GRIA Command Line Client Tutorial

GRIA Client v4.3.0


Introduction

This tutorial shows how to use GRIA services as a client. We will run some image processing jobs on remote GRIA service providers.

You should already be familiar with the general design of GRIA. In summary:

  1. Your organisation's budget holder opens accounts with one or more GRIA service providers. Each account is represented by an account conversation.

  2. When you want to use the service provider's resources, you create a resource allocation conversation within the account conversation. In this conversation, you agree on the resources (data and computation) that you will need.

  3. Within the allocation conversation, you create a data conversation for each input and output to the application you are going to run.

  4. Also within the allocation, you create a job conversation for each job.

  5. You run jobs using the job conversations, specifying the inputs and outputs as the various data conversations.

It is common to create several allocations on different service providers and to use data stagers within one allocation as inputs and outputs for jobs in another, as in this diagram:

Nested GRIA Conversations

Here, the arrows represent data-flow. The client uploads some inputs, and then runs a series of jobs. Some of the jobs use the inputs that the client has uploaded directly, while others use the output from other jobs. The whole process is orchestrated from the client.

Running gridcli on its own will display a help screen explaining all the commands:

c:\grid\gria-client-4.3.0> gridcli
Grid Client
Usage: gridcli <command< <args<

Where <command< is one of:

open            - open a new account
tender          - find a supplier willing to help you
upload          - upload data to a supplier
download        - download results from a supplier
run             - run a job on a remote supplier's machines
start           - start running a job but don't wait
monitor         - monitor the state of a job
finish          - tell the supplier you're finished
show            - show details of the current allocation
browse          - open the conversation browser window
check-accounts   - check the status of the accounts

You can get more help on any command with --help. Eg:

        gridcli tender --help

Example sequence:

        gridcli open Accounts.xml
        [ create a file called Requirements.xml ]
        [ create a file called Work.xml ]
        [ zip up input file (e.g. file.txt) into input.zip ]
        gridcli tender Accounts.xml Requirements.xml MyTask
        gridcli upload MyTask input.zip model
        gridcli run MyTask http://APPLICATION/URI Work.xml --input model --output results
        gridcli download results output.zip
        gridcli finish
        [ unzip output file (e.g. file.std) from output.zip ]

Step 1: Creating a new account

Typically, new accounts are opened by a company's budget holder, who will then grant certain other users access to the account. The budget holder is responsible for paying for usage of the account.

The budget holder opens a new account using the gridcli open command. As with all commands, the --help option may be passed to request help on how to use it:

> gridcli open --help
Grid Client
Usage: gridcli open Accounts.xml

Opens a dialog box asking the user to enter their account
details and a name for the account. A request is then sent
to the remote supplier asking for the account to be opened.
The new account is appended to the Accounts.xml file (if no
accounts file currently exists, a new one is created).
Note that you will probably have to wait for the credit checks
to pass before the account can be used.
  1. Run the command as it suggests:
    > gridcli open Accounts.xml
    

    A dialog box appears prompting you to enter the service provider details:

    Opening a new account

  2. Enter the Name for new account. This is just to let you identify the account in future. If you are testing with IT Innovation's demo server, this could be IT-Innovation demo server.

  3. In Account service URL, use https://griademo.it-innovation.soton.ac.uk/GRIA/services/AccountService to use the demo server, or enter the address of the account service you wish to use.

  4. Enter the Budget holder's name e.g. the name of a person within your organisation responsible for controlling budgets.

  5. Fill in the other fields with the budget holder's details.

  6. Press the OK button.

  7. The dialog box will disappear and the client prints Account requested. You will be left with an Accounts.xml file that lists all accounts you maintain with service providers.

  8. The new account needs to be approved by the service provider, which they will do by using the account administration tool. You can check the status of your account using the command:

    > gridcli check-accounts Accounts.xml

    Once approved, the status will have changed from pending-credit-checks to open.

  9. To get your account approved, email the service provider (griasupport@it-innovation.soton.ac.uk for the demo server).

  10. You can see the details of the new accounts using the gridcli browse command:
    > gridcli browse
    

    Browser showing the new account conversation

You can use the popup menu to perform functions such as closing an account or getting a statement. Right-click over an account to display the menu. Since no resources have been allocated yet the statements will all be blank. You can also grant access to the account to other users.

Step 2: Getting a resource allocation

We will begin with a very simple example, which uploads an input image to a GRIA Data Stager, runs a processing job on it, and then downloads the results. Then, we will move on to a more complex example which runs two jobs on two different service providers, using the output of the first job as the input to the second.

Processing an image with GRIA

The resource allocation service is used to request an allocation of resources from a service provider. When you want to use remote GRID resources, you need to create a resource allocation.

You should start with two files: Accounts.xml and Requirements.xml. The accounts file is obtained as described above. The requirements file gives the processing requirements for your job (a full description is provided here).

  1. Download the sample Requirements.xml to your <install-directory>\gria-client-4.3.0 directory.

  2. Ensure the <application> elements in the Requirements.xml file give the names of the applications you need to use. In our case, you should have http://it-innovation.soton.ac.uk/grid/imagemagick/paint and http://it-innovation.soton.ac.uk/grid/imagemagick/swirl.

  3. Run the gridcli tender command. As before, running it with --help (or no arguments) will produce a usage message. Run the command as follows:

    > gridcli tender Accounts.xml Requirements.xml MyAllocation
  4. Invitations to submit tenders are sent to all the suppliers. You will be prompted to choose one and that one will be confirmed. At this point, your account is billed for the resources you requested.

Choosing an offer

The client.state file will now contain all the accounts from Accounts.xml, plus your new resource allocation conversation.

Step 3: Check the account statement

You can see the details of the new allocation by running gridcli browse, right-clicking over the account, and choosing Get statement:

> gridcli browse

Statement viewer dialog

Balance at end date shows the total amount that you owe, assuming that all currently open allocations are completely used. Each line in the table shows a single event. The confirmation or extension of an allocation is shown by a line with an allocation ID and a value in the Charges column. Otherwise, a row with an ID value indicates that an allocation was finished (a value in the Payments column means that you were reimbursed for some fraction of the unused resources). A line with no allocation ID and a value in the Payments column records a payment made to the account.

Step 4: Upload the input data

Before you can start running a job, you need to make sure that the input data is available to it. This is done by creating a new data stager within your new resource allocation, and uploading a file to it. The sample application requires a single JPEG image file as its input.

Use the gridcli upload command. This creates a new stager called original (if one doesn't already exist) in the MyAllocation allocation, and uploads image.jpg to it:

> gridcli upload MyAllocation image.jpg original
Uploading image.jpg...
Done

Step 5: Run the job

You will need a Work.xml file for your job, specifying the resources needed for this job (a full description of this file is provided here). This will be a subset of the resources requested for the allocation. The long URI in the command gives the name of the application you want to run.

  1. Download the sample Work.xml.
  2. Now you can run the job (this should all be on one line):

    > gridcli run MyAllocation http://it-innovation.soton.ac.uk/grid/imagemagick/swirl
                                                  Work.xml --input original --output result
    

The client monitors the status of the job until the outputs are ready to download:

Grid Client
Creating new remote data stager for output 'result'
Creating job conversation...
Getting ID for job service...
Enabling read access to original...
Enabling write access to result...
Starting job...
Running...
Contacting https://vinsanto.it-innovation.soton.ac.uk/GRIA/services/JobService#61

http://it-innovation.soton.ac.uk/grid/imagemagick/swirl:
  URL   : https://vinsanto.it-innovation.soton.ac.uk/GRIA/services/JobService#61
  Status: Status is now input-retrieval-in-progress

...

Contacting https://vinsanto.it-innovation.soton.ac.uk/GRIA/services/JobService#61

http://it-innovation.soton.ac.uk/grid/imagemagick/swirl:
  URL   : https://vinsanto.it-innovation.soton.ac.uk/GRIA/services/JobService#61
  Status: Status is now output-staging-complete (FINISHED)
  > JOB_STATUS            FINISHED
  > Swirl wrapper started
  > Arguments are: -i ../stagedzips/input0.zip -o ../stagedzips/output0.zip
  > Copying inputs to work directory...
  > Transforming image...
  > Copying result to output stager...
  > Swirl job completed successfully


Success.

Step 6: Download the result

When the gridcli run command finishes, you can download the output (result) to your local directory using gridcli download.

Note: When a job finishes running, the status will say submitted (FINISHED). However, the data is not yet available. You must wait until the status is output-staging-complete (FINISHED).

> gridcli download result result.jpg
Grid Client
Downloading:
  From: result
   URL: https://griademo.it-innovation.soton.ac.uk/GRIA/services/DataService#60
    To: result.jpg
Done.

Step 7: Finish the allocation

Use gridcli finish when done to remove the remote resource allocation. If you didn't use all the resources you originally requested, you'll be reimbursed for some fraction of their cost. Make sure you have no more jobs to run, or data to keep, within the allocation before finishing it.

> gridcli finish
Done.

Step 8: Using multiple allocations

So far, we have only used a single allocation. It is also possible to run jobs using data held within other allocations, possibly on other service providers. In this part, we will process the source image using two different jobs on two different machines:

Using two allocations

The following sequence of commands will perform these operations:

gridcli tender Accounts.xml Requirements.xml First

gridcli tender Accounts.xml Requirements.xml Second

gridcli upload First image.jpg input

gridcli run First http://it-innovation.soton.ac.uk/grid/imagemagick/paint Work.xml
	--input input --output mid

gridcli run Second http://it-innovation.soton.ac.uk/grid/imagemagick/swirl Work.xml
	--input mid --output final

gridcli download final final.jpg

Notice that the job on Service Provider 2 needs permission to access to the data on Service Provider 1. This is handled automatically by the command line client; when you run the second job, the client sends a message to Service Provider 1 telling it to grant access to the second data stager to Service Provider 2.

Step 9: Delegating access to a data stager

You can also give other users read or write permission on a data stager:

  1. Use the gridcli browse command to show all the conversations.
  2. Open the popup menu over the data stager to share.
  3. Choose Enable read access or Enable write access.
  4. Choose the certificate of the user to whom you wish to grant access using the certificate browser.

Access can later be revoked by choosing the corresponding Disable method from the menu.

To use the data stager, another user will also need to know the conversation ID for the stager. The process is:

  1. Send another user your client.state file.

  2. Have the other user try to download some data. The system should reject the request.

  3. Use gridcli browse to see your conversations. Bring up the menu over a data stager and choose Enable read access from the menu.

  4. Get the other user's certificate from them and select it in the dialog box using the Browse button.

  5. The other user should now be able to download the data.

Delegating access to a data stager