Configuring Pulsar

If either installation procedure has been followed, your Pulsar directory should contain two files of interest app.yml to configure the Pulsar application and server.ini to configure the web server (unless you are running Pulsar without a web server).

Default values are specified for all configuration options that will work if Pulsar is running on the same host as Galaxy. However, the parameter “host” must be specified for remote submissions to the Pulsar server to run properly.

Security

Out of the box the Pulsar essentially allows anyone with network access to the Pulsar server to execute arbitrary code and read and write any files the web server can. Hence, in most settings steps should be taken to secure the Pulsar server.

Pulsar Web Server

The default Pulsar web server (paster) can be configured to use SSL and to require the client (i.e. Galaxy) to pass along a private token authorizing use.

pyOpenSSL is required to configure a Pulsar web server to server content via HTTPS/SSL. This dependency can be difficult to install and seems to be getting more difficult. Under Linux you will want to ensure the needed dependencies to compile pyOpenSSL are available - for instance in a fresh Ubuntu image you will likely need:

sudo apt-get install libffi-dev python-dev libssl-dev

Then pyOpenSSL can be installed with the following command (be sure to source your virtualenv if setup above):

pip install pyOpenSSL

Under Windows only older versions for pyOpenSSL are installable via pre- compiled binaries (i.e. using easy_install) so it might be good to use non- standard sources such as eGenix.

Once installed, you will need to set the option ssl_pem in server.ini. This parameter should reference an OpenSSL certificate file for use by the Python paste server. This parameter can be set to * to automatically generate such a certificate. Such a certificate can manually be generated by the following method:

$ openssl genrsa 1024 > host.key
$ chmod 400 host.key
$ openssl req -new -x509 -nodes -sha1 -days 365  \
          -key host.key > host.cert
$ cat host.cert host.key > host.pem
$ chmod 400 host.pem

More information can be found in the paste httpserver documentation.

Finally, in order to force Galaxy to authorize itself, you will want to specify a private token - by simply setting private_token to some long random string in app.yml.

Once SSL has been enabled and a private token configured, Galaxy job destinations should include a private_token parameter to authenticate these jobs.

Pulsar Message Queue

If Pulsar is processing requests via a message queue instead of a web server the underlying security mechanisms of the message queue should be used to secure communication - deploying Pulsar with SSL and a private_token described above are not required.

This will likely consist of setting some combination of amqp_connect_ssl_ca_certs, amqp_connect_ssl_keyfile, amqp_connect_ssl_certfile, amqp_connect_ssl_cert_reqs, in Pulsar’s app.yml file. See app.yml.sample for more details and the Kombu documentation for even more information.

Customizing the Pulsar Environment (*nix only)

For many deployments, Pulsar’s environment will need to be tweaked. For instance to define a DRMAA_LIBRARY_PATH environment variable for the drmaa Python module or to define the location to a find a location of Galaxy (via GALAXY_HOME) if certain Galaxy tools require it or if Galaxy metadata is being set by the Pulsar.

The file local_env.sh (created automatically by pulsar-config) will be source by pulsar before launching the application and by child process created by Pulsar that require this configuration.

Job Managers (Queues)

By default the Pulsar will maintain its own queue of jobs. While ideal for simple deployments such as those targeting a single Windows instance, if the Pulsar is going to be used on more sophisticate clusters, it can be configured to maintain multiple such queues with different properties or to delegate to external job queues (via DRMAA, qsub/qstat CLI commands, or Condor).

For more information on configured external job managers, see the job managers documentation.

Galaxy Tools

Some Galaxy tool wrappers require a copy of the Galaxy codebase itself to run. Such tools will not run under Windows, but on *nix hosts the Pulsar can be configured to add the required Galaxy code a jobs PYTHON_PATH by setting GALAXY_HOME environment variable in the Pulsar’s local_env.sh file (described above).

Caching (Experimental)

Pulsar and its client can be configured to cache job input files. For some workflows this can result in a significant decrease in data transfer and greater throughput. On the Pulsar server side - the property file_cache_dir in app.yml must be set. See Galaxy’s job_conf.xml example file for information on configuring the client.

More discussion on this can be found in this galaxy-dev mailing list thread and future plans and progress can be tracked on this Trello card.

Message Queue (Experimental)

Galaxy and the Pulsar can be configured to communicate via a message queue instead of an Pulsar web server. In this mode, the Pulsar will download files from and upload files to Galaxy instead of the inverse - this may be very advantageous if the Pulsar needs to be deployed behind a firewall or if the Galaxy server is already setup (via proxy web server) for large file transfers.

To bind the Pulsar server to a message queue, one needs to first ensure the kombu Python dependency is installed (pip install kombu). Once this available, simply set the message_queue_url property in app.yml to the correct URL of your configured AMQP endpoint.

Information on configuring RabbitMQ, one such compatible message queue, can be found here.