If either installation procedure has been followed, your Pulsar directory
should contain two files of interest:
app.yml to configure the Pulsar
server.ini to configure the web server (unless you are
running Pulsar without a web server).
Default values are specified for all configuration options that will work if
Pulsar is running on the same host as Galaxy (e.g. for testing and development).
host setting of
server.ini will need to be modified to
listen for external requests.
app.yml settings can be overridden by
setting environment variables, just as with Galaxy, by prefixing the config
setting name with
PULSAR_CONFIG_OVERRIDE_. For example:
$ export PULSAR_CONFIG_OVERRIDE_PRIVATE_TOKEN=changed $ pulsar
Defaults can also be set via environment variables by prefixing them with
PULSAR_CONFIG_. For example,
Out of the box, Pulsar essentially allows anyone with network access to the Pulsar server to execute arbitrary code and read and write any files the web server can access. Hence, in most settings steps should be taken to secure the Pulsar server.
If running Pulsar with a web server, you must specify a private token (a
shared secret between Pulsar and the Galaxy server) to prevent unauthorized
access. This is done by simply setting
app.yml to some
long random string.
Once a private token is configured, Galaxy job destinations should include a
private_token parameter to authenticate these jobs.
Pulsar Web Server¶
The default Pulsar web server, Paste can be configured to use SSL and to require the client (i.e. Galaxy) to pass along a private token authorizing use.
pyOpenSSL is required to configure a Pulsar web server to server content via
HTTPS/SSL. This dependency can be difficult to install and seems to be getting
more difficult. Under Linux you will want to ensure the needed dependencies to
compile pyOpenSSL are available - for instance in a fresh Ubuntu image you
will likely need:
$ sudo apt-get install libffi-dev python3-dev libssl-dev
Then pyOpenSSL can be installed with the following command (be sure to source your virtualenv if setup above):
$ pip install pyOpenSSL
Once installed, you will need to set the option
This parameter should reference an OpenSSL certificate file for use by the
Paste server. This parameter can be set to
* to automatically generate such
a certificate. An unsigned certificate for testing purposes can be manually
generated by the following method:
$ openssl genrsa 1024 > host.key $ chmod 400 host.key $ openssl req -new -x509 -nodes -sha1 -days 365 \ -key host.key > host.cert $ cat host.cert host.key > host.pem $ chmod 400 host.pem
More information can be found in the paste httpserver documentation.
If Pulsar is processing requests via a message queue instead of a web server the underlying security
mechanisms of the message queue should be used to secure communication -
deploying Pulsar with SSL and a
private_token described above are not
This can be done via two (not mutually exclusive) methods: client SSL certificates, or password authentication. In either case, you should configure your AMQP server with SSL.
If using client certificates, you will likely need to set the appropriate (for
your PKI) combination of
amqp_connect_ssl_cert_reqs, in Pulsar’s
app.yml file. See
app.yml.sample for more details.
If using password authentication, this information can be set in the
message_queue_url setting in
app.yml, e.g., with SSL:
You can consult the Kombu documentation for even more information.
Customizing the Pulsar Environment (*nix only)¶
For many deployments, Pulsar’s environment will need to be tweaked. For
instance to define a
DRMAA_LIBRARY_PATH environment variable for the
drmaa Python module or to define the location to a find a location of
GALAXY_HOME) if certain Galaxy tools require it or if Galaxy
metadata is being set by the Pulsar.
local_env.sh (created automatically by
pulsar-config) will be
pulsar before launching the application and by child process
created by Pulsar that require this configuration.
Job Managers (Queues)¶
By default the Pulsar will maintain its own queue of jobs. While ideal for simple deployments such as those targeting a single Windows instance, if Pulsar is going to be used on more sophisticated clusters, it can be configured to maintain multiple such queues with different properties or to delegate to external job queues (via DRMAA, qsub/qstat CLI commands, or Condor).
For more information on configured external job managers, see Job Managers.
Some Galaxy tool wrappers require a copy of the Galaxy codebase itself to run.
Such tools will not run under Windows, but on *nix hosts the Pulsar can be
configured to add the required Galaxy code a jobs
PYTHON_PATH by setting
GALAXY_HOME environment variable in the Pulsar’s
Message Queue (AMQP)¶
Galaxy and Pulsar can be configured to communicate via a message queue instead of a Pulsar web server. In this mode, Pulsar and Galaxy will send and receive job control and status messages via an external message queue server using the AMQP protocol. This is sometimes referred to as running Pulsar “webless”.
In addition, when using a message queue, Pulsar will download files from and upload files to Galaxy instead of the inverse. Message queue mode may be very advantageous if Pulsar needs to be deployed behind a firewall or if the Galaxy server is already set up (via proxy web server) for large file transfers.
A template configuration for using Galaxy with a message queue can be created
$ pulsar-config --mq
You will also need to ensure that the `
kombu Python dependency is installed
pip install kombu). Once this is available, simply set the
message_queue_url property in
app.yml to the correct URL of your
configured AMQP endpoint.
AMQP does not guarantee message receipt. It is possible to have Pulsar (and
Galaxy) require acknowledgement of receipt and resend messages that have not
been acknowledged, using the
amqp_ack* options documented in
app.yml.sample, but beware that enabling this option can give rise to the
Two Generals Problem, especially when Galaxy or the Pulsar server are down
(and thus not draining the message queue).
In the event that the connection to the AMQP server is lost during message
publish, the Pulsar server can retry the connection, governed by the
amqp_publish* options documented in app.yml.sample.
Pulsar and its client can be configured to cache job input files. For some
workflows this can result in a significant decrease in data transfer and
greater throughput. On the Pulsar server side - the property
app.yml must be set. See Galaxy’s job_conf.xml example file for information on configuring the client.