Configuring Pulsar

If either installation procedure has been followed, your Pulsar directory should contain two files of interest: app.yml to configure the Pulsar application and server.ini to configure the web server (unless you are running Pulsar without a web server).

Default values are specified for all configuration options that will work if Pulsar is running on the same host as Galaxy (e.g. for testing and development). Otherwise, the host setting of server.ini will need to be modified to listen for external requests.

app.yml settings can be overridden by setting environment variables, just as with Galaxy, by prefixing the config setting name with PULSAR_CONFIG_OVERRIDE_. For example:

$ export PULSAR_CONFIG_OVERRIDE_PRIVATE_TOKEN=changed
$ pulsar

Defaults can also be set via environment variables by prefixing them with PULSAR_CONFIG_. For example, PULSAR_CONFIG_PRIVATE_TOKEN.

Security

Out of the box, Pulsar essentially allows anyone with network access to the Pulsar server to execute arbitrary code and read and write any files the web server can access. Hence, in most settings steps should be taken to secure the Pulsar server.

Private Token

If running Pulsar with a web server, you must specify a private token (a shared secret between Pulsar and the Galaxy server) to prevent unauthorized access. This is done by simply setting private_token in app.yml to some long random string.

Once a private token is configured, Galaxy job destinations should include a private_token parameter to authenticate these jobs.

Pulsar Web Server

The default Pulsar web server, Paste can be configured to use SSL and to require the client (i.e. Galaxy) to pass along a private token authorizing use.

Tip

SSL support is built in to uWSGI, an alternate webserver that can be installed (see Installing Pulsar).

pyOpenSSL is required to configure a Pulsar web server to server content via HTTPS/SSL. This dependency can be difficult to install and seems to be getting more difficult. Under Linux you will want to ensure the needed dependencies to compile pyOpenSSL are available - for instance in a fresh Ubuntu image you will likely need:

$ sudo apt-get install libffi-dev python3-dev libssl-dev

Then pyOpenSSL can be installed with the following command (be sure to source your virtualenv if setup above):

$ pip install pyOpenSSL

Once installed, you will need to set the option ssl_pem in server.ini. This parameter should reference an OpenSSL certificate file for use by the Paste server. This parameter can be set to * to automatically generate such a certificate. An unsigned certificate for testing purposes can be manually generated by the following method:

$ openssl genrsa 1024 > host.key
$ chmod 400 host.key
$ openssl req -new -x509 -nodes -sha1 -days 365  \
          -key host.key > host.cert
$ cat host.cert host.key > host.pem
$ chmod 400 host.pem

More information can be found in the paste httpserver documentation.

Message Queue

If Pulsar is processing requests via a message queue instead of a web server the underlying security mechanisms of the message queue should be used to secure communication - deploying Pulsar with SSL and a private_token described above are not applicable.

This can be done via two (not mutually exclusive) methods: client SSL certificates, or password authentication. In either case, you should configure your AMQP server with SSL.

If using client certificates, you will likely need to set the appropriate (for your PKI) combination of amqp_connect_ssl_ca_certs, amqp_connect_ssl_keyfile, amqp_connect_ssl_certfile, and amqp_connect_ssl_cert_reqs, in Pulsar’s app.yml file. See app.yml.sample for more details.

If using password authentication, this information can be set in the message_queue_url setting in app.yml, e.g., with SSL:

message_queue_url: amqps://user:password@mqserver.example.org:5671//

You can consult the Kombu documentation for even more information.

User Authentication/Authorization

You can configure Pulsar to authenticate user during request processing and check if this user is allowed to run a job.

Various authentication/authorization plugins can be configured in app.yml to do that and plugin parameters depend on auth type. For example, the following configuration uses oidc plugin for authentication and userlist for authorization:

user_auth:
  authentication:
    - type: oidc
      oidc_jwks_url: https://login.microsoftonline.com/xxx/discovery/v2.0/keys
      oidc_provider: azure
      oidc_username_in_token: preferred_username
      oidc_username_template: *.
  authorization:
    - type: userlist
      userlist_allowed_users:
        - xxx

see plugins folder for available plugins and their parameters.

Customizing the Pulsar Environment (*nix only)

For many deployments, Pulsar’s environment will need to be tweaked. For instance to define a DRMAA_LIBRARY_PATH environment variable for the drmaa Python module or to define the location to a find a location of Galaxy (via GALAXY_HOME) if certain Galaxy tools require it or if Galaxy metadata is being set by the Pulsar.

The file local_env.sh (created automatically by pulsar-config) will be source by pulsar before launching the application and by child process created by Pulsar that require this configuration.

Job Managers (Queues)

By default the Pulsar will maintain its own queue of jobs. While ideal for simple deployments such as those targeting a single Windows instance, if Pulsar is going to be used on more sophisticated clusters, it can be configured to maintain multiple such queues with different properties or to delegate to external job queues (via DRMAA, qsub/qstat CLI commands, or Condor).

For more information on configured external job managers, see Job Managers.

Galaxy Tools

Some Galaxy tool wrappers require a copy of the Galaxy codebase itself to run. Such tools will not run under Windows, but on *nix hosts the Pulsar can be configured to add the required Galaxy code a jobs PYTHON_PATH by setting GALAXY_HOME environment variable in the Pulsar’s local_env.sh file (described above).

Most Galaxy tools require external command-line tools, known as Galaxy Tool Dependencies, to execute correctly. In Galaxy, these are provided by its Dependency Resolution system. Pulsar uses this same system, which can be configured via the dependency_resolution option in app.yml. See the example in app.yml.sample for additional information. In its default configuration, Pulsar will automatically install Conda but not automatically install missing tool dependencies. Administrators sending large numbers of tools to Pulsar most likely want to enable the auto_install option on the conda dependency resolver or the conda_auto_install global option so that it is not necessary to manually install dependencies for tools sent to Pulsar. Both options are documented in the app.yml.sample file.

Message Queue (AMQP)

Galaxy and Pulsar can be configured to communicate via a message queue instead of a Pulsar web server. In this mode, Pulsar and Galaxy will send and receive job control and status messages via an external message queue server using the AMQP protocol. This is sometimes referred to as running Pulsar “webless”.

Information on configuring RabbitMQ, one such compatible message queue, can be found in Message Queues with Galaxy and Pulsar.

In addition, when using a message queue, Pulsar will download files from and upload files to Galaxy instead of the inverse. Message queue mode may be very advantageous if Pulsar needs to be deployed behind a firewall or if the Galaxy server is already set up (via proxy web server) for large file transfers.

A template configuration for using Galaxy with a message queue can be created by pulsar-config:

$ pulsar-config --mq

You will also need to ensure that the ` kombu Python dependency is installed (pip install kombu). Once this is available, simply set the message_queue_url property in app.yml to the correct URL of your configured AMQP endpoint.

AMQP does not guarantee message receipt. It is possible to have Pulsar (and Galaxy) require acknowledgement of receipt and resend messages that have not been acknowledged, using the amqp_ack* options documented in app.yml.sample, but beware that enabling this option can give rise to the Two Generals Problem, especially when Galaxy or the Pulsar server are down (and thus not draining the message queue).

In the event that the connection to the AMQP server is lost during message publish, the Pulsar server can retry the connection, governed by the amqp_publish* options documented in app.yml.sample.

Message Queue (pulsar-relay)

Pulsar can also communicate with Galaxy via an experimental pulsar-relay server, an HTTP-based message relay. This mode is similar to the AMQP message queue mode but uses HTTP long-polling instead of a message broker like RabbitMQ. This can help when:

  • Galaxy cannot directly reach Pulsar (e.g., due to firewall restrictions)

  • You want to avoid deploying and managing a RabbitMQ server

  • You prefer HTTP-based communication for simplicity and observability

Architecture

In this mode:

  1. Galaxy → Relay: Galaxy posts control messages (job setup, status requests, kill commands) to the relay via HTTP POST

  2. Relay → Pulsar: Pulsar polls the relay via HTTP long-polling to receive these messages

  3. Pulsar → Relay: Pulsar posts status updates to the relay

  4. Relay → Galaxy: Galaxy polls the relay to receive status updates

  5. File Transfers: Pulsar transfers files directly to/from Galaxy via HTTP (not through the relay)

Galaxy ──POST messages──> pulsar-relay ──poll──> Pulsar Server
                                                       │
                                                       │
Galaxy <────────direct HTTP for file transfers─────────┘

Pulsar Configuration

To configure Pulsar to use pulsar-relay, set the message_queue_url in app.yml with a http:// or https:// prefix:

message_queue_url: http://proxy-server.example.org:9000
message_queue_username: admin
message_queue_password: your_secure_password

The http:// / https:// prefix tells Pulsar to use the relay communication mode instead of AMQP.

Optional Topic Prefix

You can optionally set a relay_topic_prefix to namespace your topics. This is useful when multiple independent Galaxy/Pulsar instance pairs share the same relay:

message_queue_url: http://proxy-server.example.org:9000
message_queue_username: admin
message_queue_password: your_secure_password
relay_topic_prefix: production

Note

Unlike AMQP mode, the pulsar-relay mode does not require the kombu Python dependency. It only requires the requests library, which is a standard dependency of Pulsar.

Galaxy Configuration

In Galaxy’s job configuration (job_conf.yml), configure a Pulsar destination with proxy parameters:

runners:
  pulsar:
    load: galaxy.jobs.runners.pulsar:PulsarMQJobRunner
    # Proxy connection
    proxy_url: http://proxy-server.example.org:9000
    proxy_username: your_username
    proxy_password: your_secure_password
    # Optional topic prefix (must match Pulsar configuration)
    # relay_topic_prefix: production


execution:
  default: pulsar_relay
  environments:
    pulsar_relay:
      runner: pulsar
      # Galaxy's URL (for Pulsar to reach back for file transfers)
      url: http://galaxy-server.example.org:8080
      # Remote job staging directory
      jobs_directory: /data/pulsar/staging

Note

The relay_topic_prefix must match on both Galaxy and Pulsar sides. If set on one side but not the other, messages will not be routed correctly.

Authentication

The pulsar-relay uses JWT (JSON Web Token) authentication. Galaxy and Pulsar authenticate with the relay using the username and password provided in the configuration. Tokens are automatically managed and refreshed as needed.

Tip

In production, always use HTTPS for the relay URL to encrypt credentials and message content during transit:

message_queue_url: https://proxy-server.example.org:443

Security Considerations

  • Use HTTPS: Always use HTTPS for the relay URL in production

  • Strong Passwords: Use strong, unique passwords for relay authentication

  • Network Isolation: Deploy the relay in a DMZ accessible to both Galaxy and Pulsar

  • Firewall Rules:
    • Galaxy → Relay: Allow outbound HTTPS

    • Pulsar → Relay: Allow outbound HTTPS

    • Pulsar → Galaxy: Allow outbound HTTP/HTTPS for file transfers

Multiple Pulsar Instances

You can deploy multiple Pulsar instances with different managers, all using the same relay. Messages are routed by topic names that include the manager name.

For example, configure two Pulsar servers:

Pulsar Server 1 (app.yml):

message_queue_url: http://proxy-server:9000
message_queue_username: admin
message_queue_password: password
managers:
  cluster_a:
    type: queued_slurm

Pulsar Server 2 (app.yml):

message_queue_url: http://proxy-server:9000
message_queue_username: admin
message_queue_password: password
managers:
  cluster_b:
    type: queued_condor

In Galaxy’s job configuration, route jobs to specific clusters using the manager parameter:

execution:
  environments:
    cluster_a_jobs:
      runner: pulsar
      proxy_url: http://proxy-server:9000
      manager: cluster_a
      # ... other settings

    cluster_b_jobs:
      runner: pulsar
      proxy_url: http://proxy-server:9000
      manager: cluster_b
      # ... other settings

Multiple Galaxy/Pulsar Instance Pairs

You can have multiple independent Galaxy and Pulsar instance pairs all sharing the same relay by using different topic prefixes. This is useful for:

  • Running separate production and staging environments

  • Supporting multiple research groups with isolated instances

  • Multi-tenant deployments

Example: Production and Staging Environments

Production Pulsar (app.yml):

message_queue_url: https://shared-relay:9000
message_queue_username: admin
message_queue_password: password
relay_topic_prefix: production
managers:
  cluster_a:
    type: queued_slurm

Staging Pulsar (app.yml):

message_queue_url: https://shared-relay:9000
message_queue_username: admin
message_queue_password: password
relay_topic_prefix: staging
managers:
  cluster_a:
    type: queued_slurm

Production Galaxy (job_conf.yml):

runners:
  pulsar:
    load: galaxy.jobs.runners.pulsar:PulsarMQJobRunner
    proxy_url: https://shared-relay:9000
    proxy_username: admin
    proxy_password: password
    relay_topic_prefix: production

execution:
  environments:
    pulsar_jobs:
      runner: pulsar
      manager: cluster_a
      # ... other settings

Staging Galaxy (job_conf.yml):

runners:
  pulsar:
    load: galaxy.jobs.runners.pulsar:PulsarMQJobRunner
    proxy_url: https://shared-relay:9000
    proxy_username: admin
    proxy_password: password
    relay_topic_prefix: staging

execution:
  environments:
    pulsar_jobs:
      runner: pulsar
      manager: cluster_a
      # ... other settings

In this setup, the topics will be completely isolated:

  • Production: production_job_setup_cluster_a, production_job_status_update_cluster_a

  • Staging: staging_job_setup_cluster_a, staging_job_status_update_cluster_a

Topic Naming

Messages are organized by topic with automatic naming based on the optional prefix and manager name:

  • Job setup: job_setup (default manager, no prefix)

  • Job setup: job_setup_{manager_name} (named manager, no prefix)

  • Job setup: {prefix}_job_setup (default manager, with prefix)

  • Job setup: {prefix}_job_setup_{manager_name} (named manager, with prefix)

The same pattern applies to other message types:

  • Status requests: job_status_request, job_status_request_{manager_name}

  • Kill commands: job_kill, job_kill_{manager_name}

  • Status updates: job_status_update, job_status_update_{manager_name}

When a relay_topic_prefix is configured, it is prepended to all topic names:

  • production_job_setup

  • production_job_setup_cluster_a

  • production_job_status_update_cluster_a

This allows:

  • Multiple Pulsar instances to share the same relay (using different manager names)

  • Multiple independent Galaxy/Pulsar instance pairs to share the same relay (using different topic prefixes)

Comparison with AMQP Mode

Feature

AMQP (RabbitMQ)

pulsar-relay

Protocol

AMQP over TCP

HTTP/HTTPS

Dependencies

kombu, RabbitMQ server

requests (built-in)

Deployment Complexity

Moderate (broker setup)

Simple (HTTP service)

Message Delivery

Push-based

Long-polling

Observability

Queue monitoring tools

HTTP access logs

SSL/TLS

Via AMQPS

Via HTTPS

Firewall Friendly

Moderate

High (standard HTTP)

For more information on deploying pulsar-relay, see the pulsar-relay documentation.

Caching (Experimental)

Pulsar and its client can be configured to cache job input files. For some workflows this can result in a significant decrease in data transfer and greater throughput. On the Pulsar server side - the property file_cache_dir in app.yml must be set. See Galaxy’s job_conf.xml example file for information on configuring the client.

More discussion on this can be found in this galaxy-dev mailing list thread and future plans and progress can be tracked on this Trello card.