Pulsar on a remote (SGE) cluster

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Pulsar on a remote (SGE) cluster

Joe Greer

Hi all,

 

We have Pulsar working in a synchronous fashion using RESTful services on our local network. Now, we’re trying to use Pulsar on a remote (SGE) cluster and we will not have the ability to mount a shared file system.

 

What is the best way to use Galaxy to queue jobs? If Pulsar is the answer, how do we defer to the cluster’s queue and have Pulsar wait on it? How should Pulsar notify Galaxy when multiple jobs are done running?

 

We’ve looked through the documentation and listservs and haven’t found anything directly related.

 

Any help or suggestions would be great.

 

Thanks,

 

Joe Greer

Proteomics Center of Excellence

Northwestern University

 


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Pulsar on a remote (SGE) cluster

John Chilton-4
On Thu, Jan 29, 2015 at 4:43 PM, Joseph Brent Greer
<[hidden email]> wrote:

> Hi all,
>
>
>
> We have Pulsar working in a synchronous fashion using RESTful services on
> our local network. Now, we’re trying to use Pulsar on a remote (SGE) cluster
> and we will not have the ability to mount a shared file system.
>
>
>
> What is the best way to use Galaxy to queue jobs?

Having a shared file system is definitely the best way to go if at all
possible. If you definitely cannot make that happen - than Pulsar is
probably the way to go - but it has limitations - it cannot for
instance leverage tool shed installed dependencies and it certainly
doesn't scale as well as Galaxy itself yet. I am working on these
problems and am happy to help work through the problems you encounter
- but I just like to throw out the warning about Pulsar.

> If Pulsar is the answer,
> how do we defer to the cluster’s queue and have Pulsar wait on it?

So it sounds like you have experience setting up Pulsar - you will now
need to set it up one a node connected to the remote cluster and open
a port for it (unless you want to setup a message queue - but I
recommend the RESTful mode when possible - it seems more robust
currently).

Once you have Pulsar setup - you will need to configure it to talk to
your SGE cluster. For that you will need to install an SGE drmaa
library on the node (it may already be available). Then copy
local_env.sh.sample to local_env.sh in Pulsar and setup
DRMAA_LIBRARY_PATH.

export DRMAA_LIBRARY_PATH=/path/to/your/libdrmaa.so

and finally setup a job_managers.ini file (cp job_managers.ini.sample
job_managers.ini) and change the default section to something like:

[manager:_default_]
type=queued_drmaa

More information about configuring job managers here
https://pulsar.readthedocs.org/en/latest/job_managers.html.

Finally - you need to configure Galaxy to use the correct Pulsar job
runner - it might look something like this -
https://gist.github.com/jmchilton/ca39ef1a3241d9074121.

That should be it - and then Galaxy should send jobs to the cluster as needed.

> How should Pulsar notify Galaxy when multiple jobs are done running?

Galaxy will poll the remote Pulsar server to determine when the jobs
are complete. If you cannot open a port for pulsar to do this polling
- then you will need to setup a message queue and then configure both
sides to use that - I would definitely try to bully your cluster
admins into opening that port before resorting to that though.

>
>
>
> We’ve looked through the documentation and listservs and haven’t found
> anything directly related.

Pulsar was previously called the LWR - so there are more conversations
related to it on galaxy-dev referring to it as the LWR.

>
>
>
> Any help or suggestions would be great.

Hopefully this helps and good luck!

-John

>
>
>
> Thanks,
>
>
>
> Joe Greer
>
> Proteomics Center of Excellence
>
> Northwestern University
>
>
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/