Galaxy on HPC and Bright Cluster Manager?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Galaxy on HPC and Bright Cluster Manager?

Carlos Lijeron

Good day everyone,

 

Has anyone of you been able to implement Galaxy on a HPC using Bright Cluster Manager as the main DRM?  I noticed that only a few have been known to work with Galaxy, but the list does not include Bright.  Any advice/ideas will be greatly appreciated.

 

TORQUE Resource Manager

PBS Professional

Open Grid Engine

Univa Grid Engine (previously known as Sun Grid Engine and Oracle Grid Engine)

Platform LSF

HTCondor

Slurm

Galaxy Pulsar (formerly LWR)

 

 

Thanks.

 

 

Carlo


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Galaxy on HPC and Bright Cluster Manager?

John Chilton-4
Hello Carlos,

  I have never heard of anyone running Galaxy with Bright Cluster
Manager (though hopefully someone will chime in if they have). If you
are interested in adding support it should be possible. One
complication is that Bright Cluster Manager doesn't appear to have a
DRMAA interface (http://www.drmaa.org/) which is the most direct way
to utilize new DRMs. Without that my approach would be to build a new
CLI runner:

There are a few examples here that one can use as template:

https://github.com/galaxyproject/galaxy/tree/dev/lib/galaxy/jobs/runners/util/cli/job

I guess you would have to write a new one targeting cmsub I guess -
you also need to be able to parse a job status somehow - I haven't
figured out how to do that from the documentation - but I assume there
is a way.

I looks like Bright supports running SGE, SLURM, and Torque on the
cluster - doing this and interfacing with one of those more common
options directly might be a better approach for Galaxy (and other
users if your cluster has them).

-John







On Wed, Apr 22, 2015 at 8:56 AM, Carlos Lijeron
<[hidden email]> wrote:

> Good day everyone,
>
>
>
> Has anyone of you been able to implement Galaxy on a HPC using Bright
> Cluster Manager as the main DRM?  I noticed that only a few have been known
> to work with Galaxy, but the list does not include Bright.  Any advice/ideas
> will be greatly appreciated.
>
>
>
> TORQUE Resource Manager
>
> PBS Professional
>
> Open Grid Engine
>
> Univa Grid Engine (previously known as Sun Grid Engine and Oracle Grid
> Engine)
>
> Platform LSF
>
> HTCondor
>
> Slurm
>
> Galaxy Pulsar (formerly LWR)
>
>
>
>
>
> Thanks.
>
>
>
>
>
> Carlo
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Galaxy on HPC and Bright Cluster Manager?

David Trudgian
Carlo,

We have Bright Cluster Manager in use on our cluster for node provisioning etc. but the actual job scheduler in use in our case is SLURM, which we use directly.

Are you using one of the integrated workload managers such as SLURM / SGE / TORQUE directly, or indirectly via cmsub?

I guess the easiest way to come up with some kind of advice is if you can provide an example of generic job script you  are using on your system. If you're using cmsub is it specifying a --wlmanager etc.

DT

-----Original Message-----
From: galaxy-dev [mailto:[hidden email]] On Behalf Of John Chilton
Sent: Wednesday, April 22, 2015 8:26 AM
To: Carlos Lijeron
Cc: [hidden email]
Subject: Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager?

Hello Carlos,

  I have never heard of anyone running Galaxy with Bright Cluster Manager (though hopefully someone will chime in if they have). If you are interested in adding support it should be possible. One complication is that Bright Cluster Manager doesn't appear to have a DRMAA interface (http://www.drmaa.org/) which is the most direct way to utilize new DRMs. Without that my approach would be to build a new CLI runner:

There are a few examples here that one can use as template:

https://github.com/galaxyproject/galaxy/tree/dev/lib/galaxy/jobs/runners/util/cli/job

I guess you would have to write a new one targeting cmsub I guess - you also need to be able to parse a job status somehow - I haven't figured out how to do that from the documentation - but I assume there is a way.

I looks like Bright supports running SGE, SLURM, and Torque on the cluster - doing this and interfacing with one of those more common options directly might be a better approach for Galaxy (and other users if your cluster has them).

-John







On Wed, Apr 22, 2015 at 8:56 AM, Carlos Lijeron <[hidden email]> wrote:

> Good day everyone,
>
>
>
> Has anyone of you been able to implement Galaxy on a HPC using Bright
> Cluster Manager as the main DRM?  I noticed that only a few have been
> known to work with Galaxy, but the list does not include Bright.  Any
> advice/ideas will be greatly appreciated.
>
>
>
> TORQUE Resource Manager
>
> PBS Professional
>
> Open Grid Engine
>
> Univa Grid Engine (previously known as Sun Grid Engine and Oracle Grid
> Engine)
>
> Platform LSF
>
> HTCondor
>
> Slurm
>
> Galaxy Pulsar (formerly LWR)
>
>
>
>
>
> Thanks.
>
>
>
>
>
> Carlo
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

________________________________

UT Southwestern


Medical Center



The future of medicine, today.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Galaxy on HPC and Bright Cluster Manager?

David Trudgian
Hi Carlos, sorry for the slow reply.

With a small 10 user setup (I assume just one group) your installation is probably going to be a lot less complex than ours. We are running services for users from multiple labs and departments, so a chief concern was making sure private datasets always stay private - which is relatively involved when using Galaxy to run jobs on a general-purpose shared cluster.

I think if I had to recommend picking a job scheduler I would suggest SLURM since it's probably the most 'fashionable' choice at present. You also have the advantage that SLURM is used with Galaxy in the Galaxy docker image etc. which ensures people notice if the Galaxy->DRMAA->SLURM setup isn't working. I've also used Galaxy with GridEngine in the past, and that was fine - but is becoming a less common choice as a scheduler.

Having said that, I don't think that the job scheduler needs to be your biggest concern. I would focus most on the file system and user account setup you are going to need.

* How are you going to migrate from standalone Galaxy to a situation where your new cluster can see the Galaxy data files, tools etc. Are you purchasing storage with the cluster? If so do you move Galaxy onto that storage, or can you mount existing Galaxy data onto the cluster nodes? If you can do that, is your networking such that performance is sufficient  for the type of analysis you are going to run?

* Do you need to, or will you need to, keep track of per-user usage of the cluster for things that Galaxy will be running? If not then you can just have a galaxy user on your cluster and things are pretty easy for file permissions etc. If you need to track jobs per-user then it becomes more complex, and the solution depends on how much privacy you need for datasets, how your cluster will authenticate users etc.

The filesystem and user accounts issues are, in my mind, the ones to focus on. You can always modify Galaxy's config to switch to a different job scheduler fairly easily. You cannot as easily move around large amounts of data, and reconcile local vs cluster user accounts, should that be necessary.

Cheers,

Dave Trudgian

-----Original Message-----
From: Carlos Lijeron [mailto:[hidden email]]
Sent: Wednesday, April 22, 2015 10:54 AM
To: David Trudgian; John Chilton
Cc: RODRIGO GONZALEZ SERRANO
Subject: Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager?

Hello David,

Thank you for the great feedback.  We are at Hunter College in NYC, part of the City University of New York.  We recently ordered the cluster which comes with Bright Cluster Management, and our PI wants to implement Galaxy for all the users (about 10) on the cluster and manage all job submissions through a job scheduler.

So, to answer  your question, we are not really using any scheduler at this point, but only a stand alone server with a local installation of Galaxy.  Our Cluster should be assembled and installed by the end of May, so I¹m trying to gather as much information as possible in preparation for the deployment.

Based on your experience, what do you think I should focus on to ensure we maximize outcome and reduce the possibility of mistakes?  In other words, any lessons learned that you would like to share will be greatly appreciated.



Thanks again,


Carlos Lijeron.


On 4/22/15, 10:34 AM, "David Trudgian" <[hidden email]>
wrote:

>Carlo,
>
>We have Bright Cluster Manager in use on our cluster for node
>provisioning etc. but the actual job scheduler in use in our case is
>SLURM, which we use directly.
>
>Are you using one of the integrated workload managers such as SLURM /
>SGE / TORQUE directly, or indirectly via cmsub?
>
>I guess the easiest way to come up with some kind of advice is if you
>can provide an example of generic job script you  are using on your system.
>If you're using cmsub is it specifying a --wlmanager etc.
>
>DT
>
>-----Original Message-----
>From: galaxy-dev [mailto:[hidden email]] On
>Behalf Of John Chilton
>Sent: Wednesday, April 22, 2015 8:26 AM
>To: Carlos Lijeron
>Cc: [hidden email]
>Subject: Re: [galaxy-dev] Galaxy on HPC and Bright Cluster Manager?
>
>Hello Carlos,
>
>  I have never heard of anyone running Galaxy with Bright Cluster
>Manager (though hopefully someone will chime in if they have). If you
>are interested in adding support it should be possible. One
>complication is that Bright Cluster Manager doesn't appear to have a
>DRMAA interface
>(http://www.drmaa.org/) which is the most direct way to utilize new DRMs.
>Without that my approach would be to build a new CLI runner:
>
>There are a few examples here that one can use as template:
>
>https://github.com/galaxyproject/galaxy/tree/dev/lib/galaxy/jobs/runner
>s/u
>til/cli/job
>
>I guess you would have to write a new one targeting cmsub I guess - you
>also need to be able to parse a job status somehow - I haven't figured
>out how to do that from the documentation - but I assume there is a way.
>
>I looks like Bright supports running SGE, SLURM, and Torque on the
>cluster - doing this and interfacing with one of those more common
>options directly might be a better approach for Galaxy (and other users
>if your cluster has them).
>
>-John
>
>
>
>
>
>
>
>On Wed, Apr 22, 2015 at 8:56 AM, Carlos Lijeron
><[hidden email]> wrote:
>> Good day everyone,
>>
>>
>>
>> Has anyone of you been able to implement Galaxy on a HPC using Bright
>> Cluster Manager as the main DRM?  I noticed that only a few have been
>> known to work with Galaxy, but the list does not include Bright.  Any
>> advice/ideas will be greatly appreciated.
>>
>>
>>
>> TORQUE Resource Manager
>>
>> PBS Professional
>>
>> Open Grid Engine
>>
>> Univa Grid Engine (previously known as Sun Grid Engine and Oracle
>> Grid
>> Engine)
>>
>> Platform LSF
>>
>> HTCondor
>>
>> Slurm
>>
>> Galaxy Pulsar (formerly LWR)
>>
>>
>>
>>
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Carlo
>>
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this and other
>> Galaxy lists, please use the interface at:
>>   https://lists.galaxyproject.org/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/
>___________________________________________________________
>Please keep all replies on the list by using "reply all"
>in your mail client.  To manage your subscriptions to this and other
>Galaxy lists, please use the interface at:
>  https://lists.galaxyproject.org/
>
>To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/
>
>________________________________
>
>UT Southwestern
>
>
>Medical Center
>
>
>
>The future of medicine, today.
>


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/