Unable to build pbs_python via scramble

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Unable to build pbs_python via scramble

Peter Schmitt
Following the directions from here: https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling galaxy jobs.  

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all configured runner plugins
Traceback (most recent call last):
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py", line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in __init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line 31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line 30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line 568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line 449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py", line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I downloaded the latest version
available and installed it from the root account.


--
Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Nate Coraor (nate@bx.psu.edu)
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
<[hidden email]> wrote:

> Following the directions from here:
> https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS
>
> I'm trying to get pbs_python to work as I'm using torque for scheduling
> galaxy jobs.
>
> Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10
>
> I have pbs_python 4.4.0 module installed into a source-built version of
> python/2.7.6
>
> I get the following error in the output of run.sh:
>
> galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
> configured runner plugins
> Traceback (most recent call last):
>   File
> "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
> line 39, in app_factory
>     app = UniverseApplication( global_conf = global_conf, **kwargs )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
> __init__
>     self.job_manager = manager.JobManager( self )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
> 31, in __init__
>     self.job_handler = handler.JobHandler( app )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
> 30, in __init__
>     self.dispatcher = DefaultJobDispatcher( app )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
> 568, in __init__
>     self.job_runners = self.app.job_config.get_job_runner_plugins(
> self.app.config.server_name )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
> 449, in get_job_runner_plugins
>     module = __import__( module_name )
>   File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
> line 31, in <module>
>     raise Exception( egg_message % str( e ) )
> Exception:
>
> The 'pbs' runner depends on 'pbs_python' which is not installed or not
> configured properly.  Galaxy's "scramble" system should make this
> installation
> simple, please follow the instructions found at:
>
>     http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster
>
> Additional errors may follow:
> pbs-python==4.3.5
>
> This is the job_conf.xml file:
>
> <?xml version="1.0"?>
> <job_conf>
>     <plugins>
>         <plugin id="pbs" type="runner"
> load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
>     </plugins>
>     <handlers>
>         <handler id="dirigo"/>
>     </handlers>
>     <destinations default="pbs_default">
>         <destination id="pbs_default" runner="pbs"/>
>                 <param
> id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
>     </destinations>
> </job_conf>
>
> I did not use the scramble system to install the pbs_python module.  I
> downloaded the latest version
> available and installed it from the root account.
>
>
> --
>
> Pete Schmitt
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Peter Schmitt
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.



On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: 603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Nate Coraor (nate@bx.psu.edu)
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Peter Schmitt
Okay I gave that a try.  Although it didn't succeed, the error has changed:

(galaxyvenv)[galaxy@dirigo galaxy-dist]$ LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python
fetch_one(): Using existing source, remove to download again:
  /nextgen3/galaxy/galaxy-dist-py27/scripts/scramble/archives/pbs_python-4.3.5.tar.gz
unpack_source(): Removing old build directory at:
  /nextgen3/galaxy/galaxy-dist-py27/scripts/scramble/build/py2.7-linux-x86_64-ucs2/pbs_python
unpack_source(): Unpacked to:
  /nextgen3/galaxy/galaxy-dist-py27/scripts/scramble/build/py2.7-linux-x86_64-ucs2/pbs_python
copy_build_script(): Using build script /nextgen3/galaxy/galaxy-dist-py27/scripts/scramble/scripts/pbs_python.py
run_scramble_script(): Beginning build
run_scramble_script(): Executing in /nextgen3/galaxy/galaxy-dist-py27/scripts/scramble/build/py2.7-linux-x86_64-ucs2/pbs_python:
  /nextgen3/galaxy/galaxyvenv/bin/python scramble.py
checking for pbs-config... /opt/torque/active/lib/../bin/pbs-config
Found torque version: 4.2.7
checking for python... /nextgen3/galaxy/galaxyvenv/bin/python
checking for python version... 2.7
checking for python platform... linux2
checking for python script directory... ${prefix}/lib/python2.7/site-packages
checking for python extension module directory... ${exec_prefix}/lib/python2.7/site-packages
configure: creating ./config.status
config.status: creating Makefile
config.status: creating setup.py
scramble(): Patching setup.py
usage: scramble.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: scramble.py --help [cmd1 cmd2 ...]
   or: scramble.py --help-commands
   or: scramble.py cmd --help

error: invalid command 'egg_info'
Traceback (most recent call last):
  File "scripts/scramble.py", line 50, in <module>
    egg.scramble()
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/eggs/scramble.py", line 57, in scramble
    self.run_scramble_script()
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/eggs/scramble.py", line 206, in run_scramble_script
    raise ScrambleFailure( self, "%s(): Egg build failed for %s %s" % ( sys._getframe().f_code.co_name, self.name, self.version ) )
galaxy.eggs.scramble.ScrambleFailure: run_scramble_script(): Egg build failed for pbs_python 4.3.5
On 3/4/14, 9:51 AM, Nate Coraor wrote:


--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: 603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Peter Schmitt
In reply to this post by Nate Coraor (nate@bx.psu.edu)
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.


On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: 603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Nate Coraor (nate@bx.psu.edu)
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Peter Schmitt
Hello Nate,

I have that version installed and was using it in the older versions of galaxy for a few years.  Once I loaded this new version, it no longer worked with the old
definitions in the universe file using: default_cluster_job_runner = drmaa:///   

Do I need a job_conf.xml that uses the drmaa runner?


On 3/5/14, 3:21 PM, Nate Coraor wrote:
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: 603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Unable to build pbs_python via scramble

Nate Coraor (nate@bx.psu.edu)
The old-style url syntax is supposed to continue to work, if you have any details on what's not working I can look in to it. That said, job_conf.xml is the way forward, and a job_conf.xml for the drmaa runner would be a pretty trivial change from the one you have for the pbs runner, e.g.:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param id="nativeSpecification">-l walltime=72:00:00,nodes=1:ppn=4</param>
        </destination>
    </destinations>
</job_conf>

Just make sure you set $DRMAA_LIBRARY_PATH in your environment to the correct libdrmaa.so.

--nate


On Wed, Mar 5, 2014 at 3:27 PM, Pete Schmitt <[hidden email]> wrote:
Hello Nate,

I have that version installed and was using it in the older versions of galaxy for a few years.  Once I loaded this new version, it no longer worked with the old
definitions in the universe file using: default_cluster_job_runner = drmaa:///   

Do I need a job_conf.xml that uses the drmaa runner?



On 3/5/14, 3:21 PM, Nate Coraor wrote:
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/###/galaxy_624.o'

Peter Schmitt


In trying something simple, using galaxy I downloaded data from USCS main.   The data gets downloaded but the job errors out.   I verified that the job actually ran, and completed successfully according to the scheduler but  I get errors like this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 18:17:35,941 (624/46.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 18:17:36,060 (624/46.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/624/galaxy_624.o'

There are no directories being created below the 000 directory.   I verified that the directory tree is owned by galaxy and that the galaxy user can run jobs from the command line as a normal user.

I set the parameter "cleanup_job = never".  It was set to "always" which is probably why the files were never there.  Now the files are there, including the galaxy_###.o file but galaxy still errors like above.

I had set the parameter "cluster_files_directory = database/pbs", but that doesn't seem to work any longer.  The .o and .e files used to end up there.

Here is an example:

(galaxyvenv)[galaxy@dirigo 630]$ ll
total 16
-rw------- 1 galaxy galaxy    0 Mar  5 19:29 galaxy_630.e
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 galaxy_630.ec
-rw------- 1 galaxy galaxy  940 Mar  5 19:29 galaxy_630.o
-rwxr-xr-x 1 galaxy galaxy 2429 Mar  5 19:29 galaxy_630.sh
-rw-rw-r-- 1 galaxy galaxy  138 Mar  5 19:29 galaxy.json
-rw-rw-r-- 1 galaxy galaxy 2139 Mar  5 19:29 metadata_in_HistoryDatasetAssociation_1182_o830e3
-rw-rw-r-- 1 galaxy galaxy   20 Mar  5 19:29 metadata_kwds_HistoryDatasetAssociation_1182_hOhPp7
-rw-rw-r-- 1 galaxy galaxy   55 Mar  5 19:29 metadata_out_HistoryDatasetAssociation_1182_Ynb70M
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 metadata_override_HistoryDatasetAssociation_1182_HsMljG
-rw-rw-r-- 1 galaxy galaxy   44 Mar  5 19:29 metadata_results_HistoryDatasetAssociation_1182_LxdsAZ
(galaxyvenv)[galaxy@dirigo 630]$ pwd
/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630

Here is the error from this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:37,731 (630/51.dirigo.mdibl.org) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:49,119 (630/51.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 19:31:50,225 (630/51.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_630.o'
galaxy.jobs DEBUG 2014-03-05 19:31:50,252 finish(): Moved /nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_dataset_856.dat to /nextgen3/galaxy/galaxy-dist/database/files/000/dataset_856.dat
galaxy.jobs DEBUG 2014-03-05 19:31:50,351 job 630 ended

On the galaxy page in the history you get in pink:
1 UCSC Main on Human: knownGene (chr22:1-51304566)
error
An error occurred with this dataset:
Job output not returned from cluster

But the dataset is there.

On 3/5/14, 3:33 PM, Nate Coraor wrote:
The old-style url syntax is supposed to continue to work, if you have any details on what's not working I can look in to it. That said, job_conf.xml is the way forward, and a job_conf.xml for the drmaa runner would be a pretty trivial change from the one you have for the pbs runner, e.g.:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param id="nativeSpecification">-l walltime=72:00:00,nodes=1:ppn=4</param>
        </destination>
    </destinations>
</job_conf>

Just make sure you set $DRMAA_LIBRARY_PATH in your environment to the correct libdrmaa.so.

--nate


On Wed, Mar 5, 2014 at 3:27 PM, Pete Schmitt <[hidden email]> wrote:
Hello Nate,

I have that version installed and was using it in the older versions of galaxy for a few years.  Once I loaded this new version, it no longer worked with the old
definitions in the universe file using: default_cluster_job_runner = drmaa:///   

Do I need a job_conf.xml that uses the drmaa runner?



On 3/5/14, 3:21 PM, Nate Coraor wrote:
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a moz-do-not-send="true" href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: 603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/###/galaxy_624.o'

Nate Coraor (nate@bx.psu.edu)
Hi Pete,

I'd suggest setting retry_job_output_collection > 0 in universe_wsgi.ini. This is usually a symptom of attribute caching on network filesystems.

--nate


On Wed, Mar 5, 2014 at 8:06 PM, Pete Schmitt <[hidden email]> wrote:


In trying something simple, using galaxy I downloaded data from USCS main.   The data gets downloaded but the job errors out.   I verified that the job actually ran, and completed successfully according to the scheduler but  I get errors like this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 18:17:35,941 (624/46.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 18:17:36,060 (624/46.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/624/galaxy_624.o'

There are no directories being created below the 000 directory.   I verified that the directory tree is owned by galaxy and that the galaxy user can run jobs from the command line as a normal user.

I set the parameter "cleanup_job = never".  It was set to "always" which is probably why the files were never there.  Now the files are there, including the galaxy_###.o file but galaxy still errors like above.

I had set the parameter "cluster_files_directory = database/pbs", but that doesn't seem to work any longer.  The .o and .e files used to end up there.

Here is an example:

(galaxyvenv)[galaxy@dirigo 630]$ ll
total 16
-rw------- 1 galaxy galaxy    0 Mar  5 19:29 galaxy_630.e
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 galaxy_630.ec
-rw------- 1 galaxy galaxy  940 Mar  5 19:29 galaxy_630.o
-rwxr-xr-x 1 galaxy galaxy 2429 Mar  5 19:29 galaxy_630.sh
-rw-rw-r-- 1 galaxy galaxy  138 Mar  5 19:29 galaxy.json
-rw-rw-r-- 1 galaxy galaxy 2139 Mar  5 19:29 metadata_in_HistoryDatasetAssociation_1182_o830e3
-rw-rw-r-- 1 galaxy galaxy   20 Mar  5 19:29 metadata_kwds_HistoryDatasetAssociation_1182_hOhPp7
-rw-rw-r-- 1 galaxy galaxy   55 Mar  5 19:29 metadata_out_HistoryDatasetAssociation_1182_Ynb70M
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 metadata_override_HistoryDatasetAssociation_1182_HsMljG
-rw-rw-r-- 1 galaxy galaxy   44 Mar  5 19:29 metadata_results_HistoryDatasetAssociation_1182_LxdsAZ
(galaxyvenv)[galaxy@dirigo 630]$ pwd
/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630

Here is the error from this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:37,731 (630/51.dirigo.mdibl.org) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:49,119 (630/51.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 19:31:50,225 (630/51.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_630.o'
galaxy.jobs DEBUG 2014-03-05 19:31:50,252 finish(): Moved /nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_dataset_856.dat to /nextgen3/galaxy/galaxy-dist/database/files/000/dataset_856.dat
galaxy.jobs DEBUG 2014-03-05 19:31:50,351 job 630 ended

On the galaxy page in the history you get in pink:
1 UCSC Main on Human: knownGene (chr22:1-51304566)
error
An error occurred with this dataset:
Job output not returned from cluster

But the dataset is there.

On 3/5/14, 3:33 PM, Nate Coraor wrote:
The old-style url syntax is supposed to continue to work, if you have any details on what's not working I can look in to it. That said, job_conf.xml is the way forward, and a job_conf.xml for the drmaa runner would be a pretty trivial change from the one you have for the pbs runner, e.g.:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param id="nativeSpecification">-l walltime=72:00:00,nodes=1:ppn=4</param>
        </destination>
    </destinations>
</job_conf>

Just make sure you set $DRMAA_LIBRARY_PATH in your environment to the correct libdrmaa.so.

--nate


On Wed, Mar 5, 2014 at 3:27 PM, Pete Schmitt <[hidden email]> wrote:
Hello Nate,

I have that version installed and was using it in the older versions of galaxy for a few years.  Once I loaded this new version, it no longer worked with the old
definitions in the universe file using: default_cluster_job_runner = drmaa:///   

Do I need a job_conf.xml that uses the drmaa runner?



On 3/5/14, 3:21 PM, Nate Coraor wrote:
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/###/galaxy_624.o'

Nate Coraor (nate@bx.psu.edu)
Just a warning, -noac has a pretty severe impact on performance in my (and others' on this list) experience. You might also want to try messing with the 'lookupcache' mount option.

--nate


On Thu, Mar 6, 2014 at 2:20 PM, Pete Schmitt <[hidden email]> wrote:
Hello Nate,

I had that parameter set to 1, but I up'd it to 5.  I also added -noac to the nfs mounts for /nextgen3

That appears to have fixed it.

Thank you!!!


On 3/6/14, 1:57 PM, Nate Coraor wrote:
Hi Pete,

I'd suggest setting retry_job_output_collection > 0 in universe_wsgi.ini. This is usually a symptom of attribute caching on network filesystems.

--nate


On Wed, Mar 5, 2014 at 8:06 PM, Pete Schmitt <[hidden email]> wrote:


In trying something simple, using galaxy I downloaded data from USCS main.   The data gets downloaded but the job errors out.   I verified that the job actually ran, and completed successfully according to the scheduler but  I get errors like this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 18:17:35,941 (624/46.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 18:17:36,060 (624/46.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/624/galaxy_624.o'

There are no directories being created below the 000 directory.   I verified that the directory tree is owned by galaxy and that the galaxy user can run jobs from the command line as a normal user.

I set the parameter "cleanup_job = never".  It was set to "always" which is probably why the files were never there.  Now the files are there, including the galaxy_###.o file but galaxy still errors like above.

I had set the parameter "cluster_files_directory = database/pbs", but that doesn't seem to work any longer.  The .o and .e files used to end up there.

Here is an example:

(galaxyvenv)[galaxy@dirigo 630]$ ll
total 16
-rw------- 1 galaxy galaxy    0 Mar  5 19:29 galaxy_630.e
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 galaxy_630.ec
-rw------- 1 galaxy galaxy  940 Mar  5 19:29 galaxy_630.o
-rwxr-xr-x 1 galaxy galaxy 2429 Mar  5 19:29 galaxy_630.sh
-rw-rw-r-- 1 galaxy galaxy  138 Mar  5 19:29 galaxy.json
-rw-rw-r-- 1 galaxy galaxy 2139 Mar  5 19:29 metadata_in_HistoryDatasetAssociation_1182_o830e3
-rw-rw-r-- 1 galaxy galaxy   20 Mar  5 19:29 metadata_kwds_HistoryDatasetAssociation_1182_hOhPp7
-rw-rw-r-- 1 galaxy galaxy   55 Mar  5 19:29 metadata_out_HistoryDatasetAssociation_1182_Ynb70M
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 metadata_override_HistoryDatasetAssociation_1182_HsMljG
-rw-rw-r-- 1 galaxy galaxy   44 Mar  5 19:29 metadata_results_HistoryDatasetAssociation_1182_LxdsAZ
(galaxyvenv)[galaxy@dirigo 630]$ pwd
/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630

Here is the error from this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:37,731 (630/51.dirigo.mdibl.org) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:49,119 (630/51.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 19:31:50,225 (630/51.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_630.o'
galaxy.jobs DEBUG 2014-03-05 19:31:50,252 finish(): Moved /nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_dataset_856.dat to /nextgen3/galaxy/galaxy-dist/database/files/000/dataset_856.dat
galaxy.jobs DEBUG 2014-03-05 19:31:50,351 job 630 ended

On the galaxy page in the history you get in pink:
1 UCSC Main on Human: knownGene (chr22:1-51304566)
error
An error occurred with this dataset:
Job output not returned from cluster

But the dataset is there.

On 3/5/14, 3:33 PM, Nate Coraor wrote:
The old-style url syntax is supposed to continue to work, if you have any details on what's not working I can look in to it. That said, job_conf.xml is the way forward, and a job_conf.xml for the drmaa runner would be a pretty trivial change from the one you have for the pbs runner, e.g.:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param id="nativeSpecification">-l walltime=72:00:00,nodes=1:ppn=4</param>
        </destination>
    </destinations>
</job_conf>

Just make sure you set $DRMAA_LIBRARY_PATH in your environment to the correct libdrmaa.so.

--nate


On Wed, Mar 5, 2014 at 3:27 PM, Pete Schmitt <[hidden email]> wrote:
Hello Nate,

I have that version installed and was using it in the older versions of galaxy for a few years.  Once I loaded this new version, it no longer worked with the old
definitions in the universe file using: default_cluster_job_runner = drmaa:///   

Do I need a job_conf.xml that uses the drmaa runner?



On 3/5/14, 3:21 PM, Nate Coraor wrote:
Hi Pete,

The latest error is pretty strange and not one I've encountered before. It suggests that scramble is not loading setuptools in place of distutils and thus does not have access to the setuptools extensions (notably, egg-related functionality). Something abnormal still seems to be going on with your python environment.

You can use drmaa if you like (this is known to work well). You will want to use the libdrmaa for Torque that's maintained by the Poznan Supercomputing and Networking Center, rather than the libdrmaa that can be built directly with the Torque source. PSNC libdrmaa for Torque/PBS can be found here: http://apps.man.poznan.pl/trac/pbs-drmaa

--nate


On Wed, Mar 5, 2014 at 3:10 PM, Pete Schmitt <[hidden email]> wrote:
Is there any other alternatives to pbs_python for interfacing to a torque scheduler.   This method appears to be a dead end.



On 3/4/14, 9:51 AM, Nate Coraor wrote:
Pete,

Is it possible that `python` as the Galaxy user is calling a python other than /opt/python/2.7.6/bin/python (e.g. the system version without the -dev/-devel package installed)?  The safest bet for ensuring you're using the right python and that it's not going to have conflicting modules, I'd suggest using a Python virtualenv. This is easy to set up, just make sure you run it with the correct python executable, for example:

% wget https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.4.tar.gz
% tar zxf virtualenv-1.11.4.tar.gz
% /opt/python/2.7.6/bin/python ./virtualenv-1.11.4/virtualenv.py galaxyvenv
% . ./galaxyvenv/bin/activate
% cd galaxy-dist
% python ./scripts/fetch_eggs.py
% LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

--nate


On Mon, Mar 3, 2014 at 5:19 PM, Pete Schmitt <[hidden email]> wrote:
I uninstalled pbs_python 4.4.0 and reinstalled 4.3.5 as root (not using scramble)

When I try this method as the galaxy user:
LIBTORQUE_DIR=/opt/torque/active/lib python scripts/scramble.py -e pbs_python

I get the following output:

src/pbs_wrap.c:2813: warning: function declaration isn't a prototype
gcc -pthread -shared build/temp.linux-x86_64-2.7/src/pbs_wrap.o -L/opt/torque/4.2.7/lib -L. -ltorque -lpython2.7 -o build/lib.linux-x86_64-2.7/_pbs.so -L/opt/torque/4.2.7/lib -ltorque -Wl,-rpath -Wl,/opt/torque/4.2.7/lib
/usr/bin/ld: cannot find -lpython2.7

LD_LIBRARY_PATH="/opt/python/2.7.6/lib:/opt/torque/active/lib:/usr/local/lib"

I'm not sure why it can't find the libpython2.7.so file.   When I built it as root there is a -L/opt/python/2.7.6/lib in that gcc line.




On 3/3/14, 4:13 PM, Nate Coraor wrote:
Hi Pete,

Your subject says you are unable to build pbs_python using scramble.
Could you provide details on what's not working there?

Galaxy is not going to work with a different version of pbs_python
unless a bit of hacking is done to make it attempt to do so. We test
Galaxy with specific versions of its dependencies, which is why we
control the versions of those dependencies and provide the scramble
script to (hopefully) make it painless to build them yourself, should
it be necessary to do so, as is always the case with pbs_python.

--nate

On Mon, Mar 3, 2014 at 3:57 PM, Pete Schmitt
[hidden email] wrote:
Following the directions from here:
https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS

I'm trying to get pbs_python to work as I'm using torque for scheduling
galaxy jobs.

Note: This is a fresh install of galaxy from galaxy-dist on CentOS 5.10

I have pbs_python 4.4.0 module installed into a source-built version of
python/2.7.6

I get the following error in the output of run.sh:

galaxy.jobs INFO 2014-03-03 15:46:45,485 Handler 'main' will load all
configured runner plugins
Traceback (most recent call last):
  File
"/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/webapps/galaxy/buildapp.py",
line 39, in app_factory
    app = UniverseApplication( global_conf = global_conf, **kwargs )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/app.py", line 130, in
__init__
    self.job_manager = manager.JobManager( self )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/manager.py", line
31, in __init__
    self.job_handler = handler.JobHandler( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
30, in __init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/handler.py", line
568, in __init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/__init__.py", line
449, in get_job_runner_plugins
    module = __import__( module_name )
  File "/nextgen3/galaxy/galaxy-dist-py27/lib/galaxy/jobs/runners/pbs.py",
line 31, in <module>
    raise Exception( egg_message % str( e ) )
Exception:

The 'pbs' runner depends on 'pbs_python' which is not installed or not
configured properly.  Galaxy's "scramble" system should make this
installation
simple, please follow the instructions found at:

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

Additional errors may follow:
pbs-python==4.3.5

This is the job_conf.xml file:

<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="pbs" type="runner"
load="galaxy.jobs.runners.pbs:PBSJobRunner"/>
    </plugins>
    <handlers>
        <handler id="dirigo"/>
    </handlers>
    <destinations default="pbs_default">
        <destination id="pbs_default" runner="pbs"/>
                <param
id="Resource_List">walltime=72:00:00,nodes=1:ppn=4</param>
    </destinations>
</job_conf>

I did not use the scramble system to install the pbs_python module.  I
downloaded the latest version
available and installed it from the root account.


--

Pete Schmitt


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





--
Pete Schmitt
Technical Director: 
   Discovery Cluster
   NH INBRE Grid
   Computational Genetics Lab
   Institute for Quantitative
          Biomedical Sciences
Dartmouth College, HB 6203
L12 Berry/Baker Library
Hanover, NH 03755

Phone: <a href="tel:603-646-8109" value="+16036468109" target="_blank">603-646-8109

http://discovery.dartmouth.edu
http://columbia.dartmouth.edu/grid
http://www.epistasis.org
http://iQBS.org





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/###/galaxy_624.o'

Peter Schmitt
In reply to this post by Nate Coraor (nate@bx.psu.edu)
Hello Nate,

I had that parameter set to 1, but I up'd it to 5.  I also added -noac to the nfs mounts for /nextgen3

That appears to have fixed it.

Thank you!!!


On 3/6/14, 1:57 PM, Nate Coraor wrote:
Hi Pete,

I'd suggest setting retry_job_output_collection > 0 in universe_wsgi.ini. This is usually a symptom of attribute caching on network filesystems.

--nate


On Wed, Mar 5, 2014 at 8:06 PM, Pete Schmitt <[hidden email]> wrote:


In trying something simple, using galaxy I downloaded data from USCS main.   The data gets downloaded but the job errors out.   I verified that the job actually ran, and completed successfully according to the scheduler but  I get errors like this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 18:17:35,941 (624/46.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 18:17:36,060 (624/46.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/624/galaxy_624.o'

There are no directories being created below the 000 directory.   I verified that the directory tree is owned by galaxy and that the galaxy user can run jobs from the command line as a normal user.

I set the parameter "cleanup_job = never".  It was set to "always" which is probably why the files were never there.  Now the files are there, including the galaxy_###.o file but galaxy still errors like above.

I had set the parameter "cluster_files_directory = database/pbs", but that doesn't seem to work any longer.  The .o and .e files used to end up there.

Here is an example:

(galaxyvenv)[galaxy@dirigo 630]$ ll
total 16
-rw------- 1 galaxy galaxy    0 Mar  5 19:29 galaxy_630.e
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 galaxy_630.ec
-rw------- 1 galaxy galaxy  940 Mar  5 19:29 galaxy_630.o
-rwxr-xr-x 1 galaxy galaxy 2429 Mar  5 19:29 galaxy_630.sh
-rw-rw-r-- 1 galaxy galaxy  138 Mar  5 19:29 galaxy.json
-rw-rw-r-- 1 galaxy galaxy 2139 Mar  5 19:29 metadata_in_HistoryDatasetAssociation_1182_o830e3
-rw-rw-r-- 1 galaxy galaxy   20 Mar  5 19:29 metadata_kwds_HistoryDatasetAssociation_1182_hOhPp7
-rw-rw-r-- 1 galaxy galaxy   55 Mar  5 19:29 metadata_out_HistoryDatasetAssociation_1182_Ynb70M
-rw-rw-r-- 1 galaxy galaxy    2 Mar  5 19:29 metadata_override_HistoryDatasetAssociation_1182_HsMljG
-rw-rw-r-- 1 galaxy galaxy   44 Mar  5 19:29 metadata_results_HistoryDatasetAssociation_1182_LxdsAZ
(galaxyvenv)[galaxy@dirigo 630]$ pwd
/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630

Here is the error from this:

galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:37,731 (630/51.dirigo.mdibl.org) state change: job is running
galaxy.jobs.runners.drmaa DEBUG 2014-03-05 19:31:49,119 (630/51.dirigo.mdibl.org) state change: job finished normally
galaxy.jobs.runners ERROR 2014-03-05 19:31:50,225 (630/51.dirigo.mdibl.org) Job output not returned from cluster: [Errno 2] No such file or directory: '/nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_630.o'
galaxy.jobs DEBUG 2014-03-05 19:31:50,252 finish(): Moved /nextgen3/galaxy/galaxy-dist/database/job_working_directory/000/630/galaxy_dataset_856.dat to /nextgen3/galaxy/galaxy-dist/database/files/000/dataset_856.dat
galaxy.jobs DEBUG 2014-03-05 19:31:50,351 job 630 ended

On the galaxy page in the history you get in pink:
1 UCSC Main on Human: knownGene (chr22:1-51304566)
error
An error occurred with this dataset:
Job output not returned from cluster

But the dataset is there.


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/