Workflow Compatibility

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Workflow Compatibility

Katherine Beaulieu
Hi Everyone,
Would anyone be able to tell me the conditions which would make a tool non-workflow compatible? I have a tool that imports files from a third party application and auto-detects the file format. There is also the option to upload multiple files at once so the tool always uploads at least two files. From what I have described can anyone see why this tool would not be able to send one of its files to the next tool in the chain, ex. a text manipulation tool?
Thanks,
Katherine

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Peter Cock
Hi Katherine,

Are you asking about compatibility staying on the same Galaxy
instance, or the harder problem of compatibility sharing workflows
between Galaxy servers?

Taking data from input Galaxy datasets should be fine, anything else
may not be portable. Even the relatively simple case of using
datafiles referenced via an example.loc file (e.g. BLAST databases)
would require the entries in the example.loc file be synchronised
between Galaxy instances, and the associated data files too.

Peter


On Fri, Aug 26, 2016 at 1:27 PM, Katherine Beaulieu
<[hidden email]> wrote:

> Hi Everyone,
> Would anyone be able to tell me the conditions which would make a tool
> non-workflow compatible? I have a tool that imports files from a third party
> application and auto-detects the file format. There is also the option to
> upload multiple files at once so the tool always uploads at least two files.
> From what I have described can anyone see why this tool would not be able to
> send one of its files to the next tool in the chain, ex. a text manipulation
> tool?
> Thanks,
> Katherine
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Katherine Beaulieu


On Fri, Aug 26, 2016 at 10:42 AM, Katherine Beaulieu <[hidden email]> wrote:
Hi Peter,
I think I did not explain myself well. I meant that if I have a tool that takes multiple file paths and outputs multiple Galaxy datasets to the history, would this tool be workflow compatible, meaning capable of being a part of any workflow? From the behaviour I am getting now I am assuming it isn't but I just wanted to confirm that this isn't supported functionality.

On Fri, Aug 26, 2016 at 10:34 AM, Peter Cock <[hidden email]> wrote:
Hi Katherine,

Are you asking about compatibility staying on the same Galaxy
instance, or the harder problem of compatibility sharing workflows
between Galaxy servers?

Taking data from input Galaxy datasets should be fine, anything else
may not be portable. Even the relatively simple case of using
datafiles referenced via an example.loc file (e.g. BLAST databases)
would require the entries in the example.loc file be synchronised
between Galaxy instances, and the associated data files too.

Peter


On Fri, Aug 26, 2016 at 1:27 PM, Katherine Beaulieu
<[hidden email]> wrote:
> Hi Everyone,
> Would anyone be able to tell me the conditions which would make a tool
> non-workflow compatible? I have a tool that imports files from a third party
> application and auto-detects the file format. There is also the option to
> upload multiple files at once so the tool always uploads at least two files.
> From what I have described can anyone see why this tool would not be able to
> send one of its files to the next tool in the chain, ex. a text manipulation
> tool?
> Thanks,
> Katherine
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Langhorst, Brad
HI Katherine:

Multi-input/output should work fine…

You could have a look at the bowtie wrappers for an example.

It takes up to 2 fastq files and a reference. I can output a few files (unmapped, mapped, etc)


Brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:42 AM, Katherine Beaulieu <[hidden email]> wrote:



On Fri, Aug 26, 2016 at 10:42 AM, Katherine Beaulieu <[hidden email]> wrote:
Hi Peter,
I think I did not explain myself well. I meant that if I have a tool that takes multiple file paths and outputs multiple Galaxy datasets to the history, would this tool be workflow compatible, meaning capable of being a part of any workflow? From the behaviour I am getting now I am assuming it isn't but I just wanted to confirm that this isn't supported functionality.

On Fri, Aug 26, 2016 at 10:34 AM, Peter Cock <[hidden email]> wrote:
Hi Katherine,

Are you asking about compatibility staying on the same Galaxy
instance, or the harder problem of compatibility sharing workflows
between Galaxy servers?

Taking data from input Galaxy datasets should be fine, anything else
may not be portable. Even the relatively simple case of using
datafiles referenced via an example.loc file (e.g. BLAST databases)
would require the entries in the example.loc file be synchronised
between Galaxy instances, and the associated data files too.

Peter


On Fri, Aug 26, 2016 at 1:27 PM, Katherine Beaulieu
<[hidden email]> wrote:
> Hi Everyone,
> Would anyone be able to tell me the conditions which would make a tool
> non-workflow compatible? I have a tool that imports files from a third party
> application and auto-detects the file format. There is also the option to
> upload multiple files at once so the tool always uploads at least two files.
> From what I have described can anyone see why this tool would not be able to
> send one of its files to the next tool in the chain, ex. a text manipulation
> tool?
> Thanks,
> Katherine
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Katherine Beaulieu
Hi Brad,
What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.
Katherine

On Fri, Aug 26, 2016 at 10:45 AM, Langhorst, Brad <[hidden email]> wrote:
HI Katherine:

Multi-input/output should work fine…

You could have a look at the bowtie wrappers for an example.

It takes up to 2 fastq files and a reference. I can output a few files (unmapped, mapped, etc)


Brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:42 AM, Katherine Beaulieu <[hidden email]> wrote:



On Fri, Aug 26, 2016 at 10:42 AM, Katherine Beaulieu <[hidden email]> wrote:
Hi Peter,
I think I did not explain myself well. I meant that if I have a tool that takes multiple file paths and outputs multiple Galaxy datasets to the history, would this tool be workflow compatible, meaning capable of being a part of any workflow? From the behaviour I am getting now I am assuming it isn't but I just wanted to confirm that this isn't supported functionality.

On Fri, Aug 26, 2016 at 10:34 AM, Peter Cock <[hidden email]> wrote:
Hi Katherine,

Are you asking about compatibility staying on the same Galaxy
instance, or the harder problem of compatibility sharing workflows
between Galaxy servers?

Taking data from input Galaxy datasets should be fine, anything else
may not be portable. Even the relatively simple case of using
datafiles referenced via an example.loc file (e.g. BLAST databases)
would require the entries in the example.loc file be synchronised
between Galaxy instances, and the associated data files too.

Peter


On Fri, Aug 26, 2016 at 1:27 PM, Katherine Beaulieu
<[hidden email]> wrote:
> Hi Everyone,
> Would anyone be able to tell me the conditions which would make a tool
> non-workflow compatible? I have a tool that imports files from a third party
> application and auto-detects the file format. There is also the option to
> upload multiple files at once so the tool always uploads at least two files.
> From what I have described can anyone see why this tool would not be able to
> send one of its files to the next tool in the chain, ex. a text manipulation
> tool?
> Thanks,
> Katherine
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Langhorst, Brad
Hi Katherine:

I’d recommend not having users type in paths if at all possible (they will make frustrating mistakes).
If there is a selection of these maybe consider turining them into dropdown lists.

Either way,  these would be no different than e.g. user specified downsampling  amounts. or library names, etc.
I have tools that accept many of these.


Brad


Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:57 AM, Katherine Beaulieu <[hidden email]> wrote:

What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Katherine Beaulieu
Hi Brad,
So with multiple selection dropdown lists this is possible? Do you have an example of a tool that can do this? Would it be possible to see an example of the tools you are talking about? Thanks for the help!
Katherine

On Fri, Aug 26, 2016 at 12:05 PM, Langhorst, Brad <[hidden email]> wrote:
Hi Katherine:

I’d recommend not having users type in paths if at all possible (they will make frustrating mistakes).
If there is a selection of these maybe consider turining them into dropdown lists.

Either way,  these would be no different than e.g. user specified downsampling  amounts. or library names, etc.
I have tools that accept many of these.


Brad


Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:57 AM, Katherine Beaulieu <[hidden email]> wrote:

What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.




___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Langhorst, Brad
Hi:


Yes - let’s say you have hard coded paths… you could put those into a select list pretty easily (just have the paths embedded like the bowtie wrapper does the paired-end vs. single-end select list).

You might also want to consider a table lookup though.
The bowtie wrapper does that for reference genomes.
You can make your own lookup table to refer to whatever files you want, which would allow you to update paths without installing new version of the tool.



brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 1:06 PM, Katherine Beaulieu <[hidden email]> wrote:

Hi Brad,
So with multiple selection dropdown lists this is possible? Do you have an example of a tool that can do this? Would it be possible to see an example of the tools you are talking about? Thanks for the help!
Katherine

On Fri, Aug 26, 2016 at 12:05 PM, Langhorst, Brad <[hidden email]> wrote:
Hi Katherine:

I’d recommend not having users type in paths if at all possible (they will make frustrating mistakes).
If there is a selection of these maybe consider turining them into dropdown lists.

Either way,  these would be no different than e.g. user specified downsampling  amounts. or library names, etc.
I have tools that accept many of these.


Brad


Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:57 AM, Katherine Beaulieu <[hidden email]> wrote:

What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.





___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Katherine Beaulieu
Hmm, unfortunately the paths have to be dynamically generated because this is basically just an upload tool, but for another filesystem on your computer other than the default (like Finder for Mac for example), in my case 'iRODS', so its not really feasible to have something like a file browser or to hard code the paths. 

Although I do have a dynamic select list working at the moment and do get multiple files outputted into my history, when I try to use one of the files I uploaded, in a workflow, I get this error:

File '/home/katie/galaxy/lib/galaxy/workflow/run.py', line 308 in replacement_for_connection

  raise Exception( message )

Exception: Workflow evaluation problem - failed to find output_name __new_primary_file_output|foo.txt__ in step_outputs {'output': <galaxy.model.HistoryDatasetAssociation object at 0x7f976ecbb410>}

And when I look in the database (table workflow_step_connection) every other normal file's output_name is the name of the output variable from the tool config file, for example if the parameter is <output name=output1... then that is the output_name for the file resulting from that tool execution. So why is it giving me something strange like __new_primary_file_output|foo.txt__ when the file comes from a tool execution where multiple files were outputted to the history.

Katherine 


On Fri, Aug 26, 2016 at 2:22 PM, Langhorst, Brad <[hidden email]> wrote:
Hi:


Yes - let’s say you have hard coded paths… you could put those into a select list pretty easily (just have the paths embedded like the bowtie wrapper does the paired-end vs. single-end select list).

You might also want to consider a table lookup though.
The bowtie wrapper does that for reference genomes.
You can make your own lookup table to refer to whatever files you want, which would allow you to update paths without installing new version of the tool.



brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 1:06 PM, Katherine Beaulieu <[hidden email]> wrote:

Hi Brad,
So with multiple selection dropdown lists this is possible? Do you have an example of a tool that can do this? Would it be possible to see an example of the tools you are talking about? Thanks for the help!
Katherine

On Fri, Aug 26, 2016 at 12:05 PM, Langhorst, Brad <[hidden email]> wrote:
Hi Katherine:

I’d recommend not having users type in paths if at all possible (they will make frustrating mistakes).
If there is a selection of these maybe consider turining them into dropdown lists.

Either way,  these would be no different than e.g. user specified downsampling  amounts. or library names, etc.
I have tools that accept many of these.


Brad


Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:57 AM, Katherine Beaulieu <[hidden email]> wrote:

What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.






___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Workflow Compatibility

Langhorst, Brad
Hi Katherine:

This is outside my experience… but I wonder if you could try using a formulaic name for the output (output1, output2, output3, output4) rather than base the output name on the input file’s name.

Maybe someone else has some experience with something like this?


Brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 2:37 PM, Katherine Beaulieu <[hidden email]> wrote:

Hmm, unfortunately the paths have to be dynamically generated because this is basically just an upload tool, but for another filesystem on your computer other than the default (like Finder for Mac for example), in my case 'iRODS', so its not really feasible to have something like a file browser or to hard code the paths. 

Although I do have a dynamic select list working at the moment and do get multiple files outputted into my history, when I try to use one of the files I uploaded, in a workflow, I get this error:

File '/home/katie/galaxy/lib/galaxy/workflow/run.py', line 308 in replacement_for_connection

  raise Exception( message )

Exception: Workflow evaluation problem - failed to find output_name __new_primary_file_output|foo.txt__ in step_outputs {'output': <galaxy.model.HistoryDatasetAssociation object at 0x7f976ecbb410>}

And when I look in the database (table workflow_step_connection) every other normal file's output_name is the name of the output variable from the tool config file, for example if the parameter is <output name=output1... then that is the output_name for the file resulting from that tool execution. So why is it giving me something strange like __new_primary_file_output|foo.txt__ when the file comes from a tool execution where multiple files were outputted to the history.

Katherine 


On Fri, Aug 26, 2016 at 2:22 PM, Langhorst, Brad <[hidden email]> wrote:
Hi:


Yes - let’s say you have hard coded paths… you could put those into a select list pretty easily (just have the paths embedded like the bowtie wrapper does the paired-end vs. single-end select list).

You might also want to consider a table lookup though.
The bowtie wrapper does that for reference genomes.
You can make your own lookup table to refer to whatever files you want, which would allow you to update paths without installing new version of the tool.



brad

Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 1:06 PM, Katherine Beaulieu <[hidden email]> wrote:

Hi Brad,
So with multiple selection dropdown lists this is possible? Do you have an example of a tool that can do this? Would it be possible to see an example of the tools you are talking about? Thanks for the help!
Katherine

On Fri, Aug 26, 2016 at 12:05 PM, Langhorst, Brad <[hidden email]> wrote:
Hi Katherine:

I’d recommend not having users type in paths if at all possible (they will make frustrating mistakes).
If there is a selection of these maybe consider turining them into dropdown lists.

Either way,  these would be no different than e.g. user specified downsampling  amounts. or library names, etc.
I have tools that accept many of these.


Brad


Brad Langhorst, Ph.D.
Development Scientist
New England Biolabs




On Aug 26, 2016, at 10:57 AM, Katherine Beaulieu <[hidden email]> wrote:

What if its multiple file paths that the user types in rather than actual files, which makes it so the tool can't be executed in batch mode, would it still be workflow compatible at that point? Thanks for pointing me to the Bowtie wrappers as an example.







___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/