tool datatypes

classic Classic list List threaded Threaded
8 messages Options
| Threaded
Open this post in threaded view
|

tool datatypes

Matthias Bernt
Dear list,

just a request for links to documentation: How can I realize tool
specific data types. I'm just developing a set of tools that need their
own data types, but I don't want to add them to Galaxy's core data types
(yet).

I've seen examples of tools that had a datatypes_conf.xml. So I created
one, but it seems that it is ignored.

Cheers,
Matthias


--

-------------------------------------------
Matthias Bernt
Bioinformatics Service
Molekulare Systembiologie (MOLSYB)
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
Helmholtz Centre for Environmental Research GmbH - UFZ
Permoserstraße 15, 04318 Leipzig, Germany
Phone +49 341 235 482296,
[hidden email], www.ufz.de

Sitz der Gesellschaft/Registered Office: Leipzig
Registergericht/Registration Office: Amtsgericht Leipzig
Handelsregister Nr./Trade Register Nr.: B 4703
Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
Prof. Dr. Dr. h.c. Georg Teutsch
Administrative Geschäftsführerin/ Administrative Managing Director:
Prof. Dr. Heike Graßmann
-------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Peter Cock
More details might help - are you just defining the new datatype as a
subclass in the XML, or do you need to include Python code (e.g. for a
sniffer)?

If you want to see some examples of datatypes using Python code which
are available via the Tool Shed, here are two:

https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
(now also in the Galaxy core)
https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes

Peter

On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:

> Dear list,
>
> just a request for links to documentation: How can I realize tool specific
> data types. I'm just developing a set of tools that need their own data
> types, but I don't want to add them to Galaxy's core data types (yet).
>
> I've seen examples of tools that had a datatypes_conf.xml. So I created one,
> but it seems that it is ignored.
>
> Cheers,
> Matthias
>
>
> --
>
> -------------------------------------------
> Matthias Bernt
> Bioinformatics Service
> Molekulare Systembiologie (MOLSYB)
> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
> Helmholtz Centre for Environmental Research GmbH - UFZ
> Permoserstraße 15, 04318 Leipzig, Germany
> Phone +49 341 235 482296,
> [hidden email], www.ufz.de
>
> Sitz der Gesellschaft/Registered Office: Leipzig
> Registergericht/Registration Office: Amtsgericht Leipzig
> Handelsregister Nr./Trade Register Nr.: B 4703
> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
> Wilfried Kraus
> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
> Prof. Dr. Dr. h.c. Georg Teutsch
> Administrative Geschäftsführerin/ Administrative Managing Director:
> Prof. Dr. Heike Graßmann
> -------------------------------------------
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>  https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Matthias Bernt
Hi Peter,

I hope that subclassing simple data types will be sufficient.

More details:

I'm currently trying to (auto)wrap the checkm suite
https://github.com/Ecogenomics/CheckM. Current state here:
https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.

These tools often create an output folder which is then the input to
other tools. So I thought to create a (output) data type for each of the
tools and save the folder using extra_files_path. Then I can pass the
folder to downstream tools (and can check for proper input). I hope that
subclassing txt or tabular will be sufficient.

Cheers,
Matthias


On 16.08.2018 17:02, Peter Cock wrote:

> More details might help - are you just defining the new datatype as a
> subclass in the XML, or do you need to include Python code (e.g. for a
> sniffer)?
>
> If you want to see some examples of datatypes using Python code which
> are available via the Tool Shed, here are two:
>
> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
> (now also in the Galaxy core)
> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>
> Peter
>
> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>> Dear list,
>>
>> just a request for links to documentation: How can I realize tool specific
>> data types. I'm just developing a set of tools that need their own data
>> types, but I don't want to add them to Galaxy's core data types (yet).
>>
>> I've seen examples of tools that had a datatypes_conf.xml. So I created one,
>> but it seems that it is ignored.
>>
>> Cheers,
>> Matthias
>>
>>
>> --
>>
>> -------------------------------------------
>> Matthias Bernt
>> Bioinformatics Service
>> Molekulare Systembiologie (MOLSYB)
>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>> Helmholtz Centre for Environmental Research GmbH - UFZ
>> Permoserstraße 15, 04318 Leipzig, Germany
>> Phone +49 341 235 482296,
>> [hidden email], www.ufz.de
>>
>> Sitz der Gesellschaft/Registered Office: Leipzig
>> Registergericht/Registration Office: Amtsgericht Leipzig
>> Handelsregister Nr./Trade Register Nr.: B 4703
>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
>> Wilfried Kraus
>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>> Prof. Dr. Dr. h.c. Georg Teutsch
>> Administrative Geschäftsführerin/ Administrative Managing Director:
>> Prof. Dr. Heike Graßmann
>> -------------------------------------------
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   https://lists.galaxyproject.org/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/

--

-------------------------------------------
Matthias Bernt
Bioinformatics Service
Molekulare Systembiologie (MOLSYB)
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
Helmholtz Centre for Environmental Research GmbH - UFZ
Permoserstraße 15, 04318 Leipzig, Germany
Phone +49 341 235 482296,
[hidden email], www.ufz.de

Sitz der Gesellschaft/Registered Office: Leipzig
Registergericht/Registration Office: Amtsgericht Leipzig
Handelsregister Nr./Trade Register Nr.: B 4703
Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
Prof. Dr. Dr. h.c. Georg Teutsch
Administrative Geschäftsführerin/ Administrative Managing Director:
Prof. Dr. Heike Graßmann
-------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Peter Cock
Defining 20 different text-based formats does not look ideal (if that
is what you are doing).

Do you have sample output in the repository? Perhaps at least some of
these can be better defined as tabular instead?

Or, perhaps you can define one composite datatype for the folder of
output instead?

Peter

On Thu, Aug 16, 2018 at 4:18 PM, Matthias Bernt <[hidden email]> wrote:

> Hi Peter,
>
> I hope that subclassing simple data types will be sufficient.
>
> More details:
>
> I'm currently trying to (auto)wrap the checkm suite
> https://github.com/Ecogenomics/CheckM. Current state here:
> https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.
>
> These tools often create an output folder which is then the input to other
> tools. So I thought to create a (output) data type for each of the tools and
> save the folder using extra_files_path. Then I can pass the folder to
> downstream tools (and can check for proper input). I hope that subclassing
> txt or tabular will be sufficient.
>
> Cheers,
> Matthias
>
>
>
> On 16.08.2018 17:02, Peter Cock wrote:
>>
>> More details might help - are you just defining the new datatype as a
>> subclass in the XML, or do you need to include Python code (e.g. for a
>> sniffer)?
>>
>> If you want to see some examples of datatypes using Python code which
>> are available via the Tool Shed, here are two:
>>
>>
>> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
>> (now also in the Galaxy core)
>>
>> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>>
>> Peter
>>
>> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>>>
>>> Dear list,
>>>
>>> just a request for links to documentation: How can I realize tool
>>> specific
>>> data types. I'm just developing a set of tools that need their own data
>>> types, but I don't want to add them to Galaxy's core data types (yet).
>>>
>>> I've seen examples of tools that had a datatypes_conf.xml. So I created
>>> one,
>>> but it seems that it is ignored.
>>>
>>> Cheers,
>>> Matthias
>>>
>>>
>>> --
>>>
>>> -------------------------------------------
>>> Matthias Bernt
>>> Bioinformatics Service
>>> Molekulare Systembiologie (MOLSYB)
>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>> Permoserstraße 15, 04318 Leipzig, Germany
>>> Phone +49 341 235 482296,
>>> [hidden email], www.ufz.de
>>>
>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>> MinDirig
>>> Wilfried Kraus
>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>> Prof. Dr. Heike Graßmann
>>> -------------------------------------------
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   https://lists.galaxyproject.org/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>   http://galaxyproject.org/search/
>
>
> --
>
> -------------------------------------------
> Matthias Bernt
> Bioinformatics Service
> Molekulare Systembiologie (MOLSYB)
> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
> Helmholtz Centre for Environmental Research GmbH - UFZ
> Permoserstraße 15, 04318 Leipzig, Germany
> Phone +49 341 235 482296,
> [hidden email], www.ufz.de
>
> Sitz der Gesellschaft/Registered Office: Leipzig
> Registergericht/Registration Office: Amtsgericht Leipzig
> Handelsregister Nr./Trade Register Nr.: B 4703
> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
> Wilfried Kraus
> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
> Prof. Dr. Dr. h.c. Georg Teutsch
> Administrative Geschäftsführerin/ Administrative Managing Director:
> Prof. Dr. Heike Graßmann
> -------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Matthias Bernt
Dear Peter,

you are right. I hope that this will be much less in the end (I'm still learning about the package).

The main question still remains, how do I get `planemo test` to include the data types defined in the xml file?

Best,
Matthias



Am 16/08/18 18:32 schrieb Peter Cock <[hidden email]>:
Defining 20 different text-based formats does not look ideal (if that
is what you are doing).

Do you have sample output in the repository? Perhaps at least some of
these can be better defined as tabular instead?

Or, perhaps you can define one composite datatype for the folder of
output instead?

Peter

On Thu, Aug 16, 2018 at 4:18 PM, Matthias Bernt <[hidden email]> wrote:

> Hi Peter,
>
> I hope that subclassing simple data types will be sufficient.
>
> More details:
>
> I'm currently trying to (auto)wrap the checkm suite
> https://github.com/Ecogenomics/CheckM. Current state here:
> https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.
>
> These tools often create an output folder which is then the input to other
> tools. So I thought to create a (output) data type for each of the tools and
> save the folder using extra_files_path. Then I can pass the folder to
> downstream tools (and can check for proper input). I hope that subclassing
> txt or tabular will be sufficient.
>
> Cheers,
> Matthias
>
>
>
> On 16.08.2018 17:02, Peter Cock wrote:
>>
>> More details might help - are you just defining the new datatype as a
>> subclass in the XML, or do you need to include Python code (e.g. for a
>> sniffer)?
>>
>> If you want to see some examples of datatypes using Python code which
>> are available via the Tool Shed, here are two:
>>
>>
>> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
>> (now also in the Galaxy core)
>>
>> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>>
>> Peter
>>
>> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>>>
>>> Dear list,
>>>
>>> just a request for links to documentation: How can I realize tool
>>> specific
>>> data types. I'm just developing a set of tools that need their own data
>>> types, but I don't want to add them to Galaxy's core data types (yet).
>>>
>>> I've seen examples of tools that had a datatypes_conf.xml. So I created
>>> one,
>>> but it seems that it is ignored.
>>>
>>> Cheers,
>>> Matthias
>>>
>>>
>>> --
>>>
>>> -------------------------------------------
>>> Matthias Bernt
>>> Bioinformatics Service
>>> Molekulare Systembiologie (MOLSYB)
>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>> Permoserstraße 15, 04318 Leipzig, Germany
>>> Phone +49 341 235 482296,
>>> [hidden email], www.ufz.de
>>>
>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>> MinDirig
>>> Wilfried Kraus
>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>> Prof. Dr. Heike Graßmann
>>> -------------------------------------------
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   https://lists.galaxyproject.org/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>   http://galaxyproject.org/search/
>
>
> --
>
> -------------------------------------------
> Matthias Bernt
> Bioinformatics Service
> Molekulare Systembiologie (MOLSYB)
> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
> Helmholtz Centre for Environmental Research GmbH - UFZ
> Permoserstraße 15, 04318 Leipzig, Germany
> Phone +49 341 235 482296,
> [hidden email], www.ufz.de
>
> Sitz der Gesellschaft/Registered Office: Leipzig
> Registergericht/Registration Office: Amtsgericht Leipzig
> Handelsregister Nr./Trade Register Nr.: B 4703
> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
> Wilfried Kraus
> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
> Prof. Dr. Dr. h.c. Georg Teutsch
> Administrative Geschäftsführerin/ Administrative Managing Director:
> Prof. Dr. Heike Graßmann
> -------------------------------------------

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Peter Cock
This is probably a John Chilton question, as the Planemo lead.

The way I do it is to "manually" install the datatype into a Galaxy
test instance (adding entries to the datatypes_conf.xml and Python
files to Galaxy's internal library), and then call ``planemo test``
pointing at this test instance. You can see that approach in action
here:

https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.yml#L106
https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.datatypes_conf.xml#L20

and:

https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.yml#L113
https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.datatypes_conf.xml#L42

There may be a more elegant solution nowadays using planemo to do some
of the work.

Peter

On Thu, Aug 16, 2018 at 9:30 PM, Matthias Bernt <[hidden email]> wrote:

> Dear Peter,
>
> you are right. I hope that this will be much less in the end (I'm still
> learning about the package).
>
> The main question still remains, how do I get `planemo test` to include the
> data types defined in the xml file?
>
> Best,
> Matthias
>
>
>
> Am 16/08/18 18:32 schrieb Peter Cock <[hidden email]>:
>
> Defining 20 different text-based formats does not look ideal (if that
> is what you are doing).
>
> Do you have sample output in the repository? Perhaps at least some of
> these can be better defined as tabular instead?
>
> Or, perhaps you can define one composite datatype for the folder of
> output instead?
>
> Peter
>
> On Thu, Aug 16, 2018 at 4:18 PM, Matthias Bernt <[hidden email]> wrote:
>> Hi Peter,
>>
>> I hope that subclassing simple data types will be sufficient.
>>
>> More details:
>>
>> I'm currently trying to (auto)wrap the checkm suite
>> https://github.com/Ecogenomics/CheckM. Current state here:
>>
>> https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.
>>
>> These tools often create an output folder which is then the input to other
>> tools. So I thought to create a (output) data type for each of the tools
>> and
>> save the folder using extra_files_path. Then I can pass the folder to
>> downstream tools (and can check for proper input). I hope that subclassing
>> txt or tabular will be sufficient.
>>
>> Cheers,
>> Matthias
>>
>>
>>
>> On 16.08.2018 17:02, Peter Cock wrote:
>>>
>>> More details might help - are you just defining the new datatype as a
>>> subclass in the XML, or do you need to include Python code (e.g. for a
>>> sniffer)?
>>>
>>> If you want to see some examples of datatypes using Python code which
>>> are available via the Tool Shed, here are two:
>>>
>>>
>>>
>>> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
>>> (now also in the Galaxy core)
>>>
>>>
>>> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>>>
>>> Peter
>>>
>>> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>>>>
>>>> Dear list,
>>>>
>>>> just a request for links to documentation: How can I realize tool
>>>> specific
>>>> data types. I'm just developing a set of tools that need their own data
>>>> types, but I don't want to add them to Galaxy's core data types (yet).
>>>>
>>>> I've seen examples of tools that had a datatypes_conf.xml. So I created
>>>> one,
>>>> but it seems that it is ignored.
>>>>
>>>> Cheers,
>>>> Matthias
>>>>
>>>>
>>>> --
>>>>
>>>> -------------------------------------------
>>>> Matthias Bernt
>>>> Bioinformatics Service
>>>> Molekulare Systembiologie (MOLSYB)
>>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>>> Permoserstraße 15, 04318 Leipzig, Germany
>>>> Phone +49 341 235 482296,
>>>> [hidden email], www.ufz.de
>>>>
>>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>>> MinDirig
>>>> Wilfried Kraus
>>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>>> Prof. Dr. Heike Graßmann
>>>> -------------------------------------------
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>>   https://lists.galaxyproject.org/
>>>>
>>>> To search Galaxy mailing lists use the unified search at:
>>>>   http://galaxyproject.org/search/
>>
>>
>> --
>>
>> -------------------------------------------
>> Matthias Bernt
>> Bioinformatics Service
>> Molekulare Systembiologie (MOLSYB)
>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>> Helmholtz Centre for Environmental Research GmbH - UFZ
>> Permoserstraße 15, 04318 Leipzig, Germany
>> Phone +49 341 235 482296,
>> [hidden email], www.ufz.de
>>
>> Sitz der Gesellschaft/Registered Office: Leipzig
>> Registergericht/Registration Office: Amtsgericht Leipzig
>> Handelsregister Nr./Trade Register Nr.: B 4703
>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
>> Wilfried Kraus
>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>> Prof. Dr. Dr. h.c. Georg Teutsch
>> Administrative Geschäftsführerin/ Administrative Managing Director:
>> Prof. Dr. Heike Graßmann
>> -------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Matthias Bernt
Hi,

thanks for your support. This helps.

I have thought a bit about composite data types. And have additional
questions.

In my case the additional data is essentially a folder (w subfolders).
The contents of the folder vary (it depends on the input of the programs
that generate them). So the question is if I can/should

- add a single folder (with all the hirarchy as a single composite data
element) .. I have only seen a add_composite_file function.

- add a dynamic number of composite files

- or if it is better to use the extra_files_path mechanism (which seems
to be the simplest for me)

In any of these cases my next question would be how to create a test
using such a data as input and output (for output I have seen an example
somewhere).

Cheers,
Matthias



On 17.08.2018 11:53, Peter Cock wrote:

> This is probably a John Chilton question, as the Planemo lead.
>
> The way I do it is to "manually" install the datatype into a Galaxy
> test instance (adding entries to the datatypes_conf.xml and Python
> files to Galaxy's internal library), and then call ``planemo test``
> pointing at this test instance. You can see that approach in action
> here:
>
> https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.yml#L106
> https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.datatypes_conf.xml#L20
>
> and:
>
> https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.yml#L113
> https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.datatypes_conf.xml#L42
>
> There may be a more elegant solution nowadays using planemo to do some
> of the work.
>
> Peter
>
> On Thu, Aug 16, 2018 at 9:30 PM, Matthias Bernt <[hidden email]> wrote:
>> Dear Peter,
>>
>> you are right. I hope that this will be much less in the end (I'm still
>> learning about the package).
>>
>> The main question still remains, how do I get `planemo test` to include the
>> data types defined in the xml file?
>>
>> Best,
>> Matthias
>>
>>
>>
>> Am 16/08/18 18:32 schrieb Peter Cock <[hidden email]>:
>>
>> Defining 20 different text-based formats does not look ideal (if that
>> is what you are doing).
>>
>> Do you have sample output in the repository? Perhaps at least some of
>> these can be better defined as tabular instead?
>>
>> Or, perhaps you can define one composite datatype for the folder of
>> output instead?
>>
>> Peter
>>
>> On Thu, Aug 16, 2018 at 4:18 PM, Matthias Bernt <[hidden email]> wrote:
>>> Hi Peter,
>>>
>>> I hope that subclassing simple data types will be sufficient.
>>>
>>> More details:
>>>
>>> I'm currently trying to (auto)wrap the checkm suite
>>> https://github.com/Ecogenomics/CheckM. Current state here:
>>>
>>> https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.
>>>
>>> These tools often create an output folder which is then the input to other
>>> tools. So I thought to create a (output) data type for each of the tools
>>> and
>>> save the folder using extra_files_path. Then I can pass the folder to
>>> downstream tools (and can check for proper input). I hope that subclassing
>>> txt or tabular will be sufficient.
>>>
>>> Cheers,
>>> Matthias
>>>
>>>
>>>
>>> On 16.08.2018 17:02, Peter Cock wrote:
>>>>
>>>> More details might help - are you just defining the new datatype as a
>>>> subclass in the XML, or do you need to include Python code (e.g. for a
>>>> sniffer)?
>>>>
>>>> If you want to see some examples of datatypes using Python code which
>>>> are available via the Tool Shed, here are two:
>>>>
>>>>
>>>>
>>>> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
>>>> (now also in the Galaxy core)
>>>>
>>>>
>>>> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>>>>
>>>> Peter
>>>>
>>>> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>>>>>
>>>>> Dear list,
>>>>>
>>>>> just a request for links to documentation: How can I realize tool
>>>>> specific
>>>>> data types. I'm just developing a set of tools that need their own data
>>>>> types, but I don't want to add them to Galaxy's core data types (yet).
>>>>>
>>>>> I've seen examples of tools that had a datatypes_conf.xml. So I created
>>>>> one,
>>>>> but it seems that it is ignored.
>>>>>
>>>>> Cheers,
>>>>> Matthias
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> -------------------------------------------
>>>>> Matthias Bernt
>>>>> Bioinformatics Service
>>>>> Molekulare Systembiologie (MOLSYB)
>>>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>>>> Permoserstraße 15, 04318 Leipzig, Germany
>>>>> Phone +49 341 235 482296,
>>>>> [hidden email], www.ufz.de
>>>>>
>>>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>>>> MinDirig
>>>>> Wilfried Kraus
>>>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>>>> Prof. Dr. Heike Graßmann
>>>>> -------------------------------------------
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this
>>>>> and other Galaxy lists, please use the interface at:
>>>>>    https://lists.galaxyproject.org/
>>>>>
>>>>> To search Galaxy mailing lists use the unified search at:
>>>>>    http://galaxyproject.org/search/
>>>
>>>
>>> --
>>>
>>> -------------------------------------------
>>> Matthias Bernt
>>> Bioinformatics Service
>>> Molekulare Systembiologie (MOLSYB)
>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>> Permoserstraße 15, 04318 Leipzig, Germany
>>> Phone +49 341 235 482296,
>>> [hidden email], www.ufz.de
>>>
>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
>>> Wilfried Kraus
>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>> Prof. Dr. Heike Graßmann
>>> -------------------------------------------

--

-------------------------------------------
Matthias Bernt
Bioinformatics Service
Molekulare Systembiologie (MOLSYB)
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
Helmholtz Centre for Environmental Research GmbH - UFZ
Permoserstraße 15, 04318 Leipzig, Germany
Phone +49 341 235 482296,
[hidden email], www.ufz.de

Sitz der Gesellschaft/Registered Office: Leipzig
Registergericht/Registration Office: Amtsgericht Leipzig
Handelsregister Nr./Trade Register Nr.: B 4703
Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
Prof. Dr. Dr. h.c. Georg Teutsch
Administrative Geschäftsführerin/ Administrative Managing Director:
Prof. Dr. Heike Graßmann
-------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
| Threaded
Open this post in threaded view
|

Re: tool datatypes

Peter Cock
Hi Matthias,

If you look at the HTML composite datatype, there is a master file (HTML),
and an arbitrary number of arbitrarily named child files like images.

The BLAST database datatype follows this, but we have used a simple
text file as the master file (just the stdout from makeblastdb) in order to
have something for Galaxy to show in the main pane when clicking on
these datasets.

I can't think of any other composite examples which would be more
relevant, hopefully someone else on the list can?

Regards,

Peter

On Fri, Aug 17, 2018 at 2:55 PM, Matthias Bernt <[hidden email]> wrote:

> Hi,
>
> thanks for your support. This helps.
>
> I have thought a bit about composite data types. And have additional
> questions.
>
> In my case the additional data is essentially a folder (w subfolders). The
> contents of the folder vary (it depends on the input of the programs that
> generate them). So the question is if I can/should
>
> - add a single folder (with all the hirarchy as a single composite data
> element) .. I have only seen a add_composite_file function.
>
> - add a dynamic number of composite files
>
> - or if it is better to use the extra_files_path mechanism (which seems to
> be the simplest for me)
>
> In any of these cases my next question would be how to create a test using
> such a data as input and output (for output I have seen an example
> somewhere).
>
> Cheers,
> Matthias
>
>
>
>
> On 17.08.2018 11:53, Peter Cock wrote:
>>
>> This is probably a John Chilton question, as the Planemo lead.
>>
>> The way I do it is to "manually" install the datatype into a Galaxy
>> test instance (adding entries to the datatypes_conf.xml and Python
>> files to Galaxy's internal library), and then call ``planemo test``
>> pointing at this test instance. You can see that approach in action
>> here:
>>
>>
>> https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.yml#L106
>>
>> https://github.com/peterjc/galaxy_blast/blob/579c348ced72d4e6f1ef7fb0cded98e52f454b92/.travis.datatypes_conf.xml#L20
>>
>> and:
>>
>>
>> https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.yml#L113
>>
>> https://github.com/peterjc/galaxy_mira/blob/206259620376b322fc8ed99a6efdd3712f38764b/.travis.datatypes_conf.xml#L42
>>
>> There may be a more elegant solution nowadays using planemo to do some
>> of the work.
>>
>> Peter
>>
>> On Thu, Aug 16, 2018 at 9:30 PM, Matthias Bernt <[hidden email]> wrote:
>>>
>>> Dear Peter,
>>>
>>> you are right. I hope that this will be much less in the end (I'm still
>>> learning about the package).
>>>
>>> The main question still remains, how do I get `planemo test` to include
>>> the
>>> data types defined in the xml file?
>>>
>>> Best,
>>> Matthias
>>>
>>>
>>>
>>> Am 16/08/18 18:32 schrieb Peter Cock <[hidden email]>:
>>>
>>> Defining 20 different text-based formats does not look ideal (if that
>>> is what you are doing).
>>>
>>> Do you have sample output in the repository? Perhaps at least some of
>>> these can be better defined as tabular instead?
>>>
>>> Or, perhaps you can define one composite datatype for the folder of
>>> output instead?
>>>
>>> Peter
>>>
>>> On Thu, Aug 16, 2018 at 4:18 PM, Matthias Bernt <[hidden email]> wrote:
>>>>
>>>> Hi Peter,
>>>>
>>>> I hope that subclassing simple data types will be sufficient.
>>>>
>>>> More details:
>>>>
>>>> I'm currently trying to (auto)wrap the checkm suite
>>>> https://github.com/Ecogenomics/CheckM. Current state here:
>>>>
>>>>
>>>> https://github.com/bernt-matthias/mb-galaxy-tools/tree/master/tools/checkm.
>>>>
>>>> These tools often create an output folder which is then the input to
>>>> other
>>>> tools. So I thought to create a (output) data type for each of the tools
>>>> and
>>>> save the folder using extra_files_path. Then I can pass the folder to
>>>> downstream tools (and can check for proper input). I hope that
>>>> subclassing
>>>> txt or tabular will be sufficient.
>>>>
>>>> Cheers,
>>>> Matthias
>>>>
>>>>
>>>>
>>>> On 16.08.2018 17:02, Peter Cock wrote:
>>>>>
>>>>>
>>>>> More details might help - are you just defining the new datatype as a
>>>>> subclass in the XML, or do you need to include Python code (e.g. for a
>>>>> sniffer)?
>>>>>
>>>>> If you want to see some examples of datatypes using Python code which
>>>>> are available via the Tool Shed, here are two:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/peterjc/galaxy_blast/tree/master/data_managers/ncbi_blastdb
>>>>> (now also in the Galaxy core)
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/peterjc/galaxy_mira/tree/master/datatypes/mira_datatypes
>>>>>
>>>>> Peter
>>>>>
>>>>> On Thu, Aug 16, 2018 at 3:48 PM, Matthias Bernt <[hidden email]> wrote:
>>>>>>
>>>>>>
>>>>>> Dear list,
>>>>>>
>>>>>> just a request for links to documentation: How can I realize tool
>>>>>> specific
>>>>>> data types. I'm just developing a set of tools that need their own
>>>>>> data
>>>>>> types, but I don't want to add them to Galaxy's core data types (yet).
>>>>>>
>>>>>> I've seen examples of tools that had a datatypes_conf.xml. So I
>>>>>> created
>>>>>> one,
>>>>>> but it seems that it is ignored.
>>>>>>
>>>>>> Cheers,
>>>>>> Matthias
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> -------------------------------------------
>>>>>> Matthias Bernt
>>>>>> Bioinformatics Service
>>>>>> Molekulare Systembiologie (MOLSYB)
>>>>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>>>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>>>>> Permoserstraße 15, 04318 Leipzig, Germany
>>>>>> Phone +49 341 235 482296,
>>>>>> [hidden email], www.ufz.de
>>>>>>
>>>>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>>>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>>>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>>>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>>>>> MinDirig
>>>>>> Wilfried Kraus
>>>>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>>>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>>>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>>>>> Prof. Dr. Heike Graßmann
>>>>>> -------------------------------------------
>>>>>> ___________________________________________________________
>>>>>> Please keep all replies on the list by using "reply all"
>>>>>> in your mail client.  To manage your subscriptions to this
>>>>>> and other Galaxy lists, please use the interface at:
>>>>>>    https://lists.galaxyproject.org/
>>>>>>
>>>>>> To search Galaxy mailing lists use the unified search at:
>>>>>>    http://galaxyproject.org/search/
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> -------------------------------------------
>>>> Matthias Bernt
>>>> Bioinformatics Service
>>>> Molekulare Systembiologie (MOLSYB)
>>>> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
>>>> Helmholtz Centre for Environmental Research GmbH - UFZ
>>>> Permoserstraße 15, 04318 Leipzig, Germany
>>>> Phone +49 341 235 482296,
>>>> [hidden email], www.ufz.de
>>>>
>>>> Sitz der Gesellschaft/Registered Office: Leipzig
>>>> Registergericht/Registration Office: Amtsgericht Leipzig
>>>> Handelsregister Nr./Trade Register Nr.: B 4703
>>>> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
>>>> MinDirig
>>>> Wilfried Kraus
>>>> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
>>>> Prof. Dr. Dr. h.c. Georg Teutsch
>>>> Administrative Geschäftsführerin/ Administrative Managing Director:
>>>> Prof. Dr. Heike Graßmann
>>>> -------------------------------------------
>
>
> --
>
> -------------------------------------------
> Matthias Bernt
> Bioinformatics Service
> Molekulare Systembiologie (MOLSYB)
> Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
> Helmholtz Centre for Environmental Research GmbH - UFZ
> Permoserstraße 15, 04318 Leipzig, Germany
> Phone +49 341 235 482296,
> [hidden email], www.ufz.de
>
> Sitz der Gesellschaft/Registered Office: Leipzig
> Registergericht/Registration Office: Amtsgericht Leipzig
> Handelsregister Nr./Trade Register Nr.: B 4703
> Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board: MinDirig
> Wilfried Kraus
> Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
> Prof. Dr. Dr. h.c. Georg Teutsch
> Administrative Geschäftsführerin/ Administrative Managing Director:
> Prof. Dr. Heike Graßmann
> -------------------------------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/