writing datatypes

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

writing datatypes

Eric Rasche
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to add a new datatype to my galaxy instance for genbank
files, however I'm running into various issues. I've followed the
tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)

however that example subclasses tabular, and I'd like to subclass Text
as they're plain text files, and I'd like to be able to define a sniffer
for them (not possible if your type=galaxy.datatypes.data:Text)

I figured the call ought to be something like

<datatype extension="gb" type="galaxy.datatypes.data:Genbank"
subclass="True" />

however, everything I try fails with

> Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank'

To avoid this particular issue, I tried writing a separate datatype just
for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
that fails with the same error:

> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank'
> Traceback (most recent call last):
>   File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py", line 206, in load_datatypes
>     module = getattr( module, mod )
> AttributeError: 'module' object has no attribute 'genbank'

Here's my lib/galaxy/datatypes/genbank.py looks like:

> import pkg_resources
> pkg_resources.require( "bx-python" )
> import logging
> from galaxy.datatypes import data
> log = logging.getLogger(__name__)
>
> class Genbank( data.Text ):
>     file_ext = "gb"
>
>     def sniff( self, filename ):
>         header = open(filename).read(5)
>         return header == 'LOCUS'

To debug this, I've tried copying the tabular data type completely,
removed all the classes other than Tabular, and renamed it "Genbank",
however this fails too with the same error.

Can anyone offer some insight?

Cheers,
Eric
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
dLiw3ddVm+BFmczJzRpt
=4m2j
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
Hi Eric

There is already a genbank format in the EMBOSS datatypes
(although there is talk of defining this and others in a set of
smaller repositories defined as its dependencies for more
modularity). Note it uses "genbank" not "gb" as the name!

https://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes

However that doesn't answer your question :(

Peter

On Mon, Jul 14, 2014 at 7:31 PM, Eric Rasche <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm trying to add a new datatype to my galaxy instance for genbank
> files, however I'm running into various issues. I've followed the
> tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>
> however that example subclasses tabular, and I'd like to subclass Text
> as they're plain text files, and I'd like to be able to define a sniffer
> for them (not possible if your type=galaxy.datatypes.data:Text)
>
> I figured the call ought to be something like
>
> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
> subclass="True" />
>
> however, everything I try fails with
>
>> Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank'
>
> To avoid this particular issue, I tried writing a separate datatype just
> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
> that fails with the same error:
>
>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank'
>> Traceback (most recent call last):
>>   File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py", line 206, in load_datatypes
>>     module = getattr( module, mod )
>> AttributeError: 'module' object has no attribute 'genbank'
>
> Here's my lib/galaxy/datatypes/genbank.py looks like:
>
>> import pkg_resources
>> pkg_resources.require( "bx-python" )
>> import logging
>> from galaxy.datatypes import data
>> log = logging.getLogger(__name__)
>>
>> class Genbank( data.Text ):
>>     file_ext = "gb"
>>
>>     def sniff( self, filename ):
>>         header = open(filename).read(5)
>>         return header == 'LOCUS'
>
> To debug this, I've tried copying the tabular data type completely,
> removed all the classes other than Tabular, and renamed it "Genbank",
> however this fails too with the same error.
>
> Can anyone offer some insight?
>
> Cheers,
> Eric
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
>
> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
> dLiw3ddVm+BFmczJzRpt
> =4m2j
> -----END PGP SIGNATURE-----
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Peter,

I saw that in an initial search for genbank modules, however it didn't
meet our requirements (lack of features/"heavy" by requiring all of
emboss). And, you are correct, it doesn't fix the problem. Thanks for
the suggestion.

Cheers,
Eric

On 07/15/2014 03:14 AM, Peter Cock wrote:

> Hi Eric
>
> There is already a genbank format in the EMBOSS datatypes
> (although there is talk of defining this and others in a set of
> smaller repositories defined as its dependencies for more
> modularity). Note it uses "genbank" not "gb" as the name!
>
> https://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes
>
> However that doesn't answer your question :(
>
> Peter
>
> On Mon, Jul 14, 2014 at 7:31 PM, Eric Rasche <[hidden email]> wrote:
> I'm trying to add a new datatype to my galaxy instance for genbank
> files, however I'm running into various issues. I've followed the
> tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>
> however that example subclasses tabular, and I'd like to subclass Text
> as they're plain text files, and I'd like to be able to define a sniffer
> for them (not possible if your type=galaxy.datatypes.data:Text)
>
> I figured the call ought to be something like
>
> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
> subclass="True" />
>
> however, everything I try fails with
>
>>>> Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank'
>
> To avoid this particular issue, I tried writing a separate datatype just
> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
> that fails with the same error:
>
>>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank'
>>>> Traceback (most recent call last):
>>>>   File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py", line 206, in load_datatypes
>>>>     module = getattr( module, mod )
>>>> AttributeError: 'module' object has no attribute 'genbank'
>
> Here's my lib/galaxy/datatypes/genbank.py looks like:
>
>>>> import pkg_resources
>>>> pkg_resources.require( "bx-python" )
>>>> import logging
>>>> from galaxy.datatypes import data
>>>> log = logging.getLogger(__name__)
>>>>
>>>> class Genbank( data.Text ):
>>>>     file_ext = "gb"
>>>>
>>>>     def sniff( self, filename ):
>>>>         header = open(filename).read(5)
>>>>         return header == 'LOCUS'
>
> To debug this, I've tried copying the tabular data type completely,
> removed all the classes other than Tabular, and renamed it "Genbank",
> however this fails too with the same error.
>
> Can anyone offer some insight?
>
> Cheers,
> Eric
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTxULLAAoJEMqDXdrsMcpVtsAP/j8i8rNcrJqgOCnYexD2dHoQ
yn6JYRQRNziJrqhwVTuH1i47rFJXUoo2whaD4QKwSnrXg0iQSpSgiM74e+IKmOFQ
lnqyQQP50YHMars3U9441T15GcSSpNEW1FwxtBIrIt76bV26BPx+YKqhukA76eQ8
e5X+HRPsFu8+jczL0zcAv5DGSmskoJz6wDc9jlaWbFu21mjPPZiY6FFdXZaBR/h2
AesD68P85d4sygzcE42BDuSUg2obPSiBA5DJ/CMWlUNDeZi4V6/KO/F2LmC2PAak
rR9xSSS2HXryuqREzRX8Ny1jq6Y0v34zTjObwtWTExE2olTPqPxB0pvEsaoKFis7
KNEP9qLgOMTKjCTzrb1qRgQ5Iq5utNP0TyYEWGQKolpGA1L7updETFfQBw9PY2pu
/w8EkRzd6zermy2cQFYRKgvR081R6jwngJV4UUG2FXH6+bFAK4knpQ1+fT0/2PoD
qIxnB5bEUW00RiJRnKbMCWoepcl4CAQepLdgHa0ofYMNkPsZIi2mR6DBv49HRx9v
P56TRNfXDYW0nyoFRkQKNlMafjWg8ykOUsHVAcC++uicCLebWWHrQWNMEsWQr7Qk
QIg1YLhejYK1Lfiafqnu23xMat2TVS149w4bik9VNhvtIxImOvoXCpU5EpDCq2BG
gCTFHSzb7/kS3yvj1EQQ
=qop5
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Björn Grüning-3
In reply to this post by Eric Rasche
Hi Eric,

please have a look at:

https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml

You need somthing like:
<datatype extension="genbank" type="galaxy.datatypes.data:Text"
subclass="True" />

Lets try to split the EMBOSS datatypes a little bit into small chunks.
E.g. sequences_datatypes, msa_datatypes ... and so on ...

Cheers,
Bjoern


Am 14.07.2014 20:31, schrieb Eric Rasche:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm trying to add a new datatype to my galaxy instance for genbank
> files, however I'm running into various issues. I've followed the
> tutorial (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>
> however that example subclasses tabular, and I'd like to subclass Text
> as they're plain text files, and I'd like to be able to define a sniffer
> for them (not possible if your type=galaxy.datatypes.data:Text)
>
> I figured the call ought to be something like
>
> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
> subclass="True" />
>
> however, everything I try fails with
>
>> Error importing datatype module galaxy.datatypes.data: 'module' object has no attribute 'Genbank'
>
> To avoid this particular issue, I tried writing a separate datatype just
> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
> that fails with the same error:
>
>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing datatype module galaxy.datatypes.genbank: 'module' object has no attribute 'genbank'
>> Traceback (most recent call last):
>>    File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py", line 206, in load_datatypes
>>      module = getattr( module, mod )
>> AttributeError: 'module' object has no attribute 'genbank'
>
> Here's my lib/galaxy/datatypes/genbank.py looks like:
>
>> import pkg_resources
>> pkg_resources.require( "bx-python" )
>> import logging
>> from galaxy.datatypes import data
>> log = logging.getLogger(__name__)
>>
>> class Genbank( data.Text ):
>>      file_ext = "gb"
>>
>>      def sniff( self, filename ):
>>          header = open(filename).read(5)
>>          return header == 'LOCUS'
>
> To debug this, I've tried copying the tabular data type completely,
> removed all the classes other than Tabular, and renamed it "Genbank",
> however this fails too with the same error.
>
> Can anyone offer some insight?
>
> Cheers,
> Eric
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.22 (GNU/Linux)
>
> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
> dLiw3ddVm+BFmczJzRpt
> =4m2j
> -----END PGP SIGNATURE-----
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>    http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>    http://galaxyproject.org/search/mailinglists/
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
Indeed - ideally (once working) we can upload under the IUC ToolShed as a
community maintained resource rather than under a personal account which
becomes a single point of failure (the bus factor).

We (the ICU) have previously discussed doing this so that the EMBOSS
datatypes could become more of a meta-entry depending on other smaller
specific datatype defining ToolShed repositories. But it hasn't reached the
top of my personal TODO list yet ;)

Peter

On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
<[hidden email]> wrote:

> Hi Eric,
>
> please have a look at:
>
> https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>
> You need somthing like:
> <datatype extension="genbank" type="galaxy.datatypes.data:Text"
> subclass="True" />
>
> Lets try to split the EMBOSS datatypes a little bit into small chunks. E.g.
> sequences_datatypes, msa_datatypes ... and so on ...
>
> Cheers,
> Bjoern
>
>
> Am 14.07.2014 20:31, schrieb Eric Rasche:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> I'm trying to add a new datatype to my galaxy instance for genbank
>> files, however I'm running into various issues. I've followed the
>> tutorial
>> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>>
>> however that example subclasses tabular, and I'd like to subclass Text
>> as they're plain text files, and I'd like to be able to define a sniffer
>> for them (not possible if your type=galaxy.datatypes.data:Text)
>>
>> I figured the call ought to be something like
>>
>> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
>> subclass="True" />
>>
>> however, everything I try fails with
>>
>>> Error importing datatype module galaxy.datatypes.data: 'module' object
>>> has no attribute 'Genbank'
>>
>>
>> To avoid this particular issue, I tried writing a separate datatype just
>> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
>> that fails with the same error:
>>
>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing
>>> datatype module galaxy.datatypes.genbank: 'module' object has no attribute
>>> 'genbank'
>>> Traceback (most recent call last):
>>>    File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
>>> line 206, in load_datatypes
>>>      module = getattr( module, mod )
>>> AttributeError: 'module' object has no attribute 'genbank'
>>
>>
>> Here's my lib/galaxy/datatypes/genbank.py looks like:
>>
>>> import pkg_resources
>>> pkg_resources.require( "bx-python" )
>>> import logging
>>> from galaxy.datatypes import data
>>> log = logging.getLogger(__name__)
>>>
>>> class Genbank( data.Text ):
>>>      file_ext = "gb"
>>>
>>>      def sniff( self, filename ):
>>>          header = open(filename).read(5)
>>>          return header == 'LOCUS'
>>
>>
>> To debug this, I've tried copying the tabular data type completely,
>> removed all the classes other than Tabular, and renamed it "Genbank",
>> however this fails too with the same error.
>>
>> Can anyone offer some insight?
>>
>> Cheers,
>> Eric
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v2.0.22 (GNU/Linux)
>>
>> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
>> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
>> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
>> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
>> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
>> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
>> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
>> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
>> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
>> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
>> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
>> dLiw3ddVm+BFmczJzRpt
>> =4m2j
>> -----END PGP SIGNATURE-----
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>    http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>    http://galaxyproject.org/search/mailinglists/
>>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>  http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
Forgive me, I'm not 100% clear on the custom plugin system used by galaxy, but if I "subclass" from the text data type, will sniffers I implement override text's and function? The lack of being able to add an entry to the sniffer section (unlike with the tabular example) led me to believe my genbank datatype wouldn't be sniffed.

Additionally, I'd still like to be able to add completely new datatypes, do you know of any working examples of this? As mentioned in my original post, duplicating an existing datatype and changing names on it surprisingly doesn't work.

I'd be lovely to have the emboss datatypes split out.

Cheers,
Eric

On July 16, 2014 8:34:55 AM CDT, Peter Cock <[hidden email]> wrote:
Indeed - ideally (once working) we can upload under the IUC ToolShed as a
community maintained resource rather than under a personal account which
becomes a single point of failure (the bus factor).

We (the ICU) have previously discussed doing this so that the EMBOSS
datatypes could become more of a meta-entry depending on other smaller
specific datatype defining ToolShed repositories. But it hasn't reached the
top of my personal TODO list yet ;)

Peter

On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
<[hidden email]> wrote:
Hi Eric,

please have a look at:

https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml

You need somthing like:
<datatype extension="genbank" type="galaxy.datatypes.data:Text"
subclass="True" />

Lets try to split the EMBOSS datatypes a little bit into small chunks. E.g.
sequences_datatypes, msa_datatypes ... and so on ...

Cheers,
Bjoern


Am 14.07.2014 20:31, schrieb Eric Rasche:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to add a new datatype to my galaxy instance for genbank
files, however I'm running into various issues. I've followed the
tutorial
(https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)

however that example subclasses tabular, and I'd like to subclass Text
as they're plain text files, and I'd like to be able to define a sniffer
for them (not possible if your type=galaxy.datatypes.data:Text)

I figured the call ought to be something like

<datatype extension="gb" type="galaxy.datatypes.data:Genbank"
subclass="True" />

however, everything I try fails with

Error importing datatype module galaxy.datatypes.data: 'module' object
has no attribute 'Genbank'


To avoid this particular issue, I tried writing a separate datatype just
for genbank files (type="galaxy.datatypes.genbank:Genbank" ), however
that fails with the same error:

galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error importing
datatype module galaxy.datatypes.genbank: 'module' object has no attribute
'genbank'
Traceback (most recent call last):
File "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
line 206, in load_datatypes
module = getattr( module, mod )
AttributeError: 'module' object has no attribute 'genbank'


Here's my lib/galaxy/datatypes/genbank.py looks like:

import pkg_resources
pkg_resources.require( "bx-python" )
import logging
from galaxy.datatypes import data
log = logging.getLogger(__name__)

class Genbank( data.Text ):
file_ext = "gb"

def sniff( self, filename ):
header = open(filename).read(5)
return header == 'LOCUS'


To debug this, I've tried copying the tabular data type completely,
removed all the classes other than Tabular, and renamed it "Genbank",
however this fails too with the same error.

Can anyone offer some insight?

Cheers,
Eric
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
dLiw3ddVm+BFmczJzRpt
=4m2j
-----END PGP SIGNATURE-----


Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/



Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Björn Grüning-3
Hi Eric,

> Forgive me, I'm not 100% clear on the custom plugin system used by galaxy, but if I "subclass" from the text data type, will sniffers I implement override text's and function? The lack of being able to add an entry to the sniffer section (unlike with the tabular example) led me to believe my genbank datatype wouldn't be sniffed.

Thats true, if you want to override functions, you need to subclass it
on a python level not on the XML level.

> Additionally, I'd still like to be able to add completely new datatypes, do you know of any working examples of this? As mentioned in my original post, duplicating an existing datatype and changing names on it surprisingly doesn't work.

https://github.com/bgruening/galaxytools/tree/master/datatypes/msa_datatypes
https://github.com/bgruening/galaxytools/blob/master/chemicaltoolbox/datatypes/datatypes_conf.xml

Is that enough, to get started?

> I'd be lovely to have the emboss datatypes split out.

Ok, than lets start :)
I will try to fork emboss into my galaxytools/datatypes repository and
try to split them. You will get commit access and can improve your
genbank datatype (and a few more ;)). Finally, we will talk to the
devteam to rewrite EMBOSS to depend on our separate data type
repositories. OK?

Ciao,
Bjoenr

> Cheers,
> Eric
>
> On July 16, 2014 8:34:55 AM CDT, Peter Cock <[hidden email]> wrote:
>> Indeed - ideally (once working) we can upload under the IUC ToolShed as
>> a
>> community maintained resource rather than under a personal account
>> which
>> becomes a single point of failure (the bus factor).
>>
>> We (the ICU) have previously discussed doing this so that the EMBOSS
>> datatypes could become more of a meta-entry depending on other smaller
>> specific datatype defining ToolShed repositories. But it hasn't reached
>> the
>> top of my personal TODO list yet ;)
>>
>> Peter
>>
>> On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
>> <[hidden email]> wrote:
>>> Hi Eric,
>>>
>>> please have a look at:
>>>
>>>
>> https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>>>
>>> You need somthing like:
>>> <datatype extension="genbank" type="galaxy.datatypes.data:Text"
>>> subclass="True" />
>>>
>>> Lets try to split the EMBOSS datatypes a little bit into small
>> chunks. E.g.
>>> sequences_datatypes, msa_datatypes ... and so on ...
>>>
>>> Cheers,
>>> Bjoern
>>>
>>>
>>> Am 14.07.2014 20:31, schrieb Eric Rasche:
>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> I'm trying to add a new datatype to my galaxy instance for genbank
>>>> files, however I'm running into various issues. I've followed the
>>>> tutorial
>>>> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>>>>
>>>> however that example subclasses tabular, and I'd like to subclass
>> Text
>>>> as they're plain text files, and I'd like to be able to define a
>> sniffer
>>>> for them (not possible if your type=galaxy.datatypes.data:Text)
>>>>
>>>> I figured the call ought to be something like
>>>>
>>>> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
>>>> subclass="True" />
>>>>
>>>> however, everything I try fails with
>>>>
>>>>> Error importing datatype module galaxy.datatypes.data: 'module'
>> object
>>>>> has no attribute 'Genbank'
>>>>
>>>>
>>>> To avoid this particular issue, I tried writing a separate datatype
>> just
>>>> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
>>>> that fails with the same error:
>>>>
>>>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error
>> importing
>>>>> datatype module galaxy.datatypes.genbank: 'module' object has no
>> attribute
>>>>> 'genbank'
>>>>> Traceback (most recent call last):
>>>>>     File
>> "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
>>>>> line 206, in load_datatypes
>>>>>       module = getattr( module, mod )
>>>>> AttributeError: 'module' object has no attribute 'genbank'
>>>>
>>>>
>>>> Here's my lib/galaxy/datatypes/genbank.py looks like:
>>>>
>>>>> import pkg_resources
>>>>> pkg_resources.require( "bx-python" )
>>>>> import logging
>>>>> from galaxy.datatypes import data
>>>>> log = logging.getLogger(__name__)
>>>>>
>>>>> class Genbank( data.Text ):
>>>>>       file_ext = "gb"
>>>>>
>>>>>       def sniff( self, filename ):
>>>>>           header = open(filename).read(5)
>>>>>           return header == 'LOCUS'
>>>>
>>>>
>>>> To debug this, I've tried copying the tabular data type completely,
>>>> removed all the classes other than Tabular, and renamed it
>> "Genbank",
>>>> however this fails too with the same error.
>>>>
>>>> Can anyone offer some insight?
>>>>
>>>> Cheers,
>>>> Eric
>>>> -----BEGIN PGP SIGNATURE-----
>>>> Version: GnuPG v2.0.22 (GNU/Linux)
>>>>
>>>> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
>>>> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
>>>> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
>>>> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
>>>> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
>>>> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
>>>> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
>>>> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
>>>> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
>>>> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
>>>> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
>>>> dLiw3ddVm+BFmczJzRpt
>>>> =4m2j
>>>> -----END PGP SIGNATURE-----
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>>     http://lists.bx.psu.edu/
>>>>
>>>> To search Galaxy mailing lists use the unified search at:
>>>>     http://galaxyproject.org/search/mailinglists/
>>>>
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>   http://lists.bx.psu.edu/
>>>
>>> To search Galaxy mailing lists use the unified search at:
>>>   http://galaxyproject.org/search/mailinglists/
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Björn,

On 07/16/2014 09:20 AM, Björn Grüning wrote:

> Hi Eric,
>
>> Forgive me, I'm not 100% clear on the custom plugin system used by
>> galaxy, but if I "subclass" from the text data type, will sniffers I
>> implement override text's and function? The lack of being able to add
>> an entry to the sniffer section (unlike with the tabular example) led
>> me to believe my genbank datatype wouldn't be sniffed.
>
> Thats true, if you want to override functions, you need to subclass it
> on a python level not on the XML level.

Okay, good, as I figured then.

>> Additionally, I'd still like to be able to add completely new
>> datatypes, do you know of any working examples of this? As mentioned
>> in my original post, duplicating an existing datatype and changing
>> names on it surprisingly doesn't work.
>
> https://github.com/bgruening/galaxytools/tree/master/datatypes/msa_datatypes
>
> https://github.com/bgruening/galaxytools/blob/master/chemicaltoolbox/datatypes/datatypes_conf.xml

Those absolutely should be, thank you.

I'll probably strip them down and post a minimal working example, and
code to the wiki as well, for future reference.

> Is that enough, to get started?
>
>> I'd be lovely to have the emboss datatypes split out.
>
> Ok, than lets start :)
> I will try to fork emboss into my galaxytools/datatypes repository and
> try to split them. You will get commit access and can improve your
> genbank datatype (and a few more ;)). Finally, we will talk to the
> devteam to rewrite EMBOSS to depend on our separate data type
> repositories. OK?

Ja, sounds good! Happy to help.

>
> Ciao,
> Bjoenr
>
>> Cheers,
>> Eric
>>
>> On July 16, 2014 8:34:55 AM CDT, Peter Cock
>> <[hidden email]> wrote:
>>> Indeed - ideally (once working) we can upload under the IUC ToolShed as
>>> a
>>> community maintained resource rather than under a personal account
>>> which
>>> becomes a single point of failure (the bus factor).
>>>
>>> We (the ICU) have previously discussed doing this so that the EMBOSS
>>> datatypes could become more of a meta-entry depending on other smaller
>>> specific datatype defining ToolShed repositories. But it hasn't reached
>>> the
>>> top of my personal TODO list yet ;)
>>>
>>> Peter
>>>
>>> On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
>>> <[hidden email]> wrote:
>>>> Hi Eric,
>>>>
>>>> please have a look at:
>>>>
>>>>
>>> https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>>>
>>>>
>>>> You need somthing like:
>>>> <datatype extension="genbank" type="galaxy.datatypes.data:Text"
>>>> subclass="True" />
>>>>
>>>> Lets try to split the EMBOSS datatypes a little bit into small
>>> chunks. E.g.
>>>> sequences_datatypes, msa_datatypes ... and so on ...
>>>>
>>>> Cheers,
>>>> Bjoern
>>>>
>>>>
>>>> Am 14.07.2014 20:31, schrieb Eric Rasche:
>>>>
> I'm trying to add a new datatype to my galaxy instance for genbank
> files, however I'm running into various issues. I've followed the
> tutorial
> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>
> however that example subclasses tabular, and I'd like to subclass
>>>> Text
> as they're plain text files, and I'd like to be able to define a
>>>> sniffer
> for them (not possible if your type=galaxy.datatypes.data:Text)
>
> I figured the call ought to be something like
>
> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
> subclass="True" />
>
> however, everything I try fails with
>
>>>>>>> Error importing datatype module galaxy.datatypes.data: 'module'
>>>> object
>>>>>>> has no attribute 'Genbank'
>
>
> To avoid this particular issue, I tried writing a separate datatype
>>>> just
> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
> that fails with the same error:
>
>>>>>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error
>>>> importing
>>>>>>> datatype module galaxy.datatypes.genbank: 'module' object has no
>>>> attribute
>>>>>>> 'genbank'
>>>>>>> Traceback (most recent call last):
>>>>>>>     File
>>>> "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
>>>>>>> line 206, in load_datatypes
>>>>>>>       module = getattr( module, mod )
>>>>>>> AttributeError: 'module' object has no attribute 'genbank'
>
>
> Here's my lib/galaxy/datatypes/genbank.py looks like:
>
>>>>>>> import pkg_resources
>>>>>>> pkg_resources.require( "bx-python" )
>>>>>>> import logging
>>>>>>> from galaxy.datatypes import data
>>>>>>> log = logging.getLogger(__name__)
>>>>>>>
>>>>>>> class Genbank( data.Text ):
>>>>>>>       file_ext = "gb"
>>>>>>>
>>>>>>>       def sniff( self, filename ):
>>>>>>>           header = open(filename).read(5)
>>>>>>>           return header == 'LOCUS'
>
>
> To debug this, I've tried copying the tabular data type completely,
> removed all the classes other than Tabular, and renamed it
>>>> "Genbank",
> however this fails too with the same error.
>
> Can anyone offer some insight?
>
> Cheers,
> Eric
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this
>>>>> and other Galaxy lists, please use the interface at:
>>>>>     http://lists.bx.psu.edu/
>>>>>
>>>>> To search Galaxy mailing lists use the unified search at:
>>>>>     http://galaxyproject.org/search/mailinglists/
>>>>>
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>>   http://lists.bx.psu.edu/
>>>>
>>>> To search Galaxy mailing lists use the unified search at:
>>>>   http://galaxyproject.org/search/mailinglists/
>>

- --
Eric Rasche
Programmer II
Center for Phage Technology
Texas A&M University
College Station, TX 77843
404-692-2048
[hidden email]
[hidden email]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTxo7pAAoJEMqDXdrsMcpVKnsP/3Oqaux9jdJobAuCL4py5wW9
Kbxe5Io5ua8ZhrDUj4qeQsYzNPt9bHKuYWK1VHz+Jf8ZPZDW6a/hHTf6mTqfXrzX
mDUHtf9j6xEW+ye1JTL7QsOImxl7JRDvI0MVQOkQt8C8QZTWu+pjXLMrVd/QygGL
AntX/1ngmEYDKxwPAagD+P1bUxwNalZ96FE9qIubL5GLjFn7yuG6fBE98/40UM27
x4bHdRB+svUTjCiH/E7MKZGN2OEL8H6QHOnl7rfA70Z2SezNr7Ivzgb+pVStyVum
QF2/g8C0dUooYaM2hZhosOM6mLKSNH2NAIsRumfIQMDoZBMxlQE6iD0550tORvSf
1MP1T2B0jqPckca30udDZ7qtksB0u/QJgLFunZ26uQZE/B/jKoL/FjNsNIFgxCto
EdIab40rH7ysnFjbiLV8AiSjgDV0V8VCDjxNbBZRjwy34RP2ZN4Ggew+vyGVv3iH
28biFIhbxVWmYccDYGWVYaqw2wdUdk4l9j0OvlCaqHH3fXsPyhpAWivmVzwc3kGz
/Wyj3KYMEySiJ++Wkw1H29QH0wyEpbh89dX2ULQjqJV7qibWUM7KZtFiizEr55tP
QI0KhJzGe/NQ4CzP1szl/mbxjBUr3cRGXF1dausVu1hKgjhb2QJBJKw9nFvbTnKw
BKOkNRQ/vkr3aHL1IrVE
=AP7s
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

John Chilton-4
In reply to this post by Björn Grüning-3
Is this going to work? I get that this would be a better design if
done from the beginning, but what happens if you install an emboss
repository upgrade (on an existing install) that brings in conflicting
types from other repositories that already exist and have been
previously installed? Does the tool shed have a mechanism to handle
that?

-John

On Wed, Jul 16, 2014 at 9:20 AM, Björn Grüning
<[hidden email]> wrote:

> Hi Eric,
>
>
>> Forgive me, I'm not 100% clear on the custom plugin system used by galaxy,
>> but if I "subclass" from the text data type, will sniffers I implement
>> override text's and function? The lack of being able to add an entry to the
>> sniffer section (unlike with the tabular example) led me to believe my
>> genbank datatype wouldn't be sniffed.
>
>
> Thats true, if you want to override functions, you need to subclass it on a
> python level not on the XML level.
>
>
>> Additionally, I'd still like to be able to add completely new datatypes,
>> do you know of any working examples of this? As mentioned in my original
>> post, duplicating an existing datatype and changing names on it surprisingly
>> doesn't work.
>
>
> https://github.com/bgruening/galaxytools/tree/master/datatypes/msa_datatypes
> https://github.com/bgruening/galaxytools/blob/master/chemicaltoolbox/datatypes/datatypes_conf.xml
>
> Is that enough, to get started?
>
>
>> I'd be lovely to have the emboss datatypes split out.
>
>
> Ok, than lets start :)
> I will try to fork emboss into my galaxytools/datatypes repository and try
> to split them. You will get commit access and can improve your genbank
> datatype (and a few more ;)). Finally, we will talk to the devteam to
> rewrite EMBOSS to depend on our separate data type repositories. OK?
>
> Ciao,
> Bjoenr
>
>> Cheers,
>> Eric
>>
>> On July 16, 2014 8:34:55 AM CDT, Peter Cock <[hidden email]>
>> wrote:
>>>
>>> Indeed - ideally (once working) we can upload under the IUC ToolShed as
>>> a
>>> community maintained resource rather than under a personal account
>>> which
>>> becomes a single point of failure (the bus factor).
>>>
>>> We (the ICU) have previously discussed doing this so that the EMBOSS
>>> datatypes could become more of a meta-entry depending on other smaller
>>> specific datatype defining ToolShed repositories. But it hasn't reached
>>> the
>>> top of my personal TODO list yet ;)
>>>
>>> Peter
>>>
>>> On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
>>> <[hidden email]> wrote:
>>>>
>>>> Hi Eric,
>>>>
>>>> please have a look at:
>>>>
>>>>
>>>
>>> https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>>>>
>>>>
>>>> You need somthing like:
>>>> <datatype extension="genbank" type="galaxy.datatypes.data:Text"
>>>> subclass="True" />
>>>>
>>>> Lets try to split the EMBOSS datatypes a little bit into small
>>>
>>> chunks. E.g.
>>>>
>>>> sequences_datatypes, msa_datatypes ... and so on ...
>>>>
>>>> Cheers,
>>>> Bjoern
>>>>
>>>>
>>>> Am 14.07.2014 20:31, schrieb Eric Rasche:
>>>>
>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>> Hash: SHA1
>>>>>
>>>>> I'm trying to add a new datatype to my galaxy instance for genbank
>>>>> files, however I'm running into various issues. I've followed the
>>>>> tutorial
>>>>> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>>>>>
>>>>> however that example subclasses tabular, and I'd like to subclass
>>>
>>> Text
>>>>>
>>>>> as they're plain text files, and I'd like to be able to define a
>>>
>>> sniffer
>>>>>
>>>>> for them (not possible if your type=galaxy.datatypes.data:Text)
>>>>>
>>>>> I figured the call ought to be something like
>>>>>
>>>>> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
>>>>> subclass="True" />
>>>>>
>>>>> however, everything I try fails with
>>>>>
>>>>>> Error importing datatype module galaxy.datatypes.data: 'module'
>>>
>>> object
>>>>>>
>>>>>> has no attribute 'Genbank'
>>>>>
>>>>>
>>>>>
>>>>> To avoid this particular issue, I tried writing a separate datatype
>>>
>>> just
>>>>>
>>>>> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
>>>>> that fails with the same error:
>>>>>
>>>>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error
>>>
>>> importing
>>>>>>
>>>>>> datatype module galaxy.datatypes.genbank: 'module' object has no
>>>
>>> attribute
>>>>>>
>>>>>> 'genbank'
>>>>>> Traceback (most recent call last):
>>>>>>     File
>>>
>>> "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
>>>>>>
>>>>>> line 206, in load_datatypes
>>>>>>       module = getattr( module, mod )
>>>>>> AttributeError: 'module' object has no attribute 'genbank'
>>>>>
>>>>>
>>>>>
>>>>> Here's my lib/galaxy/datatypes/genbank.py looks like:
>>>>>
>>>>>> import pkg_resources
>>>>>> pkg_resources.require( "bx-python" )
>>>>>> import logging
>>>>>> from galaxy.datatypes import data
>>>>>> log = logging.getLogger(__name__)
>>>>>>
>>>>>> class Genbank( data.Text ):
>>>>>>       file_ext = "gb"
>>>>>>
>>>>>>       def sniff( self, filename ):
>>>>>>           header = open(filename).read(5)
>>>>>>           return header == 'LOCUS'
>>>>>
>>>>>
>>>>>
>>>>> To debug this, I've tried copying the tabular data type completely,
>>>>> removed all the classes other than Tabular, and renamed it
>>>
>>> "Genbank",
>>>>>
>>>>> however this fails too with the same error.
>>>>>
>>>>> Can anyone offer some insight?
>>>>>
>>>>> Cheers,
>>>>> Eric
>>>>> -----BEGIN PGP SIGNATURE-----
>>>>> Version: GnuPG v2.0.22 (GNU/Linux)
>>>>>
>>>>> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
>>>>> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
>>>>> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
>>>>> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
>>>>> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
>>>>> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
>>>>> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
>>>>> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
>>>>> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
>>>>> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
>>>>> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
>>>>> dLiw3ddVm+BFmczJzRpt
>>>>> =4m2j
>>>>> -----END PGP SIGNATURE-----
>>>>> ___________________________________________________________
>>>>>
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this
>>>>> and other Galaxy lists, please use the interface at:
>>>>>     http://lists.bx.psu.edu/
>>>>>
>>>>> To search Galaxy mailing lists use the unified search at:
>>>>>     http://galaxyproject.org/search/mailinglists/
>>>>>
>>>> ___________________________________________________________
>>>>
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>>   http://lists.bx.psu.edu/
>>>>
>>>> To search Galaxy mailing lists use the unified search at:
>>>>   http://galaxyproject.org/search/mailinglists/
>>
>>
> ___________________________________________________________
>
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>  http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Greg Von Kuster
Assuming this comment:

>> Finally, we will talk to the devteam to
rewrite EMBOSS to depend on our separate data type repositories.

refers to the emboss_5 repository owned by devteam, then what is being proposed should work (although I may not be fully understanding what is being proposed).  

If the emboss datatypes are split up and the emboss_5 repository's repository_dependencies.xml file is altered, then  a new installable revision of the emboss_5 repository will be created.  This implies that any previous installation cannot be updated to include the split up emboss datatypes.  Instead, a new installation of the emboss_5 repository will be required.  This new installation may depend on emboss datatypes that conflict with those in the older emboss_5 installation, and the 2nd version of the conflicting datatypes will not be loaded into the Galaxy datatypes registry.  However, if the datatypes are the same, this shouldn't be a problem since the 1st version will have been loaded.


On Jul 16, 2014, at 4:52 PM, John Chilton <[hidden email]> wrote:

> Is this going to work? I get that this would be a better design if
> done from the beginning, but what happens if you install an emboss
> repository upgrade (on an existing install) that brings in conflicting
> types from other repositories that already exist and have been
> previously installed? Does the tool shed have a mechanism to handle
> that?
>
> -John
>
> On Wed, Jul 16, 2014 at 9:20 AM, Björn Grüning
> <[hidden email]> wrote:
>> Hi Eric,
>>
>>
>>> Forgive me, I'm not 100% clear on the custom plugin system used by galaxy,
>>> but if I "subclass" from the text data type, will sniffers I implement
>>> override text's and function? The lack of being able to add an entry to the
>>> sniffer section (unlike with the tabular example) led me to believe my
>>> genbank datatype wouldn't be sniffed.
>>
>>
>> Thats true, if you want to override functions, you need to subclass it on a
>> python level not on the XML level.
>>
>>
>>> Additionally, I'd still like to be able to add completely new datatypes,
>>> do you know of any working examples of this? As mentioned in my original
>>> post, duplicating an existing datatype and changing names on it surprisingly
>>> doesn't work.
>>
>>
>> https://github.com/bgruening/galaxytools/tree/master/datatypes/msa_datatypes
>> https://github.com/bgruening/galaxytools/blob/master/chemicaltoolbox/datatypes/datatypes_conf.xml
>>
>> Is that enough, to get started?
>>
>>
>>> I'd be lovely to have the emboss datatypes split out.
>>
>>
>> Ok, than lets start :)
>> I will try to fork emboss into my galaxytools/datatypes repository and try
>> to split them. You will get commit access and can improve your genbank
>> datatype (and a few more ;)). Finally, we will talk to the devteam to
>> rewrite EMBOSS to depend on our separate data type repositories. OK?
>>
>> Ciao,
>> Bjoenr
>>
>>> Cheers,
>>> Eric
>>>
>>> On July 16, 2014 8:34:55 AM CDT, Peter Cock <[hidden email]>
>>> wrote:
>>>>
>>>> Indeed - ideally (once working) we can upload under the IUC ToolShed as
>>>> a
>>>> community maintained resource rather than under a personal account
>>>> which
>>>> becomes a single point of failure (the bus factor).
>>>>
>>>> We (the ICU) have previously discussed doing this so that the EMBOSS
>>>> datatypes could become more of a meta-entry depending on other smaller
>>>> specific datatype defining ToolShed repositories. But it hasn't reached
>>>> the
>>>> top of my personal TODO list yet ;)
>>>>
>>>> Peter
>>>>
>>>> On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
>>>> <[hidden email]> wrote:
>>>>>
>>>>> Hi Eric,
>>>>>
>>>>> please have a look at:
>>>>>
>>>>>
>>>>
>>>> https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>>>>>
>>>>>
>>>>> You need somthing like:
>>>>> <datatype extension="genbank" type="galaxy.datatypes.data:Text"
>>>>> subclass="True" />
>>>>>
>>>>> Lets try to split the EMBOSS datatypes a little bit into small
>>>>
>>>> chunks. E.g.
>>>>>
>>>>> sequences_datatypes, msa_datatypes ... and so on ...
>>>>>
>>>>> Cheers,
>>>>> Bjoern
>>>>>
>>>>>
>>>>> Am 14.07.2014 20:31, schrieb Eric Rasche:
>>>>>
>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>> Hash: SHA1
>>>>>>
>>>>>> I'm trying to add a new datatype to my galaxy instance for genbank
>>>>>> files, however I'm running into various issues. I've followed the
>>>>>> tutorial
>>>>>> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>>>>>>
>>>>>> however that example subclasses tabular, and I'd like to subclass
>>>>
>>>> Text
>>>>>>
>>>>>> as they're plain text files, and I'd like to be able to define a
>>>>
>>>> sniffer
>>>>>>
>>>>>> for them (not possible if your type=galaxy.datatypes.data:Text)
>>>>>>
>>>>>> I figured the call ought to be something like
>>>>>>
>>>>>> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
>>>>>> subclass="True" />
>>>>>>
>>>>>> however, everything I try fails with
>>>>>>
>>>>>>> Error importing datatype module galaxy.datatypes.data: 'module'
>>>>
>>>> object
>>>>>>>
>>>>>>> has no attribute 'Genbank'
>>>>>>
>>>>>>
>>>>>>
>>>>>> To avoid this particular issue, I tried writing a separate datatype
>>>>
>>>> just
>>>>>>
>>>>>> for genbank files (type="galaxy.datatypes.genbank:Genbank"), however
>>>>>> that fails with the same error:
>>>>>>
>>>>>>> galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100 Error
>>>>
>>>> importing
>>>>>>>
>>>>>>> datatype module galaxy.datatypes.genbank: 'module' object has no
>>>>
>>>> attribute
>>>>>>>
>>>>>>> 'genbank'
>>>>>>> Traceback (most recent call last):
>>>>>>>   File
>>>>
>>>> "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py",
>>>>>>>
>>>>>>> line 206, in load_datatypes
>>>>>>>     module = getattr( module, mod )
>>>>>>> AttributeError: 'module' object has no attribute 'genbank'
>>>>>>
>>>>>>
>>>>>>
>>>>>> Here's my lib/galaxy/datatypes/genbank.py looks like:
>>>>>>
>>>>>>> import pkg_resources
>>>>>>> pkg_resources.require( "bx-python" )
>>>>>>> import logging
>>>>>>> from galaxy.datatypes import data
>>>>>>> log = logging.getLogger(__name__)
>>>>>>>
>>>>>>> class Genbank( data.Text ):
>>>>>>>     file_ext = "gb"
>>>>>>>
>>>>>>>     def sniff( self, filename ):
>>>>>>>         header = open(filename).read(5)
>>>>>>>         return header == 'LOCUS'
>>>>>>
>>>>>>
>>>>>>
>>>>>> To debug this, I've tried copying the tabular data type completely,
>>>>>> removed all the classes other than Tabular, and renamed it
>>>>
>>>> "Genbank",
>>>>>>
>>>>>> however this fails too with the same error.
>>>>>>
>>>>>> Can anyone offer some insight?
>>>>>>
>>>>>> Cheers,
>>>>>> Eric
>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>> Version: GnuPG v2.0.22 (GNU/Linux)
>>>>>>
>>>>>> iQIcBAEBAgAGBQJTxCHwAAoJEMqDXdrsMcpVmbsQAJ3eFIhZtZmVP9LCz/F9Ywg/
>>>>>> 148NJZy4lmxZU0KScJlc8kVDCDSADXIHd0Db/kpJwuUKEX7zei9q2uXfO7sWl3yt
>>>>>> yxrFEdtX/a5SMVsa6F5WZuKwBs0zfvfsnIUoraOgh6nXeJnr53l9mYeWaKB6bi3Z
>>>>>> xAlgJG/kdIR1jRjAimuQf4vMjNgtDQPOmotYBQTytbhsV6/nRzGI8RZAYwQ7GnVs
>>>>>> XYOWFyhzrBgALndVI3BjI21rbRqguhrqr2t7i0Ma7Pp2JmAnNjmUaq70NN3Rueh6
>>>>>> DvnTtxInM1dVOQY+Yam6MCMmAedV1cG+rNGdpP2l82MajQAsMtbXckBXXKcSgyTq
>>>>>> WCFoLVURYO1tHkWyq4ikamfFDHtJp1DogBYhUiPMyRw+CV+3sOvr0U5DcyRdiDsJ
>>>>>> Xcm3ygqYVLGwauNmuN3yGcQcnfypDOOeFs1lppbNe3lw0w3ikZN4Zmu1ec5s1ITK
>>>>>> MEcgBrGYgZrKDRXkx53lnABGpv6mYflYpag7fguDNL8j0lh9beaaNmHr4tmeEcug
>>>>>> VZ1b1EWoLMj/ikJ/vZcluiHPTSTheiAP8Ttvh1WAayq4rKwVtZygaI9IDauqqBQ1
>>>>>> Dgotes3vcomlTQXDUEZACyOZDxl7wbAUh0LZVaa2fYNIOoPNPOItUFSjf6YveF88
>>>>>> dLiw3ddVm+BFmczJzRpt
>>>>>> =4m2j
>>>>>> -----END PGP SIGNATURE-----
>>>>>> ___________________________________________________________
>>>>>>
>>>>>> Please keep all replies on the list by using "reply all"
>>>>>> in your mail client.  To manage your subscriptions to this
>>>>>> and other Galaxy lists, please use the interface at:
>>>>>>   http://lists.bx.psu.edu/
>>>>>>
>>>>>> To search Galaxy mailing lists use the unified search at:
>>>>>>   http://galaxyproject.org/search/mailinglists/
>>>>>>
>>>>> ___________________________________________________________
>>>>>
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this
>>>>> and other Galaxy lists, please use the interface at:
>>>>> http://lists.bx.psu.edu/
>>>>>
>>>>> To search Galaxy mailing lists use the unified search at:
>>>>> http://galaxyproject.org/search/mailinglists/
>>>
>>>
>> ___________________________________________________________
>>
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>> http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>> http://galaxyproject.org/search/mailinglists/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/mailinglists/
>


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
In reply to this post by Eric Rasche
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

For those reading this thread from the future, there's a secret to
adding completely new datatypes locally (and not through a toolshed).

You have to manually edit lib/galaxy/datatypes/registry.py and import
the module you've written at the top of the file.

For instance, if you add a new "gbk.py" datatype, you'll need to add
"import gbk" to the top of registry.py. This will cause your errors to
go away and your datatype to be loaded on startup.

Thanks to John Chilton for answering this on IRC.

Cheers,
Eric

On 07/16/2014 09:02 AM, Eric Rasche wrote:

> Forgive me, I'm not 100% clear on the custom plugin system used by
> galaxy, but if I "subclass" from the text data type, will sniffers I
> implement override text's and function? The lack of being able to add an
> entry to the sniffer section (unlike with the tabular example) led me to
> believe my genbank datatype wouldn't be sniffed.
>
> Additionally, I'd still like to be able to add completely new datatypes,
> do you know of any working examples of this? As mentioned in my original
> post, duplicating an existing datatype and changing names on it
> surprisingly doesn't work.
>
> I'd be lovely to have the emboss datatypes split out.
>
> Cheers,
> Eric
>
> On July 16, 2014 8:34:55 AM CDT, Peter Cock <[hidden email]>
> wrote:
>
>     Indeed - ideally (once working) we can upload under the IUC ToolShed as a
>     community maintained resource rather than under a personal account which
>     becomes a single point of failure (the bus factor).
>
>     We (the ICU) have previously discussed doing this so that the EMBOSS
>     datatypes could become more of a meta-entry depending on other smaller
>     specific datatype defining ToolShed repositories. But it hasn't reached the
>     top of my personal TODO list yet ;)
>
>     Peter
>
>     On Wed, Jul 16, 2014 at 1:47 PM, Björn Grüning
>     <[hidden email]> wrote:
>
>         Hi Eric,
>
>         please have a look at:
>
>         https://github.com/bgruening/galaxytools/blob/master/datatypes/msa_datatypes/datatypes_conf.xml
>
>         You need somthing like:
>         <datatype extension="genbank" type="galaxy.datatypes.data:Text"
>         subclass="True" />
>
>         Lets try to split the EMBOSS datatypes a little bit into small
>         chunks. E.g.
>         sequences_datatypes, msa_datatypes ... and so on ...
>
>         Cheers,
>         Bjoern
>
>
>         Am 14.07.2014 20:31, schrieb Eric Rasche:
>
> I'm trying to add a new datatype to my galaxy instance for
> genbank
> files, however I'm running into various issues. I've
> followed the
> tutorial
> (https://wiki.galaxyproject.org/Admin/Datatypes/Adding%20Datatypes)
>
> however that example subclasses tabular, and I'd like to
> subclass Text
> as they're plain text files, and I'd like to be able to
> define a sniffer
> for them (not possible if your type=galaxy.datatypes.data:Text)
>
> I figured the call ought to be something like
>
> <datatype extension="gb" type="galaxy.datatypes.data:Genbank"
> subclass="True" />
>
> however, everything I try fails with
>
>     Error importing datatype module galaxy.datatypes.data:
>     'module' object
>     has no attribute 'Genbank'
>
>
>
> To avoid this particular issue, I tried writing a separate
> datatype just
> for genbank files (type="galaxy.datatypes.genbank:Genbank"),
> however
> that fails with the same error:
>
>     galaxy.datatypes.registry ERROR 2014-07-14 13:23:23,100
>     Error importing
>     datatype module galaxy.datatypes.genbank: 'module'
>     object has no attribute
>     'genbank'
>     Traceback (most recent call last):
>     File
>     "/home/hxr/work/galaxy-central/lib/galaxy/datatypes/registry.py
>     <http://registry.py>",
>     line 206, in load_datatypes
>     module = getattr( module, mod )
>     AttributeError: 'module' object has no attribute 'genbank'
>
>
>
> Here's my lib/galaxy/datatypes/genbank.py
> <http://genbank.py> looks like:
>
>     import pkg_resources
>     pkg_resources.require( "bx-python" )
>     import logging
>     from galaxy.datatypes import data
>     log = logging.getLogger(__name__)
>
>     class Genbank( data.Text ):
>     file_ext = "gb"
>
>     def sniff( self, filename ):
>     header = open(filename).read(5)
>     return header == 'LOCUS'
>
>
>
> To debug this, I've tried copying the tabular data type
> completely,
> removed all the classes other than Tabular, and renamed it
> "Genbank",
> however this fails too with the same error.
>
> Can anyone offer some insight?
>
> Cheers,
> Eric
>             ------------------------------------------------------------------------
>
>             Please keep all replies on the list by using "reply all"
>             in your mail client. To manage your subscriptions to this
>             and other Galaxy lists, please use the interface at:
>             http://lists.bx.psu.edu/
>
>             To search Galaxy mailing lists use the unified search at:
>             http://galaxyproject.org/search/mailinglists/
>
>
>         ------------------------------------------------------------------------
>
>         Please keep all replies on the list by using "reply all"
>         in your mail client. To manage your subscriptions to this
>         and other Galaxy lists, please use the interface at:
>         http://lists.bx.psu.edu/
>
>         To search Galaxy mailing lists use the unified search at:
>         http://galaxyproject.org/search/mailinglists/
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTx+w/AAoJEMqDXdrsMcpVXqgP/0IaiufJwoP5gKS1suQ8fLJz
U/V/9ysgZsW0NUfZR7sCuPP/h6x+HlhRM41IweoYwqDI5qJHClrDIHahYNM4rJ76
OyP1qgpQdlZE8R/kveKRUIEh1YpzAHsIZlFUAnuuFrEeJN2QGrmffsuDEQ/E5AoS
tvLxcFrJ1gY45KhfhUr9OLgsTX1pt30jlgswzlG7I6ii3hmWgex/EKh+Xf0CRJHD
fIS0qc3RNzrxvUmfFtXlFLn6WM/ZbJyLMB4qE8B2S2hLvIQa14KlsziCs9n13GtW
qr0o+6E05LpqbKYCFvINEbasyxjVpFKoccRYWsZNu8UP3taiyw14COTgqvlnyXJQ
QlM7a8NlmG2wnOpuwY2uEnqbAKeaUbtawz0jIlRGbVs4x7TkC/O8UrL8VTcqOt+0
s5Ix2Rf5qevt5jKIvLxHxjwXvP3mP8gZSWJjMG31Kq3vQjErNn/bczb3WKgfVCW7
h39bjt0nALam5bLcHcCvzS39/ea0M7NlvJqUA1b/a/ViqIxru3IPL927fWsvACe/
1Cfep6gFc/tmJHZM8hZEtgiOnh8pqkGOiuEevM4NAaBLWsrT1a/oijq9xQEyoG2+
vEyFmzGF1DmqELRdh97AD7MWqlZSB3V+TfsuEboF+67sB1p0MLv5HQthtP63k9eH
0xstC2V6X0LHoTMUDiwa
=ClA/
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

John Chilton-4
In reply to this post by Greg Von Kuster

Even more out of office than normal so maybe I don't have the throughput to process this but it sounds like it won't work then. If the new types aren't going to be loaded than we cannot evolve the datatypes with new functionality in new repositories. 

Perhaps I am missing something,  but in the abstract it seems Galaxy would have no way of knowing which types are new and which types are old in this scenario.

Not really my business so feel free to proceed however obviously - just letting you know that it makes me nervous. I will try to find the time to understand how the tool shed handles data types so I can speak with less ignorance in the future.

-John


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Björn Grüning-3
Hi,

I think you are right John. Datatypes have many issues in that regard as
I can tell, from a few bug reports. Imho datatypes should be handled
like "Tool dependency definitions". There should be only one
"installable revsion".

But that aside, emboss datatypes are already broken. For example asn1
was added into Galaxy but it still exists in emboss_datatypes.

Moreover, howto add a proper genbank datatype with sniffer, split and
merge functions? Ideally, every datatype should have its own repository,
but that is an overhead I would like to omit ... any other ideas?

I would love to discuss that issue further, maybe a hangout with Greg
and Peter?

Thanks John for your input,
Bjoern

Am 17.07.2014 18:24, schrieb John Chilton:

> Even more out of office than normal so maybe I don't have the throughput to
> process this but it sounds like it won't work then. If the new types aren't
> going to be loaded than we cannot evolve the datatypes with new
> functionality in new repositories.
>
> Perhaps I am missing something,  but in the abstract it seems Galaxy would
> have no way of knowing which types are new and which types are old in this
> scenario.
>
> Not really my business so feel free to proceed however obviously - just
> letting you know that it makes me nervous. I will try to find the time to
> understand how the tool shed handles data types so I can speak with less
> ignorance in the future.
>
> -John
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
In reply to this post by Eric Rasche
On Thu, Jul 17, 2014 at 4:31 PM, Eric Rasche <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> For those reading this thread from the future, there's a secret to
> adding completely new datatypes locally (and not through a toolshed).
>
> You have to manually edit lib/galaxy/datatypes/registry.py and import
> the module you've written at the top of the file.
>
> For instance, if you add a new "gbk.py" datatype, you'll need to add
> "import gbk" to the top of registry.py. This will cause your errors to
> go away and your datatype to be loaded on startup.
>
> Thanks to John Chilton for answering this on IRC.
>
> Cheers,
> Eric

Indeed - sorry I hadn't spotted that complication.

The README files for these datatype extensions may help:

https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes
https://github.com/peterjc/pico_galaxy/tree/master/datatypes/mira_datatypes

I have to do this manually with some sed magic in my TravisCI
automated set setup, see:

http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
In reply to this post by Björn Grüning-3
On Thu, Jul 17, 2014 at 5:45 PM, Björn Grüning
<[hidden email]> wrote:

> Hi,
>
> I think you are right John. Datatypes have many issues in that regard as I
> can tell, from a few bug reports. Imho datatypes should be handled like
> "Tool dependency definitions". There should be only one "installable
> revsion".
>
> But that aside, emboss datatypes are already broken. For example asn1 was
> added into Galaxy but it still exists in emboss_datatypes.
>
> Moreover, howto add a proper genbank datatype with sniffer, split and merge
> functions? Ideally, every datatype should have its own repository, but that
> is an overhead I would like to omit ... any other ideas?
>
> I would love to discuss that issue further, maybe a hangout with Greg and
> Peter?
>
> Thanks John for your input,
> Bjoern

This could be high level, e.g. "other sequence file formats" repository
covering GenBank, EMBL, SwissProt plain text, UniProt XML, etc;
one for multiple sequence alignments; one for EMBOSS' own output...

But it wouldn't be that much more work to do one ToolShed repo
per additional file format, would it?

One reason I have been meaning to do some of these is familiarity with
many of these formats from looking after/writing parsers in Biopython.

Having this done sooner rather than later ought to head off too many
incompatible datatype names which worries me. Is it too late to adopt
something like the EDAM ontology for the datatypes within Galaxy?

Peter

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
In reply to this post by Peter Cock
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Not a problem Peter, it's a somewhat subtle bug to have, and there isn't
a lot of documentation on the wiki about writing new datatypes (though I
plan to fix that soon).

That particular error message could stand to be a bit more explicit.
(e.g., "Did you forget to add import mylib to registry.py?").

Also, thanks for sharing the blog post. Since we develop all of our
tools internally, I may adapt and publish your post with similar
instructions for jenkins, if that's all right by you.

Cheers,
Eric

On 07/17/2014 11:46 AM, Peter Cock wrote:

> On Thu, Jul 17, 2014 at 4:31 PM, Eric Rasche <[hidden email]> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> For those reading this thread from the future, there's a secret to
>> adding completely new datatypes locally (and not through a toolshed).
>>
>> You have to manually edit lib/galaxy/datatypes/registry.py and import
>> the module you've written at the top of the file.
>>
>> For instance, if you add a new "gbk.py" datatype, you'll need to add
>> "import gbk" to the top of registry.py. This will cause your errors to
>> go away and your datatype to be loaded on startup.
>>
>> Thanks to John Chilton for answering this on IRC.
>>
>> Cheers,
>> Eric
>
> Indeed - sorry I hadn't spotted that complication.
>
> The README files for these datatype extensions may help:
>
> https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes
> https://github.com/peterjc/pico_galaxy/tree/master/datatypes/mira_datatypes
>
> I have to do this manually with some sed magic in my TravisCI
> automated set setup, see:
>
> http://blastedbio.blogspot.co.uk/2013/09/using-travis-ci-for-testing-galaxy-tools.html
>
> Peter
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTx//oAAoJEMqDXdrsMcpVY40P/2KI2RniuGgsl7w0Mt3by4wP
XIWAsRRYUL/I4pTqEgtg3/aMn/9J2PFfPTvJMJbwCboT7Bn/4q0vc4qW7MDPSsjR
1V1XZ/5dEi0Q/gjXQYZmib2uSBgnRR58XR8/ae2UUKDINJv2BsToIB7Z60bB2XAI
a/b7qLXgq37NOFaZmBsqCse1yf7D9qD20Gf3c2uNYRPdARbkTVNMfjNoCzbNkMiJ
QyPt0c7ZetrKUseEgKoBa4EtO/y8uU7EHdYo2WxtmymZFdeIzTit9XKk/l6V0p2G
pqwcc504r0AsKA46/5BY5g9MpboEk36CRG0u+CG3vWv958MKxKMblKYE7qexqq9p
6UrsdxvHohX4IlTMU4GEwCMvks+jn2JwMqYGUOpk8yQLkTALxRUfJcheN3RtMvfF
jRT2xzUm0s3dwKCHX5v7dePYIYLRvpig8CwRtL2FQZTntxJh2FAvwnL6ViUi/jGL
+FYjfGFDMRvqqY81nAqUh7dfjEOVf8J5lTAL2YTzZ8y8sLDtZNaeCdNj+4IUOYJT
5QEDpKH/TR4W8MnlmE5gLFZC0Yf0v951pikjMR+rI2mYVf1uYT1UVeWpPT2JZXdw
gbNOt/Gu9gcK2GTAmd223bCy3zPZGkVW3JVJlTo1wiyx7Bx3umQGLQEDu3aGpOEm
b2DJ01ovMrEsr9X83v9i
=TZls
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
On Thu, Jul 17, 2014 at 5:55 PM, Eric Rasche <[hidden email]> wrote:

>
> Not a problem Peter, it's a somewhat subtle bug to have, and there isn't
> a lot of documentation on the wiki about writing new datatypes (though I
> plan to fix that soon).
>
> That particular error message could stand to be a bit more explicit.
> (e.g., "Did you forget to add import mylib to registry.py?").
>
> Also, thanks for sharing the blog post. Since we develop all of our
> tools internally, I may adapt and publish your post with similar
> instructions for jenkins, if that's all right by you.
>
> Cheers,
> Eric

Please do :)

Peter

P.S. I know Saket is using this approach too now:
https://github.com/saketkc/galaxy_tools
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Björn Grüning-3
In reply to this post by Peter Cock


Am 17.07.2014 18:51, schrieb Peter Cock:

> On Thu, Jul 17, 2014 at 5:45 PM, Björn Grüning
> <[hidden email]> wrote:
>> Hi,
>>
>> I think you are right John. Datatypes have many issues in that regard as I
>> can tell, from a few bug reports. Imho datatypes should be handled like
>> "Tool dependency definitions". There should be only one "installable
>> revsion".
>>
>> But that aside, emboss datatypes are already broken. For example asn1 was
>> added into Galaxy but it still exists in emboss_datatypes.
>>
>> Moreover, howto add a proper genbank datatype with sniffer, split and merge
>> functions? Ideally, every datatype should have its own repository, but that
>> is an overhead I would like to omit ... any other ideas?
>>
>> I would love to discuss that issue further, maybe a hangout with Greg and
>> Peter?
>>
>> Thanks John for your input,
>> Bjoern
>
> This could be high level, e.g. "other sequence file formats" repository
> covering GenBank, EMBL, SwissProt plain text, UniProt XML, etc;
> one for multiple sequence alignments; one for EMBOSS' own output...

That was my initial idea. Starting point is here:
https://github.com/bgruening/galaxytools/tree/master/datatypes

> But it wouldn't be that much more work to do one ToolShed repo
> per additional file format, would it?

Uploading and creating descriptions in the toolshed will take most of
the time :)
Lets see if I can use a train trip to do that ... but the problem will
stay the same ... one repository can have multiple versions ...

> One reason I have been meaning to do some of these is familiarity with
> many of these formats from looking after/writing parsers in Biopython.
>
> Having this done sooner rather than later ought to head off too many
> incompatible datatype names which worries me. Is it too late to adopt
> something like the EDAM ontology for the datatypes within Galaxy?
>
> Peter
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Eric Rasche
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On 07/17/2014 12:10 PM, Björn Grüning wrote:

>
>
> Am 17.07.2014 18:51, schrieb Peter Cock:
>> On Thu, Jul 17, 2014 at 5:45 PM, Björn Grüning
>> <[hidden email]> wrote:
>>> Hi,
>>>
>>> I think you are right John. Datatypes have many issues in that regard
>>> as I
>>> can tell, from a few bug reports. Imho datatypes should be handled like
>>> "Tool dependency definitions". There should be only one "installable
>>> revsion".
>>>
>>> But that aside, emboss datatypes are already broken. For example asn1
>>> was
>>> added into Galaxy but it still exists in emboss_datatypes.
>>>
>>> Moreover, howto add a proper genbank datatype with sniffer, split and
>>> merge
>>> functions? Ideally, every datatype should have its own repository,
>>> but that
>>> is an overhead I would like to omit ... any other ideas?

We could use something like what I do, CI scripts and hidden .yaml files
to manage which folders get pushed to which toolshed repositories and
when. My initial version of that blindly updates things when there are
changes, but I'm working to add support for things like "create a new
versioned toolshed repository on major version # changes".

That would remove a lot of the overhead for maintaining that many
repositories.

>>>
>>> I would love to discuss that issue further, maybe a hangout with Greg
>>> and
>>> Peter?
>>>
>>> Thanks John for your input,
>>> Bjoern
>>
>> This could be high level, e.g. "other sequence file formats" repository
>> covering GenBank, EMBL, SwissProt plain text, UniProt XML, etc;
>> one for multiple sequence alignments; one for EMBOSS' own output...
>
> That was my initial idea. Starting point is here:
> https://github.com/bgruening/galaxytools/tree/master/datatypes
>
>> But it wouldn't be that much more work to do one ToolShed repo
>> per additional file format, would it?
>
> Uploading and creating descriptions in the toolshed will take most of
> the time :)
> Lets see if I can use a train trip to do that ... but the problem will
> stay the same ... one repository can have multiple versions ...

And how to solve that? You're right, datatypes shouldn't have multiple
revisions since the file format should not be changing. I don't have an
answer for this either unfortunately :/

>
>> One reason I have been meaning to do some of these is familiarity with
>> many of these formats from looking after/writing parsers in Biopython.

Peter, similar case here with BioPerl. All of my tools can output the
full range of Bio::SeqIO output formats, so having datatypes would be
great. Happy to contribute there.

>>
>> Having this done sooner rather than later ought to head off too many
>> incompatible datatype names which worries me. Is it too late to adopt
>> something like the EDAM ontology for the datatypes within Galaxy?
>>
>> Peter
>>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>  http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>  http://galaxyproject.org/search/mailinglists/


- --
Eric Rasche
Programmer II
Center for Phage Technology
Texas A&M University
College Station, TX 77843
404-692-2048
[hidden email]
[hidden email]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTyAfbAAoJEMqDXdrsMcpVcFUQAJMQMyZ7eDM3fDhppOHjPgxU
16hpuQ14MW2UqZsAl4V0H8R+1C1xnBIH1rErUPfvaloEAVk6FWogDY5L79XHz6b5
6G7UkDM+7K+zKb6pDyVynm8Kx5Kg+D7gHtu0R2HTFxYGRhVbuldskKJfp9g8aziP
NPVALTLUi+hotzsNSJpP8rBct6WYWNNIM3o1TIKLVVsQfrhlTfYXuYF8Xb0n8GTs
Tf3ad6ZIY7BJTftGdlzE0O3ZPgXe5J/cb9RCyzTN69R6uKUIhg1XaOGHlA+JubbG
161e9fiuNzFF54bmQZYCIZTR9YBPF7aRjRQJcRVjBvTaQ3NbTmUdzvhW1fLT9Yuv
8WPVKIyB0lWECVx85fuSGE1PH7rwJZATO0bkHgsxqUT2TI7TFy0HWl6hJaPolP5/
1u3uvvsBu4aDiBK9uI+fzkqn+fu4D+A8GwllL0sOsyNcDlbjBUXWfYA0xVI41+m1
PFeQ6MRHf332kY/iqhnX5GJfzQIp0KHmEwpDTzwa9SkDSnZm7SLhZi46vFZpQAgR
AvBObz8ztstZP9yRwNF1cXYIap+tFQ0vKa9uqNTeC3sTWwypsK5SKl1jCfHUI71T
saxqNuML+G+uJiVPaFmeh19eVrHAPSR1oQLYl0fC2X4Qt9Jw2/Tgj8cEl08Cj3NO
LAMs0NIOwRhkJ556uA/P
=JeRi
-----END PGP SIGNATURE-----
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: writing datatypes

Peter Cock
In reply to this post by Björn Grüning-3
On Thu, Jul 17, 2014 at 6:10 PM, Björn Grüning
<[hidden email]> wrote:
>
> ... but the problem will stay the same ... one [datatype definition] repository
> can have multiple versions ...
>

I like your idea that like tool dependency definitions, this should be a special
repository type on the ToolShed:

Earlier, Björn Grüning <[hidden email]> wrote:
>
> Imho datatypes should be handled like "Tool dependency definitions".
> There should be only one "installable revsion".
>

This is something Greg will have to comment on - there may be
ramifications I'm not seeing.

Peter

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
12