loading with Auto detect and fastqsanger options issue!

classic Classic list List threaded Threaded
4 messages Options
| Threaded
Open this post in threaded view
|

loading with Auto detect and fastqsanger options issue!

Hakeem Almabrazi-2

Hi,

 

I have installed Galaxy as cluster in my local env.  I have few fastq.gz files in my file system that I want to link them to galaxy instead of loading them.  When I select “Auto Detect” during the loading I noticed two things.

 

First, it is taking way longer to link the files with the AutoDetect option compared to when I chose Fastqsanger as my file type.  It takes less than a second if I specify the type, i.e Fastqsanger.   Any reason why is that?  Is there a way to speed up the first option (Auto Detect)?

 

Second, when it finish linking for AutoDetect option, it shows the DataType for these files as “data” rather than fastq files. 

Does this mean my fastq.gz files are not recognize as fastq files?

 

To validate my fastq files, I tried to use FastqValidator tool I get an error saying “BGZF EOF marker is missing”.  How can I fix these files?

 

I appreciate any kind of help,

 

Regards,

Hak

 

 

Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
| Threaded
Open this post in threaded view
|

Re: loading with Auto detect and fastqsanger options issue!

Martin Vickers [mjv08]
Hi Hak,

I noticed the same thing when linking them when adding fastq.gz files as a data library. The auto detect was reading the whole file as it was unknown, presumably looking for some sort of characteristic to place it. What I did to resolve this was add the archive data format data type from the tool shed which included fastq.gz. This meant that auto detect instantly found the correct type. Finally I added the gunzip tool to galaxy and it works well. 

There is also a trello feature request asking for this to be inbuilt of if you're interested you may wish to vote for it. I will forward you the like when I'm back at a computer. 

Hope that helps,

Martin 

Sent on the move.

On 4 Sep 2015, at 19:04, Hakeem Almabrazi <[hidden email]> wrote:

Hi,

 

I have installed Galaxy as cluster in my local env.  I have few fastq.gz files in my file system that I want to link them to galaxy instead of loading them.  When I select “Auto Detect” during the loading I noticed two things.

 

First, it is taking way longer to link the files with the AutoDetect option compared to when I chose Fastqsanger as my file type.  It takes less than a second if I specify the type, i.e Fastqsanger.   Any reason why is that?  Is there a way to speed up the first option (Auto Detect)?

 

Second, when it finish linking for AutoDetect option, it shows the DataType for these files as “data” rather than fastq files. 

Does this mean my fastq.gz files are not recognize as fastq files?

 

To validate my fastq files, I tried to use FastqValidator tool I get an error saying “BGZF EOF marker is missing”.  How can I fix these files?

 

I appreciate any kind of help,

 

Regards,

Hak

 

 

Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
| Threaded
Open this post in threaded view
|

Re: loading with Auto detect and fastqsanger options issue!

Hakeem Almabrazi-2

Thank you Martin for your reply.

 

I will give it a try.  Can you please send me the name of the tool you are referring to?

 

Also, I am hoping others will be able to share their insights about the other concerns in my post.

 

Regards,

 

From: Martin Vickers [mjv08] [mailto:[hidden email]]
Sent: Saturday, September 05, 2015 12:31 PM
To: Hakeem Almabrazi
Cc: [hidden email]
Subject: Re: [galaxy-dev] loading with Auto detect and fastqsanger options issue!

 

Hi Hak,

 

I noticed the same thing when linking them when adding fastq.gz files as a data library. The auto detect was reading the whole file as it was unknown, presumably looking for some sort of characteristic to place it. What I did to resolve this was add the archive data format data type from the tool shed which included fastq.gz. This meant that auto detect instantly found the correct type. Finally I added the gunzip tool to galaxy and it works well. 

 

There is also a trello feature request asking for this to be inbuilt of if you're interested you may wish to vote for it. I will forward you the like when I'm back at a computer. 

 

Hope that helps,

 

Martin 

Sent on the move.


On 4 Sep 2015, at 19:04, Hakeem Almabrazi <[hidden email]> wrote:

Hi,

 

I have installed Galaxy as cluster in my local env.  I have few fastq.gz files in my file system that I want to link them to galaxy instead of loading them.  When I select “Auto Detect” during the loading I noticed two things.

 

First, it is taking way longer to link the files with the AutoDetect option compared to when I chose Fastqsanger as my file type.  It takes less than a second if I specify the type, i.e Fastqsanger.   Any reason why is that?  Is there a way to speed up the first option (Auto Detect)?

 

Second, when it finish linking for AutoDetect option, it shows the DataType for these files as “data” rather than fastq files. 

Does this mean my fastq.gz files are not recognize as fastq files?

 

To validate my fastq files, I tried to use FastqValidator tool I get an error saying “BGZF EOF marker is missing”.  How can I fix these files?

 

I appreciate any kind of help,

 

Regards,

Hak

 

 

Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
| Threaded
Open this post in threaded view
|

Re: loading with Auto detect and fastqsanger options issue!

Martin Vickers [mjv08]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Hak,

The tool is called archive_datatypes written by cmonjeau and is available in the Galaxy Main Tool shed.

And here is the trello card to vote for support of compressed versions of standard data types.

https://trello.com/c/3RkTDnIn/345-666-support-gzipped-gz-compressed-versions-of-standard-datatypes

and was discussed here;

https://biostar.usegalaxy.org/p/13076/

Cheers,

Martin

On 09/06/2015 11:05 AM, Hakeem Almabrazi wrote:
>
> Thank you Martin for your reply.
>

>
> I will give it a try.  Can you please send me the name of the tool you are referring to?
>

>
> Also, I am hoping others will be able to share their insights about the other concerns in my post.
>

>
> Regards,
>

>
> *From:*Martin Vickers [mjv08] [[hidden email]]
> *Sent:* Saturday, September 05, 2015 12:31 PM
> *To:* Hakeem Almabrazi
> *Cc:* [hidden email]
> *Subject:* Re: [galaxy-dev] loading with Auto detect and fastqsanger options issue!
>

>
> Hi Hak,
>

>
> I noticed the same thing when linking them when adding fastq.gz files as a data library. The auto detect was reading the whole file as it was unknown, presumably looking for some sort of characteristic to place it. What I did to resolve this was add the archive data format data type from the tool shed which included fastq.gz. This meant that auto detect instantly found the correct type. Finally I added the gunzip tool to galaxy and it works well.
>

>
> There is also a trello feature request asking for this to be inbuilt of if you're interested you may wish to vote for it. I will forward you the like when I'm back at a computer.
>

>
> Hope that helps,
>

>
> Martin
>
> Sent on the move.
>
>
> On 4 Sep 2015, at 19:04, Hakeem Almabrazi <[hidden email] [hidden email]> wrote:
>
>     Hi,
>
>     
>
>     I have installed Galaxy as cluster in my local env.  I have few fastq.gz files in my file system that I want to link them to galaxy instead of loading them.  When I select “Auto Detect” during the loading I noticed two things.
>
>     
>
>     First, it is taking way longer to link the files with the AutoDetect option compared to when I chose Fastqsanger as my file type.  It takes less than a second if I specify the type, i.e Fastqsanger.   Any reason why is that?  Is there a way to speed up the first option (Auto Detect)?
>
>     
>
>     Second, when it finish linking for AutoDetect option, it shows the DataType for these files as “data” rather than fastq files.
>
>     Does this mean my fastq.gz files are not recognize as fastq files?
>
>     
>
>     To validate my fastq files, I tried to use FastqValidator tool I get an error saying “BGZF EOF marker is missing”.  How can I fix these files?
>
>     
>
>     I appreciate any kind of help,
>
>     
>
>     Regards,
>
>     Hak
>
>     
>
>     
>
>     Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.
>
>     ___________________________________________________________
>     Please keep all replies on the list by using "reply all"
>     in your mail client.  To manage your subscriptions to this
>     and other Galaxy lists, please use the interface at:
>      https://lists.galaxyproject.org/
>
>     To search Galaxy mailing lists use the unified search at:
>      http://galaxyproject.org/search/mailinglists/
>
> Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.


- --

- --
Dr. Martin Vickers

Data Manager/HPC Systems Administrator
Institute of Biological, Environmental and Rural Sciences
IBERS New Building
Aberystwyth University

w: http://www.martin-vickers.co.uk/
e: [hidden email]
t: 01970 62 2807
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)

iQEcBAEBAgAGBQJV7U2TAAoJEHa0a8GkKQgII/IH/3fNWL2a2Lcgm8W2uEhZ7ZhR
KKhE3FpcJnKihawUyC5H/EK4obVqJRWiVkUdujkWJ/le6wH8eo2x1mE14mK+WtT9
0sRdDlFgmcwGdrAyIyq4WAV3LqKVBa8ljEj09ZxtJ4UddPNyGClgS4YPE5JIWb5p
+H6KxkZNLDtsqH2mhXwNVP+HLZT8Ilon8GjQS4M7sBdjZnHFrPA1mTPcx1wiy0Qb
uCOWz4OcYXlBBiL45xBmdzmj+kD3wNR4i73lu133GQPXi7LxMpnpLzr3Q8wSv8uv
Rt0pcNUw5gKHECC3XvTsgvyf1tPhadFMR5xggG8DqGUqIvrpyYgNNg6fA24WeoI=
=agKx
-----END PGP SIGNATURE-----


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/