Output Excel file

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Output Excel file

Patrick Leyshock
Hello,

I am trying to write a Galaxy tool that will output an Excel file. 

Currently the tool wrapper calls an R script, which uses the "xlsx" package to read and write to and from Excel files.

After being invoked by Galaxy, the script is able to successfully read an input Excel file:

    suppressPackageStartupMessages(library(xlsx, quietly=TRUE));
    raw.data <- read.xlsx(commandArgs(trailingOnly=TRUE)[1]), sheetName="input_records");
   
The script then does its work on the data just read in.  Then, when the work is done, I'd like to output the results to an Excel file.  Here's where I run into a problem.  I first tried to output the result like this:

    write.xlsx(processed.data, file=commandArgs(trailingOnly=TRUE)[2]);

but write.xlsx throws an error.  Looks like  ".xls" and ".xlsx" are the only legal file extensions for the write.xlsx() function.  Inspecting commandArgs(trailingOnly=TRUE)[2] shows that Galaxy provided a filename extension of ".dat". 

I tried a workaround using this:

    write.xlsx(processed.data, file=paste(commandArgs(trailingOnly=TRUE)[2]), ".xls", sep="");
   
The write.xlsx function no longer throws an error (since the file name supplied as a parameter has an acceptable file extension) but now Galaxy won't display the result in the History.  If I look in Galaxy's database I see two files there corresponding to my tool's output.  Supposing that Galaxy assigned the result the name "dataset_87", then looking in the Galaxy database I see:

    dataset_87.dat
    dataset_87.dat.xls
   
"dataset_87.dat" is empty but visible to Galaxy (and so displayed as an empty dataset in the History window). "dataset_87.dat.xls" has the results I want but isn't displayed by Galaxy in the History window.

There might be R libraries that can write xls or xlsx files without requiring a ".xls" or ".xlsx" file extension.  That's a path I'm willing to explore.  That said, is there a configuration option I can set that'll let me continue to use write.xlsx()?

I've been working from the suggestions made on this helpful thread:

http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-December/007807.html

so the relevant entries in datatypes_conf.xml are:

    <datatype extension="xls" type="galaxy.datatypes.binary.Xls" display_in_upload="true" />
    <datatype extension="xlsx" type="galaxy.datatypes.binary.Xlsx" display_in_upload="true" />
   
and the relevant additions to binary.py are:

    class Xls(Binary):
        '''Class describing an Excel 2003 (xls) file'''

        file_ext='xls'
       
    class Xlsx(Binary):
        '''Class describing an Excel 2007 (xlsx) file'''

        file_ext='xlsx'       
   
Any suggestions appreciated.

Thanks, Patrick


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Output Excel file

Björn Grüning-3
Hi Patrick,

a nice workaround is the following:

Please create your file.xls file, but not in the path galaxy will
provide, use a temporary path, for example in the current working dir in
your wrapper. Than after your program has finished, just move the *.xls
file to the galaxy output path. Done. Like this.

myprogram.r -i $input -o foo.xls -p 4;
mv foo.xls $output;

Cheers,
Bjoern





Am 04.09.2014 um 23:27 schrieb Patrick Leyshock:

> Hello,
>
> I am trying to write a Galaxy tool that will output an Excel file.
>
> Currently the tool wrapper calls an R script, which uses the "xlsx" package
> to read and write to and from Excel files.
>
> After being invoked by Galaxy, the script is able to successfully read an
> input Excel file:
>
>      suppressPackageStartupMessages(library(xlsx, quietly=TRUE));
>      raw.data <- read.xlsx(commandArgs(trailingOnly=TRUE)[1]),
> sheetName="input_records");
>
> The script then does its work on the data just read in.  Then, when the
> work is done, I'd like to output the results to an Excel file.  Here's
> where I run into a problem.  I first tried to output the result like this:
>
>      write.xlsx(processed.data, file=commandArgs(trailingOnly=TRUE)[2]);
>
> but write.xlsx throws an error.  Looks like  ".xls" and ".xlsx" are the
> only legal file extensions for the write.xlsx() function.  Inspecting
> commandArgs(trailingOnly=TRUE)[2] shows that Galaxy provided a filename
> extension of ".dat".
>
> I tried a workaround using this:
>
>      write.xlsx(processed.data,
> file=paste(commandArgs(trailingOnly=TRUE)[2]), ".xls", sep="");
>
> The write.xlsx function no longer throws an error (since the file name
> supplied as a parameter has an acceptable file extension) but now Galaxy
> won't display the result in the History.  If I look in Galaxy's database I
> see two files there corresponding to my tool's output.  Supposing that
> Galaxy assigned the result the name "dataset_87", then looking in the
> Galaxy database I see:
>
>      dataset_87.dat
>      dataset_87.dat.xls
>
> "dataset_87.dat" is empty but visible to Galaxy (and so displayed as an
> empty dataset in the History window). "dataset_87.dat.xls" has the results
> I want but isn't displayed by Galaxy in the History window.
>
> There might be R libraries that can write xls or xlsx files without
> requiring a ".xls" or ".xlsx" file extension.  That's a path I'm willing to
> explore.  That said, is there a configuration option I can set that'll let
> me continue to use write.xlsx()?
>
> I've been working from the suggestions made on this helpful thread:
>
> http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-December/007807.html
>
> so the relevant entries in datatypes_conf.xml are:
>
>      <datatype extension="xls" type="galaxy.datatypes.binary.Xls"
> display_in_upload="true" />
>      <datatype extension="xlsx" type="galaxy.datatypes.binary.Xlsx"
> display_in_upload="true" />
>
> and the relevant additions to binary.py are:
>
>      class Xls(Binary):
>          '''Class describing an Excel 2003 (xls) file'''
>
>          file_ext='xls'
>
>      class Xlsx(Binary):
>          '''Class describing an Excel 2007 (xlsx) file'''
>
>          file_ext='xlsx'
>
> Any suggestions appreciated.
>
> Thanks, Patrick
>
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>    http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>    http://galaxyproject.org/search/mailinglists/
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Output Excel file

Patrick Leyshock-2
Bjorn, that did the trick, thank you for the suggestion.

Onwards!

Patrick

On Thu, Sep 4, 2014 at 2:41 PM, Björn Grüning <[hidden email]> wrote:
Hi Patrick,

a nice workaround is the following:

Please create your file.xls file, but not in the path galaxy will provide, use a temporary path, for example in the current working dir in your wrapper. Than after your program has finished, just move the *.xls file to the galaxy output path. Done. Like this.

myprogram.r -i $input -o foo.xls -p 4;
mv foo.xls $output;

Cheers,
Bjoern





Am 04.09.2014 um 23:27 schrieb Patrick Leyshock:
Hello,

I am trying to write a Galaxy tool that will output an Excel file.

Currently the tool wrapper calls an R script, which uses the "xlsx" package
to read and write to and from Excel files.

After being invoked by Galaxy, the script is able to successfully read an
input Excel file:

     suppressPackageStartupMessages(library(xlsx, quietly=TRUE));
     raw.data <- read.xlsx(commandArgs(trailingOnly=TRUE)[1]),
sheetName="input_records");

The script then does its work on the data just read in.  Then, when the
work is done, I'd like to output the results to an Excel file.  Here's
where I run into a problem.  I first tried to output the result like this:

     write.xlsx(processed.data, file=commandArgs(trailingOnly=TRUE)[2]);

but write.xlsx throws an error.  Looks like  ".xls" and ".xlsx" are the
only legal file extensions for the write.xlsx() function.  Inspecting
commandArgs(trailingOnly=TRUE)[2] shows that Galaxy provided a filename
extension of ".dat".

I tried a workaround using this:

     write.xlsx(processed.data,
file=paste(commandArgs(trailingOnly=TRUE)[2]), ".xls", sep="");

The write.xlsx function no longer throws an error (since the file name
supplied as a parameter has an acceptable file extension) but now Galaxy
won't display the result in the History.  If I look in Galaxy's database I
see two files there corresponding to my tool's output.  Supposing that
Galaxy assigned the result the name "dataset_87", then looking in the
Galaxy database I see:

     dataset_87.dat
     dataset_87.dat.xls

"dataset_87.dat" is empty but visible to Galaxy (and so displayed as an
empty dataset in the History window). "dataset_87.dat.xls" has the results
I want but isn't displayed by Galaxy in the History window.

There might be R libraries that can write xls or xlsx files without
requiring a ".xls" or ".xlsx" file extension.  That's a path I'm willing to
explore.  That said, is there a configuration option I can set that'll let
me continue to use write.xlsx()?

I've been working from the suggestions made on this helpful thread:

http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-December/007807.html

so the relevant entries in datatypes_conf.xml are:

     <datatype extension="xls" type="galaxy.datatypes.binary.Xls"
display_in_upload="true" />
     <datatype extension="xlsx" type="galaxy.datatypes.binary.Xlsx"
display_in_upload="true" />

and the relevant additions to binary.py are:

     class Xls(Binary):
         '''Class describing an Excel 2003 (xls) file'''

         file_ext='xls'

     class Xlsx(Binary):
         '''Class describing an Excel 2007 (xlsx) file'''

         file_ext='xlsx'

Any suggestions appreciated.

Thanks, Patrick



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: Output Excel file

Patrick Leyshock
In reply to this post by Björn Grüning-3
Bjorn, that did the trick, thank you for the suggestion.

Onwards!

On Thu, Sep 4, 2014 at 2:41 PM, Björn Grüning <[hidden email]> wrote:
Hi Patrick,

a nice workaround is the following:

Please create your file.xls file, but not in the path galaxy will provide, use a temporary path, for example in the current working dir in your wrapper. Than after your program has finished, just move the *.xls file to the galaxy output path. Done. Like this.

myprogram.r -i $input -o foo.xls -p 4;
mv foo.xls $output;

Cheers,
Bjoern





Am 04.09.2014 um 23:27 schrieb Patrick Leyshock:
Hello,

I am trying to write a Galaxy tool that will output an Excel file.

Currently the tool wrapper calls an R script, which uses the "xlsx" package
to read and write to and from Excel files.

After being invoked by Galaxy, the script is able to successfully read an
input Excel file:

     suppressPackageStartupMessages(library(xlsx, quietly=TRUE));
     raw.data <- read.xlsx(commandArgs(trailingOnly=TRUE)[1]),
sheetName="input_records");

The script then does its work on the data just read in.  Then, when the
work is done, I'd like to output the results to an Excel file.  Here's
where I run into a problem.  I first tried to output the result like this:

     write.xlsx(processed.data, file=commandArgs(trailingOnly=TRUE)[2]);

but write.xlsx throws an error.  Looks like  ".xls" and ".xlsx" are the
only legal file extensions for the write.xlsx() function.  Inspecting
commandArgs(trailingOnly=TRUE)[2] shows that Galaxy provided a filename
extension of ".dat".

I tried a workaround using this:

     write.xlsx(processed.data,
file=paste(commandArgs(trailingOnly=TRUE)[2]), ".xls", sep="");

The write.xlsx function no longer throws an error (since the file name
supplied as a parameter has an acceptable file extension) but now Galaxy
won't display the result in the History.  If I look in Galaxy's database I
see two files there corresponding to my tool's output.  Supposing that
Galaxy assigned the result the name "dataset_87", then looking in the
Galaxy database I see:

     dataset_87.dat
     dataset_87.dat.xls

"dataset_87.dat" is empty but visible to Galaxy (and so displayed as an
empty dataset in the History window). "dataset_87.dat.xls" has the results
I want but isn't displayed by Galaxy in the History window.

There might be R libraries that can write xls or xlsx files without
requiring a ".xls" or ".xlsx" file extension.  That's a path I'm willing to
explore.  That said, is there a configuration option I can set that'll let
me continue to use write.xlsx()?

I've been working from the suggestions made on this helpful thread:

http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-December/007807.html

so the relevant entries in datatypes_conf.xml are:

     <datatype extension="xls" type="galaxy.datatypes.binary.Xls"
display_in_upload="true" />
     <datatype extension="xlsx" type="galaxy.datatypes.binary.Xlsx"
display_in_upload="true" />

and the relevant additions to binary.py are:

     class Xls(Binary):
         '''Class describing an Excel 2003 (xls) file'''

         file_ext='xls'

     class Xlsx(Binary):
         '''Class describing an Excel 2007 (xlsx) file'''

         file_ext='xlsx'

Any suggestions appreciated.

Thanks, Patrick



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/