Quantcast

New (reckless) tools

classic Classic list List threaded Threaded
1 message Options
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

New (reckless) tools

Assaf Gordon
Hello,

I'd like to share three tools that I've recently made.
One is semi-useful, the other too are completely reckless.

The first one is an improved "sort" tool, which allows sorting according
to multiple keys. it uses the GNU sort utility, and therefor doesn't
need a script file (except for the XML itself).

BEGIN SORT TOOL
---------------
<tool id="cshl_sort_tool" name="Sort">
   <command>sort
       #for $key in $sortkeys
        '-k ${key.column},${key.column}${key.order}${key.style}'
       #end for
    $input > $out_file1</command>
   <inputs>
     <param format="txt" name="input" type="data" label="Sort Query" />
        <repeat name="sortkeys" title="sort key">
            <param name="column" label="on column" type="data_column"
                        data_ref="input" accept_default="true" />
            <param name="order" type="select"
                          display="radio" label="in">
              <option value="r">Descending order</option>
              <option value="">Ascending order</option>
            </param>
            <param name="style" type="select"
                           display="radio" label="Flavor">
              <option value="n">Numerical sort</option>
              <option value="">Alphabetical sort</option>
            </param>
    </repeat>
   </inputs>
   <outputs>
     <data format="input" name="out_file1" metadata_source="input"/>
   </outputs>
</tool>
--------------
END SORT TOOL


The two other tools allow running AWK and SED directly from galaxy.
These should NEVER EVER be installed on a publicly accessible system (as
a matter of fact, these shouldn't be install on any system... )

The reason I wrote these tools, is that sometimes the galaxy tools just
can't do something which would be very simple to do, if I just had
access to sed or awk.

This usually happens when biologists here work on galaxy, and get
'stuck' trying to do something which just can't be done with the current
galaxy tools.
Then they ask me to do it.
One way (for me) to do it is to SSH into the galaxy server, get the
dataset file, run sed/awk, create a new file, and manually upload it
back to galaxy. Another way is to run awk and sed from within galaxy...

I won't say this is the most elegant solution ever - but it gets the job
done.

Note about security:
With these tools - there is NONE. An AWK script can execute shell
commands (the 'system' function) and can read any local file (the
'getline' command).
A SED script can read files (the 'r' command) and write files (the 'w'
command).
The 'password' field is completely useless. The password is not hidden,
and the web browser will probably cache it and display it the next time
you access the tool's page.


BEGIN AWK TOOL
--------------
<tool id="cshl_awk_tool" name="awk">
   <command>awk '$url_paste' OFS="\t" $input > $output </command>
   <inputs>
  <param format="txt" name="input" type="data" label="File to process" />
     <!-- Note: the parameter ane MUST BE 'url_paste' -
          This is a hack in the galaxy library (see
           ./lib/galaxy/util/__init__.py line 142)
         If the name is 'url_paste' the string won't be sanitized,
           and all the non-alphanumeric characters
         will be passed to the shell script -->
     <param name="url_paste" type="text" area="true"
              size="5x35" label="AWK Program" help="">
      <validator type="expression"
     message="Invalid Program!">value.find('\'')==-1</validator>
     </param>

     <!-- Enable this parameter for a completely insecure
         password mechanism. You can also change the validator
          type to use an external file -->
     <param name="passwd" type="text" area="false"
               label="Secret password">
         <validator type="regex"
           message="Oops! wrong password.">^awk$</validator>
     </param>

   </inputs>
   <outputs>
     <data format="txt" name="output" metadata_source="input" />
   </outputs>
</tool>
--------------
END AWK TOOL



BEGIN SED TOOL
--------------
<tool id="cshl_sed_tool" name="sed">
   <command>sed '$url_paste' $input > $output </command>
   <inputs>
  <param format="txt" name="input" type="data" label="File to process" />

     <!-- Note: the parameter ane MUST BE 'url_paste' -
          This is a hack in the galaxy library (see
          ./lib/galaxy/util/__init__.py line 142)
         If the name is 'url_paste' the string won't be sanitized,
           and all the non-alphanumeric characters
         will be passed to the shell script -->
     <param name="url_paste" type="text" area="true"
           size="5x35" label="SED Program" help="">
      <validator type="expression"
       message="Invalid Program!">value.find('\'')==-1</validator>
     </param>

     <!-- Enable this parameter for a completely insecure
          password mechanism. You can also change the validator type
           to use an external file -->
     <param name="passwd" type="text" area="false"
             label="Secret password">
         <validator type="regex"
             message="Oops! wrong password.">^sed$</validator>
     </param>
   </inputs>
   <outputs>
     <data format="txt" name="output" metadata_source="input" />
   </outputs>
</tool>
--------------
END SED TOOL

The above XML snippets are free for use under the same license as galaxy
(the LICENSE.txt file).


-Gordon.

Loading...