
                                cutgextract 
                                      
   
   
Function

   Extract data from CUTG
   
Description

   given the name of a directory containing the CUTG database
   (ftp://ftp.ebi.ac.uk/pub/databases/cutg) this will calculate codon
   usage tables for individual species (e.g. EHomo_sapiens.cut) and place
   them in the CODONS subdirectory of the EMBOSS data directory. This is
   an all-or-nothing extraction, will create many files and take several
   minutes. The usage tables are from the sum of codons over all
   sequences for each organism. Given the name of a directory containing
   the CUTG database (ftp://ftp.ebi.ac.uk/pub/databases/cutg) cutgextract
   will calculate codon usage tables for individual species (e.g.
   EHomo_sapiens.cut) and place them in the CODONS subdirectory of the
   EMBOSS data directory. This is an all-or-nothing extraction, will
   create many files and take several minutes. The usage tables are from
   the sum of codons over all sequences for each organism.
   
   The EMBOSS distribution comes loaded with a set of codon usage tables.
   Thes codon usage tables provided with the distribution are calculated
   from the files in
   ftp://ftp.ebi.ac.uk/pub/databases/codonusage/README), with a few
   additions whose exact derivation cannot easily be determined. Many
   people would prefer to create their own from the public CUTG data.
   
   You run cutgextract on the CUTG database from
   ftp://ftp.ebi.ac.uk/pub/databases/cutg. You should get all the
   required *.codon files from CUTG, and uncompress them if they are
   compressed before running cutgextract on them.
   
   The task of downloading the CUTG database and running cutgextract to
   create the codon usage table files from it would normally be done only
   once when the EMBOSS package is being installled or if a new version
   of the CUTG database is released.
   
   Note by the way that CUTG has a drawback: it has a table for each
   organism without making the distinction between different gene
   populations.
   
Algorithm

   cutgextract looks in the specified directory and opens all the files
   with the extension '.codon'. These are all expected to be CUTG data
   files.
   
   It then parses out the codon usage data from these *.codon files and
   writes one file per species into the EMBOSS data/CODONS directory. The
   names of the files are derived from the species names in the CUTG
   files. These files names will be long (and therefore descriptive).
   
Usage

   Here is a sample session with cutgextract
   

% cutgextract 
Extract data from CUTG
CUTG directory [.]: ../../data
   
   Go to the output files for this example
   
Command line arguments

   Standard (Mandatory) qualifiers:
  [-directory]         dirlist    CUTG directory

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers:
   -wildspec           string     Type of codon file

   Associated qualifiers: (none)
   General qualifiers:
   -auto                boolean    Turn off prompts
   -stdout              boolean    Write standard output
   -filter              boolean    Read standard input, write standard output
   -options             boolean    Prompt for standard and additional values
   -debug               boolean    Write debug output to program.dbg
   -verbose             boolean    Report some/full command line options
   -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning             boolean    Report warnings
   -error               boolean    Report errors
   -fatal               boolean    Report fatal errors
   -die                 boolean    Report deaths
   

   Error: File
   /packages/emboss_dev/gwilliam/emboss/emboss/emboss/acd/cutgextract.acd
   line 11: (directory) Unknown attribute 'name'
   
Input file format

Output file format

   cutgextract outputs a set of EMBOSS codon usage data files to the
   EMBOSS data/CODONS data directory
   
Output files for usage example

File: CODONS

Data files

   EMBOSS data files are distributed with the application and stored in
   the standard EMBOSS data directory, which is defined by the EMBOSS
   environment variable EMBOSS_DATA.
   
   To see the available EMBOSS data files, run:
   
% embossdata -showall

   To fetch one of the data files (for example 'Exxx.dat') into your
   current directory for you to inspect or modify, run:

% embossdata -fetch -file Exxx.dat

   Users can provide their own data files in their own directories.
   Project specific files can be put in the current directory, or for
   tidier directory listings in a subdirectory called ".embossdata".
   Files for all EMBOSS runs can be put in the user's home directory, or
   again in a subdirectory called ".embossdata".
   
   The directories are searched in the following order:
     * . (your current directory)
     * .embossdata (under your current directory)
     * ~/ (your home directory)
     * ~/.embossdata
       
Notes

   None.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   None.
   
Exit status

   It always exits with status 0.
   
Known bugs

   None.
   
See also

   Program name Description
   aaindexextract Extract data from AAINDEX
   printsextract Extract data from PRINTS
   prosextract Builds the PROSITE motif database for patmatmotifs to
   search
   rebaseextract Extract data from REBASE
   tfextract Extract data from TRANSFAC
   
Author(s)

   Alan Bleasby (ableasby  rfcgr.mrc.ac.uk)
   MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust
   Genome Campus, Hinxton, Cambridge, CB10 1SB, UK
   
History

   Written (June 2001) - Alan Bleasby.
   
Target users

   This program is intended to be run by people maintaining the data
   associated with an installation of EMBOSS.
