
                                 prettyseq 
                                      
   
   
Function

   Output sequence with translated ranges
   
Description

   This writes out a nicely formatted display of the sequence with the
   translation (within specified ranges) displayed beneath it.
   
   The translated nucleic acid region will be shown in lower-case letters
   while the rest of the input sequence will be left in the input case.
   
   The base and residue numbers of the sequences are shown beside the
   sequences in the output.
   
   Slightly unusually, this application uses the codon usage tables to
   translate the codons.
   
Usage

   Here is a sample session with prettyseq
   

% prettyseq 
Output sequence with translated ranges
Input sequence: tembl:paamir
Range(s) to translate [1-2167]: 135-1292
Output file [paamir.prettyseq]: 
   
   Go to the input files for this example
   Go to the output files for this example
   
Command line arguments

   Standard (Mandatory) qualifiers:
  [-sequence]          sequence   Sequence USA
   -range              range      Range(s) to translate
  [-outfile]           outfile    Output file name

   Additional (Optional) qualifiers:
   -[no]ruler          boolean    Add a ruler
   -[no]plabel         boolean    Number translations
   -[no]nlabel         boolean    Number DNA sequence

   Advanced (Unprompted) qualifiers:
   -cfile              codon      Codon usage table name
   -width              integer    Width of screen

   Associated qualifiers:

   "-sequence" associated qualifiers
   -sbegin1             integer    First base used
   -send1               integer    Last base used, def=seq length
   -sreverse1           boolean    Reverse (if DNA)
   -sask1               boolean    Ask for begin/end/reverse
   -snucleotide1        boolean    Sequence is nucleotide
   -sprotein1           boolean    Sequence is protein
   -slower1             boolean    Make lower case
   -supper1             boolean    Make upper case
   -sformat1            string     Input sequence format
   -sdbname1            string     Database name
   -sid1                string     Entryname
   -ufo1                string     UFO features
   -fformat1            string     Features format
   -fopenfile1          string     Features file name

   "-outfile" associated qualifiers
   -odirectory2         string     Output directory

   General qualifiers:
   -auto                boolean    Turn off prompts
   -stdout              boolean    Write standard output
   -filter              boolean    Read standard input, write standard output
   -options             boolean    Prompt for standard and additional values
   -debug               boolean    Write debug output to program.dbg
   -verbose             boolean    Report some/full command line options
   -help                boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning             boolean    Report warnings
   -error               boolean    Report errors
   -fatal               boolean    Report fatal errors
   -die                 boolean    Report deaths
   

   Standard (Mandatory) qualifiers Allowed values Default
   [-sequence]
   (Parameter 1) Sequence USA Readable sequence Required
   -range Range(s) to translate Sequence range Whole sequence
   [-outfile]
   (Parameter 2) Output file name Output file <sequence>.prettyseq
   Additional (Optional) qualifiers Allowed values Default
   -[no]ruler Add a ruler Boolean value Yes/No Yes
   -[no]plabel Number translations Boolean value Yes/No Yes
   -[no]nlabel Number DNA sequence Boolean value Yes/No Yes
   Advanced (Unprompted) qualifiers Allowed values Default
   -cfile Codon usage table name Codon usage file in EMBOSS data path
   Ehum.cut
   -width Width of screen Integer 10 or more 60
   
Input file format

   prettyseq reads any nucleic acid sequence USA.
   
  Input files for usage example
  
   'tembl:paamir' is a sequence entry in the example nucleic acid
   database 'tembl'
   
  Database entry: tembl:paamir
  
ID   PAAMIR     standard; DNA; PRO; 2167 BP.
XX
AC   X13776; M43175;
XX
SV   X13776.1
XX
DT   19-APR-1989 (Rel. 19, Created)
DT   17-FEB-1997 (Rel. 50, Last updated, Version 22)
XX
DE   Pseudomonas aeruginosa amiC and amiR gene for aliphatic amidase regulation
XX
KW   aliphatic amidase regulator; amiC gene; amiR gene.
XX
OS   Pseudomonas aeruginosa
OC   Bacteria; Proteobacteria; gamma subdivision; Pseudomonadaceae; Pseudomonas
.
XX
RN   [1]
RP   1167-2167
RA   Rice P.M.;
RT   ;
RL   Submitted (16-DEC-1988) to the EMBL/GenBank/DDBJ databases.
RL   Rice P.M., EMBL, Postfach 10-2209, Meyerhofstrasse 1, 6900 Heidelberg, FRG
.
XX
RN   [2]
RP   1167-2167
RX   MEDLINE; 89211409.
RA   Lowe N., Rice P.M., Drew R.E.;
RT   "Nucleotide sequence of the aliphatic amidase regulator gene of Pseudomona
s
RT   aeruginosa";
RL   FEBS Lett. 246:39-43(1989).
XX
RN   [3]
RP   1-1292
RX   MEDLINE; 91317707.
RA   Wilson S., Drew R.;
RT   "Cloning and DNA seqence of amiC, a new gene regulating expression of the
RT   Pseudomonas aeruginosa aliphatic amidase, and purification of the amiC
RT   product.";
RL   J. Bacteriol. 173:4914-4921(1991).
XX
RN   [4]
RP   1-2167
RA   Rice P.M.;
RT   ;
RL   Submitted (04-SEP-1991) to the EMBL/GenBank/DDBJ databases.
RL   Rice P.M., EMBL, Postfach 10-2209, Meyerhofstrasse 1, 6900 Heidelberg, FRG
.
XX
DR   SWISS-PROT; P10932; AMIR_PSEAE.
DR   SWISS-PROT; P27017; AMIC_PSEAE.
DR   SWISS-PROT; Q51417; AMIS_PSEAE.


  [Part of this file has been deleted for brevity]

FT                   phenotype"
FT                   /replace=""
FT                   /gene="amiC"
FT   misc_feature    1
FT                   /note="last base of an XhoI site"
FT   misc_feature    648..653
FT                   /note="end of 658bp XhoI fragment, deletion in  pSW3 cause
s
FT                   constitutive expression of amiE"
FT   conflict        1281
FT                   /replace="g"
FT                   /citation=[3]
XX
SQ   Sequence 2167 BP; 363 A; 712 C; 730 G; 362 T; 0 other;
     ggtaccgctg gccgagcatc tgctcgatca ccaccagccg ggcgacggga actgcacgat        6
0
     ctacctggcg agcctggagc acgagcgggt tcgcttcgta cggcgctgag cgacagtcac       12
0
     aggagaggaa acggatggga tcgcaccagg agcggccgct gatcggcctg ctgttctccg       18
0
     aaaccggcgt caccgccgat atcgagcgct cgcacgcgta tggcgcattg ctcgcggtcg       24
0
     agcaactgaa ccgcgagggc ggcgtcggcg gtcgcccgat cgaaacgctg tcccaggacc       30
0
     ccggcggcga cccggaccgc tatcggctgt gcgccgagga cttcattcgc aaccgggggg       36
0
     tacggttcct cgtgggctgc tacatgtcgc acacgcgcaa ggcggtgatg ccggtggtcg       42
0
     agcgcgccga cgcgctgctc tgctacccga ccccctacga gggcttcgag tattcgccga       48
0
     acatcgtcta cggcggtccg gcgccgaacc agaacagtgc gccgctggcg gcgtacctga       54
0
     ttcgccacta cggcgagcgg gtggtgttca tcggctcgga ctacatctat ccgcgggaaa       60
0
     gcaaccatgt gatgcgccac ctgtatcgcc agcacggcgg cacggtgctc gaggaaatct       66
0
     acattccgct gtatccctcc gacgacgact tgcagcgcgc cgtcgagcgc atctaccagg       72
0
     cgcgcgccga cgtggtcttc tccaccgtgg tgggcaccgg caccgccgag ctgtatcgcg       78
0
     ccatcgcccg tcgctacggc gacggcaggc ggccgccgat cgccagcctg accaccagcg       84
0
     aggcggaggt ggcgaagatg gagagtgacg tggcagaggg gcaggtggtg gtcgcgcctt       90
0
     acttctccag catcgatacg cccgccagcc gggccttcgt ccaggcctgc catggtttct       96
0
     tcccggagaa cgcgaccatc accgcctggg ccgaggcggc ctactggcag accttgttgc      102
0
     tcggccgcgc cgcgcaggcc gcaggcaact ggcgggtgga agacgtgcag cggcacctgt      108
0
     acgacatcga catcgacgcg ccacaggggc cggtccgggt ggagcgccag aacaaccaca      114
0
     gccgcctgtc ttcgcgcatc gcggaaatcg atgcgcgcgg cgtgttccag gtccgctggc      120
0
     agtcgcccga accgattcgc cccgaccctt atgtcgtcgt gcataacctc gacgactggt      126
0
     ccgccagcat gggcggggga ccgctcccat gagcgccaac tcgctgctcg gcagcctgcg      132
0
     cgagttgcag gtgctggtcc tcaacccgcc gggggaggtc agcgacgccc tggtcttgca      138
0
     gctgatccgc atcggttgtt cggtgcgcca gtgctggccg ccgccggaag ccttcgacgt      144
0
     gccggtggac gtggtcttca ccagcatttt ccagaatggc caccacgacg agatcgctgc      150
0
     gctgctcgcc gccgggactc cgcgcactac cctggtggcg ctggtggagt acgaaagccc      156
0
     cgcggtgctc tcgcagatca tcgagctgga gtgccacggc gtgatcaccc agccgctcga      162
0
     tgcccaccgg gtgctgcctg tgctggtatc ggcgcggcgc atcagcgagg aaatggcgaa      168
0
     gctgaagcag aagaccgagc agctccagga ccgcatcgcc ggccaggccc ggatcaacca      174
0
     ggccaaggtg ttgctgatgc agcgccatgg ctgggacgag cgcgaggcgc accagcacct      180
0
     gtcgcgggaa gcgatgaagc ggcgcgagcc gatcctgaag atcgctcagg agttgctggg      186
0
     aaacgagccg tccgcctgag cgatccgggc cgaccagaac aataacaaga ggggtatcgt      192
0
     catcatgctg ggactggttc tgctgtacgt tggcgcggtg ctgtttctca atgccgtctg      198
0
     gttgctgggc aagatcagcg gtcgggaggt ggcggtgatc aacttcctgg tcggcgtgct      204
0
     gagcgcctgc gtcgcgttct acctgatctt ttccgcagca gccgggcagg gctcgctgaa      210
0
     ggccggagcg ctgaccctgc tattcgcttt tacctatctg tgggtggccg ccaaccagtt      216
0
     cctcgag                                                                216
7
//
   
   You can specifiy a file of ranges to extract by giving the '-range'
   qualifier the value '@' followed by the name of the file containing
   the ranges. (eg: '-range @myfile').
   
   The format of the range file is:
     * Comment lines start with '#' in the first column.
     * Comment lines and blank lines are ignored.
     * The line may start with white-space.
     * There are two positive (integer) numbers per line separated by one
       or more space or TAB characters.
     * The second number must be greater or equal to the first number.
     * There can be optional text after the two numbers to annotate the
       line.
     * White-space before or after the text is removed.
       
   An example range file is:

# this is my set of ranges
12   23
 4   5       this is like 12-23, but smaller
67   10348   interesting region

Output file format

  Output files for usage example
  
  File: paamir.prettyseq
  
PRETTYSEQ of PAAMIR from 1 to 2167

           ---------|---------|---------|---------|---------|---------|
         1 GGTACCGCTGGCCGAGCATCTGCTCGATCACCACCAGCCGGGCGACGGGAACTGCACGAT 60


           ---------|---------|---------|---------|---------|---------|
        61 CTACCTGGCGAGCCTGGAGCACGAGCGGGTTCGCTTCGTACGGCGCTGAGCGACAGTCAC 120


           ---------|---------|---------|---------|---------|---------|
       121 AGGAGAGGAAACGGatgggatcgcaccaggagcggccgctgatcggcctgctgttctccg 180
         1               M  G  S  H  Q  E  R  P  L  I  G  L  L  F  S  E 16

           ---------|---------|---------|---------|---------|---------|
       181 aaaccggcgtcaccgccgatatcgagcgctcgcacgcgtatggcgcattgctcgcggtcg 240
        17   T  G  V  T  A  D  I  E  R  S  H  A  Y  G  A  L  L  A  V  E 36

           ---------|---------|---------|---------|---------|---------|
       241 agcaactgaaccgcgagggcggcgtcggcggtcgcccgatcgaaacgctgtcccaggacc 300
        37   Q  L  N  R  E  G  G  V  G  G  R  P  I  E  T  L  S  Q  D  P 56

           ---------|---------|---------|---------|---------|---------|
       301 ccggcggcgacccggaccgctatcggctgtgcgccgaggacttcattcgcaaccgggggg 360
        57   G  G  D  P  D  R  Y  R  L  C  A  E  D  F  I  R  N  R  G  V 76

           ---------|---------|---------|---------|---------|---------|
       361 tacggttcctcgtgggctgctacatgtcgcacacgcgcaaggcggtgatgccggtggtcg 420
        77   R  F  L  V  G  C  Y  M  S  H  T  R  K  A  V  M  P  V  V  E 96

           ---------|---------|---------|---------|---------|---------|
       421 agcgcgccgacgcgctgctctgctacccgaccccctacgagggcttcgagtattcgccga 480
        97   R  A  D  A  L  L  C  Y  P  T  P  Y  E  G  F  E  Y  S  P  N 116

           ---------|---------|---------|---------|---------|---------|
       481 acatcgtctacggcggtccggcgccgaaccagaacagtgcgccgctggcggcgtacctga 540
       117   I  V  Y  G  G  P  A  P  N  Q  N  S  A  P  L  A  A  Y  L  I 136

           ---------|---------|---------|---------|---------|---------|
       541 ttcgccactacggcgagcgggtggtgttcatcggctcggactacatctatccgcgggaaa 600
       137   R  H  Y  G  E  R  V  V  F  I  G  S  D  Y  I  Y  P  R  E  S 156

           ---------|---------|---------|---------|---------|---------|
       601 gcaaccatgtgatgcgccacctgtatcgccagcacggcggcacggtgctcgaggaaatct 660
       157   N  H  V  M  R  H  L  Y  R  Q  H  G  G  T  V  L  E  E  I  Y 176

           ---------|---------|---------|---------|---------|---------|
       661 acattccgctgtatccctccgacgacgacttgcagcgcgccgtcgagcgcatctaccagg 720
       177   I  P  L  Y  P  S  D  D  D  L  Q  R  A  V  E  R  I  Y  Q  A 196



  [Part of this file has been deleted for brevity]

      1441 GCCGGTGGACGTGGTCTTCACCAGCATTTTCCAGAATGGCCACCACGACGAGATCGCTGC 1500


           ---------|---------|---------|---------|---------|---------|
      1501 GCTGCTCGCCGCCGGGACTCCGCGCACTACCCTGGTGGCGCTGGTGGAGTACGAAAGCCC 1560


           ---------|---------|---------|---------|---------|---------|
      1561 CGCGGTGCTCTCGCAGATCATCGAGCTGGAGTGCCACGGCGTGATCACCCAGCCGCTCGA 1620


           ---------|---------|---------|---------|---------|---------|
      1621 TGCCCACCGGGTGCTGCCTGTGCTGGTATCGGCGCGGCGCATCAGCGAGGAAATGGCGAA 1680


           ---------|---------|---------|---------|---------|---------|
      1681 GCTGAAGCAGAAGACCGAGCAGCTCCAGGACCGCATCGCCGGCCAGGCCCGGATCAACCA 1740


           ---------|---------|---------|---------|---------|---------|
      1741 GGCCAAGGTGTTGCTGATGCAGCGCCATGGCTGGGACGAGCGCGAGGCGCACCAGCACCT 1800


           ---------|---------|---------|---------|---------|---------|
      1801 GTCGCGGGAAGCGATGAAGCGGCGCGAGCCGATCCTGAAGATCGCTCAGGAGTTGCTGGG 1860


           ---------|---------|---------|---------|---------|---------|
      1861 AAACGAGCCGTCCGCCTGAGCGATCCGGGCCGACCAGAACAATAACAAGAGGGGTATCGT 1920


           ---------|---------|---------|---------|---------|---------|
      1921 CATCATGCTGGGACTGGTTCTGCTGTACGTTGGCGCGGTGCTGTTTCTCAATGCCGTCTG 1980


           ---------|---------|---------|---------|---------|---------|
      1981 GTTGCTGGGCAAGATCAGCGGTCGGGAGGTGGCGGTGATCAACTTCCTGGTCGGCGTGCT 2040


           ---------|---------|---------|---------|---------|---------|
      2041 GAGCGCCTGCGTCGCGTTCTACCTGATCTTTTCCGCAGCAGCCGGGCAGGGCTCGCTGAA 2100


           ---------|---------|---------|---------|---------|---------|
      2101 GGCCGGAGCGCTGACCCTGCTATTCGCTTTTACCTATCTGTGGGTGGCCGCCAACCAGTT 2160


           -------
      2161 CCTCGAG 2167
   

Data files

   The codon usage table is read by default from "Ehum.cut" in the
   'data/CODONS' directory of the EMBOSS distribution. If the name of a
   codon usage file is specified on the command line, then this file will
   first be searched for in the current directory and then in the
   'data/CODONS' directory of the EMBOSS distribution.
   
   To see the available EMBOSS codon usage files, run:
   
% embossdata -showall

   To fetch one of the codon usage tables (for example 'Emus.cut') into
   your current directory for you to inspect or modify, run:
   
% embossdata -fetch -file Emus.cut

Notes

   None.
   
References

   None.
   
Warnings

   None.
   
Diagnostic Error Messages

   "Range outside length of sequence" - this is self explanatory. You
   should specify a range of sequences to translate that is within the
   length of the input sequence.
   
Exit status

   It always exits with a status of 0.
   
Known bugs

   None.
   
See also

   Program name Description
   abiview Reads ABI file and display the trace
   backtranseq Back translate a protein sequence
   cirdna Draws circular maps of DNA constructs
   coderet Extract CDS, mRNA and translations from feature tables
   lindna Draws linear maps of DNA constructs
   pepnet Displays proteins as a helical net
   pepwheel Shows protein sequences as helices
   plotorf Plot potential open reading frames
   prettyplot Displays aligned sequences, with colouring and boxing
   remap Display a sequence with restriction cut sites, translation etc
   seealso Finds programs sharing group names
   showalign Displays a multiple sequence alignment
   showdb Displays information on the currently available databases
   showfeat Show features of a sequence
   showorf Pretty output of DNA translations
   showseq Display a sequence with features, translation etc
   sixpack Display a DNA sequence with 6-frame translation and ORFs
   textsearch Search sequence documentation text. SRS and Entrez are
   faster!
   transeq Translate nucleic acid sequences
   
   showseq has more options for specifying various ways of displaying a
   sequence, with or without various ways of translating it.
   
Author(s)

   Alan Bleasby (ableasby  hgmp.mrc.ac.uk)
   HGMP-RC, Genome Campus, Hinxton, Cambridge CB10 1SB, UK
   
History

   Written (1999) - Alan Bleasby
   
Target users

   This program is intended to be used by everyone and everything, from
   naive users to embedded scripts.
   
Comments
