Package Bio :: Package expressions :: Module genbank
[show private | hide private]
[frames | no frames]

Module Bio.expressions.genbank

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the 'regular expressions on steroids' capabilities of Martel.

Documentation for GenBank format that I found:

o GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

o There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt
Function Summary
  define_block(identifier, block_tag, block_data, std_block_tag, std_tag)
Define a Martel grouping which can parse a block of text.

Variable Summary
Group accession = <Martel.Expression.Group instance at 0x2aaaa...
Group accession_block = <Martel.Expression.Group instance at 0...
Group authors_block = <Martel.Expression.Group instance at 0x2...
Group base_count = <Martel.Expression.Group instance at 0x2aaa...
Group base_count_line = <Martel.Expression.Group instance at 0...
Group base_number = <Martel.Expression.Group instance at 0x2aa...
Str big_indent_space = <Martel.Expression.Str instance at 0x...
MaxRepeat blank_space = <Martel.Expression.MaxRepeat instance at 0...
Group comment_block = <Martel.Expression.Group instance at 0x2...
Group consrtm_block = <Martel.Expression.Group instance at 0x2...
Group contig_block = <Martel.Expression.Group instance at 0x2a...
Group contig_location = <Martel.Expression.Group instance at 0...
Group data_file_division = <Martel.Expression.Group instance a...
Group date = <Martel.Expression.Group instance at 0x2aaaad7dfe...
Group db_source_block = <Martel.Expression.Group instance at 0...
Group definition_block = <Martel.Expression.Group instance at ...
list divisions = [<Martel.Expression.Str instance at 0x2aaaad...
Group feature = <Martel.Expression.Group instance at 0x2aaaad7...
Group feature_block = <Martel.Expression.Group instance at 0x2...
Group feature_key = <Martel.Expression.Group instance at 0x2aa...
int FEATURE_KEY_INDENT = 5                                                                     
Group feature_key_line = <Martel.Expression.Group instance at ...
int FEATURE_QUALIFIER_INDENT = 21                                                                    
Group features_line = <Martel.Expression.Group instance at 0x2...
ParseRecords format = <Martel.Expression.ParseRecords instance at 0x2...
Group gi = <Martel.Expression.Group instance at 0x2aaaad7e4200...
Seq header = <Martel.Expression.Seq instance at 0x2aaaad7f44...
int INDENT = 12                                                                    
Group journal_block = <Martel.Expression.Group instance at 0x2...
Group keywords_block = <Martel.Expression.Group instance at 0x...
Group location = <Martel.Expression.Group instance at 0x2aaaad...
Group locus = <Martel.Expression.Group instance at 0x2aaaad7df...
Group locus_line = <Martel.Expression.Group instance at 0x2aaa...
Group medline_line = <Martel.Expression.Group instance at 0x2a...
HeaderFooter ncbi_format = <Martel.Expression.HeaderFooter instance a...
Group nid = <Martel.Expression.Group instance at 0x2aaaad7e3b9...
Group nid_line = <Martel.Expression.Group instance at 0x2aaaad...
Group organism = <Martel.Expression.Group instance at 0x2aaaad...
Group organism_block = <Martel.Expression.Group instance at 0x...
Group origin_line = <Martel.Expression.Group instance at 0x2aa...
Group pid = <Martel.Expression.Group instance at 0x2aaaad7e3e1...
Group pid_line = <Martel.Expression.Group instance at 0x2aaaad...
Group primary = <Martel.Expression.Group instance at 0x2aaaad7...
Group primary_line = <Martel.Expression.Group instance at 0x2a...
Group primary_ref_line = <Martel.Expression.Group instance at ...
Group pubmed_line = <Martel.Expression.Group instance at 0x2aa...
Group qualifier = <Martel.Expression.Group instance at 0x2aaaa...
Alt qualifier_space = <Martel.Expression.Alt instance at 0x2...
Str quote = <Martel.Expression.Str instance at 0x2aaaad7e9a2...
Group quoted_chars = <Martel.Expression.Group instance at 0x2a...
Seq quoted_string = <Martel.Expression.Seq instance at 0x2aa...
Group record = <Martel.Expression.Group instance at 0x2aaaad7e...
Group record_end = <Martel.Expression.Group instance at 0x2aaa...
Group reference = <Martel.Expression.Group instance at 0x2aaaa...
Group reference_bases = <Martel.Expression.Group instance at 0...
Group reference_line = <Martel.Expression.Group instance at 0x...
Group reference_num = <Martel.Expression.Group instance at 0x2...
Group region = <Martel.Expression.Group instance at 0x2aaaad7e...
Group remark_block = <Martel.Expression.Group instance at 0x2a...
list residue_prefixes = [<Martel.Expression.Str instance at 0...
Group residue_type = <Martel.Expression.Group instance at 0x2a...
list residue_types = [<Martel.Expression.Str instance at 0x2a...
Group segment = <Martel.Expression.Group instance at 0x2aaaad7...
Group segment_line = <Martel.Expression.Group instance at 0x2a...
Group sequence = <Martel.Expression.Group instance at 0x2aaaad...
Group sequence_entry = <Martel.Expression.Group instance at 0x...
Group sequence_line = <Martel.Expression.Group instance at 0x2...
Group sequence_plus_spaces = <Martel.Expression.Group instance...
Group size = <Martel.Expression.Group instance at 0x2aaaad7df5...
Str small_indent_space = <Martel.Expression.Str instance at ...
Group source_block = <Martel.Expression.Group instance at 0x2a...
Group taxonomy = <Martel.Expression.Group instance at 0x2aaaad...
Group title_block = <Martel.Expression.Group instance at 0x2aa...
Seq unquoted_string = <Martel.Expression.Seq instance at 0x2...
list valid_divisions = ['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'P...
list valid_residue_prefixes = ['ss-', 'ds-', 'ms-']
list valid_residue_types = ['DNA', 'RNA', 'mRNA', 'tRNA', 'rR...
Group version = <Martel.Expression.Group instance at 0x2aaaad7...
Group version_line = <Martel.Expression.Group instance at 0x2a...

Function Details

define_block(identifier, block_tag, block_data, std_block_tag=None, std_tag=None)

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

Variable Details

accession

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e1d40>                   

accession_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3a70>                   

authors_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e6e60>                   

base_count

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eb8c0>                   

base_count_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eb9e0>                   

base_number

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ebea8>                   

big_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaad7df290>                     

blank_space

Type:
MaxRepeat
Value:
<Martel.Expression.MaxRepeat instance at 0x2aaaad7df2d8>               

comment_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ea5a8>                   

consrtm_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad890488>                   

contig_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ecc20>                   

contig_location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ec830>                   

data_file_division

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e15a8>                   

date

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7dfef0>                   

db_source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e4b90>                   

definition_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3200>                   

divisions

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaad7dfe60>,
 <Martel.Expression.Str instance at 0x2aaaad7dffc8>,
 <Martel.Expression.Str instance at 0x2aaaad7e1050>,
 <Martel.Expression.Str instance at 0x2aaaad7e1098>,
 <Martel.Expression.Str instance at 0x2aaaad7e10e0>,
 <Martel.Expression.Str instance at 0x2aaaad7e1128>,
 <Martel.Expression.Str instance at 0x2aaaad7e1170>,
 <Martel.Expression.Str instance at 0x2aaaad7e11b8>,
...                                                                    

feature

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eb6c8>                   

feature_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eb7a0>                   

feature_key

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e92d8>                   

FEATURE_KEY_INDENT

Type:
int
Value:
5                                                                     

feature_key_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e9950>                   

FEATURE_QUALIFIER_INDENT

Type:
int
Value:
21                                                                    

features_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e9170>                   

format

Type:
ParseRecords
Value:
<Martel.Expression.ParseRecords instance at 0x2aaaad7ed560>            

gi

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e4200>                   

header

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaad7f44d0>                     

INDENT

Type:
int
Value:
12                                                                    

journal_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e8098>                   

keywords_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e71b8>                   

location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e9518>                   

locus

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7df440>                   

locus_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e1680>                   

medline_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e8128>                   

ncbi_format

Type:
HeaderFooter
Value:
<Martel.Expression.HeaderFooter instance at 0x2aaaad7ed3f8>            

nid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3b90>                   

nid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3cb0>                   

organism

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e7ef0>                   

organism_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e6320>                   

origin_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ebd88>                   

pid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3e18>                   

pid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e3f38>                   

primary

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eafc8>                   

primary_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ea950>                   

primary_ref_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eaf80>                   

pubmed_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e8488>                   

qualifier

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7eb5a8>                   

qualifier_space

Type:
Alt
Value:
<Martel.Expression.Alt instance at 0x2aaaad7df368>                     

quote

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaad7e9a28>                     

quoted_chars

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e9b00>                   

quoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaad7e9ea8>                     

record

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ed2d8>                   

record_end

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ece18>                   

reference

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e8ef0>                   

reference_bases

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e65a8>                   

reference_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e67e8>                   

reference_num

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e6440>                   

region

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e33f8>                   

remark_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e8d40>                   

residue_prefixes

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaad7df5f0>,
 <Martel.Expression.Str instance at 0x2aaaad7df758>,
 <Martel.Expression.Str instance at 0x2aaaad7df7a0>]                   

residue_type

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7dfbd8>                   

residue_types

Type:
list
Value:
[<Martel.Expression.Str instance at 0x2aaaad7df7e8>,
 <Martel.Expression.Str instance at 0x2aaaad7df830>,
 <Martel.Expression.Str instance at 0x2aaaad7df878>,
 <Martel.Expression.Str instance at 0x2aaaad7df8c0>,
 <Martel.Expression.Str instance at 0x2aaaad7df908>,
 <Martel.Expression.Str instance at 0x2aaaad7df950>,
 <Martel.Expression.Str instance at 0x2aaaad7df998>,
 <Martel.Expression.Str instance at 0x2aaaad7df9e0>,
...                                                                    

segment

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e7320>                   

segment_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e7758>                   

sequence

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ec050>                   

sequence_entry

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ec638>                   

sequence_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ec440>                   

sequence_plus_spaces

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7ec320>                   

size

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7df5a8>                   

small_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x2aaaad7df248>                     

source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e7e18>                   

taxonomy

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e6128>                   

title_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad890a70>                   

unquoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x2aaaad7eb0e0>                     

valid_divisions

Type:
list
Value:
['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'PLN', 'BCT', 'RNA', 'VRL']        

valid_residue_prefixes

Type:
list
Value:
['ss-', 'ds-', 'ms-']                                                  

valid_residue_types

Type:
list
Value:
['DNA', 'RNA', 'mRNA', 'tRNA', 'rRNA', 'uRNA', 'scRNA', 'snRNA', 'snoR\
NA']                                                                   

version

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e4050>                   

version_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x2aaaad7e4518>                   

Generated by Epydoc 2.1 on Mon Aug 14 08:26:43 2006 http://epydoc.sf.net