CMR logo
Search  for

CMR Tool Descriptions

CMR Home Page Tools


CMR Home Page: The Comprehensive Microbial Resource (CMR) is a tool that allows the researcher to access all of the bacterial genome sequences completed to date. For each genome not sequenced at TIGR two kinds of annotation are displayed: the Primary annotation taken from the genome sequencing center and the TIGR annotation generated by an automated annotation process at TIGR. Use the CMR to access information on all of the bacterial genomes or any subset of them. To get to the tools available on the CMR click on the menu options in the lower right hand corner of the header at the top of the page. Each menu option takes you to a listing of the tools available under that main category. To go to a genome page select a genome from the genome menu at the lower right hand corner of this page.

CMR Publication: The CMR is fully described in: J.D. Peterson, L.A. Umayam, T.M. Dickinson, E.K. Hickey and O. White. The Comprehensive Microbial Resource. Nucleic Acids Research, 29:1 (2001), 123-125.

CMR Data Release Control: To cope with the numerous complete prokaryotic genomes released by the scientific community the CMR instituted a quarterly data release schedule at the beginning of 2001. At the beginging of each quarter we obtain all the new genomes available at GenBank and run several analyses on the data including TIGR automated annotation of the genome. The data is then added to the CMR. The Data Release numbers used to track the CMR updates start with the dataset of genomes available before quarterly updates were implemented in 2001 as Data Release 1.0. Each subsequent quarterly update is incremented by 1, i.e. the first quarterly release after Data Release 1.0 is Data Release 2.0. Smaller releases in between quarters are incremented by one-tenth, i.e. the first interim release between Data Release 2.0 and Data Release 3.0 is Data Release 2.1. The summary of the CMR Data Release Control numbers for all genomes in the CMR are available on this page. Click on the name of the organism to get to the CMR genome page.

CMR Update Schedule: At TIGR we are continuously adding newly published prokaryotic genomes to our CMR database. As new genomes not sequenced at TIGR become available at GenBank we load the genome and sequencing center annotation into an individual organism database and perform TIGR automated annotation. Once the TIGR automated annotation is complete, we add the genome to an internal version of the CMR database which is not visible from the CMR web site.

All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. Many of the tools on the CMR depend on the results from the All v/s All searches. Once several new genomes have been added to the internal CMR database we run All v/s All searches. After the All v/s All searches are complete we push the new data to the external CMR database, where it is visible from the CMR web site.

We have recently implemented a quarterly system for updating the external CMR database, please refer to the schedule below. Because of the amount of time it takes to prepare a genome for inclusion on the CMR, as well as the amount of time needed to perform the All v/s All searches, genomes must be present at GenBank at least two months prior to the date the external CMR is updated to be included in the update. As we approach an update a more specific date will be posted below. Newly published TIGR genomes will be made available on the CMR web site on the date of publication.

CMR TIGR Role Ids: TIGR assigned role IDs and their corresponding role categories are listed on this page.

CMR TIGR Role Notes: This page gives additional information on the role category selected.

CMR Genomes Page Tools


CMR Genomes Page: This page lists the tools available for showing the genomes in the CMR.

CMR Genomes: Shows all genomes in the CMR ordered by organism name. Also includes links and other information on the genomes. Click on the organism name in the table to go to the Genome Home Page for the selected genome. Click on the links in the table to go to the taxon page, sequencing center, the sequencing center genome page, the NCBI genome page, the FTP sites, or the publication for the genome.

CMR Genomes Sorted by Taxonomy: Shows all genomes in the CMR ordered by taxonomy. The genomes are shown in a taxonomic tree that can be expanded or collapsed. Click on the + in the tree to expand a taxonomy and click on the - to compress a taxonomy.

Genome Summary Page: The Genome Page for a genome. This page shows general information on a genome including links to genome specific links and taxonomy. The Genome specific menu in the top right hand corner of the page will link to all of the tools available for this genome.

CMR Search Page Tools


CMR Search Page: This page lists the tools available for searching the CMR database including annotation, EC #, HMM, COG, and role category searches.

CMR Search Tool: This tool allows the user to search the CMR database based on different criteria including:
    The Annotation Search Tool - search for a string in the annotation of a genome, this includes searches of the loci names, gene symbols, and common names of the genomes selected. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The SWISS-PROT/TrEMBL Search Tool - find any genes from the selected genomes with a specific SWISS-PROT or TrEMBL accession. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The GenBank Search Tool - find any genes from the selected genomes with a specific GenBank accession. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The EcoCyc Search Tool - find any genes from the selected genomes with a specific EcoCyc accession. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The Enzyme Commission (EC) Number Search - lists all genes from the selected genomes that are assigned a specific EC Number. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The Gene Ontology (GO) Search Tool - lists all genes from the selected genomes that are assigned a specific GO Term. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The TIGRFAM/Pfam Search Tool - lists all genes from the selected genomes that hit a specific HMM, either TIGRFAM or Pfam. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The Clusters of Orthologous Groups of proteins (COGs) Search Tool - lists all genes from the selected genomes that hit a specific NCBI COG. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The Interpro Search Tool - lists all genes from the selected genomes that hit a specific Interpro accession number. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.
    The PROSITE Search Tool - lists all genes from the selected genomes that hit a specific PROSITE accession number. You can choose to search either the Primary or the TIGR annotation, but the default is to search both. When you are done making your selections hit the "Submit" button. All searches are case independent.

EGAD Search: Shows description of given EGAD, including name, nucleotide and protein information, and accession information used to derive EGAD term. .

The Role Category Search Tool: This tool lists all genes assigned one or more TIGR role categories for the selected genome(s). To select the roles first click on a mainrole category. Next click on the subrole category of interest and use the "Add" button to add this subrole category to your list. You can choose to search either the Primary or the TIGR annotation. When you are done making your selections hit the "Submit" button.

The Selected Role Category Search Tool: This tool lists all genes assigned to the TIGR role category passed to the program for the selected genome(s). Choose the genomes you would like to search by click on the genome name and hitting the "Add" button. You can choose to search either the Primary or the TIGR annotation. When you are done making your selections hit the "Submit" button.

The Region View Search Tool: This tool displays small regions of a DNA molecule either based on a locus name or a set of coordinates. Either enter a locus name or enter your coordinates and select the organism and DNA molecule of choice. When you are done making your selections hit the "Submit" button.

The Position Search/Segment Retrieval Tool: This tool retrieves either the sequence or a list of genes found between a pair of coordinates. Choose "Retrieve sequence segment" to get the sequence between the two coordinates, choose "View predicted coding regions" to get a gene list. Select the organism and DNA molecule of choice and input your left and right coordinates. If you chose "View predicted coding regions" you must select the annotation type you wish to see. If you chose "Retrieve sequence segment" you must choose if you want the sequence from the forward or the reverse strand. When you are done making your selections hit the "Submit" button.

Annotation Search Report: Here the results from your annotation search are displayed. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

Accession Search Report: Shows the genes for the selected genomes that are part of a Swiss-Prot, GenBank or EcoCyc accession. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

EC Search Report: Shows all genes for the selected genomes that have a specific EC number. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

GO Term Search Report: Shows all genes for the selected genomes that are assigned a specific GO Term. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

Evidence Search Report: Shows the genes for the selected genomes that are hit by a TIGRFAM, Pfam, Interpro, PROSITE, or COG accession. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. For TIGRFAM/Pfam searches all hits are shown and the noise and trusted cutoffs are shown under the title for the page. To show only hits above the "noise" or "trusted" cutoff for the HMM click on the "Noise Cutoff" or "Trusted Cutoff" links under the title for the page. To see all hits (even those under the cutoffs) click on the "Show all Hits" link under the title for the page. To see the HMM or COG information page click on the HMM or COG accession number in the table. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

Role Category Search Report: Shows the genes for the selected genomes that are within the selected role categories. To get to the Gene Page for a gene click on the Locus. Click on the EC Number to get a description of the number. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

Region View Report: This tool displays a region of a DNA molecule showing all genes. To view the information on each of the genes displayed, place the cursor over the gene of interest. The fields (locus, coordinates, name, etc.) will fill in the appropriate information. To access the gene information page on a gene, click on the gene of interest. Very small genes will not be labeled on the graphical display below to keep gene name overlap to a minimum. A second table displays the start and stop sites for all six translational reading frames (three for the "top" DNA strand and three, reading in the opposite orientation, on the "bottom" strand). The long black lines represent stop sites while the short colored lines represent the 3 different start sites. The vertical position of each gene in the gene display above corresponds to the reading frame in which it is found.

Position Search/Segment Retrieval Report: This tool shows either the nucleotide sequence or a gene list between two coordinates on a DNA molecule. For the gene list click on the locus name in the table to get to the Gene Page for that gene. Click on the "Options" button in the top right hand corner of the page to select a different annotation type or to download the nucleotide or protein sequences for all genes in the table. Finally, click on the "Download" button in the top right hand corner of the page to download the table in tab delimited format.

CMR Toolbox Page Tools


CMR Toolbox: This page lists the "toolbox" tools available, including BLAST searches, restriction digest, operon predictions, as well as many other tools.

Restriction Digest Tool Selection Page: Type II restriction digestions can be generated from any organism using the restriction enzymes displayed. You must select your enzymes, select your sequence, and then select your output options. In the "Enzyme Selection" section you can either select a list of one or more enzymes, choose your own recognition site, or choose a minimum or maximum number of cuts. To select a list of enzymes choose enzymes from the column on the left and add or remove those to your list on the right using the "Add" and "Remove" buttons. Select an enzyme and click on the "See Enzyme Info" button to get more information on that enzyme. In the "Sequence Selection" section you must select either an organism and molecule or input your sequence into the text box or upload a file of sequence. Finally, select your output options in the "Output Options" section. If no locus or coordinates are provided, the complete molecule will be digested.

DNA molecule Information Selection Page: This tool provides detailed information on each of the DNA molecules found in the organism (chromosomes, plasmids) including the topology (linear or circular), length, %A, T, G, C and number of genes. Select your organisms of interest and add them to your selected genome list by using the "add" button. Hit the Submit button to get the results.

DNA molecule Information Results: This tool provides detailed information on each of the DNA molecules found in the organism (chromosomes, plasmids) including the topology (linear or circular), length, %A, T, G, C and number of genes. To download the table into a tab delimited file click on the "Download" button.

Pseudo-2D Gel Selection Page: This tool makes a computer model of a 2-Dimensional gel for the selected organism. Select your genome and hit the submit button.

Pseudo-2D Gel Results: This tool makes a computer model of a 2-Dimensional gel for the selected organism. The results here are an approximation of what a 2D gel for this organism should look like. No attempt has been made to try to deal with issues like post-transcriptional processing. Mouse over the dots in the plot to view more information. Use the pull down menu to select different options including: Zooming in or out of the graph, switching the X and Y axis and selecting the Gene Page for a gene of interest. Click on the link at the bottom of the page to get the data from the graph in tab delimited format. Click on the "Options" button to change the annotation type that is currently being displayed.

Compact Genome Display Selection Page: This tool shows all of the genes on a DNA molecule as different colored stars. The colors represent different functional roles. Select an organism of interest and then select the DNA molecule of interest. Hit the "Submit" button to get the results.

Compact Genome Display Results: This tool shows the genes on the selected DNA molecule as different colored stars. The colors represent different functional roles. A window with a legend that shows which colors represent which functional roles also opens when this page is loaded. To display basic information about each gene, place the cursor over the gene of interest and the information will be displayed below. To see the Gene Page for a gene, to zoom in on a region, to highlight the selected role category or to highlight and show the RNAs in the molecule select the appropriate option and click on the gene/region of interest.

Circular Genome Display Selection Page: This tool shows a circular image of a selected DNA molecule. The image includes representation of all genes on the molecule as well as all tRNAs and rRNAs. Select an organism of interest and then select the DNA molecule of interest. Also select the annotation type of interest. Hit the "Submit" button to get the results.

Circular Genome Display Results: This tool shows a circular image of a selected DNA molecule. The image includes representation of all genes on the molecule as well as all tRNAs and rRNAs. The colors of the genes represent different functional roles. A window with a legend that shows which colors represent which functional roles also opens when this page is loaded. To display a list of genes/RNAs in a region, place the cursor over the area of interest and the information will be displayed below. To zoom in on a region click on the region of interest.

Role Category Color Chart: This page shows the colors used to represent each role category in the CMR.

GC Content Display Selection Page: This tool shows a graph displaying the %GC content for a selected DNA molecule. Select an organism of interest and then select the DNA molecule of interest. Also select the annotation type of interest. Hit the "Submit" button to get the results.

GC Content Display Results: This graph shows the %GC content display for the selected DNA molecule. The genome is represented on the X-axis from left to right. Along the Y-axis of the plot is the average %GC for a set "window" of nucleotides, use the window option below to change this setting. The black center line is the median %GC found in this genome. The two dashed red lines represent the 5% lower limit and the 95% upper limit average GC content for the DNA molecule shown; 90% of the GC content of this genome falls within these two dashed red lines. If a specific gene was passed to this program, the location of the gene is shown on the plot by a blue line.

Paralog Description: Shows the family members in a paralogous family. Click on the "Locus" to get the the Gene Page for a family member. Click on "Alignment" to see the alignment of the family.

Paralog Alignment: Shows the alignment of family members in a paralogous family.

Codon Usage: Shows codon usage within a genome. Select a genome and annotation type. Optional parameters include specific molecules, role categories, and coordinates.

CMR Comparative Analyses Page Tools


CMR Comparative Analyses: Here you can find the tools that compare the different genomes in the CMR to one another. Tools include GC comparison, Genomes region comparison, Genome v/s Genome protein hits, and the Genome v/s Genome protein scatter plot.

Genome Region Comparison Selection Page:All of the proteins in the CMR are routinely blasted against one another. We call these the All v/s All searches. When a new genome is added to the database the All v/s All searches are re-run. This tool takes the protein encoded for by the supplied locus and finds similar proteins in other organisms on the CMR using the All v/s All search results. Then supplied locus and its surrounding genes are aligned with genes that produce similar proteins and their surrounding genes. This allows the user to see regions of similarity that two or more genomes may have in common. Input your locus of interest and select the organisms to which you would like to compare the locus.

Genome Region Comparison Results:All of the proteins in the CMR are routinely blasted against one another. We call these the All v/s All searches. When a new genome is added to the database the All v/s All searches are re-run. This tool takes the protein encoded for by the supplied locus and finds similar proteins in other organisms on the CMR using the All v/s All search results. Then supplied locus and its surrounding genes are aligned with genes that produce similar proteins and their surrounding genes. This allows the user to see regions of similarity that two or more genomes may have in common. The loci are color coded by role category. Mouse over the gene of interest to see more information. The genomes are ordered by overall similarity of the entire region to the comparison organism, i.e., the genome with the most similarity to the comparison genome is listed first in the graphic.

Genome v/s All Genomes Protein Hits Selection Page: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This page shows the number of genes organism supplied has in common with all of the genomes in the CMR (not including itself), based on all v/s all searches. Select your reference genome and select if you want to see "Total Protein Hits" or "Best Protein Hits". The "Best Protein Hits" only shows the protein or proteins from the blast search that have the lowest P value. The "Total Protein Hits" shows all of the hits stored in the database.

Genome v/s All Genomes Protein Hits Results: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This page shows the number of genes organism supplied has in common with all of the genomes in the CMR (not including itself), based on all v/s all searches. Each dot on the plot below represents a "match" organism in the CMR. The X-axis value is the total number of genes in each match organism. The Y-axis value is the number of hits between the match organism and organism supplied, based on the all v/s all blast searches. Mouse over the dots to see the information for each genome shown in the form. Click on the link at the bottom of the page to get a table of the data shown in the graphic.

Scatter Plot Selection Page: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This tool will generate a plot of protein hits between two organisms and is intended to show how closely two genomes are related. The protein hits between two organisms are taken from the all v/s all blast searches. Choose a reference genome and DNA molecule as well as a comparison genome and molecule. The reference and comparison genomes can be the same organism. Also select if you want to see "Total Protein Hits" or "Best Protein Hits". The "Best Protein Hits" only shows the protein or proteins from the blast search that have the lowest P value. The "Total Protein Hits" shows all of the hits stored in the database.

Scatter Plot Results: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This tool will generate a plot of protein hits between two organisms and is intended to show how closely two genomes are related. The protein hits between two organisms are taken from the all v/s all blast searches. Each dot in the graphic represents two genes that are similar to one another, one gene is from the X-axis genome while the other is from the Y-axis genome. Mouse over the dots to see the information for each genome shown in the form. Click on the link at the bottom of the page to get a table of the data shown in the graphic. Click on the "Options" button to change the annotation type that is currently being displayed.

Role Category Pie Graph Selection Page: This tool produces a pie chart that shows the percentage of genes represented in each of the TIGR role categories. Select a genome and hit the "Submit" button to get the results.

Role Category Pie Graph Results: The pie chart shows the percentage of genes that are represented in each of the TIGR role categories. Beside the pie chart a table also shows the number and percentages of genes in each family represented. To see a list of the genes in each of the families, click on the number of genes in the table. Click on the "Download" button to download the results from the table into a tab delimited file. Click on the "Options" button to change the annotation type that is currently being displayed.
Warning: All Role Category data shown below was generated on the TIGR annotation of the genome. The data shown in this graph was generated at the time each of the genomes was entered into the CMR. Consequently, newer genomes may have more genes assigned to role categories such as conserved hypothetical protein than older genomes. In addition, because some genes are assigned to more than one role category the total number of genes represented below may be higher than the number of genes in the genome.

Role Category Bar Chart Selection Page: This tool shows the number and percentage of genes in the CMR assigned to each of the role categories. Select the organisms you would like to show up in the graph as well as your TIGR role category of interest.

Role Category Bar Chart Results: The bar chart shows the percentage of genes that are represented in the selected TIGR role category. Above the bar chart a table also shows the number and percentages of genes in each family represented. To see a list of the genes in each of the organisms, click on the number of genes in the role category in the table. Click on the "Download" button to download the results from the table into a tab delimited file. Click on the "Options" button to change the annotation type that is currently being displayed.
Warning: All Role Category data shown below was generated on the TIGR annotation of the genome. The data shown in this graph was generated at the time each of the genomes was entered into the CMR. Consequently, newer genomes may have more genes assigned to role categories such as conserved hypothetical protein than older genomes. In addition, because some genes are assigned to more than one role category the total number of genes represented below may be higher than the number of genes in the genome.

Terminator Bar Chart Selection Page: This tool shows the number and percentage of genes in each genome predicted to have Rho-independent terminators. Add as many genomes as you would like to your "Selected Genomes" list by clicking on the organism name and then hitting the "Add" button. The graph will only include the genomes selected on this page.

Terminator Bar Chart Results: This tool shows the number and percentage of genes in each genome predicted to have Rho-independent terminators. A summary of the results are shown in a table and the bar chart is also shown. To download the data in the table click on the "Download" button. Click on the "Options" button to change the annotation type that is currently being displayed.

GC Comparison Selection Page: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This tool will generate a plot of protein hits between two organisms and compares the % GC of the hits. Choose a reference genome and DNA molecule as well as a comparison genome and molecule. The reference and comparison genomes can be the same organism. Also select if you want to see "Total Protein Hits" or "Best Protein Hits". The "Best Protein Hits" only shows the protein or proteins from the blast search that have the lowest P value. The "Total Protein Hits" shows all of the hits stored in the database.

GC Comparison Results: All of the proteins in the CMR are routinely blasted against one another to produce the "All v/s All" data. When a new genome is added to the database the all v/s all searches are re-run. This tool will generate a plot of protein hits between two organisms and compares the % GC of the hits. The protein hits between two organisms are taken from the all v/s all blast searches. Each dot in the graphic represents two genes that are similar to one another, one gene is from the X-axis genome while the other is from the Y-axis genome. Those genes along the diagonal have approximately the same %GC content. Blast matches that have significantly different GC contents may indicate interesting evolutionary processes, such as horizontal gene transfer or strong selection. Mouse over the dots to see the information for each genome shown in the form. Click on the link at the bottom of the page to get a table of the data shown in the graphic. Click on the "Options" button to change the annotation type that is currently being displayed.

CMR Gene List Page Tools


CMR Gene Lists: Find tools here that list different feature types of CMR genomes including ORFs, terminators, RNAs, TIGRFAMs, Pfams, and EC numbers. Some of the tools list by TIGR role category.

CMR Role Categories: This page shows all of the CMR TIGR role categories. Click on a "+" sign to expand the main role category and show all sub role categories under it. To download the entire list click on the "Download" button in the upper right hand corner of the page. To see a list of genes from a role category click on the role category name.

RNA List Selection Page: This is the selection page to display the RNAs from the selected genomes. Choose the genomes you would like to search by click on the genome name and hitting the "Add" button. You can choose to display the Transfer RNAs (tRNAs), Ribosomal RNAs (rRNAs), or Structural RNAs (sRNAs), but the default is to display all three. When you are done making your selections hit the "Submit" button.

RNA List Results: All of the selected RNA types from the selected genomes are shown on this page. The tables are broken down by organism and by RNA type. To go to the RNA report page click on the RNA name. To download a table click on the "Download" button at the top right hand corner of the table.

RNA Report Page: This page shows the information on a selected RNA, including the sequence, length, and coordinates. If this is a tRNA the anticodon is also included. To download the table click on the "Download" button at the top right hand corner of the page.

Clone Viewer Page: Clones in the area of the selected gene that are available at the TIGR/ATCC Microbial Genome Special Collection are displayed below along with their corresponding ATCC numbers. The clones that span your selected gene are marked in blue. You may order clones from the TIGR/ATCC Microbial Genome Special Collection by clicking the "Order Clone" button below the image. Clones are available at ATCC's website.

Terminator List Selection Page: This page will display a list of rho-independent transcriptional terminators for the selected genomes. The terminators are identified using the algorithm which searches for a common mRNA motif. The algorithm is fully described in: Prediction of Transcription Terminators in Bacterial Genomes Ermolaeva,M.D., Khalak,H.G., White,O., Smith,H.O., Saltzberg,S.L. Journal of Molecular Biology 301,27-33(2000). Choose the genomes you would like to search by click on the genome name and hitting the "Add" button. When you are done making your selections hit the "Submit" button.

Terminator List Results: This page shows the predicted rho-independent terminators for the selected genomes. The terminators are identified using the algorithm which searches for a common mRNA motif. The algorithm is fully described in: Prediction of Transcription Terminators in Bacterial Genomes Ermolaeva,M.D., Khalak,H.G., White,O., Smith,H.O., Saltzberg,S.L. Journal of Molecular Biology 301,27-33(2000). To see the Gene Page for a gene click on the Terminator Locus. To download the table click on the "Download" button at the top right hand corner of the page.

Enzyme Commission Numbers List: This page lists all of the Enzyme Commission (EC) numbers in the CMR. To expand the list and see the EC numbers under the main numbers click on the "+" sign. To compress the list click on the "-" sign. To search for all genes in the CMR assigned a specific EC number click on the EC number in the list. To download the data click on the "Download" button at the top right hand corner of the page.

The Evidence List Page: This page lists the accessions of different evidence types, including COGs, TIGRFAMs, and Pfams. See below to find explanations of these different pages.
    CMR TIGRFAMs Ordered by Role Category - here the different TIGRFAMs assigned to the CMR are all listed ordered by TIGR role category. To expand the role categories click on the "+" button. To compress the categories click on the "-" button. To download the entire page click on the "Download" button at the top right hand corner of the page. To download one table click on the "Download" button at the top right hand corner of the table. To go to the HMM information page for a TIGRFAM click on the TIGRFAM accession. To search the CMR for genes that are assigned a TIGRFAM click on the "Hits" link in the table.
    CMR TIGRFAMs - here all of the TIGRFAMs in the CMR are listed ordered by accession number. To download the table click on the "Download" button at the top right hand corner of the page. To go to the HMM information page for a TIGRFAM click on the TIGRFAM accession. To search the CMR for genes that are assigned a TIGRFAM click on the "Hits" link in the table.
    CMR Pfams - here all of the Pfams in the CMR are listed ordered by accession number. To download the table click on the "Download" button at the top right hand corner of the page. To go to the Pfam information page click on the Pfam accession. To search the CMR for genes that are assigned a Pfam click on the "Hits" link in the table.
    CMR COGs - here all of the NCBI Clusters of Orthologous Groups of proteins (COGs) in the CMR are listed ordered by accession number. To download the table click on the "Download" button at the top right hand corner of the page. To go to the COG information page click on the COG accession. To search the CMR for genes that are assigned a COG click on the "Hits" link in the table.
    CMR Interpros - here all of the Interpro protein families, domains and functional sites in the CMR are listed ordered by accession number. To download the table click on the "Download" button at the top right hand corner of the page. To go to the Interpro information page click on the Interpro accession. To search the CMR for genes that are assigned a Interpro click on the "Hits" link in the table.
    CMR PROSITEs - here all of the PROSITE protein families and domains in the CMR are listed ordered by accession number. To download the table click on the "Download" button at the top right hand corner of the page. To go to the PROSITE information page click on the PROSITE accession. To search the CMR for genes that are assigned a PROSITE click on the "Hits" link in the table.

CMR Download Page Tools


CMR Downloads: Find tools to download data from the CMR here.

Batch Download Selection Page:The CMR Batch Download allows you to retrieve a large number of sequences from the CMR and save them to your computer's local disk. There are several different options you can choose from: Retrieve all sequences for a specific organism - get all ORFs from an organism Retrieve all sequences for a specific role category - get all ORFs from all organisms that are in a selected set of role categories Retrieve all sequences for a specific organism AND a role category - get all ORFs from one organism that are in a selected set of role categories Retrieve all sequences from a file of accessions - get all ORFs from a list of accessions uploaded to this program Retrieve nucleotide sequences for an entire DNA molecule - get the entire DNA molecule for an organism.

Batch Download Results:The CMR Batch Download allows you to retrieve a large number of sequences from the Comprehensive Microbial Resource and save them to your computer's local disk. You can choose to either download all nucleotides sequences from one or all DNA molecules from an organism or download the nucleotide or protein sequences from all Open Reading Frames (ORF) from an organism. There are three ways of submitting a query to retrieve ORF sequences: send a list of accessions (Primary or TIGR annotation accession numbers), selecting an organism, or selecting a role category.

Gene Attribute Download Selection Page: Retrieves to your local disk selected gene attributes for either a user-generated list of genes or all genes from a specific organism and/or role category.

Gene Attribute Download Results: Retrieves to your local disk selected gene attributes for either a user-generated list of genes or all genes from a specific organism and/or role category.

Gene Page Tools


Gene Page Annotation Display: This tool displays the annotation, either primary or TIGR automated, for a gene.

Gene Page Sequence: Here the sequence, both nucleotide and protein, is displayed for a gene.

Gene Page Graphical Display: This tool presents the gene and associated information (including rho-independent terminators, ribosomal binding sites, TIGRFam/Pfam matches, secondary structure, SP sites, LPs, Interpro hits, and PROSITE hits). Clicking on any element will center the display on that particular element. Clicking on a region of the scale will zoom into that particular region. Zoom out doubles the current view. Refresh sets the image back to its original size. If the image is zoomed in close enough to display amino acids or nucleotides, they will automatically show next to the scale. The gene is located on the highlighted strand. The start and stop nucleotides are highlighted in red and green respectively. The arrows on either side of the scale scroll the view in their respective directions.

Gene Page Matrix: Shows the number of hits of every genome to a particular evidence (Interpro, PROSITE, COG, HMMs, Roles) for the given gene.

Gene Page Codon Display: In this tool the codon table for the gene is shown. To see a table that can be downloaded click on the "View the Downloadable Codon Table" link at the bottom of the page.

Gene Page Protein v/s All: This table shows all of the proteins in the CMR that match the query gene using blastp. To see an alignment of the proteins select your proteins of interest by clicking on the "Alignment" button next to each gene and click on the "Get alignment of selected proteins" button at the bottom of the page. Alignments are determined using Smith-Waterman alignment.

Gene Page Identification Alignment: This tool shows a summary table and protein alignment generated at the time this gene was annotated at TIGR. It compares this gene's protein to a non-redundant amino acid database which includes proteins from EGAD (a TIGR-created database of genes and proteins), SWISS-PROT/TrEMBL, GenPept, PIR and the CMR. This alignment was generated using praze (G. Sutton, unpublished) which produces an optimal gapped alignment using an implementation of the Smith-Waterman algorithm (J. Mol. Biol. (1981) 147, 195-197) which allows for frameshifts. The praze output displays codons stacked vertically above each translated amino acid of the query sequence. All query sequences were extended 300 nt at both ends to show possible upstream and downstream frameshifts or point mutations. The table and alignment displays only the best hits to this gene's protein. To get more information on the genes that align with this gene, click on the accession number.

Gene Page Transmembrane HMM (TmHMM): This tool shows putative transmembrane helices in proteins based on Hidden Markov Models. Click the following link for a detailed description of TmHMM. Click here to go to the TmHMM server.

Gene Page HMM: This table shows all HMMs (both TIGRFAM and Pfam) that hit the gene. Hits above the trusted cutoff for the gene are shown in red, hits between the trusted and noise cutoffs are shown in gold, and hits below the noise cutoff are shown in black. For more information on the HMM click on the HMM accession in the table.

Gene Page Membrane Protein: This table shows the several membrane protein attributes for the gene, including TmHMM transmembrane predictions, outer membrane protein predictions, lipoprotein predictions, and signal peptide predictions. see the table below for more detailed descriptions.

TmHMM: Transmembrane HMMThis tool shows putative transmembrane helices in proteins based on Hidden Markov Models. Click the following link for a detailed description of TmHMM. Click here to go to the TmHMM server.
OMP: Outer membrane proteinPrediction based on the presence of an N-terminal aromatic amino acid and at least 2 more aromatics in the last 10 amino acids of the protein. True positives should always be found in conjunction with a signal sequence. Outer membrane porins use beta barrels to cross the membrane, so TmHMM regions are generally not predictive.
LP: LipoproteinPrediction based on the presence of a particular sequence pattern within the first 35 amino acids. See PROSITE PS00013. Usually coincides with a signal sequence prediction. Membrane attachment is assumed to be by the lipid modification, so TmHMM regions are contra-predictive.
C-score: Raw cleavage site scoreThe output score from networks trained to recognize cleavage sites vs. other sequence positions. Trained to be high at position +1 (immediately after the cleavage site) and low at all other positions.
S-score: Signal peptide scoreThe output score from networks trained to recognize cleavage peptide vs. non-signal-peptide positions. Trained to be high at all positions before the cleavage site and low at 30 positions after the cleavage site and in the N-terminals of non-secretory proteins.
Y-score: Combined cleavage site scoreThe prediction of cleavage site location is optimized by observing where the C-score is high and the S-score changes from high to a low value. The Y-score formalizes this by combining the height of the C-score with the slope of the S-score. Links to a graph of signalp scores for the first 100 amino acids of the protein.
S-meanThe mean of the S-scores from the beginning of the protein to the position of highest Y-score.
Site: Predicted cleavage siteCleavage is predicted to occur after this position. For Gram negative organisms, cleavage sites are most frequently between positions 18 and 30; for Gram positives, 25 and 42. Cleavage positions greater than 50 are suspect.

Gene Page Related Links: This table shows links specific for this gene or this genome.

Gene Page Terminator: This table shows the predicted Rho-independent transcription terminator for this gene.

Gene Page RBS: This table shows the predicted Ribosomal Binding Site for this gene. The RBS site is predicted by the gene finding program Glimmer.

Gene Page COG: This page shows any hits to NCBI's Clusters of Orthologous Groups of Proteins (COG).

Gene Page Paralog Display: This table shows annotation for paralogous gene families (genes which have been duplicated within a particular organism during evolution). To view the alignment of this family depress the "Alignment" link. To see the other members of this family click on the "Family Members" link.

Gene Page Gene Ontology (GO): The Gene Ontology Consortium has created a controlled vocabulary of terms to describe the properties of gene products. There are three ontologies: Biological Process, Molecular Function, and Cellular Component. A gene product can and should have multiple GO terms assigned. When a GO term is assigned, the type of evidence used for an assignment is captured in the Evidence Code, for example, sequence similarity (ISS), or electronic annotation (IEA). For a detailed description of the evidence codes please visit the GO evidence codes page.

Proteins from TIGR genomes are given an initial automated IEA GO term assignment. These are then manually reviewed and the IEA code is changed in favor of one reflecting manual curation. Proteins from genomes not sequenced at TIGR are given an automated assignment (IEA) and are not manually reviewed. For genes that have been assigned ISS the "With" column stores an accession of a protein, HMM, or motif representative of the evidence used for the annotation. The "Reference" column stores the literature citation for experiments done on the proteins or the citation of the publication describing the genome sequence. For more information on the Gene Ontology project please visit the GO home page.

Gene Page PSIPRED Secondary Structure Prediction: PSIPRED is a secondary structure prediction method, incorporating two feed-forward neural networks which perform an analysis on output obtained from PSI-BLAST (Position Specific Iterated - BLAST) (Altschul et al., 1997). This page displays output from Version 2.0 of PSIPRED. PSIPRED vers 2.0 includes a new algorithm which averages the output from up to 4 separate neural networks in the prediction process to further increase prediction accuracy. Using a very stringent cross validation method to evaluate the methods performance, PSIPRED 2.0 is capable of achieving an average Q3 score of nearly 78%. Predictions produced by PSIPRED were also submitted to the CASP4 server and assessed during the CASP4 meeting, which took place in December 2000 at Asilomar. PSIPRED 2.0 achieved an average Q3 score of 80.6% across all 40 submitted target domains with no obvious sequences similarity to structures present in PDB, which placed PSIPRED in first place out of 20 evaluated methods (an earlier version of PSIPRED was also ranked first in CASP3 held in 1998).

Gene Page Operon: Gene clusters often, but not always represent a co-transcribed unit, or operon. We developed a method to detect and analyze conserved gene pairs - pairs of genes that are located close on the same DNA strand in two or more bacterial genomes. For each conserved gene pair, we calculate an estimate of probability that the genes belong to the same operon. The algorithm takes into account several alternative possibilities.

The method is described in: Maria D. Ermolaeva, Owen White and Steven L. Salzberg. Prediction of Operons in Microbial Genomes. Nucleic Acids Research, 29, 1216-1221, (2001)

Carts


Genome Cart: Allows users to preselect a set of genomes of interest. To add a set of preferred genomes to the cart, select and add all genomes of interest. Click submit once, and an alert box appears stating if preferences were saved. Once the genome cart has been set, other selection boxes for genomes in the CMR (list of genomes in any search tools, toolbox tools, comparative analyses tools, and gene lists) will automatically display only this list of preferred genomes. To override the default of displaying only preferred genomes, click on the "Show All" button. Cookies must be enabled to use this tool.

Gene Cart: Allows users to preselect a set of genes of interest. Genes of interest can be added from any page that displays a table of genes, such as search results, resulting gene tables from toolbox and comparative analyses tools, and gene lists. Selected genes can be viewed and downloaded from the Gene Cart page. In addition, any gene cart genes found in any graphical results from comparative analyses and toolbox tools will automatically be highlighted. Cookies must be enabled to use this tool. The cart holds a maximum of 300 genes at once.

HMM Page Tools


HMM Summary Page: Shows all the info for the HMM.

HMM NIAA Members: Shows all Non-Identical Amino Acid (NIAA) database members of an HMM.

HMM alignment: Shows the MSF or FASTA alignment of the seed for an HMM.

Contact Us | © J. Craig Venter Institute