How to cite PLANTORDB

PlantOrDB has been published in BMC Plant Biology.

Li L, Ji G, Ye C, Shu C, Zhang J, Liang C (2015) PlantOrDB: a genome-wide ortholog database for land plants and green algae, BMC Plant Biology, 15:161.

How to use PLANTORDB

Suggested Browsers and Resolution

Advanced browsers, such as Chrome, Firefox, Safari, and Internet Explorer (10.0 or later) can be used for browsing PlantOrDB.

Firefox and Chrome are recommended:

Mozilla Firefox 3.5 or greater (http://www.mozilla.org/)

Google Chrome 6.0 or greater (http://www.google.com/chrome/)

In order to contain more alignment in our web page, we have to build our web interface in a high resolution. You will get the best display performance in a resolution wider than 1500 pixel.

Our suggested resolution is 1680x1050.

* when loading sequence by AJAX, average loading time is about 0.5 - 4 seconds. it's data transfer time.

** average waiting time is about 10-15 seconds. It's the background processing time.

*** average waiting time of individual gene sequence-annotatiobn viewer is about 5-15 seconds. It's up to your internet connection.

**** In certain cases of insufficient resolution, alignment overlap phenomenon may occur in tree-alignment combined viewer. We found this phenomenon only on Chrome on MAC OS. On other browsers on MAC OS, even on Chrome on iPad, the alignment viewer performances well when it was tested on a 13'' Macbook Pro. This problem can be solved by adjusting the scaling of screen.

1. Introduction

PlantOrDB (http://bioinfolab.miamioh.edu/plantordb) is a genome-wide ortholog database for plants. Through genome-wide ortholog identification of 41 plant species, we clustered 1,291,670 sequences into 49,355 homolog gene families in terms of character (amino acid)-based orthology and homology. For each gene family, we generated phylogenetic tree and protein sequence alignment and identified its diagnostic characters. These characters facilitate appropriate addition of a query sequence into the existing phylogenetic tree and sequence alignment of query’s best matched gene family. Based on a desired species or subgroup of species, users can selectively view the phylogenetic tree, sequence alignment and diagnostic characters for a gene family. Query sequences can be uploaded to find the best matched gene family and visualized within the relevant phylogenetic tree and sequence alignment. PlantOrDB offers powerful search and visualization functionalities useful for functional and evolutionary genomic researches.

PlantOrDB allows users to:

1) Summarize information of Ortholog/Homolog gene families (Database Browser)

2) Search protein sequences or gene families in our database (Gene Family Search)

3) Search Orthologs for a gene (Ortholog Search)

4) Upload a query sequence and insert it to a partcular gene family (Query Classification)

2. Database Browser

2.1 Gene Family Browser

We show homolog gene families information by datagrid. There are three columns: gene family ID, gene family size and gene family alignment length.

Users can filter data by gene family size and family ID

2.2 Protein Sequence Browser

We show protein sequences information by datagrid. There are six columns: Gene ID, Gene name, Taxonomy ID, Species, Sequence and Homolog Family ID.

Users can filter data by gene ID or Taxonomy ID. Users also can use the homolog switch to filter data. When the Homolog switch is "yes", only the gene that can be clustered to a homolog gene family will be displayed in the datagrid. When the Homolog switch is "no", only the gene that can not be clustered to a homolog gene family will be displayed in the datagrid. When the Homolog switch is "all", the filter switch will be ignored and all the gene will be displayed in datagrid. The default value for the switch is "yes".

2.3 Gene Annotation Browser

We show gene annotation information by datagrid. There are 16 columns. Meanings of all 16 columns are shown in the follow table.

There is a page size selection at the bottom of the table. Users can set page size to 10,25,50,100,250 and 500. As shown in the follow figure, the page size selection is at the bottom of the table (A), and the options could be 10,25,50,100,250 and 500 (B).

2.4 Individual Gene Sequence-annotation Viewer

We also build a new page called Individual Gene Sequence-Annotation Viewer page for displaying sequence, annotation and details for a specific gene. In this interface, we also provided a separated gene annotation table. As shown in the follow figure, they are gene summary information (A), gene annotation (B) and gene sequence (C).

3.1 Homolog Gene Family Search

Users can search the Homolog gene family by its member's ID, member's Name, Gene Family ID or Gene annotation (For convenience, only digital ID needed). We also provide a fuzzy search function that allows users to search gene families by component species and gene family sizes.

3.2 Gene family viewer Try an example

PlantOrDB provides a highly interactive web interface for each gene family, either ortholog or homolog gene family, that allows selective visualization of the phylogenetic tree and multiple sequence alignment in terms of a desired species or subgroup of all species. Our major web interface has two panels: “Gene Family Details” and “Tree-alignment Combined Viewer”.

The “Gene Family Details” panel consists of the following 6 sections. “Summary Information” section (B) shows gene family id, total component sequences (gene members), total species number and consensus sequence length. “Download” section (C) enables users to download the protein sequence alignment, sequences in FASTA format, phylogenetic tree and consensus sequence. “Consensus Sequence Viewer” (D) shows the consensus sequence with a ruler and pattern search capability. “Pie Viewer” (E) shows all component species in a given gene family and their composition percentages (i.e., how many different gene sequences from the same species within a given gene family). From this pie chart, users can easily know species distribution of the current gene family: whether this is a family specific to a species, a subgroup of species or all 35 species. “Datagrid Viewer” (F) provides more detailed information about species composition for a given gene family. It shows species taxon ID, abbreviated and full species name, and the number of different genes from the same species within a given gene family. The last column of “Datagrid Viewer” is a checkbox HTML element that is checked by default. When a user unchecks the checkbox for a species or subgroup of all species, all sequence alignment and phylogenetic tree parts for the unchecked species will be invisible. This unique feature allows users to focus on a desired species or subgroup of species for selective and partial visualization of the phylogenetic tree and sequence alignment within a given gene family. “Tree Viewer” (G) shows species composition information for a given gene family, using species-based phylogenetic tree with component gene family numbers highlighted for each species. There are two parts in “Tree-alignment Combined Viewer” panel (H): phylogenetic tree on the left and multiple sequence alignment on the right. In phylogenetic tree, PlantOrDB represent genes with their gene names and species icons. when a user moves mouse on a gene name or species icon, there will be a table (I) that contains all details about this gene. Moreover, users also can click gene names or species icons to open a individual gene sequence-alignment viewer page that show more detailed information. There is a navigation bar floating on the bottom right side (J), which shows the position of the current partial alignment, with four buttons for users to move the alignment to the left or right, at a normal pace or faster pace.

4. Ortholog Search

Users can search Orthologs for a specific gene. Try an example

In order to show more information about ortholog genes, PlantOrDB provides another highly interactive web interface as shown in the follow figure. There are four parts within this interface. The first one is “Expandable Ortholog Tree Viewer” (A), which shows all orthologs for a given gene and their relevant orthologs recursively, with the root node being the gene that a user selected or specified. The second one is “Gene and its RBH Ortholog Genes” (B), which provides details about the specific gene and its ortholog genes. The third part is “Ortholog Path Viewer” (C), which shows the concrete ortholog pathway between any two genes within an ortholog gene tree so that we know how these two genes are linked through their orthologs. This is a novel function that is not available in all other aforementioned databases. The fourth part is “Ortholog Gene Details” (D), which presents a pie chart and data grid table to describe the species composition and other information of all orthologs for a given gene.

5. Query Classification

Users can find the best aligned ortholog or homolog gene family for their query sequences by uploading the sequence on our "query classification" interface.

New users can try our example by clicking the link on the bottom of page. There are two radios for users to choose indentifying ortholog or homolog.

The interface of query sequence classification result is the same as 3.3 Gene family viewer