Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

If You Use a Screen Reader

This content is available through Read Online (Free) program, which relies on page scans. Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.

Genome Cluster Database. A Sequence Family Analysis Platform for Arabidopsis and Rice

Kevin Horan, Josh Lauricha, Julia Bailey-Serres, Natasha Raikhel and Thomas Girke
Plant Physiology
Vol. 138, No. 1 (May, 2005), pp. 47-54
Stable URL: http://www.jstor.org/stable/4629801
Page Count: 8
  • Read Online (Free)
  • Subscribe ($19.50)
  • Cite this Item
Since scans are not currently available to screen readers, please contact JSTOR User Support for access. We'll provide a PDF copy for your screen reader.
Genome Cluster Database. A Sequence Family Analysis Platform for Arabidopsis and Rice
Preview not available

Abstract

The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

Page Thumbnails

  • Thumbnail: Page 
47
    47
  • Thumbnail: Page 
48
    48
  • Thumbnail: Page 
49
    49
  • Thumbnail: Page 
50
    50
  • Thumbnail: Page 
51
    51
  • Thumbnail: Page 
52
    52
  • Thumbnail: Page 
53
    53
  • Thumbnail: Page 
54
    54