Enriching the SCOP Database with Phylogenetic and Protein Function Data
Different protein databases classify and organize proteins based on separate criteria. As a result, the organization and information available on these databases will often vary. By mapping the protein domains between different databases, I can analyze and add new, otherwise not apparent information. During the summer, I am augmenting the computational resources available on the Structural Classification of Proteins (SCOP) database by mapping it to the Uniprot Gene Ontology Annotation Database (Uniprot-GOA) and the Pfam Database. Based on the results of mapping the Pfam database to SCOP, I can uncover the internal phylogeny of SCOP families and organize new protein sequences within SCOP without necessarily knowing the structures. The Uniprot-GOA mapping will be used to assign protein functions to the Protein Data Bank entries in SCOP and improve the search function to include GO annotations.
Message to Sponsor
- Major: Molecular and Cell Biology, Computer Science (minor)
- Sponsor: Rose Hills Foundation
- Mentor: Steven Brenner, Plant and Microbial Biology