TY - GEN
T1 - Using protein fragments for searching and data-mining protein databases
AU - Keasar, Chen
AU - Kolodny, Rachel
PY - 2013
Y1 - 2013
N2 - Proteins are macro-molecules involved in virtually all of life processes. Protein sequence and structure data is accumulated at an ever increasing rate in publicly-available databases. To extract knowledge from these databases, we need efficient and accurate tools; this is a major goal of computational structural biology. The tasks we consider are searching and mining protein data; we rely on protein fragment libraries to build more efficient tools. We describe FragBag - an example of using fragment libraries to improve protein structural search. To search for patterns in structure space, we discuss methods to generate efficient low-dimensional maps. In particular, we use these maps to identify patterns of functional diversity and sequence diversity. Finally, we discuss how to extend these methods to protein sequences. To do this, one needs to predict local structure from sequence; we survey previous work that suggests that this is a very feasible task. Furthermore, we show that such predictions can be used to improve sequence alignments. Namely, protein fragments can be used to leverage protein structural data to improve remote homology detection.
AB - Proteins are macro-molecules involved in virtually all of life processes. Protein sequence and structure data is accumulated at an ever increasing rate in publicly-available databases. To extract knowledge from these databases, we need efficient and accurate tools; this is a major goal of computational structural biology. The tasks we consider are searching and mining protein data; we rely on protein fragment libraries to build more efficient tools. We describe FragBag - an example of using fragment libraries to improve protein structural search. To search for patterns in structure space, we discuss methods to generate efficient low-dimensional maps. In particular, we use these maps to identify patterns of functional diversity and sequence diversity. Finally, we discuss how to extend these methods to protein sequences. To do this, one needs to predict local structure from sequence; we survey previous work that suggests that this is a very feasible task. Furthermore, we show that such predictions can be used to improve sequence alignments. Namely, protein fragments can be used to leverage protein structural data to improve remote homology detection.
UR - http://www.scopus.com/inward/record.url?scp=84898860034&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84898860034
SN - 9781577356172
T3 - AAAI Workshop - Technical Report
SP - 14
EP - 19
BT - Artificial Intelligence and Robotics Methods in Computational Biology - Papers from the 2013 AAAI Workshop, Technical Report
PB - AI Access Foundation
T2 - 2013 AAAI Workshop
Y2 - 14 July 2013 through 14 July 2013
ER -