Fold recognition using sequence and secondary structure information.
We applied a succession of sequence search and structure prediction methods to the targets in the fold recognition part of the CASP3 experiment. For each target, we expanded an initial sequence space, obtained through PSI-BLAST, by searching for statistically significant relationships to low-scoring sequences and then by searching for conserved sequence patterns. We then divided the proteins in the sequence space into families and built an alignment hierarchically, using the multiple alignment program MACAW. If no significant similarity to a protein of known structure was apparent at this point, we submitted the alignment to the Jpred server for consensus secondary structure prediction and searched the structure space using the secondary structure mapping program MAP. Failing this, we compared the structural properties that we believed we recognized in the aligned proteins to the folds in the SCOP database, using visual inspection. If all these methods failed to uncover a plausible match, we predicted that the target would adopt a novel fold. This procedure yielded correct answers for seven of twenty-one targets and a partly correct answer for one. A retrospective analysis shows that automating the sequence search procedures would have represented a significant improvement, with at least three additional correct predictions.