Publications of F. Eisenhaber

2001 Jan; 14(1): 17-25. PubMed: 11287675

2000 1 publication(s).

18.

Automated annotation of GPI anchor sites: case study C. elegans.

Eisenhaber B, Bork P, Yuan YP, Löffler G, Eisenhaber F

Trends Biochem Sci.

2000 Jul; 25(7): 340-1. PubMed: 10871885

1999 3 publication(s).

17.

Prediction of potential GPI-modification sites in proprotein sequences.

Eisenhaber B, Bork P, Eisenhaber F

1999 Sep 24; 292(3): 741-58. PubMed: 10497036

16.

Evaluation of human-readable annotation in biomolecular sequence databases with biological rule libraries.

Eisenhaber F, Bork P

Bioinformatics.

1999 Jul-Aug; 15(7-8): 528-35. PubMed: 10487860

15.

PSIC: profile extraction from sequence alignments with position-specific counts of independent observations.

Sunyaev SR, Eisenhaber F, Rodchenkov IV, Eisenhaber B, Tumanyan VG, Kuznetsov EN

1999 May; 12(5): 387-94. PubMed: 10360979

1998 8 publication(s).

14.

Predicting function: from genes to genomes and back.

Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen MA, Yuan YP

1998 Nov 6; 283(4): 707-25. PubMed: 9790834

13.

Characterization of targeting domains by sequence analysis: glycogen-binding domains in protein phosphatases.

Bork P, Dandekar T, Eisenhaber F, Huynen MA

J Mol Med (Berl).

1998 Feb; 76(2): 77-9. PubMed: 9500672

12.

Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase.

Eisenhaber B, Bork P, Eisenhaber F

1998 Dec; 11(12): 1155-61. PubMed: 9930665

11.

Stops Genejockeys being taken for a ride.

Eisenhaber F

Trends Cell Biol.

1998; 8: 377-378. PID: 351.

10.

Sequence and Structure of Proteins.

Eisenhaber F, Bork P

In: Recombinant proteins, monoclonal antibodies and theraeutic genes (series Biotechnology, 2nd. Edition). (Rehm, H.-J. Reed, G+A. Mountain, U. Ney and D. Schomburg)5a pp.43-86. BID: 11.

Publisher: Wiley-VCH Weinheim

Wanted: subcellular localization of proteins based on sequence.

Eisenhaber F, Bork P

Trends Cell Biol.

1998 Apr; 8(4): 169-70. PubMed: 9695832

Homology-based fold predictions for Mycoplasma genitalium proteins.

Huynen MA, Doerks T, Eisenhaber F, Orengo C, Sunyaev SR, Yuan YP, Bork P

1998 Jul 17; 280(3): 323-6. PubMed: 9665839

Are knowledge-based potentials derived from protein structure sets discriminative with respect to amino acid types?

Sunyaev SR, Eisenhaber F, Argos P, Kuznetsov EN, Tumanyan VG

Proteins.

1998 May 15; 31(3): 225-46. PubMed: 9593195

Journal of Applied Crystallography.

1997 1 publication(s).

Probabilistic evaluation of similarity between pairs of three-dimensional structures utilizing temperature factors.

Carugo O, Eisenhaber F

1997 Oct 1; 30(5): 547-549. PID: 299.

1996 5 publication(s).

Hydrophobic regions on protein surfaces. Derivation of the solvation energy from their area distribution in crystallographic protein structures.

Eisenhaber F

Protein Sci.

1996 Aug; 5(8): 1676-86. PubMed: 8844856

Hydrophobic regions on protein surfaces: definition based on hydration shell structure and a quick method for their computation.

Eisenhaber F, Argos P

1996 Dec; 9(12): 1121-33. PubMed: 9010925

The hydrophobic part of the solvent-accessible surface of a typical monomeric globular protein consists of a single, large interconnected region formed from faces of apolar atoms and constituting approximately 60% of the solvent-accessible surface area. Therefore, the direct delineation of the hydrophobic surface patches on an atom-wise basis is impossible. Experimental data indicate that, in a two-state hydration model, a protein can be considered to be unified with its first hydration shell in its interaction with bulk water. We show that, if the surface area occupied by water molecules bound at polar protein atoms as generated by AUTOSOL is removed, only about two-thirds of the hydrophobic part of the protein surface remains accessible to bulk solvent. Moreover, the organization of the hydrophobic part of the solvent-accessible surface experiences a drastic change, such that the single interconnected hydrophobic region disintegrates into many smaller patches, i.e. the physical definition of a hydrophobic surface region as unoccupied by first hydration shell water molecules can distinguish between hydrophobic surface clusters and small interconnecting channels. It is these remaining hydrophobic surface pieces that probably play an important role in intra- and intermolecular recognition processes such as ligand binding, protein folding and protein-protein association in solution conditions. These observations have led to the development of an accurate and quick analytical technique for the automatic determination of hydrophobic surface patches of proteins. This technique is not aggravated by the limiting assumptions of the methods for generating explicit water hydration positions. Formation of the hydrophobic surface regions owing to the structure of the first hydration shell can be computationally simulated by a small radial increment in solvent-accessible polar atoms, followed by calculation of the remaining exposed hydrophobic patches. We demonstrate that a radial increase of 0.35-0.50 A resembles the effect of tightly bound water on the organization of the hydrophobic part of the solvent-accessible surface.

Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.

Eisenhaber F, Frömmel C, Argos P

Proteins.

1996 Jun; 25(2): 169-79. PubMed: 8811733

The success rates reported for secondary structural class prediction with different methods are contradictory. On one side, the problem of recognizing the secondary structural class of a protein knowing only its amino acid composition appears completely solved by simply applying jury decision with an elliptically scaled distance function. Chou and coworkers repeatedly (see Crit. Rev. Biochem. Mol. Biol. 30:275-349, 1995) published prediction accuracies near 100%. On the other hand, traditional secondary structure prediction techniques achieve success rates of about 70% for the secondary structural state per residue and about 75% for structural class only with extensive input information (full sequence of the query protein, its amino acid composition and length, multiple alignments with homologous sequences). In this article, we resolve the paradox and consider (1) the question of the secondary structural class definition, (2) the role of the representativity of the test set of protein tertiary structure for the current state of the Protein Data Bank (PDB); and (3) we estimate the real impact of amino acid composition on secondary structural class. We formulate three objective criteria for a reasonable definition of secondary structural classes and show that only the criterion of Nakashima et al. (J. Biochem. 99:153-162, 1986) complies with all of them. Only this definition matches the distribution of secondary structural content in representative PDB subsets, whereas other criteria leave many proteins (up to 65% of all PDB entries) simply unassigned. We review critically specialized secondary-structural class prediction methods, especially those of Chou and coworkers, which claim almost 100% accuracy using only amino acid composition, and resolve the paradox that these prediction accuracies are better than those from secondary structure predictions from multiple alignments. We show (i) that these techniques rely on a preselection of test sets which removes irregular proteins and other proteins without any class assignment (about 35% of all PDB entries); and (ii) that even for preselected representative test sets, the success rate drops to 60% and lower for a 4-type classification (alpha, beta, alpha + beta, alpha/beta). The prediction accuracies fall to about 50% if the secondary structural class definition of Nakashima et al. is applied and only few irregular proteins are preselected and removed from automatically generated, representative subsets of the PDB. We have applied two new vector decomposition methods for secondary structural content prediction from amino acid composition alone, with and without consideration of amino acid compositional coupling in the learning set of tertiary structures respectively, to the problem of class prediction and achieve about 60% correct assignment among four classes (alpha, beta, mixed, irregular) as well as single sequence-based secondary structure prediction methods like GORIII and COMBI. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein and that consideration of compositional coupling does not improve the prediction success. The prediction program SSCP offering secondary structural class assignment for query compositions and sequences has been made available as a World Wide Web and E-mail service.

Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods.

Eisenhaber F, Imperiale F, Argos P, Frömmel C

Proteins.

1996 Jun; 25(2): 157-68. PubMed: 8811732

The predictive limits of the amino acid composition for the secondary structural content (percentage of residues in the secondary structural states helix, sheet, and coil) in proteins are assessed quantitatively. For the first time, techniques for prediction of secondary structural content are presented which rely on the amino acid composition as the only information on the query protein. In our first method, the amino acid composition of an unknown protein is represented by the best (in a least square sense) linear combination of the characteristic amino acid compositions of the three secondary structural types computed from a learning set of tertiary structures. The second technique is a generalization of the first one and takes into account also possible compositional couplings between any two sorts of amino acids. Its mathematical formulation results in an eigenvalue/eigenvector problem of the second moment matrix describing the amino acid compositional fluctuations of secondary structural types in various proteins of a learning set. Possible correlations of the principal directions of the eigenspaces with physical properties of the amino acids were also checked. For example, the first two eigenvectors of the helical eigenspace correlate with the size and hydrophobicity of the residue types respectively. As learning and test sets of tertiary structures, we utilized representative, automatically generated subsets of Protein Data Bank (PDB) consisting of non-homologous protein structures at the resolution thresholds < or = 1.8A, < or = 2.0A, < or = 2.5A, and < or = 3.0 A. We show that the consideration of compositional couplings improves prediction accuracy, albeit not dramatically. Whereas in the self-consistency test (learning with the protein to be predicted), a clear decrease of prediction accuracy with worsening resolution is observed, the jackknife test (leave the predicted protein out) yielded best results for the largest dataset (< or = 3.0A, almost no difference to the self-consistency test!), i.e., only this set, with more than 400 proteins, is sufficient for stable computation of the parameters in the prediction function of the second method. The average absolute error in predicting the fraction of helix, sheet, and coil from amino acid composition of the query protein are 13.7, 12.6, and 11.4%, respectively with r.m.s. deviations in the range of 8.6 divided by 11.8% for the 3.0 A dataset in a jackknife test. The absolute precision of the average absolute errors is in the range of 1 divided by 3% as measured for other representative subsets of the PDB. Secondary structural content prediction methods found in the literature have been clustered in accordance with their prediction accuracies. To our surprise, much more complex secondary structure prediction methods utilized for the same purpose of secondary structural content prediction achieve prediction accuracies very similar to those of the present analytic techniques, implying that all the information beyond the amino acid composition is, in fact, mainly utilized for positioning the secondary structural state in the sequence but not for determination of the overall number of residues in a secondary structural type. This result implies that higher prediction accuracies cannot be achieved relying solely on the amino acid composition of an unknown query protein as prediction input. Our prediction program SSCP has been made available as a World Wide Web and E-mail service.

Principles of helix-helix packing in proteins: the helical lattice superposition model.

Walther D, Eisenhaber F, Argos P

1996 Jan 26; 255(3): 536-53. PubMed: 8568896