10.
Prediction of potential GPI-modification sites in proprotein sequences.
J Mol Biol.
1999 Sep 24; 292(3): 741-58. PubMed:
10497036.Abstract + PDF
Glycosylphosphatidylinositol (GPI) lipid anchoring is a common posttranslational modification known mainly from extracellular eukaryotic proteins. Attachment of the GPI moiety to the carboxyl terminus (omega-site) of the polypeptide follows after proteolytic cleavage of a C-terminal propeptide. For the first time, a new prediction technique locating potential GPI-modification sites in precursor sequences has been applied for large-scale protein sequence database searches. The composite prediction function (with separate parametrisation for metazoan and protozoan proteins) consists of terms evaluating both amino acid type preferences at sequence positions near a supposed omega-site as well as the concordance with general physical properties encoded in multi-residue correlation within the motif sequence. The latter terms are especially successful in rejecting non-appropriate sequences from consideration. The algorithm has been validated with a self-consistency and two jack-knife tests for the learning set of fully annotated sequences from the SWISS-PROT database as well as with a newly created database "big-Pi" (more than 300 GPI-motif mutations extracted from original literature sources). The accuracy of predicting the effect of mutations in the GPI sequence motif was above 83 %. Lists of potential precursor proteins which are non-annotated in SWISS-PROT and SPTrEMBL are presented on the WWW-page http://www.embl-heidelberg.de/beisenha/gpi/gpi_p rediction. html The algorithm has been implemented in the prototype software "big-Pi predictor" which may find application as a genome annotation and target selection tool.
9.
Domain organization of Mac-2 binding protein and its oligomerization to linear and ring-like structures.
Müller SA, Sasaki T, Bork P, Wolpensinger B, Schulthess T, Timpl R, Engel A, Engel J
J Mol Biol.
1999 Aug 27; 291(4): 801-13. PubMed:
10452890.Abstract + PDF
The multidomain Mac-2 binding protein (M2BP) is present in serum and in the extracellular matrix in the form of linear and ring-shaped oligomers, which interact with galectin-3, fibronectin, collagens, integrins and other large glycoproteins. Domain 1 of M2BP (M2BP-1) shows homology with the cysteine-rich SRCR domain of scavanger receptor. Domains 2 and 3 are related to the dimerization domains BTB/POZ and IVR of the Drosophila kelch protein. Recombinant M2BP, its N-terminal domain M2BP-1 and a fragment consisting of putative domains 2, 3 and 4 (M2BP-2,3,4) were investigated by scanning transmission electron microscopy, transmission electron microscopy, analytical ultracentrifugation and binding assays. The ring oligomers formed by the intact protein are comprised of approximately 14 nm long segments composed of two 92 kDa M2BP monomers. Although the rings vary in size, decamers predominate. The various linear oligomers also observed are probably ring precursors, dimers predominate. M2BP-1 exhibits a native fold, does not oligomerize and is inactive in cell attachment. M2BP-2,3,4 aggregates to heterogeneous, protein filled ring-like structures as shown by metal shadowed preparations. These aggregates retain the cell-adhesive potential indicating native folding. It is hypothesized that the rings provide an interaction pattern for multivalent interactions of M2BP with target molecules or complexes of ligands.
8.
Eukaryotic signalling domain homologues in archaea and bacteria. Ancient ancestry and horizontal gene transfer.
Ponting CP, Aravind L,
Schultz J,
Bork P, Koonin EV
J Mol Biol.
1999 Jun 18; 289(4): 729-45. PubMed:
10369758.Abstract + PDF
Phyletic distributions of eukaryotic signalling domains were studied using recently developed sensitive methods for protein sequence analysis, with an emphasis on the detection and accurate enumeration of homologues in bacteria and archaea. A major difference was found between the distributions of enzyme families that are typically found in all three divisions of cellular life and non-enzymatic domain families that are usually eukaryote-specific. Previously undetected bacterial homologues were identified for# plant pathogenesis-related proteins, Pad1, von Willebrand factor type A, src homology 3 and YWTD repeat-containing domains. Comparisons of the domain distributions in eukaryotes and prokaryotes enabled distinctions to be made between the domains originating prior to the last common ancestor of all known life forms and those apparently originating as consequences of horizontal gene transfer events. A number of transfers of signalling domains from eukaryotes to bacteria were confidently identified, in contrast to only a single case of apparent transfer from eukaryotes to archaea.