Computational characterization of multiple Gag-like human proteins.
2006 Nov 15; 22(11): 585-9. Epub 2006 Sep 18; PubMed: 16979784.
Abstract + PDF
In a genome-wide analysis, we have identified 85 human genes encoding 103 protein isoforms that resemble retroviral Gag proteins. These genes were domesticated from retrotransposons in at least five independent events during vertebrate evolution and were subsequently duplicated further in mammals. Structural insights into the mammalian proteins can be inferred by homology to Gag from viruses such as HIV; in turn, the cellular roles of the mammalian Gag homologs, such as apoptosis-related functions and binding to ubiquitin ligases, might hint at further functionality of viral Gag itself.
A novel MSH2 germline mutation in homozygous state in two brothers with colorectal cancers diagnosed at the age of 11 and 12 years.
Müller A, Schackert HK, Lange B, Rüschoff J, Füzesi L, Willert J, Burfeind P, Shah PK, Becker H, Epplen JT, Stemmler S
Abstract + PDF
Hereditary non-polyposis colorectal cancer (HNPCC) syndrome is caused by heterozygous germline mutations in DNA mismatch repair genes (MMR), (MSH2, MLH1, MSH6, and PMS2) and it is inherited in an autosomal dominant pattern with high penetrance. Several patients have been reported carrying bi-allelic MMR gene mutations and whose phenotype resembled a syndrome with childhood malignancies including hematological malignancies, brain, and colorectal tumors. This phenotype is similar to the tumor spectrum of MMR knockout mice. Herein we describe two brothers of healthy consanguineous parents from Pakistan, who had developed two and three colorectal cancers at the ages of 11 and 12 years, respectively, and less than 30 polyps. Tumor specimens were microsatellite instable (MSI-H), and expression of MSH2 and MSH6 was lost. Mutation analyses of DNA samples from both patients revealed a novel homozygous c.2006-5T > A mutation in intron 12 of the MSH2 gene. This phenotype of the brothers is unusual as they neither develop hematological malignancies nor brain tumors at an older age of presentation than other patients with homozygous MSH2 mutations. The milder phenotype may be due to the expression of low amounts of MSH2 protein with reduced activity. (c) 2005 Wiley-Liss, Inc.
LSAT: learning about alternative transcripts in MEDLINE.
2006 Apr 1; 22(7): 857-65. Epub 2006 Jan 12; PubMed: 16410322.
Abstract + PDF
MOTIVATION: Generation of alternative transcripts from the same gene in is an important biological event due to their contribution in creating functional diversity in eukaryotes. In this work, we choose the task of extracting information around this complex topic using a two step procedure involving machine learning and information extraction. RESULTS: In the first step, we trained a classifier that inductively learns to identify sentences about physiological transcript diversity from the MEDLINE abstracts. Using a large hand-built corpus, we compared the sentence classification performance of various text categorization methods. Support vector machines (SVM) followed by the maximum entropy classifier outperformed other methods for the sentence classification task. The SVM with the radial basis function kernel and optimized parameters achieved Fbeta-measure of 91% during the four-fold cross validation and of 74% when applied to all sentences in more than 12 million abstracts of MEDLINE. In the second step, we identified eight frequently present semantic categories in the sentences and performed a limited amount of semantic role labeling. The role labeling step also achieved very high Fbeta-measure for all eight categories. AVAILABILITY: The results of our two-step procedure are summarized in the LSAT database of alternative transcripts. LSAT is available at http://www.bork.embl.de/LSAT/.