Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Oct;14(10B):2136-44.
doi: 10.1101/gr.2576704.

From ORFeome to biology: a functional genomics pipeline

Affiliations

From ORFeome to biology: a functional genomics pipeline

Stefan Wiemann et al. Genome Res. 2004 Oct.

Abstract

As several model genomes have been sequenced, the elucidation of protein function is the next challenge toward the understanding of biological processes in health and disease. We have generated a human ORFeome resource and established a functional genomics and proteomics analysis pipeline to address the major topics in the post-genome-sequencing era: the identification of human genes and splice forms, and the determination of protein localization, activity, and interaction. Combined with the understanding of when and where gene products are expressed in normal and diseased conditions, we create information that is essential for understanding the interplay of genes and proteins in the complex biological network. We have implemented bioinformatics tools and databases that are suitable to store, analyze, and integrate the different types of data from high-throughput experiments and to include further annotation that is based on external information. All information is presented in a Web database (http://www.dkfz.de/LIFEdb). It is exploited for the identification of disease-relevant genes and proteins for diagnosis and therapy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The functional genomics and proteomics pipeline. Starting with the large-scale production and molecular analysis of cDNAs, a human ORFeome resource is generated. This physical resource is systematically exploited in high-throughput applications of protein localization, cell-based assays, and proteomics applications. Information that is derived from these experiments is integrated with expression profiling data from clinical studies and external information to allow for an efficient mining of data. The output is functionally characterized genes and proteins with their possible disease relations. The results are presented through http://www.dkfz.de/LIFEdb.
Figure 2
Figure 2
UCSC genome browser view of the gene locus of PRO0971. The exons (numbered bars) and introns (connecting lines) are immediately apparent when cDNAs are aligned with the genome sequence. Arrow heads in the intron lines indicate the orientation of the gene (left to right), with a CpG island (green bar, “CpG: 106”) supporting the 5′-end of the gene and transcript. Multiple coverage of the gene with individual cDNAs (accession nos. BC009485 from the MGC, AK094126 from the FLJ project, and BX647702 from the German cDNA Consortium) helps to identify putative splice variants. An example of exon skipping in the IMAGE:3623656 cDNA (BC009485) as compared with the DKFZp686P0859 cDNA (BX647702) is highlighted within the yellow circle. The UCSC genome browser is at http://genome.ucsc.edu/cgi-bin/hgGateway.
Figure 3
Figure 3
Effect of the orientation of the GFP-tag relative to the ORF. (A) For 340 proteins of 567 tested, both orientations resulted in the correct localization of the fusion proteins (same). Another 219 proteins localized differently in the two orientations. Of these, 120 localizations were correctly localizing with the ORF-GFP construct, and 99 fusion proteins localized correctly in the GFP-ORF orientation. Eight expression constructs did not show any detectable expression (none). (B) An example of a mitochondrial protein (upper image). The fusion protein mislocalized (lower image) when the signal peptide at its N terminus was blocked by the GFP-tag. The cytoplasmic and nuclear staining of the GFP-ORF fusion protein is also the default localization of GFP alone. The bar indicates 10 μm.
Figure 4
Figure 4
Established assays (gray boxes) to address processes of the cell cycle (yellow circle). G1, S, G2, and M are the phases of the cell cycle (G, growth; S, DNA synthesis; M, mitosis).
Figure 5
Figure 5
Effect of protein overexpression during mitosis. The protein encoded by cDNA DKFZp434P097 was overexpressed as a CFP fusion protein in NIH-3T3 cells (B), antitubulin staining (A). Phosphorylated histone H3 in mitotic cells was detected with a specific antibody (C), which shows a punctuate staining pattern in nuclei of cells in prophase (yellow arrows). The overlay (D) shows colocalization of the DKZp434P097 and the phosphohistone H3 proteins in the cytoplasm. The white arrowhead in A-D marks a cell expressing the DKFZp434P097 protein. Bar, 10 μm.
Figure 6
Figure 6
Identification of apoptosis modulators. Shown are plots of the fluorescence intensity in the YFP channel (expression of the recombinant proteins) against the level of activated caspase-3 (measured with an APC-labeled antibody). NIH3T3 cells were transfected with ORFs that were C- or N-terminally tagged with YFP. After 24 h, the cells were stained with an antibody directed against the active form of caspase-3 and measured by FACS. For every protein, the percentage of transfected cells (YFP > 10e1) that were positive for activated caspase-3 (APC > 10e1) is given as compared with the transfected cells (YFP > 10e1) that were negative in active caspase-3 (APC < 10e1). APC is fluorescence of the secondary antibody labeled with allophycocyanine. FAS (Chinnaiyan et al. 1995) is the receptor for the cytokine ligand known as FASL. Activated Fas results in the formation of Death-inducing signaling complex, which ultimately leads to cell death (activator control). Bcl-2 (Hockenbery et al. 1990) is an integral protein of the inner mitochondrial membrane that blocks apoptotic death (inhibitor control). YFP is the YFP protein. P097 is the DKFZp434P097 protein. All proteins were expressed as fusion proteins with YFP.
Figure 7
Figure 7
In vitro phosphorylation of arrayed proteins. Purified proteins were arrayed in quadruplicate on glass slides, and incubated with different protein kinases in the presence of [γ-33P]ATP. Rb is the retinoblastoma protein (Lee et al. 1987), which served as positive control. GFP-GST is purified fusion protein of GFP with a GST-tag, which should not be phosphorylated by the kinases. The protein from cDNA DKFZp434P097 was expressed as a fusion protein with the C terminus of GST. (A) The array was incubated with CDK2/cyclin E kinase. (B) The array was incubated with p42 MAPK kinase.
Figure 8
Figure 8
Statistical power analysis for the number of cells. The plot shows means (dots) and 95% confidence intervals (vertical bars) of the measured effect on the proliferation rate of transfection with cyclin A (a positive control in the assay) as a function of the number of cells analyzed. The effect was measured by a robust local regression of the anti-BrdU intensity on the intensity from the YFP-tag (arbitrary fluorescence units). The dependence on the number of cells was simulated by random sampling from the full data set with 2211 cells. The red line represents the approximate true effect, and the blue line no effect. In this example, we would have detected cyclin A as an activator of cell proliferation with 95% probability only for cell numbers ≧1000. Conversely, we would have assigned an activating effect to a protein that is in fact neutral with <5% probability. To reliably detect modifiers of cell proliferation that are subtler, or to achieve probabilities better than 95%, cell numbers must be even higher.

Similar articles

Cited by

References

    1. Adams, M.D., Dubnick, M., Kerlavage, A.R., Moreno, R., Kelley, J.M., Utterback, T.R., Nagle, J.W., Fields, C., and Venter, J.C. 1992. Sequence identification of 2,375 human brain genes. Nature 355: 632-634. - PubMed
    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402. - PMC - PubMed
    1. Aza-Blanc, P., Cooper, C.L., Wagner, K., Batalov, S., Deveraux, Q.L., and Cooke, M.P. 2003. Identification of modulators of TRAIL-induced apoptosis via RNAi-based phenotypic screening. Mol. Cell 12: 627-637. - PubMed
    1. Bannasch, D., Mehrle, A., Glatting, K.-H., Pepperkok, R., Poustka, A., and Wiemann, S. 2004. LIFEdb: A database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system. Nucleic Acids Res. 32: D505-D508. - PMC - PubMed
    1. Bashirullah, A., Cooperstock, R.L., and Lipshitz, H.D. 2001. Spatial and temporal control of RNA stability. Proc. Natl. Acad. Sci. 98: 7025-7028. - PMC - PubMed

WEB SITE REFERENCES

    1. http://genome.ucsc.edu/cgi-bin/hgGateway; UCSC Genome Browser GoldenPath.
    1. http://mips.gsf.de/projects/cdna; database with annotation of the cDNAs sequenced by the German cDNA Consortium.
    1. http://www.dkfz.de/LIFEdb; database with subcellular localizations and protein annotation (the address is case-sensitive).
    1. http://www.ebi.ac.uk/interpro/; IntroPro database of protein families, domains, and functional sites.
    1. http://www.ncbi.nlm.nih.gov/LocusLink/; LocusLink database with curated sequence and descriptive information on genetic loci.

Publication types

Substances