Friday, August 20, 2010

Multiple Gene Expression

Quantitative gene expression analysis aims to define the gene expression patterns determining cell behavior. So far, these assessments can only be performed at the population level. Therefore, they determine the average gene expression within a population, overlooking possible cell-to-cell heterogeneity that could lead to different cell behaviors/cell fates. Understanding individual cell behavior requires multiple gene expression analyses of single cells, and may be fundamental for the understanding of all types of biological events and/or differentiation processes. We here describe a new reverse transcription-polymerase chain reaction (RT-PCR) approach allowing the simultaneous quantification of the expression of 20 genes in the same single cell. This method has broad application, in different species and any type of gene combination. RT efficiency is evaluated. Uniform and maximized amplification conditions for all genes are provided. Abundance relationships are maintained, allowing the precise quantification of the absolute number of mRNA molecules per cell, ranging from 2 to 1.28×109 for each individual gene. We evaluated the impact of this approach on functional genetic read-outs by studying an apparently homogeneous population (monoclonal T cells recovered 4 d after antigen stimulation), using either this method or conventional real-time RT-PCR. Single-cell studies revealed considerable cell-to-cell variation: All T cells did not express all individual genes. Gene coexpression patterns were very heterogeneous. mRNA copy numbers varied between different transcripts and in different cells. As a consequence, this single-cell assay introduces new and fundamental information regarding functional genomic read-outs. By comparison, we also show that conventional quantitative assays determining population averages supply insufficient information, and may even be highly misleading

Model of Down Syndrome

Trisomy 21, or Down syndrome (DS), is the most common genetic cause of mental retardation. Changes in the neuropathology, neurochemistry, neurophysiology, and neuropharmacology of DS patients' brains indicate that there is probably abnormal development and maintenance of central nervous system structure and function. The segmental trisomy mouse (Ts65Dn) is a model of DS that shows analogous neurobehavioral defects. We have studied the global gene expression profiles of normal and Ts65Dn male and normal female mice brains (P30) using the serial analysis of gene expression (SAGE) technique. From the combined sample we collected a total of 152,791 RNA tags and observed 45,856 unique tags in the mouse brain transcriptome. There are 14 ribosomal protein genes (nine underexpressed) among the 330 statistically significant differences between normal male and Ts65Dn male brains, which possibly implies abnormal ribosomal biogenesis in the development and maintenance of DS phenotypes. This study contributes to the establishment of a mouse brain transcriptome and provides the first overall analysis of the differences in gene expression in aneuploid versus normal mammalian brain cells.

Congenic Strains and Microarray Gene Expression

Combining congenic mapping with microarray expression profiling offers an opportunity to establish functional links between genotype and phenotype for complex traits such as type 1 diabetes (T1D). We used high-density oligonucleotide arrays to measure the relative expression levels of >39,000 genes and ESTs in the NOD mouse (a murine model of T1D and other autoimmune conditions), four NOD-derived diabetes-resistant congenic strains, and two nondiabetic control strains. We developed a simple, yet general, method for measuring differential expression that provides an objective assessment of significance and used it to identify >400 gene expression differences and eight new candidates for the Idd9.1 locus. We also discovered a potential early biomarker for autoimmune hemolytic anemia that is based on different levels of erythrocyte-specific transcripts in the spleen. Overall, however, our results suggest that the dramatic disease protection conferred by six Idd loci (Idd3,Idd5.1, Idd5.2, Idd9.1, Idd9.2, andIdd9.3) cannot be rationalized in terms of global effects on the noninduced immune system. They also illustrate the degree to which regulatory systems appear to be robust to genetic variation. These observations have important implications for the design of future microarray-based studies in T1D and, more generally, for studies that aim to combine genome-wide expression profiling and congenic mapping.

neighboring gene expression

Transposable elements (TEs) are ubiquitous genomic parasites. The deleterious consequences of the presence and activity of TEs have fueled debate about the evolutionary forces countering their expansion. Purifying selection is thought to purge TE insertions from the genome, and TE sequences are targeted by hosts for epigenetic silencing. However, the interplay between epigenetic and evolutionary forces countering TE expansion remains unexplored. Here we analyze genomic, epigenetic, and population genetic data from Arabidopsis thaliana to yield three observations. First, gene expression is negatively correlated with the density of methylated TEs. Second, the signature of purifying selection is detectable for methylated TEs near genes but not for unmethylated TEs or for TEs far from genes. Third, TE insertions are distributed by age and methylation status, such that older, methylated TEs are farther from genes. Based on these observations, we present a model in which host silencing of TEs near genes has deleterious effects on neighboring gene expression, resulting in the preferential loss of methylated TEs from gene-rich chromosomal regions. This mechanism implies an evolutionary tradeoff in which the benefit of TE silencing imposes a fitness cost via deleterious effects on the expression of nearby genes.

polymorphism markers in Arabidopsis

Expression microarrays hybridized with RNA can simultaneously provide both phenotypic (gene expression) and genotypic (marker) data. We developed two types of genetic markers from Affymetrix GeneChip expression data to generate detailed haplotypes for 148 recombinant inbred lines (RILs) derived from Arabidopsis thaliana accessions Bayreuth and Shahdara. Gene expression markers (GEMs) are based on differences in transcript levels that exhibit bimodal distributions in segregating progeny, while single feature polymorphism (SFP) markers rely on differences in hybridization to individual oligonucleotide probes. Unlike SFPs, GEMs can be derived from any type of DNA-based expression microarray. Our method identifies SFPs independent of a gene’s expression level. Alleles for each GEM and SFP marker were ascertained with GeneChip data from parental accessions as well as RILs; a novel algorithm for allele determination using RIL distributions capitalized on the high level of genetic replication per locus. GEMs and SFP markers provided robust markers in 187 and 968 genes, respectively, which allowed estimation of gene order consistent with that predicted from the Col-0 genomic sequence. Using microarrays on a population to simultaneously measure gene expression variation and obtain genotypic data for a linkage map will facilitate expression QTL analyses without the need for separate genotyping. We have demonstrated that gene expression measurements from microarrays can be leveraged to identify polymorphisms across the genome and can be efficiently developed into genetic markers that are verifiable in a large segregating RIL population. Both marker types also offer opportunities for massively parallel mapping in unsequenced and less studied species.

Genetic Modifiers in Centromere

Centromere protein B (CENP-B) binds constitutively to mammalian centromere repeat DNA and is highly conserved between humans and mouse.Cenpb null mice appear normal but have lower body and testis weights. We demonstrate here that testis-weight reduction is seen in male null mice generated on three different genetic backgrounds (denoted R1, W9.5, and C57), whereas body-weight reduction is dependent on the genetic background as well as the gender of the animals. In addition, Cenpb null females show 31%, 33%, and 44% reduced uterine weights on the R1, W9.5, and C57 backgrounds, respectively. Production of “revertant” mice lacking the targeted frameshift mutation but not the other components of the targeting construct corrected these differences, indicating that the observed phenotype is attributable to Cenpb gene disruption rather than a neighbouring gene effect induced by the targeting construct. The R1 and W9.5 Cenpb null females are reproductively competent but show age-dependent reproductive deterioration leading to a complete breakdown at or before 9 months of age. Reproductive dysfunction is much more severe in the C57 background as Cenpb null females are totally incompetent or are capable of producing no more than one litter. These results implicate a further genetic modifier effect on female reproductive performance. Histology of the uterus reveals normal myometrium and endometrium but grossly disrupted luminal and glandular epithelium. Tissue in situ hybridization demonstrates highCenpb expression in the uterine epithelium of wild-type animals. This study details the first significant phenotype ofCenpb gene disruption and suggests an important role of Cenpb in uterine morphogenesis and function that may have direct implications for human reproductive pathology

RNA expression pattern

The recently identified mouse obese (ob) gene apparently encodes a secreted protein that may function in the signaling pathway of adipose tissue. Mutations in the mouse ob gene are associated with the early development of gross obesity. A detailed knowledge concerning the RNA expression pattern and precise genomic location of the human homolog, the OB gene, would facilitate examination of the role of this gene in the inheritance of human obesity. Northern blot analysis revealed that OB RNA is present at a high level in adipose tissue but at much lower levels in placenta and heart. OB RNA is undetectable in a wide range of other tissues. Comparative mapping of mouse and human DNA indicated that the ob gene is located within a region of mouse chromosome 6 that is homologous to a portion of human chromosome 7q. We mapped the human OB gene on a yeast artificial chromosome (YAC) contig from chromosome 7q31.3 that contains 43 clones and 19 sequence-tagged sites (STSs). Among the 19 STSs are eight corresponding to microsatellite-type genetic markers, including seven (CA)n repeat-type Genethon markers. Because of their close physical proximity to the human OB gene, these eight genetic markers represent valuable tools for analyzing families with evidence of hereditary obesity and for investigating the possible association between OB mutations and human obesity.

Cancer Genome Anatomy Project

SNPs (Single-Nucleotide Polymorphisms), the most common DNA variant in humans, represent a valuable resource for the genetic analysis of cancer and other illnesses. These markers may be used in a variety of ways to investigate the genetic underpinnings of disease. In gene-based studies, the correlations between allelic variants of genes of interest and particular disease states are assessed. An extensive collection of SNP markers may enable entire molecular pathways regulating cell metabolism, growth, or differentiation to be analyzed by this approach. In addition, high-resolution genetic maps based on SNPs will greatly facilitate linkage analysis and positional cloning. The National Cancer Institute's CGAP-GAI (Cancer Genome Anatomy Project Genetic Annotation Initiative) group has identified 10,243 SNPs by examining publicly available EST (Expressed Sequence Tag) chromatograms. More than 6800 of these polymorphisms have been placed on expression-based integrated genetic/physical maps. In addition to a set of comprehensive SNP maps, we have produced maps containing single nucleotide polymorphisms in genes expressed in breast, colon, kidney, liver, lung, or prostate tissue. The integrated maps, a SNP search engine, and a Java-based tool for viewing candidate SNPs in the context of EST assemblies can be accessed via the CGAP-GAI web site .Our SNP detection tools are available to the public for noncommercial use.

Genetic Regulatory Networks

Gene expression profiles are an increasingly common data source that can yield insights into the functions of cells at a system-wide level. The present work considers the limitations in information content of gene expression data for reverse engineering regulatory networks. An in silico genetic regulatory network was constructed for this purpose. Using the in silico network, a formal identifiability analysis was performed that considered the accuracy with which the parameters in the network could be estimated using gene expression data and prior structural knowledge (which transcription factors regulate which genes) as a function of the input perturbation and stochastic gene expression. The analysis yielded experimentally relevant results. It was observed that, in addition to prior structural knowledge, prior knowledge of kinetic parameters, particularly mRNA degradation rate constants, was necessary for the network to be identifiable. Also, with the exception of cases where the noise due to stochastic gene expression was high, complex perturbations were more favorable for identifying the network than simple ones. Although the results may be specific to the network considered, the present study provides a framework for posing similar questions in other systems.

Detecting genetic variation

The use of high-density oligonucleotide arrays to measure the expression levels of thousands of genes in parallel has become commonplace. To take further advantage of the growing body of data, we developed a method, termed “GeSNP,” to mine the detailed hybridization patterns in oligonucleotide array expression data for evidence of genetic variation. To demonstrate the performance of the algorithm, the hybridization patterns in data obtained previously from SAMP8/Ta, SAMP10/Ta, and SAMR1/Ta inbred mice and from humans and chimpanzees were analyzed. Genes with consistent strain-specific and species-specific hybridization pattern differences were identified, and 90% of the candidate genes were independently confirmed to harbor sequence differences. Importantly, the quality of gene expression data was also improved by masking the probes of regions with putative sequence differences between species and strains. To illustrate the application to human disease groups, data from an inflammatory bowel disease study were analyzed. GeSNP identified sequence differences in candidate genes previously discovered in independent association and linkage studies and uncovered many promising new candidates. This approach enables the opportunistic extraction of genetic variation information from new or pre-existing gene expression data obtained with high-density oligonucleotide arrays

influence of genetic variation

The view that changes to the control of gene expression rather than alterations to protein sequence are central to the evolution of organisms has become something of a truism in molecular biology. In reality, the direct evidence for this is limited, and only recently have we had the ability to look more globally at how genetic variation influences gene expression, focusing upon inter-individual variation in gene expression and using microarrays to test for differences in mRNA levels. Here, we review the scope of these experimental analyses, what they are designed to tell us about genetic variation, and what are their limitations from both a technical and a conceptual viewpoint. We conclude that while we are starting to understand the impact of this class of genetic variation upon steady-state mRNA levels, we are still far from identifying the potential phenotypic and evolutionary outcomes.