Advancing the understanding of the embryo transcriptome co-regulation using meta-, functional, and gene network analysis tools

in Reproduction
Authors:
S L Rodriguez-ZasDepartment of Animal Sciences, Institute for Genomic Biology, Department of Chemistry
Department of Animal Sciences, Institute for Genomic Biology, Department of Chemistry

Search for other papers by S L Rodriguez-Zas in
Current site
Google Scholar
PubMed
Close
,
Y KoDepartment of Animal Sciences, Institute for Genomic Biology, Department of Chemistry

Search for other papers by Y Ko in
Current site
Google Scholar
PubMed
Close
,
H A AdamsDepartment of Animal Sciences, Institute for Genomic Biology, Department of Chemistry
Department of Animal Sciences, Institute for Genomic Biology, Department of Chemistry

Search for other papers by H A Adams in
Current site
Google Scholar
PubMed
Close
, and
B R SoutheyDepartment of Animal Sciences, Institute for Genomic Biology, Department of Chemistry
Department of Animal Sciences, Institute for Genomic Biology, Department of Chemistry

Search for other papers by B R Southey in
Current site
Google Scholar
PubMed
Close
View More View Less

Free access

Embryo development is a complex process orchestrated by hundreds of genes and influenced by multiple environmental factors. We demonstrate the application of simple and effective meta-study and gene network analyses strategies to characterize the co-regulation of the embryo transcriptome in a systems biology framework. A meta-analysis of nine microarray experiments aimed at characterizing the effect of agents potentially harmful to mouse embryos improved the ability to accurately characterize gene co-expression patterns compared with traditional within-study approaches. Simple overlap of significant gene lists may result in under-identification of genes differentially expressed. Sample-level meta-analysis techniques are recommended when common treatment levels or samples are present in more than one study. Otherwise, study-level meta-analysis of standardized estimates provided information on the significance and direction of the differential expression. Cell communication pathways were highly represented among the genes differentially expressed across studies. Mixture and dependence Bayesian network approaches were able to reconstruct embryo-specific interactions among genes in the adherens junction, axon guidance, and actin cytoskeleton pathways. Gene networks inferred by both approaches were mostly consistent with minor differences due to the complementary nature of the methodologies. The top–down approach used to characterize gene networks can offer insights into the mechanisms by which the conditions studied influence gene expression. Our work illustrates that further examination of gene expression information from microarray studies including meta- and gene network analyses can help characterize transcript co-regulation and identify biomarkers for the reproductive and embryonic processes under a wide range of conditions.

Abstract

Embryo development is a complex process orchestrated by hundreds of genes and influenced by multiple environmental factors. We demonstrate the application of simple and effective meta-study and gene network analyses strategies to characterize the co-regulation of the embryo transcriptome in a systems biology framework. A meta-analysis of nine microarray experiments aimed at characterizing the effect of agents potentially harmful to mouse embryos improved the ability to accurately characterize gene co-expression patterns compared with traditional within-study approaches. Simple overlap of significant gene lists may result in under-identification of genes differentially expressed. Sample-level meta-analysis techniques are recommended when common treatment levels or samples are present in more than one study. Otherwise, study-level meta-analysis of standardized estimates provided information on the significance and direction of the differential expression. Cell communication pathways were highly represented among the genes differentially expressed across studies. Mixture and dependence Bayesian network approaches were able to reconstruct embryo-specific interactions among genes in the adherens junction, axon guidance, and actin cytoskeleton pathways. Gene networks inferred by both approaches were mostly consistent with minor differences due to the complementary nature of the methodologies. The top–down approach used to characterize gene networks can offer insights into the mechanisms by which the conditions studied influence gene expression. Our work illustrates that further examination of gene expression information from microarray studies including meta- and gene network analyses can help characterize transcript co-regulation and identify biomarkers for the reproductive and embryonic processes under a wide range of conditions.

Introduction

Ongoing efforts to characterize the transcriptome influencing the reproduction and embryonic processes are generating vast gene expression information. Microarray technology supports the simultaneous characterization of thousands of gene expression profiles associated with the biological processes underlying embryonic development, embryonic stem cell proliferation, and somatic cloning. Statistical approaches with extensive theoretical foundation have been extended to accurately model gene expression data and provide both a biological and statistically sound hypothesis testing framework (Wolfinger et al. 2001, Cui et al. 2003, Kerr 2003, Bolstad et al. 2004, Rodriguez-Zas et al. 2006). The widespread availability of microarray and effective statistical approaches enables the implementation of a systems biology approach that integrates information across genes and experiments to provide a more comprehensive and multidimensional understanding of the gene–gene and gene–environment interactions influencing embryonic development.

Meta-analysis of microarray experiments and gene pathway reconstruction are the two areas of active research. The meta-analysis of microarray experiments can increase the statistical power to accurately characterize the gene expression patterns (Hu et al. 2006, Adams et al. 2007). Meta-analysis of the microarray data ranges from comparison of lists of genes with differential expression or consideration of gene expression data across treatments and experiments (Singh et al. 2005) to combination of P values (Rhodes et al. 2002) or estimates (Choi et al. 2003). Meta-analysis is challenging because most studies evaluate different conditions and not always the original gene expression information or comparable statistics are available across all studies.

Inference of gene pathways is also challenging because the networks underlying complex biological processes like embryogenesis and development or embryo–maternal interactions can involve hundreds of genes; however, the computational demands of most gene network building algorithms grow exponentially with the number of genes. Typical pathway reconstruction algorithms impose limits on the networks explored and gene expression measurements are discretized, potentially jeopardizing the discovery of weak yet critical gene interactions (Li & Zhan 2006, Wang et al. 2006). The limited data and complex approaches available have hindered the application gene network tools to embryogenomic information. Some studies have addressed the topic of embryonic gene networks and provided insights into general gene relationships (Ang & Constam 2004). However, conclusions have mostly relied on the informal integration of isolated interactions between limited number of genes reported in the studies (e.g., Revilla-i-Domingo et al. 2007) designed to test particular conditions.

We demonstrate a systems biology approach to exploit the extensive embyrogenomic expression data available and offer additional insights into the molecular regulation of embryo development. Complementary meta-analytical techniques are used to integrate different types of information from multiple embryogenomic experiments (Table 1). Insights gained from the integration of information from multiple experiments are applied to gene pathway reconstruction using the gene expression data. Standard statistical packages and web-based tools are used to implement the meta-analyses and gene network approaches. Results from our integrative and systems biology approach enhance the understanding of dynamics and interactions within the transcriptome of embryos.

Table 1

Comparison of the input information used (√) by the study-level (STU) meta-analysis of estimates, study-level meta-analysis of P values (FIS), sample-level (SAM) meta-analyses, and comparison of gene lists from the analysis of individual experiments (EXP).

Input informationEXPSAMSTUFIS
Raw/normalized expression data
Estimate (fold change) with/without standard error
P value

Results

Meta-analysis of all studies

For the ALL dataset, which included all nine studies, there were 539 genes differentially expressed (raw P<0.001, approximately false discovery rate (FDR)-adjusted (Benjamini & Hochberg 1995) P<0.1) in either the EXP, SAM_FIS, or SAM_ALL analyses. The number of differentially expressed genes (P<0.001) ranged from 1 to 204 for the individual experiments (Table 2). The STU_FIS and SAM_ALL meta-analyses of all studies identified 216 and 257 genes differentially expressed (P<0.001) respectively. The minimum fold change between any two original treatment levels among the genes differentially expressed was 1.8. At P<0.0001, there were 245 genes that exhibited differential expression between treatment levels. The median, lower 10 percentile, and minimum fold change expression of the differentially expressed genes were 10.5, 3.9, and 1.8 respectively. A list with the identification of 81 microarray elements that were differentially expressed in at least two out of the nine experiments considered or at least one meta-analysis (STU_FIS or SAM_ALL) and at least one study (P<0.001) is available in Supplementary Table 1, which is available online at www.reproduction-online.org/supplemental/. The apparent weak differential expression found in experiments GSE 1068, GSE 1069, GSE 1075, GSE 1076, and GSE 1079 may be due to the low impact of the treatment levels evaluated on gene expression patterns, the moderate sample sizes used in this experiments (8, 6, 8, 13, and 10 microarrays respectively), and conservative P value thresholds used to account for multiple analyses, multiple testing, and multiple comparisons.

Table 2

Number of genes (diagonals) differentially expressed (P<0.001) within individual experiment (EXP or GSE) or meta-analyses and number (upper diagonals) and proportion (lower diagonals) of genes differentially expressed in pairs of analyses of nine murine embryonic microarray experiments.

EXPa (GSE)Meta-analyses
106810691070107210741075107610771079STUSAM
106810000000000
10690.0b2010100000
10700.00.054072030276
10720.00.50.0611111102818
10740.00.00.10.2204104010698
10750.00.50.30.20.2610062
10760.00.00.00.20.00.250021
10770.00.00.10.00.10.00.0540245
10790.00.00.00.00.00.00.00.0211
ALL0.00.00.10.30.50.30.20.10.521674
FIS0.00.00.50.50.51.00.40.40.50.3257

EXP denotes the nine individual experiments analyzed, SAM denotes the SAM_ALL sample-level meta-analysis, and FIS denotes the STU_FIS study-level meta-analysis using Fisher's meta-test.

Proportion of genes relative to the lowest number of genes within analysis.

Very few genes significant in one individual study were found to be significantly expressed in another study. The number of genes differentially expressed in at least two individual experiments ranged from 0 to 11. The low number of genes that surpassed the conservative P value and were considered differentially expressed within individual experiments limits the number of differentially expressed genes found in two or more studies. The STU_FIS approach identified more differentially expressed genes from the individual studies than the SAM_ALL approach. Only 74 genes were common between the meta-analysis approaches indicating that the two meta-analysis approaches can provide different information.

Meta-analysis of control versus treatment

For the TWO dataset including five experiments and the comparison of control and treated samples, Table 3 presents the number of genes with significant over-expression (positive) and under-expression (negative) in the control relative to the treated embryo samples for the EXP, STU_STD, and SAM_TWO in the diagonal. The total number of genes showing differential expression between treated and control samples ranged from 12 (experiment GSE 1069) to 61 (experiment GSE 1075). The SAM_TWO and STU_STD approaches identified 20 and 9 genes differentially expressed (P<0.01) respectively. The number of genes with the same differential expression magnitude and sign across analyses are presented in the upper off-diagonals of Table 3, while the lower off-diagonals present these numbers as proportions relative to the lowest number of differentially expressed genes among the pairs of analyses considered. The overlap of genes between the EXP and SAM_TWO meta-analysis ranged from 0 to 40%, indicating that the sample-level meta-analysis can uncover genes with a consistent pattern across studies when the same treatment levels have the same or similar impact on the expression levels across studies.

Table 3

Number of genes (diagonals) over (+) or under (−) differentially expressed (P<0.01) in control relative to treated samples within individual experiment (EXP or GSE) and meta-analyses and number (upper diagonals) and proportion (lower diagonals) of genes differentially expressed in pairs of analyses of five murine embryonic microarray experiments.

EXPa (GSE)Meta-analyses
10681069107410751076STU_STDSAM
GSE+++++++
1068
 +10000000000000
 −6000001000000
1069
 +0.0b0.050040000020
 −0.00.070000000000
1074
 +0.00.00.00.01600000000
 −0.00.00.00.01620010002
1075
 +0.00.00.80.00.00.140000020
 −0.00.20.00.00.00.021010001
1076
 +0.00.00.00.00.00.00.00.0170020
 −0.00.00.00.00.00.10.00.0260003
STU_std
 +0.00.00.00.00.00.00.00.00.00.0410
 −0.00.00.00.00.00.00.00.00.00.0500
TWO
 +0.00.00.40.00.00.00.20.00.20.00.30.09
 −0.00.00.00.00.00.20.00.10.00.30.00.011

EXP denotes the five individual studies analyzed, STU denotes the STU_STD study-level-standardized estimate meta-analysis of the five studies, and SAM denotes the SAM_TWO sample-level meta-analysis of the five studies.

Proportion of genes relative to the lowest number of genes within analysis.

The meta-analysis of the standardized estimates (STU_STD) identified only nine differently expressed genes, but these genes were not identified by the individual studies. Only one of the over-expressed genes was also identified by the SAM_TWO meta-analysis. The other three genes identified by the STU_STD meta-analysis were not identified by the SAM_TWO sample-level meta-analysis because the additional data were not able to compensate the additional variation resulting from the combination of the data into one analysis. The STU_STD study-level meta-analysis was able to uncover these three genes because the lower variation within the study compensated for the lower number of estimates analyzed.

Funnel plots of control versus treatment

A funnel plot comparing the estimates of differential expression between control and treated samples from the EXP, SAM_TWO, STU_STD, and STU_NON analyses for myristoylated alanine-rich protein kinase C substrate (MARCKS) is presented in Fig. 1, and for gene zinc finger 263 (ZNF263) and sterol-C4-methyloxidase like, U93162 (SC4MOL) are presented in Supplementary Figures 1 and 2 respectively, which are available online at www.reproduction-online.org/supplemental/. These genes were selected because the corresponding funnel plots illustrate three different scenarios that can be found in the meta-analysis of thousands of transcripts.

Figure 1
Figure 1

Comparison of estimates of differential expression between control and treated samples for the gene MARCKS across analyses of five murine microarray experiments. Each row includes the estimate (central marker) and 95% confidence limits (whiskers) of the differential expression (in log2 units) between control samples and samples treated with tetagogenic agents of the individual analysis of each of the five studies (GSE 1068, 1069, 1074, 1075, and 1076), the sample-level meta-analysis (TWO), the study-level meta-analysis of the standardized estimates (STU_std) and the study-level meta-analysis of the non-standardized estimates (STU_non).

Citation: REPRODUCTION 135, 2; 10.1530/REP-07-0391

Each line in a funnel plot denotes an analysis, the center marker in each row corresponds to the estimate of the differential expression (in log2 units) between treated and control samples for each analysis, and the whiskers denote the 95% confidence intervals. The limited overlap between the SAM_TWO and STU_STD results is an artifact of the standardization of estimates in the later approach, especially when the standard error is low, such that multiplying the estimates from the STU_STD meta-analyses by the standard error recovers a similar estimate to the STU_NON approach.

The funnel plot for gene MARCKS (Fig. 1) illustrates the similarity of the fold change across the EXP resulting in consistent estimates in the STU_NON and SAM_TWO meta-analyses approaches. The estimate of the STU_STD results addressed the variation in the individual experiments resulting in a slightly more significant effect than the STU_NON analysis (P<0.01 versus P<0.1). The reason why the sample-level SAM_TWO meta-analysis was not able to detect a significant differential expression is that the joint analysis of all the five studies may have introduced more noise within the control and treated treatment levels, thus hindering the statistical significance of the treatment effect. The funnel plots for ZNF263 and SC4MOL in Supplementary Figures 1 and 2 demonstrate the ability of meta-analyses to accommodate the variability in the individual experimental results and substantial heterogeneity between the individual experiments.

Functional analysis

Cell communication, metabolism, and transcription factor regulation were among the gene ontology (GO) themes extensively represented among the 81 genes listed in Supplementary Table 1. A description of the thematic representation is presented in the Supplementary Representation of GO themes section, which is available online at www.reproduction-online.org/supplemental/. Representative genes, functions, and processes are described below.

The gene coding for integrin β4 binding protein was significantly differentially expressed in experiments GSE 1070 and GSE 1075 and borderline differentially expressed (0.001<P<0.01) in experiments GSE 1072, GSE 1076, and GSE 1079 and P<0.1 in experiments GSE 1069, GSE 1074, and GSE 1077. Villin 2 or ezrin was differentially expressed in experiments GSE 1074 and GSE 1076 (Supplementary Table 1). Three phosphoinositide 3-kinase units were differentially expressed in three to eight studies (Supplementary Table 1). These units participate in the cascades of reactions that follow many cell surface receptor-linked signaling pathways and regulate numerous cellular functions.

Many of the genes differentially expressed in more than one experiment and in meta-analyses pertain to the same pathways. In particular, certain pathways associated with cell communication (tight junction, adherens junction, focal adhesion), regulation of actin cytoskeleton, glycolysis and pentose pathways, proteasome, and ribosome components were presented by multiple significant differentially expressed genes. For example, the pathways of glycolysis, tight junction, adherens junction, and focal adhesion had 14, 14, 19 and 22 members respectively, which were differentially expressed (P<0.001) in at least one study and borderline differentially expressed (P<0.01) in multiple experiments, SAM_ALL and STU_FIS.

In addition to the genes associated with communication and membrane-related functions, the genes involved in carbohydrate and protein metabolism and transcription factor regulation were also differentially expressed in multiple studies. For example, five proteasome subunits associated with protein catabolic processes were differentially expressed (P<0.001) in one to three studies. The lactate deydrogenase A gene was differentially expressed in studies GSE 1074 and GSE 1077. Likewise, four genes coding for ZNF proteins and four karyopherin (importin) genes were differentially expressed in at least one study. A cold shock domain protein was differentially expressed in experiments GSE 1072 and GSE 1074 and borderline differentially expressed in experiment GSE 1075.

Gene network analysis

The pathways that were extensively represented among the differentially expressed genes identified using meta-analytical techniques were selected for gene pathway reconstruction. The expression estimates of genes pertaining to the adherens junction, axon guidance, and actin cytoskeleton pathways were obtained from each study for the gene network analysis approaches. The Bayesian dependence and mixture network approaches were used to reconstruct gene sub-networks within the actin cytoskeleton (Fig. 2A and B). The adherens junction gene sub-network and two axon guidance sub-networks associated with axon repulsion are depicted in Supplementary Figures 3, 4A and B respectively, which are available online at www.reproduction-online.org/supplemental/ further demonstrate the suitability of the Bayesian network approaches for a wide range of pathways. For all the sub-networks considered, the Bayesian mixture and dependence approaches provided similar gene relationships. The investigation of sub-networks was favored over the inference of complete networks because many genes pertaining to the corresponding KEGG pathways were not present in the microarray platform used.

Figure 2
Figure 2

Bayesian dependence (A) and mixture Bayesian (B) inference of the gene network associated with actomyosin assembly contraction and focal complex assembly within the KEGG regulation of actin cytoskeleton pathway. Gene names are: ITG, integrin β4 or ITGB4; c-Src, c-src tyrosine kinase; GRLF1, glucocorticoid receptor DNA-binding factor 1; RHO, ras homolog gene family, member A; RhoGEF, Rho guanine nucleotide exchange factor 11; MLCP, protein phosphatase 1, catalytic subunit, α isoform; MLC, myosin, light polypeptide 3; PI4P5K, phosphatidylinositol-4-phosphate 5-kinase, type II, β; and VCL, vinculin. Brown edges denote gene relationships identified by the Bayesian dependence model. Blue continued line edges denote direct gene relationships identified by the mixture Bayesian model and confirmed in the KEGG pathway. Green dash-dot edges denote indirect gene relationships identified by the mixture Bayesian model and confirmed in the KEGG pathway. Red dotted edges denote gene relationships present in the KEGG pathway and not identified by the mixture Bayesian model.

Citation: REPRODUCTION 135, 2; 10.1530/REP-07-0391

The Bayesian dependency approach was able to uncover the direct and indirect relationships between genes based on the available gene expression information that were consistent with the three associated KEGG pathways. Most of the direct and indirect relationships between genes present in the KEGG actin cytoskeleton pathway were identified by the dependence and mixture Bayesian approaches. All gene–gene relations in the actin cytoskeleton network predicted by the Bayesian dependence approach presented in Fig. 2A are consistent with the KEGG pathway with the exception of the relationships of gene MLC. The mixture Bayesian network was not able to correctly identify the parts of the network involving PI4P5K and VCL, but showed the correct relationship between Rho, MLCP, and MLC.

The actin cytoskeleton network predicted by the mixture Bayesian approach correctly detected the relationship between MLCP and MLC. However, the mixture Bayesian approach was not able to detect the relationships between ITG and c-Src, RhoGEF and Rho, and GRLF1 and Rho. The mixture Bayesian approach also detected direct relationships between ITG (or RhoGEF) and VCL and PI4P5K instead of an indirect relationship mediated through Rho. The rest of the relationships were consistently detected by both network approaches and were confirmed by the KEGG pathway.

Discussion

Meta-analysis of P values

Of both the study meta-analysis approaches, STU_FIS identified more or the same number of genes in common with any experiment than the SAM_ALL analysis. Among the experiments with more than two genes differentially expressed, the percentage of genes in common between STU_FIS and individual experiments ranged from 40 to 100% and between SAM_ALL and individual experiments ranged from 10 to 50%. The SAM_ALL meta-analysis identified more genes differentially expressed overall than the STU_FIS. This finding indicates that the results from the SAM_ALL sample-level meta-analysis are slightly less susceptible to single-experiment differential expression than the STU_FIS, thus resulting in lower consistency levels with the individual experiments. However, SAM_ALL is more likely to uncover differentially expressed genes that are less clearly supported by single-experiment studies. The most likely reason for the different behavior of both approaches is that the SAM_ALL approach takes into consideration the uncertainty and the number of observations within the study and the STU_FIS approach weights the P values from all studies equally. Even though two experiments may have the same P values, one P value may be the result of substantial fold changes in the expression between two treatment levels and low sample size and precision while the other P value may be the result from smaller fold changes in expression, higher sample size, and precision.

For example, GSE 1070 experiment had 7, 2, and 3 differentially expressed genes in common with experiments GSE 1074, GSE 1075, and GSE 1077 respectively. The similarity between the individual experiments is partly due to the similar experimental designs to compare the effect of the ligand PK11195 by the drug 2CdA (GSE 1070) or other teratogenic substances such as ethanol and methylmercury (GSE 1074 and GSE 1077). The SAM_ALL and STU_FIS meta-test detected 6 and 27 genes respectively of 54 genes differentially expressed in experiment GSE 1070. These results indicate that many of the 54 genes differentially expressed in study GSE 1070 responded to the treatment level combinations specific to that experiment. There was also a high overlap of the results between experiments GSE 1072, GSE 1074, and the meta-analysis, which can be explained by the substantial overlap of genes differentially expressed in the individual experiments that were borderline significant, between P<0.01 and P<0.001.

Meta-analysis of control versus treatment

Conclusions from the approaches based solely on significant P values do not take into account the direction of the gene expression changes. For example, a gene may be differentially expressed between treated and control samples in two studies; however, the gene may have been over-expressed in the control sample in one experiment and in the treated sample in the other. Consideration of the sign of the estimate of differential expression that the study-level STU_STD and SAM_TWO sample-level meta-analyses offers additional insights into the results stemming from the meta-analyses approaches. The grouping of treatment levels into control and treated and the removal of microarrays with samples including treatment with ligand agents are reasons why fewer genes were identified as differentially expressed within the experiment in the TWO dataset. The grouping permits the identification of genes with consistent magnitude and direction of the effects within and across experiments, regardless of the teratogenic treatment. These genes are likely to be key players on the impact of teratogenic agents on embryo development. The grouping of treatment levels is generally undesirable because the test involves the comparison of the average effect of the treatment levels within group.

The majority of the genes that exhibited differential expression between two individual experiments or meta-analysis also showed a consistent pattern or sign. There were two genes that were over-expressed in control relative to treated samples in experiment GSE 1075, and were under-expressed in experiment GSE 1074. As expected, these two genes were not among the two genes identified by the SAM_TWO sample-level meta-analysis due to their opposite trends.

The SAM_TWO sample-level meta-analysis had more power to detect differentially expressed genes when the samples grouped into treated or control levels exhibited a consistent expression pattern. The STU_STD study-level approach was more appropriate for genes with differential expression between treated and control samples within the experiment that experienced a weakening in the differential expression signal when the samples were pooled across experiments. Overall, the STU_STD approach may not have power to detect consistent differential expression when the estimates from a limited number of microarray experiments (e.g., five) are available. The lower number of genes differentially expressed found with the SAM_TWO and STU_STD approaches, compared with the SAM_ALL and STU_FIS approaches, can be attributed to the fewer experiments considered and the grouping of treatment levels into two main groups used for the implementation of the former two approaches.

The funnel plots for selected genes demonstrated that different meta-analytical approaches to gain strength form the objective combination of data and results from the studies and identify differentially expressed genes that otherwise would have been considered non-differentially expressed within the study. The meta-analyses were able to accumulate the consistent over-expression of genes across experiments and gained precision in the testing for differential expression. In addition, the funnel plots illustrate how genes with substantial heterogeneity of variances between studies favor the use of mixed-effects models that incorporate the heteroscedasticity across studies.

The association between the expression profile of genes MARCKS, ZNF263, and SC4MOL and the effect of agents that can be detrimental to normal embryo development portrayed in the funnel plots are consistent with the previous reports. Calabrese & Halpain (2005) demonstrated that MARCKS sustains dendritic spine morphology and influences morphological plasticity in rodent hippocampal neurons. Spine morphology is controlled by the intracellular signals that influence cytoskeletal and membrane dynamics, further confirming the detection of numerous genes pertaining to the actin cytoskeleton and cell communication pathways by the meta-analytical approaches. The family of ZNF proteins (Zfp; homologs to the human Znf) has long been associated with embryo development (Chowdhury et al. 1988) and recently with embryo stem cell fate (Bartsevich et al. 2003). Likewise, gene SC4MOL has been associated with mouse embryonic stem cell differentiation (Hailesellasse Sene et al. 2007).

Functional analysis

Representation of genes and gene families

Overall, numerous genes that were identified by the meta-analyses and individual experiment analysis have been associated with embryo development. The findings from the meta-analyses approaches corroborate reports that numerous toxic agents that can cause birth defects share some of the same effects on the transcriptome, while others are unique to the dose, exposure timing, embryogenetic line, and other factors (Finnell et al. 2002, Singh et al. 2005, Green et al. 2007, Suzuki et al. 2007). Our findings also corroborate reports that some toxic agents that can cause birth defects have the same or similar effects on the transcriptome (Nemeth et al. 2005, Singh et al. 2005). The meta-study approaches also support reports of additional condition-dependent effects of teratogenic agents with gene expression patterns highly dependent on the treatment levels compared, including toxic factor, dose, sampling time, and strain (Green et al. 2007).

The association between individual genes such as phosphoinositide 3-kinase, integrin, villin, and tropomyosin and embryo development detected in the present study were consistent with other studies. Druse et al. (2007) described the association between phosphoinositide 3-kinase and serotonin on fetal rhombencephalic neurons in rat fetuses treated with alcohol. Takahashi et al. (2005) provided a detailed review of the critical role of the phosphoinositide 3-kinase pathway in the proliferation, survival, and maintenance of pluripotency in embryo stem cells. Berti et al. (2006) reviewed the role of integrin in the development of the peripheral nervous system. Integrin is found in the basal plasma membrane and is responsible for cell–matrix adhesion and multicellular organismal development. The differential expression of the integrin gene is consistent with the previous work by Miller et al. (2006). Maunoury et al. (1992) described the association between the regulation of villin gene activity during mouse embryogenesis and the tissue-specific expression observed in adults. Villin is responsible for establishing and maintaining the apical/basal membrane polarity and Green et al. (2007) reported differential expression of cytoskeleton genes in the presence of alcohol and the present meta-analysis confirmed this finding. Lee et al. (1997) reported dependency of tropomyosin transcripts on alcohol.

Representation of pathway themes

Among the group of genes detected by the meta-analysis of experiments evaluating the effect of agents that can cause birth defects, there was an extensive representation of genes pertaining to glycolysis, cell communication, and proteasome pathways. Singh et al. (2005) compiled multiple microarray experiments including the studies considered in the present meta-analysis and others from multiple species and platforms and reported many of the differentially expressed GO categories and pathways identified in this study. As expected, many of the results in this study are consistent with the results reported by Singh et al. (2005); however, additional biological processes (i.e., cellular development) were also uncovered. Wang et al. (2007) reported on the effect of ethanol on the alcohol metabolic pathway in the Japanese Medaka embryogenesis. Monetti et al. (2002) conducted extensive studies of the effects of mercury on Xenopus embryos and identified a biomarker for these effects in the apoptotic signaling pathway. Analysis of experiment GSE 1072 that included treatment with methylmercury also uncovered representation of genes involved in anti-apoptotic processes. Many of the genes in the glycolysis pathway can be associated with mitochondrial-related processes. Soleman et al. (2003) described the activation of the mitochondrial apoptotic pathway in mouse embryos by the teratogenic agents and Singh et al. (2005) noted the critical role of mitochondria in embryo development. Green et al. (2007) reported significant representation of genes associated with protein degradation and translocation, glycolysis, chaperone proteins, and proteasome subunits on the embryos treated with ethanol. The significant presence of genes associated with carbohydrate metabolism in study GSE 1074 was expected because these processes are activated by the presence of alcohol.

The meta-analytical approaches implemented in this study also uncovered additional functions not discussed by Singh et al. (2005). In particular, genes with transferase activity and signal transducer activity were identified by the meta-analytical approaches. Likewise, genes related to the biological processes of lipoprotein, vitamin and steroid metabolic pathways, germ cell development, developmental growth and maturation, cell differentiation, cellular developmental process, and neuron migration were differentially represented in the meta-analysis gene list, however, were not differentially represented in more than one of the five studies with more than ten genes differentially expressed. These findings are supported by extensive literature review. With respect to lipoprotein, vitamin, and steroid metabolic pathways, a review by Kelley (2000) on inborn errors of cholesterol biosynthesis indicates that abnormal cholesterol metabolism (lipoprotein metabolic pathway) hinders the function of embryonic signaling proteins that guide the vertebrate body plan during early embryonic development. Bavik et al. (1997) described the importance of the metabolic pathway from vitamin A to retinoic acid for normal development of tissues in rat embryos. Likewise, the role of various genes on neural crest cell and motor neuron migration and differentiation on normal embryo development has been studied by Baum et al. (1999) and Hansson et al. (1999) respectively.

Gene network analysis

The Bayesian network approaches were able to reconstruct gene networks identical or highly similar to the pathways reported in the KEGG database. The results suggest that the network approaches have complementary advantages in capturing the true relationship (or lack thereof) between genes depending on the absolute and relative expression profile exhibited by the genes under consideration. The Bayesian network approaches were able to predict many relationships reported in the KEGG database, even though few of all the genes assigned to the selected pathway in the KEGG database were present in the microarray platform. Predicted gene relationships not found in the KEGG pathway may be real but not yet entered into the database or artifacts due to the substantial number of genes that were differentially expressed in the teratogenic experiments considered and the lack of all genes of the pathways on the microarray platform.

Some of these relationships have been reported in independent studies and have not been entered into the KEGG database. Experimental evidence supports some relationships found within the axon guidance pathway. Sasaki et al. (2002) described how Fyn and Cdk5 mediate signaling by semaphorin-3A that is associated with the regulation of dendrite orientation in cerebral cortex. Adey & Kay (1997) described the relationship between actinin and vinculin. The relationship between the ephrin B and Abl genes inferred using the Bayesian network analysis was discussed by Yu et al. (2001) who demonstrated that the ephrin family of receptor tyrosine kinases and the Abl family of non-receptor tyrosine kinases have shared signaling activities that influence tissue morphogenesis. The complex relationship between actinin, catenin, cadherin, actin, and viculin within the adherens junction pathway was described in the context of granule cell layer of the cerebellar glomerulus in mice embryos by Rose et al. (1995).

Conclusion

Research on numerous fields of reproduction, in particular embryo development, is generating vast amounts of gene expression information. The integration of information from multiple studies within a systems biology approach can result in more accurate and precise characterization of gene expression patterns across a large number of conditions and offer new insights or further confirm conclusions based on isolated experiments. The standard comparison of significant gene lists is often used as the first step to combine results from multiple experiments. However, the selection of arbitrary threshold values may hinder the identification of genes consistently differentially expressed.

Two methodologies that harvest knowledge from extensive gene expression datasets, meta-study, and gene network analyses were presented. The meta-analysis approaches can uncover additional evidence supporting consistent expression patterns across studies and offer an objective technique to effectively integrate microarray experiments. We compared the sample-level and study-level meta-analytical approaches that integrate information in a different fashion. The ability of these approaches to further mine for consistent patterns across experiments is best described by the funnel plots depicting all analyses simultaneously. Gene network analysis approaches were used to identify known relationships between genes by reconstructing known pathways. In some cases, additional relationships were identified that were supported by experimental evidence.

Traditional single-gene (single or meta-study) analysis of gene expression patterns was complemented with the Bayesian network inference of relationships between genes. This tool not only offers a powerful visual representation of gene–gene interactions but also allows the characterization of condition-dependent relationships that may be less represented in general pathway databases. The application of multiple meta-analytical and the Bayesian network approaches to mouse embryo gene expression experiments and consideration of particular results aids in the understanding of the complementary strengths of each approach.

Overall, the results from our study exemplify the benefits that can be harvested from the application of integrative and systematic analytical approaches to combine transcript information from multiple studies and genes. Standard statistical packages and web-based tools can be used to implement the meta-analyses and gene network approaches utilized in this study. The results from meta-, functional, and gene network analyses can offer additional and complementary understanding of gene co-regulation and uncover accurate biomarkers that can be used to characterize general or specific reproductive and embryonic processes.

Materials and Methods

Data

Gene expression data from nine experiments were obtained from the NCBI gene expression omnibus or GEO repository (http://www.ncbi.nlm.nih.gov/geo). The GEO series are GSE 1068, GSE 1069, GSE 1070, GSE 1072, GSE 1074, GSE 1075, GSE 1076, GSE 1077, and GSE 1079. All experiments (or studies) used the same two-dye cDNA platform and fluorescence intensities were available for 2382 sequence-verified human gene elements that effectively hybridize to mouse, human, and rat target mRNA. Detailed description of the treatment doses and timing, sampling time, mouse strains, platform, RNA isolation, labeling, and hybridization is available in the GEO database and are also provided by Nemeth et al. (2005), Singh et al. (2005) and Green et al. (2007). These microarray experiments were conducted to assess the expression profile of mouse embryos exposed to teratogenic agents that can potentially interfere with normal embryo development and cause birth defects. The treatments or teratogenic agents included were exposed to ethanol, methylmercury, low oxygen, or the metabolic toxin 2-chloro analog of 2′-deoxyadenosine. All experiments used headfold or forebrain samples from young mouse embryos that were collected ∼3 h after dams were injected with the agents at ∼8 days into gestation. Five studies also included treatment with two mitochondrial peripheral benzodiazepine receptor site ligands that may influence the effects of the teratogenic agents (Green et al. 2007).

All experimental factors or treatment levels were present in at least two microarrays following a dye-swap design and a total of 90 microarrays were available for the analysis. Microarray elements were removed when the median background-subtracted fluorescence intensity across both channels did not surpass a minimum intensity of 100. Subsequently, the background-subtracted intensities were log transformed, a loess normalization was applied to remove channel-dependent biases, and the normalized intensities were adjusted for global dye or microarray technical noise (Wolfinger et al. 2001, Cui et al. 2003, Kerr 2003, Rodriguez-Zas et al. 2006).

Meta-, functional, and network analyses

Meta-analyses

Two datasets that allowed the investigation of different meta-analytical approaches were formed. One dataset (ALL) included all the nine experiments available and the observations were assigned to the original treatment levels received. Alternatively, because most of the experiments included control (untreated) samples and samples treated with the teratogenic agents, a second dataset (TWO) was created using the microarrays that included control and treated samples from five out of the nine studies. In the TWO dataset, the original treatment levels were grouped into a control (or untreated) class and a treated class and the microarrays including ligand-treated samples were excluded to minimize the assignment of samples with different transcriptome profiles to the same group.

Different meta-analytic approaches were implemented to combine the information from multiple experiments and assess the treatment effects. The basic information used was the gene expression measurement pertaining to the samples that received different treatment levels within the experiment. These measurements can be analyzed within the experiment to provide an estimate of fold change (with associated standard errors) and the P value of treatment effect. The meta-analytical techniques that combine within-study results (estimates or P values) are termed study-level (STU) meta-analyses. Alternatively, all the measurements across studies and treatments can be analyzed together under the assumption that samples assigned to the same treatment level pertain to the same population. This type of approach that combines within-sample measurements (normalized gene expression intensities) was termed sample-level (SAM) meta-analysis because of the equivalencies with the ‘patient-level’ meta-analysis approach implemented in the field of medicine (e.g., Berghella et al. 2005, Salerno et al. 2007) and to differentiate it from the analysis of results available on a study basis (e.g., Carey et al. 2007).

Within the ALL and TWO datasets, each study or experiment (EXP) was analyzed using a standard mixed-effects model that included the fixed effects of treatment (multiple levels in the ALL dataset or the two levels, treated and control, in the TWO dataset) and dye and the random effect of microarray. In addition to the EXP analyses, STU and SAM meta-analyses of the ALL and TWO datasets were implemented. In the SAM meta-analysis of the ALL dataset (SAM_ALL), the effects of dye and treatment (multiple levels) were modeled as fixed effects and microarray and study were modeled as blocking random effects. In the ALL dataset, the STU_FIS meta-analysis consisted in aggregating EXP P values (of treatment effects) into a Fisher's χ2 test statistic (Hedges & Olkin 1995).

In the SAM meta-analysis of the TWO dataset (SAM_TWO), the effects of dye and treatment (two levels) were modeled as fixed effects and microarray and study were modeled as blocking random effects. The standardized estimate of differential expression (STU_STD) between control and treated samples (or log2 (fold change)) from each EXP analysis was analyzed in a study-level meta-analysis of the TWO dataset using a mixed-effects model including an overall mean (or overall difference between treated and control samples) and the random effect of study. The variance–covariance matrix of the study effect was diagonal with variances equal to the squared standard error computed within the EXP. This variance–covariance matrix accounted for the heterogeneity of variance across studies. The SAM_TWO approach is expected to provide slightly more power than the STU_STD because the SAM_TWO combines ∼40 gene expression measurements and the STU_STD combines five observations (the standardized estimates of differential expression between treatment levels from the five studies). A study-level meta-analysis of the non-standardized estimates of differential expression between treated and control samples (STU_NON) was also implemented. All analyses were implemented using the linear mixed model (MIXED) procedure in SAS (SAS Institute Inc., Cary, NC, USA). A comparative review of the information used by the study-level meta-analyses of estimates or P values, the sample-level meta-analysis, and the comparison of lists of significant genes obtained from the analysis of individual experiments is presented in Table 1. A detailed review of the meta-analysis can be found in Lipsey & Wilson (2000) and the implementation of the approaches used in this study is described by Wang & Bushman (1999) and Arthur et al. (2001).

Funnel plots were used to facilitate the visual comparison of results from the meta-analyses (Rodriguez-Zas et al. 2008). These plots are traditionally used in the meta-analysis to depict the estimates within and across studies. The funnel plots offer a comprehensive view of the gene expression pattern within and across studies. These plots are particularly informative, in that the estimates and the uncertainty of the estimates from each analysis are depicted simultaneously, facilitating the interpretation of results. Presented in the same funnel plot, the STU_NON overall estimates of differential expression are more similar to the SAM_ALL estimates than the STU_STD because the standardization of estimates in the latter approach may result in division of the estimates by small standard error values.

Functional analysis

The cDNA platform has been mapped to the UniGene clusters (NCBI UniGene Homo sapiens Build #202) and the KEGG (http://www.genome.jp/kegg) pathways. Representation of the GO categories in individual studies with more than five genes differentially expressed (GSE 1070, GSE 1072, GSE 1074, GSE 1075, and GSE 1077), the SAM_ALL sample-level meta-analysis, and the STU_FIS Fisher's meta-test of all the nine studies was examined using GeneTools (http://www.genetools.microarray.ntnu.no). A hypergeometric test was used to identify the GO categories significantly (P<0.005) over- or under-represented and with at least two genes within the EXP, SAM_ALL, and STU_FIS list of genes differentially expressed.

Gene network analysis

A Bayesian network approach was used to infer the relationships between genes using gene expression measurement from the nine mouse embryo studies designed to evaluate the effect of the teratogenic factors. Under the Bayesian framework, the networks are represented with directed acyclic graphs where the genes (or any other random variable) are denoted with nodes (i.e., circles or ovals) and the relationship between genes are denoted with arcs or vertices connecting the nodes. The estimates of gene expression for each treatment level obtained from the SAM_ALL analyses (adjusted for all other sources of variation) were used to construct gene networks using mixture and dependence Bayesian network approaches.

The mixture Bayesian network uses data-driven weighted mixtures of Gaussian models to describe the continuous gene expression data (Davies & Moore 2000, Friedman et al. 2000, Pe'er et al. 2001, Pe'er 2005, Ko et al. 2007a, 2007b) and models the dependencies between genes. Briefly, the distribution of each gene and the associated parental genes is described using a mixture of normal distributions instead of a single distribution (McLachlan & Peel 2000). The complete network is considered to be composed of sub-networks, each corresponding to a given gene node and associated parent nodes. The unknown parameters of the mixture of Gaussian distributions are the means and the variance–covariance matrices describing the association between the gene nodes for each component of the mixture and the weights of each component of the mixture (Davies & Moore 2000). The numbers of parental genes and mixture components are identified using the Bayesian information criterion (Kass & Raftery 1995) that is a function of the model likelihood, the number of parameters, and the sample size. This criterion, also known as Bayesian information criterion (BIC), favors models that better fit the data while penalizing for model complexity. An expectation-maximization algorithm is used to estimate the relationship (covariance) between the genes and the probability of each mixture component. The beneficial features of the mixture Bayesian network approaches of Davies & Moore (2000) and Ko et al. (2007a, 2007b) include the direct analysis of continuous gene expression measurements, modeling of multiple relationships simultaneously, and use of mixture of distributions to reflect the potentially complex behavior of gene expression data across large number of microarrays and experiments evaluating different conditions (Rodriguez-Zas et al. 2008).

The dependence model Bayesian network was implemented using B-course, a web-based data analysis tool for the Bayesian modeling (Myllymäki et al. 2002). The dependence modeling approach infers the relationships (i.e., causalities) between random variables (i.e., genes) based on the estimated probabilistic dependences among them. The beneficial features of the Myllymäki et al. (2002) approach include the capability to identify potentially non-linear relationships between genes and accommodate any gene expression distributional assumption through the modeling of unordered gene expression categories and the prompt availability of results using the B-course website. To demonstrate the potential of gene network approaches, we concentrate on reconstructing the networks that included genes identified as differentially expressed in the SAM_ALL meta-analysis approach and present in the KEGG pathway knowledgebase.

Declaration of interest

The authors declare that there is no conflict of interest that would prejudice the impartiality of this scientific work.

Funding

This work was supported by NIH/NIGMS (1R01GM068946-01), NSF/ITR (0428472), NSF/FIBR (0425852), and NIH/NIDA (5P30DA018310-039003). This article is based on research presented at the 2nd International Meeting on Mammalian Embryogenomics, which was sponsored by the Organisation for Economic Co-operation and Development (OECD), Le conseil Régional Ile-de-France, the Institut National de la Recherche Agronomique (INRA), Cogenics-Genome Express, Eurogentec, Proteigene, Sigma-Aldrich France and Diagenode sa. S L R-Z received funding from the OECD to attend the meeting. Y K, H A A and B R S have no relationship with any of the meeting sponsors.

Acknowledgements

The authors would like to thank Chengxiang Zhai (Department of Computer Science, University of Illinois) for his contributions to this study.

References

  • Adams HA, Rodriguez-Zas SL & Southey BR 2007 Comparison of meta-analytical approaches for gene expression profiling. Proceedings of the American Statistical Association [CD-ROM], Biometrics Section, Abstract number 310099. Alexandria, VA: American Statistical Association.

  • Adey NB & Kay BK 1997 Isolation of peptides from phage-displayed random peptide libraries that interact with the talin-binding domain of vinculin. Biochemical Journal 324 523528.

    • Search Google Scholar
    • Export Citation
  • Ang SL & Constam DB 2004 A gene network establishing polarity in the early mouse embryo. Seminars in Cell and Developmental Biology 15 555561.

    • Search Google Scholar
    • Export Citation
  • Arthur W, Bennett W, Huffcutt AIIn Conducting Meta-Analysis Using SAS 2001 Mahwah, NJ:Lawrence Erlbaum Associates, Inc:.

  • Bartsevich VV, Miller JC, Case CC & Pabo CO 2003 Engineered zinc finger proteins for controlling stem cell fate. Stem Cells 21 632637.

  • Baum PD, Guenther C, Frank CA, Pham BV & Garriga G 1999 The Caenorhabditis elegans gene ham-2 links Hox patterning to migration of the HSN motor neuron. Genes and Development 13 472483.

    • Search Google Scholar
    • Export Citation
  • Benjamini Y & Hochberg Y 1995 Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B 57 289300.

    • Search Google Scholar
    • Export Citation
  • Berghella V, Odibo AO, To MS, Rust OA & Althuisius SM 2005 Cerclage for short cervix on ultrasonography: meta-analysis of trials using individual patient-level data. Obstetrics and Gynecology 106 181189.

    • Search Google Scholar
    • Export Citation
  • Berti CA, Wrabetz NL & Feltri ML 2006 Role of integrins in peripheral nerves and hereditary neuropathies. Neuromolecular Medicine 8 191204.

  • Bolstad BM, Collin F, Simpson KM, Irizarry RA & Speed TP 2004 Experimental design and low-level analysis of microarray data. International Review of Neurobiology 60 2558.

    • Search Google Scholar
    • Export Citation
  • Calabrese B & Halpain S 2005 Essential role for the PKC target MARCKS in maintaining dendritic spine morphology. Neuron 48 7790.

  • Carey KB, Scott-Sheldon LA, Carey BK & Demartini KS 2007 Individual-level interventions to reduce college student drinking: a meta-analytic review. Addictive Behaviors 32 24692494.

    • Search Google Scholar
    • Export Citation
  • Choi JK, Yu U, Kim S & Yoo OJ 2003 Combining multiple microarray studies and modeling inter-study variation. Bioinformatics 19 i84i90.

  • Chowdhury K, Rohdewohld H & Gruss P 1988 Specific and ubiquitous expression of different Zn finger protein genes in the mouse. Nucleic Acids Research 16 999510011.

    • Search Google Scholar
    • Export Citation
  • Cui X, Kerr MK & Churchill GA 2003 Transformations for cDNA microarray data. Statistical Applications in Genetics and Molecular Biology 2 Article 4. Available at http://www.bepress.com/sagmb/vol2/iss1/art4.

    • Search Google Scholar
    • Export Citation
  • Davies S & Moore A 2000 Mix-nets: factored mixtures of Gaussians in Bayesian networks with mixed continuous and discrete variables. Pittsburgh, PA: School of Computer Science, Carnegie Mellon University, Technical report CMU-CS-00-119. Available at: http://citeseer.ist.psu.edu/cache/papers/cs/15276/http:zSzzSzwww-cgi.cs.cmu.eduzSzafszSzcs.cmu.eduzSzuserzSzscottdzSzwwwzSzuai2000.pdf/davies00mixnets.pdf.

  • Druse MJ, Gillespie RA, Tajuddin NF & Rich M 2007 S100B-mediated protection against the pro-apoptotic effects of ethanol on fetal rhombencephalic neurons. Brain Research 1150 4654.

    • Search Google Scholar
    • Export Citation
  • Finnell RH, Waes JG, Eudy JD & Rosenquist TH 2002 Molecular basis of environmentally induced birth defects. Annual Review of Pharmacology and Toxicology 42 181208.

    • Search Google Scholar
    • Export Citation
  • Friedman N, Linial M, Nachman I & Pe'er D 2000 Using Bayesian networks to analyze expression data. Journal of Computational Biology 7 601620.

  • Green ML, Singh AV, Zhang Y, Nemeth KA, Sulik KK & Knudsen TB 2007 Reprogramming of genetic networks during initiation of the fetal alcohol syndrome. Developmental Dynamics: An Official Publication of the American Association of Anatomists 236 613631.

    • Search Google Scholar
    • Export Citation
  • Hailesellasse Sene K, Porter CJ, Palidwor G, Perez-Iratxeta C, Muro EM, Campbell PA, Rudnicki MA & Andrade-Navarro MA 2007 Gene function in early mouse embryonic stem cell differentiation. BMC Genomics 8 85.

    • Search Google Scholar
    • Export Citation
  • Hansson SR, Mezey E & Hoffman BJ 1999 Serotonin transporter messenger RNA expression in neural crest-derived structures and sensory pathways of the developing rat embryo. Neuroscience 89 243265.

    • Search Google Scholar
    • Export Citation
  • Hedges LV, Olkin IIn Statistical Methods for Meta-analysis 1995 Burlington:Academic Press, Elsevier Inc:.

  • Hu P, Greenwoodand CMT & Beyene J 2006 Statistical methods for meta-analysis of microarray data: a comparative study. Information Systems Frontiers 8 920.

    • Search Google Scholar
    • Export Citation
  • Kass RE & Raftery RE 1995 Bayes factors. Journal of the American Statistical Association 90 773795.

  • Kerr MK 2003 Linear models for microarray data analysis: hidden similarities and differences. Journal of Computational Biology 10 891901.

    • Search Google Scholar
    • Export Citation
  • Ko Y, Zhai C & Rodriguez-Zas SL 2007a An efficient mixture model approach to characterize gene pathways using Bayesian networks. Proceedings of the American Statistical Association, [CD-ROM], Biometrics Section, Abstract number 310340. Alexandria, VA: American Statistical Association.

  • Ko Y, Zhai C & Rodriguez-Zas SL 2007b Inference of gene pathways using Gaussian mixture models. Proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine, San Jose, CA, USA. 362–367. IEEE Computer Society Press.

  • Lee IJ, Soh Y & Song BJ 1997 Molecular characterization of fetal alcohol syndrome using mRNA differential display. Biochemical and Biophysical Research Communications 240 309313.

    • Search Google Scholar
    • Export Citation
  • Li H & Zhan M 2006 Systematic intervention of transcription for identifying network response to disease and cellular phenotypes. Bioinformatics 22 96102.

    • Search Google Scholar
    • Export Citation
  • Lipsey MW, Wilson DBIn Practical Meta-Analysis 2000 Thousand Oaks, CA:Sage Publications:.

  • Maunoury R, Robine S, Pringault E, Leonard N, Gaillard JA & Louvard D 1992 Developmental regulation of villin gene expression in the epithelial cell lineages of mouse digestive and urogenital tracts. Development 115 717728.

    • Search Google Scholar
    • Export Citation
  • McLachlan G, Peel DIn Finite Mixture Models 2000 Indianapolis, IN:Wiley Publishing Inc:.

  • Miller MW, Mooney SM & Middleton FA 2006 Transforming growth factor β1 and ethanol affect transcription and translation of genes and proteins for cell adhesion molecules in B104 neuroblastoma cells. Journal of Neurochemistry 97 11821190.

    • Search Google Scholar
    • Export Citation
  • Monetti C, Vigetti D, Prati M, Sabbioni E, Bernardini G & Gornati R 2002 Gene expression in Xenopus embryos after methylmercury exposure: a search for molecular biomarkers. Environmental Toxicology and Chemistry/SETAC 21 27312736.

    • Search Google Scholar
    • Export Citation
  • Myllymki P, Silander T, Tirri H & Uronen P 2002 B-Course: a web-based tool for Bayesian and causal data analysis. International Journal on Artificial Intelligence Tools 11 369387.

    • Search Google Scholar
    • Export Citation
  • Nemeth KA, Singh AV & Knudsen TB 2005 Searching for biomarkers of developmental toxicity with microarrays: normal eye morphogenesis in rodent embryos. Toxicology and Applied Pharmacology 206 219228.

    • Search Google Scholar
    • Export Citation
  • Pe'er D 2005 Bayesian network analysis of signaling networks: a primer. Science's STKE 281 pl4.

  • Pe'er D, Regev A, Elidan G & Friedman N 2001 Inferring subnetworks from perturbed expression profiles. Bioinformatics 17 S215S224.

  • Revilla-i-Domingo R, Oliveri P & Davidson EH 2007 A missing link in the sea urchin embryo gene regulatory network: hesC and the double-negative specification of micromeres. PNAS 104 1238312388.

    • Search Google Scholar
    • Export Citation
  • Rhodes DR, Barrette TR, Rubin MA, Ghosh D & Chinnaiyan AM 2002 Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Research 62 44274433.

    • Search Google Scholar
    • Export Citation
  • Rodriguez-Zas SL, Southey BR, Whitfield CW & Robinson GE 2006 Semiparametric approach to characterize unique gene expression trajectories across time. BMC Genomics 13 233.

    • Search Google Scholar
    • Export Citation
  • Rodriguez-Zas SL, Schellander K & Lewin HA 2008 Biological interpretations of transcriptomic profiles in mammalian oocytes and embryos. Reproduction 135 129139.

    • Search Google Scholar
    • Export Citation
  • Rose O, Grund C, Reinhardt S, Starzinski-Powitz A & Franke WW 1995 Contactus adherens a special type of plaque-bearing adhering junction containing M-cadherin in the granule cell layer of the cerebellar glomerulus. PNAS 92 60226026.

    • Search Google Scholar
    • Export Citation
  • Salerno F, Camma C, Enea M, Rossle MF & Wong F 2007 Transjugular intrahepatic portosystemic shunt for refractory ascites: a meta-analysis of individual patient data. Gastroenterology 133 825834.

    • Search Google Scholar
    • Export Citation
  • Sasaki Y, Cheng C, Uchida Y, Nakajima O, Ohshima T, Yagi T, Taniguchi M, Nakayama T, Kishida R, Kudo Y, Ohno S, Nakamura F & Goshima Y 2002 Fyn and Cdk5 mediate semaphorin-3A signaling which is involved in regulation of dendrite orientation in cerebral cortex. Neuron 35 907920.

    • Search Google Scholar
    • Export Citation
  • Singh AV, Knudsen KB & Knudsen TB 2005 Computational systems analysis of developmental toxicity: design, development and implementation of a Birth Defects Systems Manager (BDSM). Reproductive Toxicology 19 421439.

    • Search Google Scholar
    • Export Citation
  • Soleman D, Cornel L, Little SA & Mirkes PE 2003 Teratogen-induced activation of the mitochondrial apoptotic pathway in the yolk sac of day 9 mouse embryos. Birth Defects Research. Part A, Clinical and Molecular Teratology 67 98107.

    • Search Google Scholar
    • Export Citation
  • Suzuki A, Urushitani H, Sato T, Kobayashi T, Watanabe H, Ohta Y & Iguchi T 2007 Gene expression change in the Mullerian duct of the mouse fetus exposed to diethylstilbestrol in utero. Experimental Biology and Medicine 232 503514.

    • Search Google Scholar
    • Export Citation
  • Takahashi K, Murakami M & Yamanaka S 2005 Role of the phosphoinositide 3-kinase pathway in mouse embryonic stem (ES) cells. Biochemical Society Transactions 33 15221525.

    • Search Google Scholar
    • Export Citation
  • Wang MC, Bushman BJIn Integrating Results Through Meta-Analytic Review Using SAS Software 1999 Cary, NC:SAS Press, SAS Institute Inc:.

  • Wang Y, Joshi T, Zhang XS, Xu D & Chen L 2006 Inferring gene regulatory networks from multiple microarray datasets. Bioinformatics 22 24132420.

  • Wang X, Zhu S, Khan IA & Dasmahapatra AK 2007 Ethanol attenuates Aldh9 mRNA expression in Japanese medaka (Oryzias latipes) embryogenesis. Comparative Biochemistry and Physiology. Part B, Biochemistry and Molecular Biology 146 357363.

    • Search Google Scholar
    • Export Citation
  • Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Afshari C & Paules RS 2001 Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology 8 625637.

    • Search Google Scholar
    • Export Citation
  • Yu HH, Zisch AH, Dodelet VC & Pasquale ER 2001 Multiple signaling interactions of Abl and Arg kinases with the EphB2 receptor. Oncogene 20 39954006.

    • Search Google Scholar
    • Export Citation

This article was presented at the 2nd International Meeting on Mammalian Embryogenomics, 17–20 October 2007. The Co-operative Research Programme: Biological Resource Management for Sustainable Agricultural Systems of The Organisation for Economic Co-operation and Development (OECD) has supported the publication of this article. The meeting was also sponsored by Le conseil Régional Ile-de-France, the Institut National de la Recherche Agronomique (INRA), Cogenics-Genome Express, Eurogentec, Proteigene, Sigma-Aldrich France and Diagenode sa.

 

  • Collapse
  • Expand
  • View in gallery

    Comparison of estimates of differential expression between control and treated samples for the gene MARCKS across analyses of five murine microarray experiments. Each row includes the estimate (central marker) and 95% confidence limits (whiskers) of the differential expression (in log2 units) between control samples and samples treated with tetagogenic agents of the individual analysis of each of the five studies (GSE 1068, 1069, 1074, 1075, and 1076), the sample-level meta-analysis (TWO), the study-level meta-analysis of the standardized estimates (STU_std) and the study-level meta-analysis of the non-standardized estimates (STU_non).

  • View in gallery

    Bayesian dependence (A) and mixture Bayesian (B) inference of the gene network associated with actomyosin assembly contraction and focal complex assembly within the KEGG regulation of actin cytoskeleton pathway. Gene names are: ITG, integrin β4 or ITGB4; c-Src, c-src tyrosine kinase; GRLF1, glucocorticoid receptor DNA-binding factor 1; RHO, ras homolog gene family, member A; RhoGEF, Rho guanine nucleotide exchange factor 11; MLCP, protein phosphatase 1, catalytic subunit, α isoform; MLC, myosin, light polypeptide 3; PI4P5K, phosphatidylinositol-4-phosphate 5-kinase, type II, β; and VCL, vinculin. Brown edges denote gene relationships identified by the Bayesian dependence model. Blue continued line edges denote direct gene relationships identified by the mixture Bayesian model and confirmed in the KEGG pathway. Green dash-dot edges denote indirect gene relationships identified by the mixture Bayesian model and confirmed in the KEGG pathway. Red dotted edges denote gene relationships present in the KEGG pathway and not identified by the mixture Bayesian model.