Refine
Document Type
- Article (4)
- Doctoral Thesis (2)
Language
- English (6)
Has Fulltext
- yes (6)
Is part of the Bibliography
- no (6)
Keywords
- GWAS (6) (remove)
Institute
Publisher
- Frontiers Media S.A. (2)
- MDPI (1)
- Springer Nature (1)
Mendelian randomization (MR) is a framework for assessing causal inference using cross-sectional data in combination with genetic information. This paper summarizes statistical methods commonly applied and strait forward to use for conducting MR analyses including those taking advantage of the rich dataset of SNP-trait associations that were revealed in the last decade through large-scale genome-wide association studies. Using these data, powerful MR studies are possible. However, the causal estimate may be biased in case the assumptions of MR are violated. The source and the type of this bias are described while providing a summary of the mathematical formulas that should help estimating the magnitude and direction of the potential bias depending on the specific research setting. Finally, methods for relaxing the assumptions and for conducting sensitivity analyses are discussed. Future researches in the field of MR include the assessment of non-linear causal effects, and automatic detection of invalid instruments.
Mendelian randomization (MR) is a framework for assessing causal inference using cross-sectional data in combination with genetic information. This paper summarizes statistical methods commonly applied and strait forward to use for conducting MR analyses including those taking advantage of the rich dataset of SNP-trait associations that were revealed in the last decade through large-scale genome-wide association studies. Using these data, powerful MR studies are possible. However, the causal estimate may be biased in case the assumptions of MR are violated. The source and the type of this bias are described while providing a summary of the mathematical formulas that should help estimating the magnitude and direction of the potential bias depending on the specific research setting. Finally, methods for relaxing the assumptions and for conducting sensitivity analyses are discussed. Future researches in the field of MR include the assessment of non-linear causal effects, and automatic detection of invalid instruments.
Genome-wide association studies (GWAS) are used to identify genetic markers linked with at least partially heritable diseases or phenotypes without prior knowledge of any disease-associated genetic loci. In summer 2008, all individuals of the population based cohort Study of Health in Pomerania (SHIP) were individually genotyped using the Affymetrix Genome-Wide Human SNP Array 6.0 microarray. The aim of this work was to establish an efficient workflow for GWAS using the more than 4000 individually genotyped samples of the SHIP cohort as well as pooled samples, focusing exclusively on analyzing genetic variations based on single nucleotide polymorphisms (SNPs). Firstly, an optimal array platform for the genotyping analysis had to be chosen that detected most of the available genetic variants at a high level of accuracy. Secondly, extensive quality controls had to be performed starting from DNA extraction and including tests of the generated array data by the analysis software to obtain the most reliable data for the subsequent association studies. For the identification of loci with smaller genetic influences, individual cohorts were meta-analyzed in large nationally and internationally organized consortia (e.g. CHARGE, BPGen, HaemGen, GIANT, CKD Gen). To participate in those meta-analyses, a comparable common set of genetic data had to be generated. This was done by imputation of the data generated by individual array-based genotyping on the basis of a reference panel using chromosomal linkage information. Due to the extensive phenotype information in the SHIP study, it was possible to perform many genome-wide discovery analyses and replication studies of possible susceptibility loci in a short time once the genetic data was available and processed. This resulted in the necessity to set up an efficient workflow for storing the huge amount of genetic data, converting it into different formats readable for specific analysis software, performing the association analyses and processing the results into a human-readable and clear format. This included replications, GWAS and meta-analyses of several cohorts. Many susceptibility loci were newly identified in different association studies with the SHIP data included and were subsequently published. In this work, genetic association studies with the SHIP data included were performed and published on blood pressure, uric acid concentrations, cardiac structure and function, lipid metabolism, hematological parameters, kidney functions, smoking quantity, circulating IGF-I and IGFBP-3 concentrations and thyroid volume including the risk of goiter development. Besides the SHIP cohort, there was a need to use other, especially patient cohorts for GWAS. Since no genotype information from these patient cohorts was available and the individual genotyping of many probands is still expensive and therefore often not affordable, we established the cost-effective allelotyping method that relied on pooling of DNA samples prior to the hybridization with microarrays. After estimating the pooling-specific error of a case-control allelotyping study, the allelotyping approach was used for identifying genetic susceptibility loci associated with aggressive periodontitis. If not referring to work of collaborators, all statistical analyses, data handling and in silico work concerning the SHIP data described in this context was performed by the author of this dissertation.
Genomics is the field of modern biology that studies the genome as the sum of all genes of a given organism. Genomics includes the analysis of genomic variations in order to identify genetic susceptibility loci for various human diseases. Besides genomics, there are related fields summarized by the term "Omics" such as transcriptomics and proteomics, studying the sum of all transcripts and proteins in a defined biological system, respectively. Genetic variants, namely single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) are used to identify genomic loci associated with human traits and diseases. Genome-wide association studies (GWASs) based on SNP data have been performed for a wide range of human traits and diseases. In the population-based Study of Health in Pomerania (SHIP) and the independent SHIP-TREND study, whole-genome genotyping data were available for 4081 and 986 individuals, respectively. In contrast to the widely used GWAS based on SNPs, association studies using CNV data are difficult to implement and thus less common. Therefore, one aim of this work was to detect CNVs using the whole-genome genotyping data available for 4081 individuals from SHIP. Another aim was to develop an efficient workflow for the analysis of these CNVs. As most common genetic variants exhibit only relatively small effects on phenotypic variability, large sample sizes are needed to maximize the statistical power to detect such effects. Therefore, the integration of data from multiple collaborating studies is indispensable. In this context, several CNV studies with the SHIP data have been performed and published, for example on body mass index (BMI) phenotypes where the SHIP cohort was used as a population-based control. Trait-associated genetic markers identified through GWASs are often intergenic or synonymous coding, and those loci identified through whole-genome CNV analyses often contain multiple genes, making it difficult to identify the causal variants. In this context, the functional analysis of identified loci aids in determining causal variant(s). One possibility to conduct functional analysis is the expression quantitative trait loci (eQTL) analysis, defined as the association of genome-wide genotyping data with genome-wide gene expression data based on measured transcriptomes. This allows the identification of genetic variants influencing the expression levels of defined genes. A further example are transcriptome-wide association analysis (TWAS), defined as the association of phenotype data with whole-genome expression data. Thus, another aim of this work was to establish an analysis pipeline for processing such expression data, which were available for about 1000 individuals from the SHIP-TREND study. Here, array-based gene expression data were generated using RNA prepared from whole-blood. Interpretation of TWAS results is often difficult, because of possible reverse causation on gene expression data. Furthermore, technical errors of measurement may bias the results. In a comprehensive work, biological and technical factors influencing measured gene expression data have been identified and were subsequently taken into account to improve the association analyses. To further elucidate the molecular mechanisms underlying the relationship of gene expression levels with human traits or diseases, pathway analyses using the Ingenuity Pathway Analysis (IPA) tool have been performed in connection with the TWAS. As for GWASs, the associations identified in TWAS usually exhibit only small effect sizes, highlighting the need for larger studies or meta-analysis to identify all susceptibility variants. In this context several eQTL- and TWAS meta-analyses using the SHIP-TREND data have been performed, for example on the phenotypes age, sex, BMI, smoking status and serum lipid traits. The results of these analyses are in preparation for publication and the most advanced example, the correlation of expression data with BMI, is presented here. The integration of whole-genome genotyping and expression data provides new functional information of the underlying biological mechanisms of complex human traits and diseases. Within the frame of this work, this could be demonstrated for the example of susceptibility to Helicobacter pylori infection.
Metabolites are intermediates or end products of biochemical processes involved in both health and disease. Here, we take advantage of the well-characterized Cooperative Health Research in South Tyrol (CHRIS) study to perform an exome-wide association study (ExWAS) on absolute concentrations of 175 metabolites in 3294 individuals. To increase power, we imputed the identified variants into an additional 2211 genotyped individuals of CHRIS. In the resulting dataset of 5505 individuals, we identified 85 single-variant genetic associations, of which 39 have not been reported previously. Fifteen associations emerged at ten variants with >5-fold enrichment in CHRIS compared to non-Finnish Europeans reported in the gnomAD database. For example, the CHRIS-enriched ETFDH stop gain variant p.Trp286Ter (rs1235904433-hexanoylcarnitine) and the MCCC2 stop lost variant p.Ter564GlnextTer3 (rs751970792-carnitine) have been found in patients with glutaric acidemia type II and 3-methylcrotonylglycinuria, respectively, but the loci have not been associated with the respective metabolites in a genome-wide association study (GWAS) previously. We further identified three gene-trait associations, where multiple rare variants contribute to the signal. These results not only provide further evidence for previously described associations, but also describe novel genes and mechanisms for diseases and disease-related traits.
Osteoporosis, a complex chronic disease with increasing prevalence, is characterised by reduced bone mineral density (BMD) and increased fracture risk. The high heritability of BMD suggests substantial impact of the individual genetic disposition on bone phenotypes and the development of osteoporosis. In the past years, genome-wide association studies (GWAS) identified hundreds of genetic variants associated with BMD or osteoporosis. Here, we analysed 1103 single nucleotide polymorphisms (SNPs), previously identified as associated with estimated BMD (eBMD) in the UK Biobank. We assessed whether these SNPs are related to heel stiffness index obtained by quantitative ultrasound in 5665 adult participants of the Study of Health in Pomerania (SHIP). We confirmed 45 significant associations after correction for multiple testing. Next, we analysed six selected SNPs in 631 patients evaluated for osteoporosis [rs2707518 (CPED1/WNT16), rs3779381 (WNT16), rs115242848 (LOC101927709/EN1), rs10239787 (JAZF1), rs603424 (PKD2L1) and rs6968704 (JAZF1)]. Differences in minor allele frequencies (MAF) of rs2707518 and rs3779381 between SHIP participants (higher MAF) and patients evaluated for osteoporosis (lower MAF) indicated a protective effect of the minor allele on bone integrity. In contrast, differences in MAF of rs603424 indicated a harmful effect. Co-localisation analyses indicated that the rs603424 effect may be mediated via stearoyl-CoA desaturase (SCD) expression, an enzyme highly expressed in adipose tissue with a crucial role in lipogenesis. Taken together, our results support the role of the WNT16 pathway in the regulation of bone properties and indicate a novel causal role of SCD expression in adipose tissue on bone integrity.