Refine
Year of publication
- 2022 (2)
Document Type
- Article (2)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- - (2)
- metabolomics (2)
- ExWAS (1)
- GWAS (1)
- R programming (1)
- annotation (1)
- association study (1)
- imputation (1)
- reproducible research (1)
- small-compound databases (1)
Institute
Publisher
- MDPI (2)
Metabolites are intermediates or end products of biochemical processes involved in both health and disease. Here, we take advantage of the well-characterized Cooperative Health Research in South Tyrol (CHRIS) study to perform an exome-wide association study (ExWAS) on absolute concentrations of 175 metabolites in 3294 individuals. To increase power, we imputed the identified variants into an additional 2211 genotyped individuals of CHRIS. In the resulting dataset of 5505 individuals, we identified 85 single-variant genetic associations, of which 39 have not been reported previously. Fifteen associations emerged at ten variants with >5-fold enrichment in CHRIS compared to non-Finnish Europeans reported in the gnomAD database. For example, the CHRIS-enriched ETFDH stop gain variant p.Trp286Ter (rs1235904433-hexanoylcarnitine) and the MCCC2 stop lost variant p.Ter564GlnextTer3 (rs751970792-carnitine) have been found in patients with glutaric acidemia type II and 3-methylcrotonylglycinuria, respectively, but the loci have not been associated with the respective metabolites in a genome-wide association study (GWAS) previously. We further identified three gene-trait associations, where multiple rare variants contribute to the signal. These results not only provide further evidence for previously described associations, but also describe novel genes and mechanisms for diseases and disease-related traits.
Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS1 properties such as m/z and retention times, followed by an MS2-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor.