Refine
Document Type
- Doctoral Thesis (3)
Language
- English (3)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
- Maschinelles Lernen (3) (remove)
Age is the single biggest risk factor for most major human diseases. As such, understanding the intricate molecular changes that drive biological aging holds great promise in attempting to slow
the onset of systemic diseases and thereby increase the effective health-span in modern societies.
This thesis explores several computational approaches to capture and analyze the molecular biological alterations triggered by intrinsic and extrinsic aging using skin as a model tissue to deliver genes and pathways as potential targets for intervention strategies.
Publication 1 demonstrates the utility of multi-omics data integration strategies for aging research, leading to the identification of four latent aging phases in skin tissue through an integrated cluster analysis of gene expression and DNA methylation data. The four phases improved the detection of molecular aging signals and were shown to be associated with sunbathing habits of the test subjects. Deeper analysis revealed extensive non-linear alterations in various biological pathways particularly at the transition into the fourth aging phase, coinciding with menopause, with potentially wide-reaching functional implications. Publication 2 describes the development of a novel type of age clock, that provides a new level of interpretability by embedding biological pathway information in the architecture of an artificial neural network. The clock not only generates meaningful biological age estimates from gene expression data, but further allows simultaneous monitoring of the aging states of various biological processes through the activations of intermediate neurons. Analyses of the inner workings of the clock revealed a wide-spread impact of aging on the global pathway landscape. Simulation experiments using the transcriptomic clock recapitulated known functional aging gene associations and allowed deciphering of the pathways by which accelerated aging conditions such as chronic sun exposure and Hutchinson-Gilford progeria syndrome exert their effects. Publication 3 further explores the molecular alterations caused by the pro-aging effector UV irradiation in the skin. The multi-omics data analysis of repetitively irradiated skin revealed signs of the immediate acquisition of aging- and cancer-related epigenetic signatures and concurrent wide-spread transcriptional changes across various biological processes. Investigations into the varying resilience to irradiation between subjects revealed prognostic biomarker signatures capable of predicting individual UV tolerances, with accuracies far surpassing the traditional Fitzpatrick classification scheme. Further analysis of the transcripts and pathways associated with UV tolerance identified a form of melanin-independent DNA damage protection in individuals with higher innate UV resilience.
Together, the approaches and findings described in this thesis explore several new angles to advance our understanding of aging processes and external drivers of aging such as UV irradiation in the human skin and deliver new insight on target genes and pathways involved.
We introduce a multi-step machine learning approach and use it to classify data from EEG-based brain computer interfaces. This approach works very well for high-dimensional EEG data. First all features are divided into subgroups and linear discriminant analysis is used to obtain a score for each subgroup. Then it is applied to subgroups of the resulting scores. This procedure is iterated until there is only one score remaining and this one is used for classification. In this way we avoid estimation of the high-dimensional covariance matrix of all features. We investigate the classifification performance with special attention to the small sample size case. For the normal model, we study the asymptotic error rate when dimension p and sample size n tend to infinity. This indicates how to defifine the sizes of subgroups at each step. In addition we present a theoretical error bound for the spatio-temporal normal model with separable covariance matrix, which results in a recommendation on how subgroups should be formed for this kind of data. Finally some techniques, for example wavelets and independent component analysis, are used to extract features of some kind of EEG-based brain computer interface data.
Humanity is plagued by many diseases. Beside environmental influences, many --- if not all --- diseases are also subject to genetic predisposition and then display molecular alterations such as proteomic or metabolic aberrations. The elucidation of the molecular principles underlying human diseases is one of the prime goals of biomedical research. To this end, there has been an advent of large-scale omics profiling studies. While the field of molecular biology has experienced tremendous development, data analysis remains a bottleneck. In the context of this thesis, we developed a number of analysis strategies for different types of omics data resulting from different experimental settings. These include approaches for associations studies for plasma miRNAs and time-resolved plasma omics data. Furthermore, we devised analyses of different RNA-Seq transcriptome profiling studies coping with problems such as lack of replicates or multifactorial experimental design. We also designed machine learning frameworks for the identification of discriminatory biomolecular signatures analysing case-control or time-to-event data. All of the strategies mentioned above were developed and applied in the contexts of multi-disciplinary endeavours. They aided in the identification of plasma miRNAs associated with age, sex, and BMI as well as plasma miRNAs bearing potential as diagnostic biomarkers for non-alcoholic fatty liver disease (NAFLD). This thesis significantly contributed to a study demonstrating the utility of plasma miRNAs as prognostic biomarkers for major cardiovascular events such as ST-elevation myocardial infarction. Our approaches for analysing RNA-Seq data aided in the characterisation of murine models for Alzheimers disease and the transcriptional response of human gingiva fibroblasts to ionizing radiation exposure. Furthermore, the developed approaches were applied for studying a human model for thyrotoxicosis and for the successful identification of a multi-omics plasma biomarker signature of thyroid status. We are only beginning to understand the molecular principles underlying human diseases. The approaches and results presented in this thesis will contribute to improved understanding of biomolecular processes involved in common diseases such as Alzheimers disease, NAFLD, and cardiovascular diseases.