OPUS 4 | Search

Computing the logarithmic capacity of compact sets having (infinitely) many components with the charge simulation method (2023)

Liesen, Jörg ; Nasser, Mohamed M. S. ; Sète, Olivier

We apply the charge simulation method (CSM) in order to compute the logarithmic capacity of compact sets consisting of (infinitely) many “small” components. This application allows to use just a single charge point for each component. The resulting method therefore is significantly more efficient than methods based on discretizations of the boundaries (for example, our own method presented in Liesen et al. (Comput. Methods Funct. Theory 17, 689–713, 2017)), while maintaining a very high level of accuracy. We study properties of the linear algebraic systems that arise in the CSM, and show how these systems can be solved efficiently using preconditioned iterative methods, where the matrix-vector products are computed using the fast multipole method. We illustrate the use of the method on generalized Cantor sets and the Cantor dust.

learnMSA: learning and aligning large protein families (2022)

Becker, Felix ; Stanke, Mario

Background The alignment of large numbers of protein sequences is a challenging task and its importance grows rapidly along with the size of biological datasets. State-of-the-art algorithms have a tendency to produce less accurate alignments with an increasing number of sequences. This is a fundamental problem since many downstream tasks rely on accurate alignments. Results We present learnMSA, a novel statistical learning approach of profile hidden Markov models (pHMMs) based on batch gradient descent. Fundamentally different from popular aligners, we fit a custom recurrent neural network architecture for (p)HMMs to potentially millions of sequences with respect to a maximum a posteriori objective and decode an alignment. We rely on automatic differentiation of the log-likelihood, and thus, our approach is different from existing HMM training algorithms like Baum–Welch. Our method does not involve progressive, regressive, or divide-and-conquer heuristics. We use uniform batch sampling to adapt to large datasets in linear time without the requirement of a tree. When tested on ultra-large protein families with up to 3.5 million sequences, learnMSA is both more accurate and faster than state-of-the-art tools. On the established benchmarks HomFam and BaliFam with smaller sequence sets, it matches state-of-the-art performance. All experiments were done on a standard workstation with a GPU. Conclusions Our results show that learnMSA does not share the counterintuitive drawback of many popular heuristic aligners, which can substantially lose accuracy when many additional homologs are input. LearnMSA is a future-proof framework for large alignments with many opportunities for further improvements.

Geometric T-Duality: Buscher Rules in General Topology (2023)

Waldorf, Konrad

The classical Buscher rules d escribe T-duality for metrics and B-fields in a topologically trivial setting. On the other hand, topological T-duality addresses aspects of non-trivial topology while neglecting metrics and B-fields. In this article, we develop a new unifying framework for both aspects.

Defining Binary Phylogenetic Trees Using Parsimony (2022)

Fischer, Mareike

Phylogenetic (i.e., leaf-labeled) trees play a fundamental role in evolutionary research. A typical problem is to reconstruct such trees from data like DNA alignments (whose columns are often referred to as characters), and a simple optimization criterion for such reconstructions is maximum parsimony. It is generally assumed that this criterion works well for data in which state changes are rare. In the present manuscript, we prove that each binary phylogenetic tree T with n ≥ 20k leaves is uniquely defined by the set Ak (T), which consists of all characters with parsimony score k on T. This can be considered as a promising first step toward showing that maximum parsimony as a tree reconstruction criterion is justified when the number of changes in the data is relatively small.

Walsh’s Conformal Map onto Lemniscatic Domains for Polynomial Pre-images I (2022)

Schiefermayr, Klaus ; Sète, Olivier

We consider Walsh’s conformal map from the exterior of a compact set E ⊆ C onto a lemniscatic domain. If E is simply connected, the lemniscatic domain is the exterior of a circle, while if E has several components, the lemniscatic domain is the exterior of a generalized lemniscate and is determined by the logarithmic capacity of E and by the exponents and centers of the generalized lemniscate. For general E, we characterize the exponents in terms of the Green’s function of Ec. Under additional symmetry conditions on E, we also locate the centers of the lemniscatic domain. For polynomial pre-images E = P−1(Ω) of a simply-connected infinite compact set Ω, we explicitly determine the exponents in the lemniscatic domain and derive a set of equations to determine the centers of the lemniscatic domain. Finally, we present several examples where we explicitly obtain the exponents and centers of the lemniscatic domain, as well as the conformal map.

Galba: genome annotation with miniprot and AUGUSTUS (2023)

Brůna, Tomáš ; Li, Heng ; Guhlin, Joseph ; Honsel, Daniel ; Herbold, Steffen ; Stanke, Mario ; Nenasheva, Natalia ; Ebel, Matthis ; Gabriel, Lars ; Hoff, Katharina J.

Background The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. Results Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein-to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. Conclusions Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.

Global, highly specific and fast filtering of alignment seeds (2022)

Ebel, Matthis ; Migliorelli, Giovanna ; Stanke, Mario

Background An important initial phase of arguably most homology search and alignment methods such as required for genome alignments is seed finding. The seed finding step is crucial to curb the runtime as potential alignments are restricted to and anchored at the sequence position pairs that constitute the seed. To identify seeds, it is good practice to use sets of spaced seed patterns, a method that locally compares two sequences and requires exact matches at certain positions only. Results We introduce a new method for filtering alignment seeds that we call geometric hashing. Geometric hashing achieves a high specificity by combining non-local information from different seeds using a simple hash function that only requires a constant and small amount of additional time per spaced seed. Geometric hashing was tested on the task of finding homologous positions in the coding regions of human and mouse genome sequences. Thereby, the number of false positives was decreased about million-fold over sets of spaced seeds while maintaining a very high sensitivity. Conclusions An additional geometric hashing filtering phase could improve the run-time, accuracy or both of programs for various homology-search-and-align tasks.

Relevance and Regulation of Alternative Splicing in Plant Heat Stress Response: Current Understanding and Future Directions (2022)

Rosenkranz, Remus R. E. ; Ullrich, Sarah ; Löchli, Karin ; Simm, Stefan ; Fragkostefanakis, Sotirios

Alternative splicing (AS) is a major mechanism for gene expression in eukaryotes, increasing proteome diversity but also regulating transcriptome abundance. High temperatures have a strong impact on the splicing profile of many genes and therefore AS is considered as an integral part of heat stress response. While many studies have established a detailed description of the diversity of the RNAome under heat stress in different plant species and stress regimes, little is known on the underlying mechanisms that control this temperature-sensitive process. AS is mainly regulated by the activity of splicing regulators. Changes in the abundance of these proteins through transcription and AS, post-translational modifications and interactions with exonic and intronic cis-elements and core elements of the spliceosomes modulate the outcome of pre-mRNA splicing. As a major part of pre-mRNAs are spliced co-transcriptionally, the chromatin environment along with the RNA polymerase II elongation play a major role in the regulation of pre-mRNA splicing under heat stress conditions. Despite its importance, our understanding on the regulation of heat stress sensitive AS in plants is scarce. In this review, we summarize the current status of knowledge on the regulation of AS in plants under heat stress conditions. We discuss possible implications of different pathways based on results from non-plant systems to provide a perspective for researchers who aim to elucidate the molecular basis of AS under high temperatures.

Identifying Differentially Expressed MicroRNAs, Target Genes, and Key Pathways Deregulated in Patients with Liver Diseases (2020)

Gholizadeh, Maryam ; Szelag-Pieniek, Sylwia ; Post, Mariola ; Kurzawski, Mateusz ; Prieto, Jesus ; Argemi, Josepmaria ; Drozdzik, Marek ; Kaderali, Lars

Liver diseases are important causes of morbidity and mortality worldwide. The aim of this study was to identify differentially expressed microRNAs (miRNAs), target genes, and key pathways as innovative diagnostic biomarkers in liver patients with different pathology and functional state. We determined, using RT-qPCR, the expression of 472 miRNAs in 125 explanted livers from subjects with six different liver pathologies and from control livers. ANOVA was employed to obtain differentially expressed miRNAs (DEMs), and miRDB (MicroRNA target prediction database) was used to predict target genes. A miRNA–gene differential regulatory (MGDR) network was constructed for each condition. Key miRNAs were detected using topological analysis. Enrichment analysis for DEMs was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). We identified important DEMs common and specific to the different patient groups and disease progression stages. hsa-miR-1275 was universally downregulated regardless the disease etiology and stage, while hsa-let-7a*, hsa-miR-195, hsa-miR-374, and hsa-miR-378 were deregulated. The most significantly enriched pathways of target genes controlled by these miRNAs comprise p53 tumor suppressor protein (TP53)-regulated metabolic genes, and those involved in regulation of methyl-CpG-binding protein 2 (MECP2) expression, phosphatase and tensin homolog (PTEN) messenger RNA (mRNA) translation and copper homeostasis. Our findings show a novel panel of deregulated miRNAs in the liver tissue from patients with different liver pathologies. These miRNAs hold potential as biomarkers for diagnosis and staging of liver diseases.

NOD-Like Receptors: Guards of Cellular Homeostasis Perturbation during Infection (2021)

Pei, Gang ; Dorhoi, Anca

The innate immune system relies on families of pattern recognition receptors (PRRs) that detect distinct conserved molecular motifs from microbes to initiate antimicrobial responses. Activation of PRRs triggers a series of signaling cascades, leading to the release of pro-inflammatory cytokines, chemokines and antimicrobials, thereby contributing to the early host defense against microbes and regulating adaptive immunity. Additionally, PRRs can detect perturbation of cellular homeostasis caused by pathogens and fine-tune the immune responses. Among PRRs, nucleotide binding oligomerization domain (NOD)-like receptors (NLRs) have attracted particular interest in the context of cellular stress-induced inflammation during infection. Recently, mechanistic insights into the monitoring of cellular homeostasis perturbation by NLRs have been provided. We summarize the current knowledge about the disruption of cellular homeostasis by pathogens and focus on NLRs as innate immune sensors for its detection. We highlight the mechanisms employed by various pathogens to elicit cytoskeleton disruption, organelle stress as well as protein translation block, point out exemplary NLRs that guard cellular homeostasis during infection and introduce the concept of stress-associated molecular patterns (SAMPs). We postulate that integration of information about microbial patterns, danger signals, and SAMPs enables the innate immune system with adequate plasticity and precision in elaborating responses to microbes of variable virulence.

Open Access

Article

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

Publisher

30 search hits