Project

Back to overview

Cell type-specific expression of 3’ untranslated region isoforms: quantification, modeling, and prediction of functional impact

English title Cell type-specific expression of 3’ untranslated region isoforms: quantification, modeling, and prediction of functional impact
Applicant Zavolan Mihaela
Number 189063
Funding scheme Project funding
Research institution Biozentrum der Universität Basel
Institution of higher education University of Basel - BS
Main discipline Molecular Biology
Start/End 01.01.2020 - 31.12.2023
Approved amount 800'000.00
Show all

Keywords (5)

PWM; 3' UTR; modeling; single cell; poly(A) site

Lay Summary (German)

Lead
Ziel unserer Arbeit ist es, neue Formen von Boten-RNAs aufzudecken, die mit sehr hoher Spezifität produziert werden und somit helfen, Subtypen von menschlichen Zellen zu identifizieren, sowohl normale als auch krankheitsassoziierte.
Lay summary

Eine wesentliche Herausforderung der modernen Biologie besteht darin, herauszufinden, wie die vielen Zelltypen im Körper aus einer einzigen Zelle, basierend auf der gleichen Genomsequenz, erzeugt werden, und zwar nur durch die Regulation der Genexpression. Neu entwickelte Technologien ermöglichen es, einen relativ großen Teil der RNAs zu identifizieren, die in den einzelnen menschlichen Zellen vorhanden sind, und so wurden gross angelegte Initiativen gestartet, um einen "Human Cell Atlas" aller Zelltypen im menschlichen Körper zu erstellen. Allerdings kann ein menschliches Gen typischerweise mehr als eine Art von RNA erzeugen, und es ist im Allgemeinen unklar, welche RNA in einem bestimmten Zelltyp erzeugt wird und warum. Dennoch kann gerade diese Vielfalt eine zuverlässige Identifizierung von Zelltypen auf Basis der RNA-Sequenzierung ermöglichen. Ziel unseres Projekts ist es, einen kohärenten Satz von synergistischen Methoden zu entwickeln, um zelltyp-spezifische mRNA-Isoformen, die sich aus der Verwendung alternativer Polyadenylierungsstellen ergeben, systematisch zu annotieren, ihren Einsatz auf Einzelzellenebene zu quantifizieren und die Regulation zelltyp-spezifischer Isoform-Expression zu modellieren. Indem wir diese Werkzeuge auf Daten aus groß angelegten Initiativen, z.B. für die personalisierte Medizin, anwenden, wollen wir die relative Expression alternativer Isoformen mit der Morphologie und dem Verhalten einzelner Zelltypen verknüpfen, neue Krankheitsmarker liefern und die Entwicklung individualisierter Therapien unterstützen.

Direct link to Lay Summary Last update: 15.11.2019

Lay Summary (English)

Lead
The goal of our work is to uncover new forms of messenger RNAs that are produced with very high specificity and thereby help identify subtypes of human cells, both normal and disease-associated.
Lay summary

A central challenge of modern biology is to uncover how the many cell types in the body are generated from a single cell and based on the same genome sequence, through the regulation of gene expression. Recently developed technologies allow one to identify a relatively large fraction of the RNAs that are present in individual human cells and thus, large scale initiatives have been started to construct a ‘human cell atlas’, of all cell types in a human body. However, a human gene can typically give rise to more than one type of RNA and it is generally unclear which RNA is generated in a specific cell type and why. Nevertheless, it is precisely this diversity that may allow a reliable identification of cell types based on RNA sequencing. The goal of this project is to develop a coherent set of synergistic methods to systematically annotate cell type-specific mRNA isoforms that result from the use of alternative polyadenylation sites, quantify their usage at the single cell level, and model the regulation of cell type-specific isoform expression. Applying these tools to data from large-scale initiatives, e.g. for personalized health, we aim to link the relative expression of alternative isoforms to the morphology and behavior of individual cell types, to provide new disease markers and support the development of individualized therapies.

Direct link to Lay Summary Last update: 15.11.2019

Responsible applicant and co-applicants

Employees

Associated projects

Number Title Start Funding scheme
170216 Regulation of mRNA translation and its relationship with disease processes 01.10.2016 Project funding
204517 Modeling multi-layer large-scale data to decipher the translational regulatory code of cellular functions 01.05.2022 Project funding
141735 NCCR RNA & disease: Understanding the role of RNA biology in disease mechanisms (phase I) 01.05.2014 National Centres of Competence in Research (NCCRs)

Abstract

The human body contains hundreds of cell types. Thus, a central challenge of modern biology is to uncover the principles that guide the formation and maintenance of this complex cellular landscape, from a single cell and based on the same genome sequence. Indeed, large scale initiatives such as the ‘human cell atlas’ are ongoing, aiming to comprehensively map cell states with a variety of technologies, including single cell sequencing. However, while most human genes express multiple isoforms, differing primarily in transcription start and termination sites, expression is mostly described at the ‘gene level’. Furthermore, annotation of regulatory elements and their specific involvement in a particular cell remains sparse.The term alternative cleavage and polyadenylation (APA) is used to describe the tissue-dependent choice of a poly(A) site among the many available within a gene. APA leads to expression of isoforms that differ in their protein-coding sequences (CDS) and/or in their 3’ untranslated regions (3’ UTRs). During animal evolution, 3’ UTRs have expanded from ~140 nucleotides in the worm Caenorhabditis elegans to 1-2 kilobases in humans, suggesting a parallel increase in the complexity of post-transcriptional regulation. Indeed, there is increasing evidence that APA contributes to cell identity. For instance, it has been reported that the cell type-specific pattern of expression of single-3’UTR genes is due preferentially to transcriptional regulation, and rather to changes in isoform ratios in the case of multi-UTR genes. Because 3’ UTR isoforms differ in relative stability, localization, and translation rate, APA-dependent remodeling of 3’ UTRs has many and far-reaching consequences on cell physiology. A very surprising finding has been that transcripts with distinct 3’ UTRs can direct the localization of the (same) encoded protein to distinct cellular compartments. Although these findings made clear that much of the diversity of cellular phenotypes could come from differences in APA-dependent isoform expression, this possibility that has received relatively little consideration. Having established unique resources and methods in the field of APA, I here propose to develop a coherent set of synergistic methods to systematically annotate cell type-specific poly(A) sites and isoforms, quantify their usage at the single cell level, and model the regulation of cell type-specific isoform expression. Because the broadly used 10x Genomics technology for single cell sequencing captures precisely transcript 3’ ends, we will exploit such data in combination with comprehensive genome-wide annotations of alternative poly(A) sites to quantify APA in single cells. We will develop new methods to systematically infer regulatory motifs for a large set of RNA binding proteins (RBPs) using extensive eCLIP data from ENCODE. Using these RBP motifs in combination with RNA sequencing data from The Cancer Genome Atlas, the human cell atlas and other consortia, we will model poly(A) site usage in terms of configurations of RBP binding sites, to infer how cell-type-dependent processing of poly(A) sites is regulated in physiological and pathological conditions. Finally, we will develop methods to assess the downstream effects of APA on RNA stability, localization, and translation. The resources generated in this project will be combined with data from large-scale initiatives, e.g. for personalized health, to link isoform expression to cellular morphology and phenotype, support identification of biomarkers, and the development of individualized therapies.
-