Regulatory Genomics and Systems Biology

News & updates

Discovering and understanding oncogenic gene fusions through data intensive computational approaches.

Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different-yet highly complementary and symbiotic-approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation. Paper by Natasha Latysheva and M. Madan Babu can be found here.

Affinity and competition for TBP are molecular determinants of gene expression noise

Cell-to-cell variation in gene expression levels (noise) generates phenotypic diversity and is an important phenomenon in evolution, development and disease. TATA-box binding protein (TBP) is an essential factor that is required at virtually every eukaryotic promoter to initiate transcription. While the presence of a TATA-box motif in the promoter has been strongly linked with noise, the molecular mechanism driving this relationship is less well understood. Through an integrated analysis of multiple large-scale data sets, computer simulation and experimental validation in yeast, we provide molecular insights into how noise arises as an emergent property of variable binding affinity of TBP for different promoter sequences, competition between interaction partners to bind the same surface on TBP (to either promote or disrupt transcription initiation) and variable residence times of TBP complexes at a promoter. These determinants may be fine-tuned under different conditions and during evolution to modulate eukaryotic gene expression noise. Paper by Charles Ravarani et al can be found here.

Probing Gαi1 protein activation at single-amino acid resolution

We present comprehensive maps at single-amino acid resolution of the residues stabilizing the human Gαi1 subunit in nucleotide- and receptor-bound states. We generated these maps by measuring the effects of alanine mutations on the stability of Gαi1 and the rhodopsin-Gαi1 complex. We identified stabilization clusters in the GTPase and helical domains responsible for structural integrity and the conformational changes associated with activation. In activation cluster I, helices α1 and α5 pack against strands β1-β3 to stabilize the nucleotide-bound states. In the receptor-bound state, these interactions are replaced by interactions between α5 and strands β4-β6. Key residues in this cluster are Y320, which is crucial for the stabilization of the receptor-bound state, and F336, which stabilizes nucleotide-bound states. Destabilization of helix α1, caused by rearrangement of this activation cluster, leads to the weakening of the interdomain interface and release of GDP. The paper by Dawei Sun, Tilman Flock et al can be viewed here.

Variable Glutamine-Rich Repeats Modulate Transcription Factor Activity

Excessive expansions of glutamine (Q)-rich repeats in various human proteins are known to result in severe neurodegenerative disorders such as Huntington’s disease and several ataxias. However, the physiological role of these repeats and the consequences of more moderate repeat variation remain unknown. Here, we demonstrate that Q-rich domains are highly enriched in eukaryotic transcription factors where they act as functional modulators. Incremental changes in the number of repeats in the yeast transcriptional regulator Ssn6 (Cyc8) result in systematic, repeat-length-dependent variation in expression of target genes that result in direct phenotypic changes. The function of Ssn6 increases with its repeat number until a certain threshold where further expansion leads to aggregation. Quantitative proteomic analysis reveals that the Ssn6 repeats affect its solubility and interactions with Tup1 and other regulators. Thus, Q-rich repeats are dynamic functional domains that modulate a regulator’s innate function, with the inherent risk of pathogenic repeat expansions. The paper by Rita Gemayel, Sreenivas Chavali et al can be viewed here.

Universal allosteric mechanism for Ga activation by GPCRs

G protein-coupled receptors (GPCRs) allosterically activate heterotrimeric G proteins and trigger GDP release. Given that there are 800 human GPCRs and 16 different Ga genes, this raises the question of whether a universal allosteric mechanism governs Ga activation. Here we show that different GPCRs interact with and activate Ga proteins through a highly conserved mechanism. Comparison of Ga with the small G protein Ras reveals how the evolution of short segments that undergo disorder-to-order transitions can decouple regions important for allosteric activation from receptor binding specificity. This might explain how the GPCR–Ga system diversified rapidly, while conserving the allosteric activation mechanism. The paper by Tilman Flock et al can be viewed here.

Sequence composition of disordered regions fine-tunes protein half-life

The proteasome controls the concentrations of most proteins in eukaryotic cells. It recognizes its protein substrates through ubiquitin tags and initiates degradation at disordered regions within the substrate. Here we show that the proteasome has pronounced preferences for the amino acid sequence of the regions at which it initiates degradation. Specifically, proteins in which the initiation regions have biased amino acid compositions show longer half-lives in yeast than proteins with unbiased sequences in the regions. The relationship is also observed on a genomic scale in mouse cells. These preferences affect the degradation rates of proteins in vitro, can explain the unexpected stability of natural proteins in yeast and may affect the accumulation of toxic proteins in disease. We propose that the proteasome’s sequence preferences provide a second component to the degradation code and may fine-tune protein half-life in cells.The paper by Susan Fishbain, Sreenivas Chavali et al can be viewed here.

Proteome response at the edge of protein aggregation

Proteins adopt defined structures and are crucial to most cellular functions. Their misfolding and aggregation is associated with numerous degenerative human disorders such as type II diabetes, Huntington’s or Alzheimer’s diseases. Here, we aim to understand why cells promote the formation of protein foci. Comparison of two amyloid-b-peptide variants, mostly insoluble but differently recruited by the cell (inclusion body versus diffused), reveals small differences in cell fitness and proteome response. We suggest that the levels of oxidative stress act as a sensor to trigger protein recruitment into foci. Our data support a common cytoplasmic response being able to discern and react to the specific properties of polypeptides. The paper by Natalia Sanchez de Groot can be viewed here.

Structured and disordered facets of the GPCR fold

The seven-transmembrane (7TM) helix fold of G-protein coupled receptors (GPCRs) has been adapted for a wide variety of physiologically important signaling functions. Here, we discuss the diversity in the structured and disordered regions of GPCRs based on the recently published crystal structures and sequence analysis of all human GPCRs. A comparison of the structures of rhodopsin-like receptors (class A), secretin-like receptors (class B), metabotropic receptors (class C) and frizzled receptors (class F) shows that the relative arrangement of the transmembrane helices is conserved across all four GPCR classes although individual receptors can be activated by ligand binding at varying positions within and around the transmembrane helical bundle. A systematic analysis of GPCR sequences reveals the presence of disordered segments in the cytoplasmic side, abundant post-translational modification sites, evidence for alternative splicing and several putative linear peptide motifs that have the potential to mediate interactions with cytosolic proteins. While the structured regions permit the receptor to bind diverse ligands, the disordered regions appear to have an underappreciated role in modulating downstream signaling in response to the cellular state. An integrated paradigm combining the knowledge of structured and disordered regions is imperative for gaining a holistic understanding of the GPCR (un)structure-function relationship. The paper by AJ Venkatakrishnan et al can be viewed here.

Categories

Archives