"Towards predicting gene expression from DNA sequence"
Tuesday, November 12, 2019 - 13:00
School of Life Sciences, MSI Small Lecture Theatre
Dr Kasper Rasmussen
Professor Jussi Taipale
University Of Cambridge
Abstract: Understanding the information encoded in the human genome requires two genetic codes, the first code specifies how mRNA sequence is converted to protein sequence, and the second code determines where and when the mRNAs are expressed. Although the proteins that read the second, regulatory code – transcription factors (TFs) – have been largely identified, the code is poorly understood as it is not known which sequences TFs can bind in the genome. To understand the regulatory code, we have analyzed the sequence-specific binding of TFs to unmodified and epigenetically modified DNA in the presence and absence of nucleosomes, using multiple different methods. Our findings indicate that DNA commonly mediates interactions between TFs, and that dimer formation results in changes in the binding preferences of TFs. We also found that CpG methylation has both negative and positive effects on TF binding. The effect of nucleosome is largely negative, but several different TFs from diverse structural families can access nucleosomal DNA using five distinct binding modes. Despite the extensive knowledge of TF binding preferences, reading the regulatory code still remains a challenge. To address this, we have taken a multiomic approach to identify the sources of this problem by performing several experiments that bridge the gap between in vivo analyses such as massively parallel reporter assays and in vitro studies such as HT-SELEX. A binding model that is required to understand binding of TFs to the genome, which incorporates information about cellular TF DNA binding and transcriptional activity, protein-protein interactions induced by DNA, and inheritance of epigenetic states across cell division will be discussed.