Seminar: A new perspective on enhancer architecture and 3D structural modeling of whole interactomes for genomics studies

Speaker: Prof. Haiyuan Yu 
         Cornell University
Host: Prof. ZHANG Zhihua
Time: 10:00-12:00, July 25th, 2023
Location: The first floor conference room, BIG, CAS/CNCB

Speaker introduction:
Haiyuan Yu is the Tisch University Professor in the Department of Computational Biology and Weill Institute for Cell and Molecular Biology, also the founding Director of Center for Innovative Proteomics (CIP) at Cornell University. Yu performs research in the broad areas of Network Systems Biology. The Yu group uses integrated computational-experimental systems biology approaches to determine protein interactions and complex structures on the scale of the whole cell. In particular, his group focuses on protein-protein and gene regulatory networks and seeks to understand how such intricate systems evolve and how their perturbations lead to human disease, especially Autism Spectrum Disorder and cancer. Towards these goals, Haiyuan led his group to develop the concept of “3D structurally-resolved interactome networks”, where they integrate multi-scale structural modeling, machine learning, and high-throughput genomics/proteomics experiments to determine protein interactions and their binding interfaces on the whole proteome scale. More recently, in close collaboration with John Lis and his group, the Yu group demonstrate that enhancer RNAs (eRNAs) detected by the novel PRO-cap assay is a critical mark for active enhancers genome-wide. PRO-cap has great sensitivity and specificity, among all RNA-sequencing assays to detect eRNAs (thus active enhancers) across the whole genome with high resolution.

Recent studies have shown that both enhancers and promoters can recruit RNA pol II and initiate transcription. The short half-life nature of enhancer RNAs (eRNAs) makes detection of distal initiation events challenging. Through systematic comparison of RNA sequencing assays, we find that nascent transcriptome assays, PRO-cap and PRO-seq, have great sensitivity and specificity in detecting eRNA transcription genome-wide. In fact, we find that divergent transcription of eRNAs is a critical mark for all active enhancers genome-wide. Moreover, nascent transcription precisely delineates the sequence architecture of enhancers, whereby transcription start sites (TSSs) serve as critical anchors in revealing motif positioning within enhancers and their boundaries. By leveraging our high precision and sensitivity nascent transcriptome PRO-cap and PRO-seq assays, we mapped the active transcriptional regulatory landscape across ~200 tissue and cell types of the human body with unprecedented resolution and depth.
Currently, only <10% of all known human interactions have any structural information. To solve this issue, we developed a unified deep learning framework to create a multiscale full-coverage structural interactome in human for all known protein interactions reported in the literature. Furthermore, we developed a novel computational algorithm, named NetFlow3D, integrating spatial cluster identification with a 3D structurally-informed protein network model to create a multiscale functional map of somatic mutations in cancer. By applying NetFlow3D to 415,017 somatic protein-altering mutations from 5,950 TCGA tumors across 19 cancer types, we identified 1,656 intra- and 3,343 inter-protein mutation clusters, of which ~50% would not have been found if using only experimentally-determined protein structures.