Ye Zheng

FreeHi-C: high fidelity Hi-C data simulation for benchmarking and data augmentation

Ability to simulate realistic high-throughput chromatin conformation (Hi-C) data is foundational for developing and benchmarking statistical and computational methods for Hi-C data analysis. We propose FreeHi-C, a data-driven Hi-C simulator for simulating and augmenting Hi-C datasets. FreeHi-C employs a non-parametric strategy for estimating interaction distribution of genome fragments from a given sample and simulates Hi-C reads from interacting fragments. Data from FreeHi-C exhibit higher fidelity to the biological Hi-C data compared with other tools in its class. FreeHi-C not only enables benchmarking a wide range of Hi-C analysis methods but also boosts the precision and power of differential chromatin interaction detection methods while preserving false discovery rate control through data augmentation.

Workshop: 3D Genomics and Long-range Gene Regulations

Chromatin is dynamically organized within the three-dimensional nuclear space in a way that allows efficient genome packaging while ensuring proper expression and replication of the genetic materials. Therefore, understanding of genome architecture and revealing its relationship to the genomic function is vital and has progressed through the advancement of new technologies. Recently developed chromatin conformation capture-based assays (3C) enabled the study of three-dimensional chromosomal architecture in a high throughput fashion. Hi-C, particularly, elucidated genome-wide long-range interactions among loci. In this workshop, we will go through the state-of-the-art 3D genomics technologies and focus on the role of statistical methods and computational tools in analyzing 3D genomics data. Successful running of the complete pipeline and all software is not strictly required; instead, we will concentrate on the inference and interpretation of the results.

Keywords: three-dimensional chromatin organization, long-range gene regulation, statistical genomics analysis, computational tools

Requirements: You will need to bring your laptop and have the latest R and Python installed. Participants should be comfortable about running commands in terminal and have basic knowledge of Statistics. Participants are not expected to have any knowledge of 3D genomics.

Relevance: This workshop is relevant to anyone interested in learning three-dimensional chromatin structure, both from biotechnological and quantitative perspectives. The target audience can be anyone who came across 3C assays such as 3C, 4C, 5C, Hi-C, ChIA-PET, and HiChIP in literature and wants to learn more about them systematically. Or if you are simply curious about the three-dimensional chromatin structure and want to see some advanced experimental technologies and fancy quantitative analysis tools, this workshop is right for you!

Ye Zheng

PhD, University of Wisconsin – Madison
Postdoctoral Research Fellow, Fred Hutchinson Cancer Research Center

I received my B.E in Statistics at Renmin University of China before starting my doctoral training at the University of Wisconsin – Madison in Fall 2014. I received my Ph.D. in Statistics with a doctoral minor in Quantitative Biology under the supervision of Professor Sündüz Keleş in August 2019. I am inherently drawn to problems that are at the interface of statistical, biological, and biomedical sciences. My research thus far concentrated on developing statistical and computational methods for studying three-dimensional chromatin organization and long-range regulatory interactions in DNA. I will join Professor Raphael Gottardo’s group at Fred Hutchinson Cancer Research Center as a Postdoctoral Research Fellow to further investigate the genomic regulation mechanism from the single-cell perspective.