About

I am a Research Fellow at the Margaret Turner Warwick Centre for Fibrosing Lung Disease at the National Heart and Lung Institute at the Imperial College London. My current research interest includes understanding the functional consequences of genetic associations with interstitial lung disease such as idiopathic pulmonary fibrosis (IPF) and familial pulmonary fibrosis (FPF).

I received a bachelor’s degree in mathematics from the Shandong University in 2006 and obtained a PhD degree in bioinformatics from the Beijing Institute of Genomics of the Chinese Academy of Sciences in 2011. During my doctoral study, my work was to develop approaches on measuring the protein-coding sequence variations and explore the evolutionary trajectory of human introns.

After graduation, I was appointed Assistant Professor in 2011 and was then promoted to Associate Professor in 2014 at the same institute. During this period of time, I expanded my research interest into investigating the gene orders, repetitive sequences and genome complexity using newly established analytical framework on an enormous number of genomes, leading to the creation of several databases and webservers such as RGKbase, KGCAK, Plastid-LCGbase and LCGserver.

In 2014, I moved to the UK to start working as a Research Associate at the Cancer Institute at the University College London where I was mainly concentrating on the construction of the data analysis pipelines for next-generation-sequencing data with application in the leukaemia disease.

In 2016, I was working as a Postdoctoral Research Assistant at the Department of Plant Sciences at the University of Oxford, and my research was to use RNA-Seq technology to de-novo assemble reference transcriptomes as well as characterise the differing gene expression patterns and networks for three tissues including bark, xylem and leaf underpinning drought resistance mechanisms for two groups of Conifer species, namely Pinaceae species and Cupressaceae species. Besides, I developed the streamlined pipeline tool hppRNA for efficient and handy RNA-Seq data analysis and established IntronDB and GCevobase that provide the intron features and compositional properties with the data visualisation functionality for thousands of eukaryotic genomes respectively.

In 2018, I started a new role as a Senior Bioinformatics Research Officer at the LeedsOmics at the University of Leeds and my primary responsibility was to manage the day-to-day running of LeedsOmics Institute, organise and coordinate LeedsOmics activities such as annual symposiums, research seminar series, training workshops and coding clubs as well as provide the data analysis and training for the researchers in terms of bioinformatics and other cutting-edge omics technologies for the LeedsOmics community. By collaborating with other research groups, I had been providing bioinformatics support and input for the projects that made use of a broad range of multi-omics techniques including genomics (DNA-Seq, exome-Seq, phylogenomics and RAD-Seq), transcriptomics (bulk RNA-Seq and single-cell RNA-Seq), translatomics (Ribo-Seq), epigenomics (BS-Seq and ChIP-Seq), proteomics and metabolomics.

From 2021 to 2022, I spent one year working as a Senior Bioinformatician in Integrative Analysis at the COVID-19 Multi-omics Blood ATlas (COMBAT) consortium at the Wellcome Centre for Human Genetics at the University of Oxford using multi-omics techniques in combination with the cutting-edge bioinformatic approaches and statistical methods to reveal the pathogenesis of COVID-19 and stratification of patients as well as inform the treatment strategy based on molecular information. In particular, I managed the data warehousing, developed and maintained the COMBATdb and the COMBAT consortium website and supported the data integration work on multiple modalities.