Menu
Research Publication

The 1000 Chinese Pangenome empowers medical and population genetics.

Wang Yifei, Y Duan, Zhongqu Z et al.

41922767 PubMed ID
31 Authors
2026-04-01 Published
Scroll to explore
Chapter I

Publication Details

Comprehensive information about this research publication

Authors

WY
Wang Yifei
YD
Y Duan
ZZ
Zhongqu Z
CD
Chen Dan
DS
D Shi
DD
Dandan D
DY
Ding Yi
YW
Y Wang
ZZ
Zhibin Z
LB
Li Baoqing
BW
B Wang
ZZ
Zhiyi Z
GM
Guo Minmin
MY
M Yang
WW
Wen W
HJ
Hou Junren
JC
J Chen
WW
Wenhao W
GY
Guo Yazhou
YW
Y Wei
WW
Wenjie W
CY
Cao Yujie
YS
Y Sun
XX
Xiwei X
BW
Bai Weiyang
WL
W Lu
MM
Mingdong M
QT
Qi Ting
TS
T Shen
XX
Xian X
YJ
Yang Jian
Chapter II

Abstract

Summary of the research findings

Pangenomes are revolutionizing our ability to resolve genomic regions with complex variations1. However, existing human pangenomes2,3, constrained by small sample sizes, provide limited utility for medical and population genetic applications. Here we generated 1,116 diploid genome assemblies (55 de novo and 1,061 pangenome-informed) with an average size of 2.98 Gb and a mean quality value of 46 as part of the 1000 Chinese Pangenome (1KCP) project. On the basis of these assemblies, we constructed a pangenome comprising 405.3 million base pairs of sequences absent from the current references GRCh38 and CHM13, including 26.2 million base pairs of functional genic and predicted regulatory elements. We catalogued a full spectrum of genetic variation, including 35.4 million small variants, 110,530 structural variants (SVs), 485,575 tandem repeats (TRs) and 0.86 million nested variants embedded in non-reference sequences. This extensive dataset enabled detailed characterization of multiscale genic variations relevant to medical genetics, including gene-altering SVs, TR expansions, gene cluster variations and HLA gene haplotypes. Coupled with the 1KCP gene expression data, we conducted pan-variant expression quantitative trait locus (eQTL) mapping to analyse diverse variant types. We identified 3,256 eQTLs involving complex variants (SVs, TRs and nested variants) and elucidated their regulatory complexity. Finally, we developed a 1KCP pan-variant imputation reference panel, which provides multitype genetic markers to enhance the resolution of future association studies. This resource advances our understanding of complex variants and their functional implications to provide new insights into human health.

Chapter III

Analysis

Comprehensive review of ancestry and genetic findings

Important Disclaimer: This review has been performed semi-automatically and is provided for informational purposes only. While we strive for accuracy, this analysis may contain errors, omissions, or misinterpretations of the original research. DNA Genics disclaims all liability for any inaccuracies, errors, or consequences arising from the use of this information. Users should independently verify all information and consult original research publications before making any decisions based on this content. This analysis is not intended as a substitute for professional scientific review or medical advice.

Summary

Key Findings

Ancestry Insights

Traits Analysis

Historical Context

Scientific Assessment