Menu
Research Publication

yallHap: Modern Y-chromosome haplogroup inference with probabilistic scoring and ancient DNA support

Alaina Hardie

1 Authors
2026-01-01 Published
Scroll to explore
Chapter I

Publication Details

Comprehensive information about this research publication

Authors

AH
Alaina Hardie
Chapter II

Abstract

Summary of the research findings

The human Y chromosome enables detailed reconstruction of paternal lineages through haplogroup classification. Existing tools for this purpose typically rely on outdated phylogenies, lack ancient DNA handling, or provide limited confidence metrics. Here I present yallHap, a Y-chromosome haplogroup classifier that integrates the YFull phylogenetic tree (185,780 SNPs) with probabilistic scoring, built-in ancient DNA damage filtering, and parallel processing for population-scale studies. Validation on 1,231 high-coverage gnomAD samples achieved 99.9% accuracy (95% CI: 99.5–100%) on GRCh38, and 1,233 samples from 1000 Genomes Phase 3 achieved 99.8% accuracy (95% CI: 99.3–100%). For ancient DNA with moderate variant density (4–10%), Bayesian ancient mode achieves +19.3 pp improvement over heuristic mode (+12 to +24 pp at 1% increments; see Supplementary Table S3), reaching 60–86% accuracy. On full AADR ancient DNA validation (7,333 samples spanning ∼45,000 years), this translates to 90.7% overall accuracy (95% CI: 90.0–91.3%) versus 88.3% for heuristic transversions-only mode. At variant densities ≥10%, both modes reach 97–99% accuracy. yallHap supports multiple reference genomes (GRCh37, GRCh38, T2T-CHM13v2.0), provides detailed quality metrics including optional ISOGG nomenclature output, and offers multi-threaded batch processing for large-scale studies. The tool is designed for integration into modern bioinformatics pipelines, with example wrappers for nf-core/eager [16,17] and Snakemake [18] workflows. The software is open source, available at https://github.com/trianglegrrl/yallHap, and distributed via pip, Bioconda, and Docker.

Chapter III

Analysis

Comprehensive review of ancestry and genetic findings

Important Disclaimer: This review has been performed semi-automatically and is provided for informational purposes only. While we strive for accuracy, this analysis may contain errors, omissions, or misinterpretations of the original research. DNA Genics disclaims all liability for any inaccuracies, errors, or consequences arising from the use of this information. Users should independently verify all information and consult original research publications before making any decisions based on this content. This analysis is not intended as a substitute for professional scientific review or medical advice.

Summary

Key Findings

Ancestry Insights

Traits Analysis

Historical Context

Scientific Assessment