Link to Pubmed [PMID] – 40918066
Link to DOI – 10.1093/nargab/lqaf118
NAR Genom Bioinform 2025 Sep; 7(3): lqaf118
Composite hypothesis testing using summary statistics is a well-established approach for assessing the effect of a single marker or gene across multiple traits or omics levels. Numerous procedures have been developed for this task and have been successfully applied to identify complex patterns of association between traits, conditions, or phenotypes. However, existing methods often struggle with scalability in large datasets or fail to account for dependencies between traits or omics levels, limiting their ability to control false positives effectively. To overcome these challenges, we present the qch_copula approach, which integrates mixture models with a copula function to capture dependencies between traits or omics and provides rigorously defined P-values for any composite hypothesis. Through a comprehensive benchmark against eight state-of-the-art methods, we demonstrate that qch_copula controls Type I error rates effectively while enhancing the detection of joint association patterns. Compared to other mixture model-based approaches, our method notably reduces memory usage during the EM algorithm, allowing the analysis of up to 20 traits and 105-106 markers. The effectiveness of qch_copula is further validated through two application cases in human and plant genetics. The method is available in the R package qch, accessible on CRAN.