Population genetic summary statistics for five super and
Population genetic summary statistics for five super- and 26 sub-populations, including haplotype and diplotype frequencies (analogous to allele and genotype frequencies), observed (Ho) and expected (He) heterozygosities, pairwise genetic distances, and tests for detection of departures from Hardy Weinberg Equilibrium (HWE) and linkage disequilibrium were performed using Genetic Data Analysis (GDA)  and the RStudio® package ggplot2 . TreeView Version 1.6.6 Build 7601 ,  was used to create phylogenetic trees; haplotype network analyses were performed using Population Analysis with Reticulate Trees (PopART) using the ancestral parsimony setting .
Enzyme activity was predicted using commonly typed and previously described polymorphisms for each gene , , , , , , , . Due to lack of empirical data for each polymorphism, additional damaging or most likely damaging polymorphisms in a gene were assumed to completely eliminate enzyme function. Logistic regression was used to explore possible relationships between the well-characterized CYP2D6-inferred metabolizer phenotype, represented as an activity score (a qualitative measure of phenotype derived from the activity conferred by each * allele an individual carries ) and the predicted activity of UGT2B7, ABCB1, OPRM1, and COMT. These data were then used to interpret the potential utility of a combinatorial pharmacogenetic profile.
Results and discussion
Conclusions Full-gene haplotypes of four they have to say encoding trans-acting T-metabolism proteins, UGT2B7, ABCB1, OPRM1, and COMT, were defined and characterized using substantially more polymorphic sites than previously employed in pharmacogenetic studies. In doing so, a large number of haplotypes were observed. The data presented demonstrate significant LDs between full-gene haplotypes of CYP2D6 and those of UGT2B7 and COMT; however, the functional effects of these findings need to be determined empirically. The relatively low frequency of each haplotype and associated diplotype may confound LD estimates simply because each haplotype was only observed in combination with one other haplotype. This study also proposed an extended ABCB1-Block -1, which included distal untranslated exon 1, and did not substantially increase acquired information over the truncated Block -1 reported by Sai et al. , . Most individual haplotypes identified in this study were quite rare; however, relatively common haplotypes (≥1% global frequency) were identified which contain at least one damaging, or most likely damaging, polymorphism. It should be noted that copy number variation and CYP2D6/CYP2D7 gene conversion do occur in some individuals, primary UMs and may alter the presented LD and regression patterns . These events were not considered herein for determining of CYP2D6 activity  due to the limitations of short read sequences that comprise 1000 Genomes Project data , . It is likely that ongoing developments in longer read sequencing technologies will provide more confident interpretation of structural variation from existing short-read sequences , , , . The variant effects of many polymorphisms included in these haplotype definitions have not been empirically evaluated by the pharmacogenetics/pharmacogenomics community. There are obvious limitations to using an algorithmic approach to variant effect ; however, the predicted implications on phenotype should not be overlooked, instead they can be used to narrow the pool of potentially causal variants/haplotypes to explore empirically. The inclusion of only self-reported healthy individuals in the 1000 Genomes Project means that additional functionally-relevant haplotypes may be selected against being represented in this dataset. This limiting factor may impact the analyses performed above. It is likely that additional polymorphisms and/or specific haplotypes may be enriched, or selected for, in affected, or T-exposed, cohorts , , . As such, there potentially are additional damaging haplotypes in these affected groups that have not been observed herein so a full-gene interrogation of affected cohorts may provide greater resolution to damaging haplotype population distribution. This possibility lends support to utilizing a comprehensive genotyping approach, such as relatively long-read MPS or continuous-read nanopore technology in pharmacogenetic/pharmacogenomic interrogations , , .