Understanding Association and Causality in the Genetic Studies of Inflammatory Bowel Disease
Article Outline
Crohn’s disease (CD) and ulcerative colitis (UC) are 2 major forms of idiopathic chronic inflammatory bowel diseases (IBD). Although the pathogenic mechanisms of IBD are partially defined, epidemiological, and clinical observations support a multifactorial disease model with both genetic and non-genetic risk factors. Over the last decade, multiple genome-wide linkage searches have delineated numerous genomic regions containing putative IBD risk factors. Subsequent association studies using positional mapping and candidate gene approaches further identified specific genetic variants associated with CD. The first being the 3 genetic variants in the CARD15/NOD2 gene1, 2, 3, 4, 5 and the other being the IBD5 risk haplotype.6, 7, 8, 9, 10, 11 The latter is a 250 kilobase (kb) region of the human genome on chromosome 5q31, that contains 5 genes and multiple genetic variants that are associated with CD and are highly correlated with one another; that is to say that there is strong linkage disequilibrium (LD) across this 250 kb region.6, 7 A more recent study has proposed that the genetic risk conferred by this 250 kb region can be explained by a promoter variant in the OCTN2 gene and a coding variant in the OCTN1 gene.12 Other reported associations of genetic variants to IBD have been proposed, such as a coding variant in the DLG5 gene13, 14 but await further confirmation in larger replication studies.
These discoveries have placed the genetics of IBD in a near-unique situation among complex disease traits in having successfully identified multiple associated alleles. However, recent studies have highlighted the current challenges in the “post-linkage” phase of complex trait genetics, including replication of association, translation of genetic association to functional mechanisms of disease pathology, identification of biologically relevant genotype-phenotype correlations, and delineation of gene-gene and gene-environment interactions. The articles in this issue of Gastroenterology, by Vermeire et al15 and Noble et al,16 illustrate some of these challenges in their studies of the IBD5 and DLG5 associations. In the case of DLG5, it is the challenge of achieving statistically significant evidence of replication, and in the case of OCTN genes in the IBD5 risk haplotype, it is the difficulty of distinguishing the causal allele(s) from those with which it is in LD.
DLG5, a susceptibility locus on chromosome 10, was initially described in 1999 by Hampe et al17 emerging from a genome wide linkage screen in a European cohort. Stoll et al13 further narrowed this risk region using an association mapping approach and identified 2 distinct haplotypes in the region surrounding the DLG5 gene that were putatively associated with IBD and CD. They reported the association of a risk “haplotype D” defined by a nonsynonymous SNP 113GàA nucleotide change that results in the amino acid substitution R30Q. They observed that the Q allele was associated with IBD and CD in their family trios with independent replication in a case-control cohort. They also described a protective “haplotype A,” identified by 8 haplotype-tagging SNPs (htSNPs), that was undertransmitted in the IBD trios and was confirmed in an independent case-control sample. Evidence for epistasis between DLG5 and CARD15 was also observed.
The excitement regarding DLG5 have been met with some frustration, however, as variable findings have emerged from other groups. Torok et al18 did not find any association of the 2 previously reported DLG5 haplotypes with IBD in their German case-control cohort. Noble et al19 also found no association of either haplotypes with IBD, CD, or UC in a Scottish population. Both studies found no evidence for locus-locus interactions between DLG5 and CARD15.18, 19 In contrast, Daly et al14 confirmed the association of DLG5 with IBD in 2 of their 3 European-derived populations. Interestingly, they were able to replicate the association of IBD with the R30Q variant (haplotype D) in their Quebec/Italian case-control cohort and in an independent set of trios from Quebec and the UK, but were not able to replicate it in a UK case-control group. They also found no association between IBD and the putatively protective “haplotype A” in any of these three populations. In this issue of Gastroenterology, Vermeire et al15 not only did not replicate the initial findings of Stoll et al,13 but actually observed an undertransmission of the R30Q variant (associated with the risk “haplotype D”) in their Flemish IBD family trios. They also confirmed this lack of association of the DLG5 mutations with IBD, CD, or UC in an independent case-control cohort.
The inconsistent replication of DLG5 illustrates a major difficulty with determining true positive associations in complex disease traits: multiple causal alleles each individually conferring modest increased disease risk. In general terms, the factors that influence our ability to replicate true association findings are statistical power and consistency of phenotype across studies. In the case of DLG5, the prevalence of these variant alleles has not been extensively studied across ethnic or geographic subgroups. This information would be critical in determining whether the discordant findings among studies are actually due to genetic or population heterogeneity, or rather a consequence of sampling variation among modestly sized genetic studies. The statistical power of an association study is dependent upon the sample size, the frequency and strength of the disease allele, as well as frequency of the disease in the population. The lack of evidence of CD-associated CARD15 mutations and IBD5 risk haplotype in the Japanese population supports the notion of variation of allelic frequencies in different populations.20, 21 Alternatively, subject recruitment and sampling bias must be considered as incomplete sampling can lead to overestimation of the frequency of some risk alleles and the underestimation of others.22 Differences in phenotypic definition can also contribute to sampling variation and lack of consistency among various studies. Finally, the genetic effect (or penetrance) of most genes contributing to complex trait are modest. Taking these above factors into account, sample sizes of several thousand cases and controls are often required to achieve statistical power and enable adequately sized populations to be studied for potential subphenotypic associations with various loci.14, 23 Functional studies that provide compelling evidence for causality will be powerful adjuncts to any positive genetic study. At this point, more work is required to define the exact nature of potential associations of DLG5 to IBD.
In contrast to DLG5, the IBD5 risk haplotype has been well replicated as an IBD susceptibility region. Significant evidence for linkage of the 5q31 region to CD was first reported in a Canadian population with subsequent fine mapping of this locus to a 250-kb risk haplotype.6, 7 Several groups have independently confirmed this as a CD-associated risk haplotype in different European-derived populations.8, 9, 10, 11 Unfortunately, identification of the underlying causal genetic variants within this region has been a more daunting task due to the strong LD across this region.7 An interesting study by Peltekova et al12 recently proposed 2 causal genetic variants in the OCTN genes within the IBD5 risk haplotype that would confer risk to CD. The 2 variants were in strong LD and created a 2-allele risk haplotype (TC-haplotype) that was associated with CD. They provided preliminary functional studies demonstrating that these 2 SNPs resulted in impaired OCTN transporter function of various organic cations as well as carnitine, an essential cofactor in lipid metabolism.12, 24 Based on the observed association of this TC-haplotype with CD independent of the IBD5 extended haplotype, and a possible link between OCTN function and intracellular homeostasis, they suggested that these 2 specific variants rather than other closely linked alleles were causal in CD susceptibility.
Unfortunately, these provocative findings have yet to be independently replicated. Recently, Torok et al18 found an association of the OCTN-TC haplotype with CD in a German cohort. However, they did not find conclusive genetic evidence that the OCTN polymorphisms were the causal variants as the association of the OCTN-TC haplotype with CD was not independent from another genetic variant that was chosen as a proxy for the IBD5 risk haplotype. In this issue, both Vermeire et al15 and Noble et al16 further examine the association of the OCTN-TC/IBD5 haplotype, albeit with divergent findings. Noble et al15 confirmed the association of the OCTN1/2 variant with CD in a Scottish population, but similarly to Torok et al,18 could not demonstrate this link in the absence of the IBD5 risk haplotype. Quite contrasting with these and previous studies, Vermeire et al15 actually failed to find an association of the IBD5 risk haplotype altogether (defined by the htSNPs) or the OCTN variants with IBD, CD, or UC in their Flemish population. While surprising, the authors point out that this was consistent with lack of linkage to this region in a genome-wide scan in the same population.9
These studies highlight the challenge with distinguishing a causal variant from that which it is in LD. In general, 2 factors determine the success with which a causal genetic variant can be identified in the setting of linkage disequilibrium (LD): the extent of LD in the region and the strength of effect of the variant. Both the MHC and the IBD5 regions represent 2 loci whereby extensive replication of association has been established, but identification of the causal variants within each region has been problematic. Whereas the IBD5 haplotype displays less extensive LD than the MHC region, identifying the causal genes within it remains tricky because it confers an overall weaker genetic effect than the MHC. Given the difficulties in resolving these dilemmas, future progress made with respect to OCTN and IBD5 will likely require supportive evidence from functional studies. Information providing a compelling biologic explanation for how impairment of these OCTN genes leads to the clinical phenotypes in CD would strengthen any positive associations. Although Peltekova et al provide preliminary data linking the OCTN mutations with impaired cation transport, it is unclear how these defects would translate to an increased risk for intestinal inflammation. In fact, the OCTNs are widely expressed in various human tissues (ie, brain, intestine, skeletal muscle, heart, kidney, intestines), and previously reported mutations in the human and mouse OCTN2 genes are associated with systemic carnitine deficiency, a condition characterized by diseases of skeletal muscles, cardiac muscles, and liver, rather than the intestinal system.24 At this time, much additional work will be necessary to prove causality and determine the precise mechanism of action.
In addition to conferring disease susceptibility, genetic variation may also influence the clinical manifestations of IBD, including disease location, behavior, clinical course, and response to therapy. An eventual goal in the genomic study of IBD is to identify these biologically relevant genotype-phenotype associations and to apply them to our clinical practice. Although a handful of phenotypic associations with several IBD risk loci have been reported, particularly in CD, these findings have been quite variable, and have yet to have an impact in the clinical arena. The most firmly established phenotypic associations have been for CARD15. Multiple studies have consistently reported an association of all three CD-associated CARD15 variants with ileal disease3, 25, 26, 27, 28, 29 and possibly stricturing disease.26, 29 In fact, double-dose carriage of CARD15 alleles is rare in CD patients with exclusive colonic involvement, who generally have similar allele frequencies of these 3 major risk alleles to healthy control subjects.26
Recently, the IBD5 region has been studied for phenotypic associations, albeit with rather disparate findings. Several groups have reported a lack of association between IBD5 and specific disease location, including the current article by Noble et al,7, 8, 9, 10, 11, 16 whereas a handful have also reported variable associations with specific disease sites.11, 15, 18, 30 Armuzzi et al11 reported an association of the IBD5 haplotype with both perianal and ileal CD. Newman et al30 reported an association of the OCTN-TC haplotype with ileal CD, independent of perianal disease involvement, which was further strengthened by the presence of CARD15 alleles. In contrast, Torok et al18 reported novel phenotypic associations with the IBD5/OCTN-TC haplotype and colonic CD, and with non-fistulizing and non-stricturing behavior. In this issue, Vermeire et al15 observed that both IBD5 and OCTN-TC haplotype within the region were associated with perianal disease. Although the 2 featured articles in this issue15, 16 differ with respect to disease localization associated with IBD5, both independently observed a putative association of IBD5/OCTN-TC with potential measures of disease severity. Specifically, Vermeire et al15 report a connection with penetrating behavior, and Noble et al16 observe a correlation with both penetrating/stricturing phenotype and with aggressive disease course (ie, progression to stricturing disease, requirement for surgery).
Taken together, these observations are among many genotype-phenotype associations that have been identified in the field of IBD genetics. With the exception of CARD15, the majority of these reported associations have been difficult to replicate across studies. This reflects the inherent difficulties in genotype-phenotype correlation studies: small sample size of individual stratified subgroups, and differences in phenotypic classification systems and definition of disease among studies. Future collaborative efforts to incorporate large data sets using standardized and rigorously defined phenotypic classification schemes will be critical in clarifying these conflicting observations regarding IBD5, as well as other IBD loci. Although much work remains, the recent progress already made in the genetic studies of IBD brings optimism for the future validation of a molecular classification of IBD.
In conclusion, the considerable insights in IBD genetics gained over the last decade offer hope that the genetic study of complex disease traits is indeed feasible. However, as highlighted in the 2 complementary articles in this issue of Gastroenterology, ongoing challenges for the future include finding true positive associations, identifying actual causal variants, and transitioning from genetic association to biological relevance. This can be a daunting task given the genetic and phenotypic heterogeneity among patients with IBD. As our understanding of the patterns of human genetic variation evolves and the genetic resources available increase, we will hopefully not only be able to identify additional IBD associations, but also be able to address these challenges, with the ultimate goals being to further our knowledge of the genetics of complex diseases and to apply them to clinical practice.
References
- . Mapping of a susceptibility locus for Crohn’s disease on chromosome 16 . Nature . 1996;379:821–823
- . Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease . Nature . 2001;411:599–603
- . Association between insertion mutation in NOD2 gene and Crohn’s disease in German and British populations . Lancet . 2001;357:1925–1928
- . Association of NOD2 (CARD 15) genotype with clinical course of Crohn’s disease (a cohort study) . Lancet . 2002;359:1661–1665
- . International collaboration provides convincing linkage replication in complex disease through analysis of a large pooled data set (Crohn’s disease and chromosome 16) . Am J Hum Genet . 2001;68:1165–1171
- . Genomewide search in Canadian families with inflammatory bowel disease reveals two novel susceptibility loci . Am J Hum Genet . 2000;66:1863–1870
- . Genetic variation in the 5q31 cytokine gene cluster confers susceptibility to Crohn’s disease . Nat Genet . 2001;29:223–228
- . Genetic evidence for interaction of the 5q31 cytokine locus and the CARD15 gene in Crohn’s disease . Am J Hum Genet . 2003;72:1018–1022
- . IBD5 is a general risk factor for inflammatory bowel disease (replication of association with Crohn disease and identification of a novel association with ulcerative colitis) . Am J Hum Genet . 2003;73:205–211
- . Analysis of the IBD5 locus and potential gene-gene interactions in Crohn’s disease . Gut . 2003;52:541–546
- . Genotype-phenotype analysis of the Crohn’s disease susceptibility haplotype on chromosome 5q31 . Gut . 2003;52:1133–1139
- . Functional variants of OCTN cation transporter genes are associated with Crohn’s disease . Nat Genet . 2004;36:471–475
- . Genetic variation in DLG5 is associated with inflammatory bowel disease . Nat Genet . 2004;36:476–480
- . Association of DLG5 R30Q variant with inflammatory bowel disease . Eur J Hum Genet . 2005;13:835–839
- . Association of organic cation transporter risk haplotype with perianal penetrating Crohn’s disease but not with susceptibility to IBD . Gastroenterology . 2005;129:1845–1853
- . The contribution of OCTN1/2 variants within the IBD5 locus to disease susceptibility and severity in Crohn’s disease . Gastroenterology . 2005;129:1854–1864
- . A genomewide analysis provides evidence for novel linkages in inflammatory bowel disease in a large European cohort . Am J Hum Genet . 1999;64:808–816
- . Polymorphisms in the DLG5 and OCTN cation transporter genes in Crohn’s disease . Gut . 2005;54:1421–1427
- . DLG5 variants do not influence susceptibility to inflammatory bowel disease in the Scottish population . Gut . 2005;54:1416–1420
- . Lack of common NOD2 variants in Japanese patients with Crohn’s disease . Gastroenterology . 2002;123:86–91
- . Absence of mutation in the NOD2/CARD15 gene among 483 Japanese patients with Crohn’s disease . J Hum Genet . 2004;47:469–472
- . Using a genome-wide scan and meta-analysis to identify a novel IBD locus and confirm previously identified IBD loci . Inflamm Bowel Dis . 2002;8:375–381
- . Genome-wide association studies (theoretical and practical concerns) . Nat Rev Genet . 2005;6:109–118
- . Carnitine transport by organic cation transporters and systemic carnitine deficiency . Mol Genet Metab . 2001;73:287–297
- . The molecular classification of the clinical manifestations of Crohn’s disease . Gastroenterology . 2002;122:854–866 Related Articles, Links 2002;122:854–866
- . CARD15/NOD2 mutational analysis and genotype-phenotype correlation in 612 patients with inflammatory bowel disease . Am J Hum Genet . 2002;70:845–857
- . The contribution of NOD2 gene mutations to the risk and site of disease in inflammatory bowel disease . Gastroenterology . 2002;122:867–874
- . CARD15 genetic variation in a Quebec population (prevalence, genotype-phenotype relationship, and haplotype structure) . Am J Hum Genet . 2002;71:74–83
- . Defining complex contributions of NOD2/CARD15 gene mutations, age at onset, and tobacco use on Crohn’s disease phenotypes . Inflamm Bowel Dis . 2003;9:281–289
- . A risk haplotype in the Solute Carrier Family 22A4/22A5 gene cluster influences phenotypic expression of Crohn’s disease . Gastroenterology . 2005;128:260–269
PII: S0016-5085(05)02236-5
doi:10.1053/j.gastro.2005.10.056
© 2005 American Gastroenterological Association. Published by Elsevier Inc. All rights reserved.
Refers to article:
- Association of Organic Cation Transporter Risk Haplotype With Perianal Penetrating Crohn’s Disease but Not With Susceptibility to IBD
- The Contribution of OCTN1/2 Variants Within the IBD5 Locus to Disease Susceptibility and Severity in Crohn’s Disease , 21 September 2005

