Advertisement

Hunting for Celiac Disease Genes

  • Ludvig M. Sollid
    Correspondence
    Address requests for reprints to Ludvig M. Sollid, MD, Centre for Immune Regulation, Institute of Immunology, University of Oslo and Rikshopitalet University Hospital, Rikshospitalet, N-0027 Oslo, Norway.
    Affiliations
    Centre for Immune Regulation, Institute of Immunology, University of Oslo and Rikshopitalet University Hospital, Oslo, Norway
    Search for articles by this author
      See “Combined functional and positional gene information for the identification of susceptibility variants in celiac disease,” by Castellanos–Rubio A, Martin–Pagola A, Santín I, et al on page 738.
      Celiac disease results from a dysregulated immune response to dietary wheat gluten and related cereal proteins.
      • Sollid L.M.
      Coeliac disease: dissecting a complex inflammatory disorder.
      • Kagnoff M.F.
      Celiac disease: pathogenesis of a model immunogenetic disease.
      The disease is an acquired disorder, but with a strong hereditary component. The evidence for the importance of genes comes from familial and twin studies. About 10% of first-degree relatives are affected by the disease, compared with the population prevalence of about 1%; the pairwise concordance rates in monozygotic and dizygotic twins are about 75% and 10%, respectively.
      • Greco L.
      • Romino R.
      • Coto I.
      • et al.
      The first large population based twin study of coeliac disease.
      Already in 1972 the association between celiac disease and the HLA locus had been established.
      • Falchuk Z.M.
      • Rogentine G.N.
      • Strober W.
      Predominance of histocompatibility antigen HL-A8 in patients with gluten-sensitive enteropathy.
      • Stokes P.L.
      • Asquith P.
      • Holmes G.K.
      • et al.
      Histocompatibility antigens associated with adult coeliac disease.
      The primary association is with HLA alleles encoding HLA-DQ2 and HLA-DQ8.
      • Sollid L.M.
      • Markussen G.
      • Ek J.
      • et al.
      Evidence for a primary association of celiac disease to a particular HLA-DQ α/β heterodimer.
      • Spurkland A.
      • Sollid L.M.
      • Polanco I.
      • et al.
      HLA-DR and -DQ genotypes of celiac disease patients serologically typed to be non-DR3 or non-DR5/7.
      Most individuals expressing HLA-DQ2 or HLA-DQ8, however, never develop the disease. In addition, the concordance rate among HLA-identical dizgotic twins that share their HLA genes in addition to on average half of their other genes, is much lower that of the monozygotic twins.
      • Greco L.
      • Romino R.
      • Coto I.
      • et al.
      The first large population based twin study of coeliac disease.
      Collectively, these genetic epidemiological observations about the involvement of genes outside the HLA locus for the development of celiac disease. An intensive search for these genes has been going on for more than a decade (Figure 1). A long series of linkage-based, genome-wide scans with the use of hundreds of multiallelic markers in multiplex families (typically affected sibling pairs) have confirmed the major role of the HLA locus and also suggested several minor loci outside the HLA locus. Strikingly, however, with the exception of the HLA locus, few of the same regions have been identified in the different linkage studies. A region on chromosome 5q31-33, first identified in a genome-wide linkage screen of an Italian dataset
      • Greco L.
      • Corazza G.
      • Babron M.C.
      • et al.
      Genome search in celiac disease.
      has been identified in several linkage studies. A recent comprehensive study using single nucleotide polymorphisms (SNP) to fine map the responsible mutation, identified several SNPs weakly associated with celiac disease, but concluded that none of the associated markers could alone explain the strong linkage signal and the causative gene(s) in this region remain(s) elusive.
      • Amundsen S.S.
      • Adamovic S.
      • Hellqvist A.
      • et al.
      A comprehensive screen for SNP associations on chromosome region 5q31-33 in Swedish/Norwegian celiac disease families.
      Moreover, studies of Dutch cohorts have indicated a susceptibility gene located on chromosome 19p13.1,
      • Monsuur A.J.
      • de Bakker P.I.
      • Alizadeh B.Z.
      • et al.
      Myosin IXB variant increases the risk of celiac disease and points toward a primary intestinal barrier defect.
      but this finding has been hard to replicate in other populations. Thus, the enormous effort to identify celiac disease susceptibility genes by linkage analysis has, by and large, been a frustrating endeavor.
      Figure thumbnail gr1
      Figure 1Schematic outline of the development of methods used to identify genes implicated in complex disorders such as celiac disease over the last 10–15 years. The study by Castellanos–Rubio et al
      • Castellanos–Rubio A.
      • Martin–Pagloa A.
      • Santín I.
      • et al.
      Combined functional and positional gene information for the identification of susceptibility variants in celiac disease.
      in this issue of Gastroenterology is an effort to identify celiac disease genes by association analysis of SNPs localized in regions of the genome previously identified to harbor celiac disease genes by linkage analysis. Linkage studies localize chromosomal regions containing disease genes by investigating co-inheritance of genetic markers and disease within families. Association studies assess if a specific allele is more or less frequent in affected individuals, as can be done in a case-control setting. A marker associated with a disease is either the causative mutation itself or it is in linkage disequilibrium with (ie, localized close to) the causative mutation. During the last year, GWAS using SNPs have resulted in dramatic progress in the hunt for genes predisposing to or protecting against the development of complex diseases. Next will be the analysis of the impact of structural variants of DNA, like the variation of gene copy numbers, performed at genome-wide level.
      The experience from genome-wide linkage scans of celiac disease mirrors the experience from other complex diseases. The emerging picture from these genetic studies is that the susceptibility profile for complex diseases consists of multiple, probably interacting, risk alleles, each contributing a relatively small effect to the overall risk. It is also clear that linkage studies by themselves will not be effective in identifying these susceptibility alleles.
      In this issue of the Gastroenterology, Castellanos–Rubio et al
      • Castellanos–Rubio A.
      • Martin–Pagloa A.
      • Santín I.
      • et al.
      Combined functional and positional gene information for the identification of susceptibility variants in celiac disease.
      report on a strategy where they combine gene expression profiling, genetic linkage information, and bioinformatics tools to hunt down celiac disease genes. By messenger RNA expression analysis, the authors identified genes that had an altered gene expression in celiac intestinal biopsies. Among these genes, the authors selected for further analysis genes located in genomic regions that have demonstrated linkage with celiac disease. The authors then selected SNPs of these genes that based on bioinformatic prediction are likely to affect gene expression, and these markers were then tested for association with celiac disease. Altogether, 361 SNPs of 71 genes were tested in a cohort of 264 celiac patients and 214 controls. Ten SNPs showed association (P < .005) and the SNP with the most significant association had P = 2.4 × 10−5. The odds ratios for the associated markers were generally low (between 0.5 and 1.7), and thus the genetic effects detected in this study are modest. The marker with the strongest association was a synonymous SNP in exon 3 of the gene SERPINE2 (serine peptidase inhibitor, clade E). In accordance with the predictions that this mutation has implications for gene expression, the authors found reduced expression of SERPINE2 in intestinal biopsies of active celiac disease patients. The product of this gene is important in the initial stages of extracellular matrix production, and reduced expression of SERPINE2 would lead to increased loss of extracellular matrix.
      The 5 next most significant SNPs detected in this study were located in 2 genes, PBX3 and PPP6C, which are positioned 0.5 Mb apart on chromosome 9q34.11. The protein encoded by PBX3 is implicated in developmental and transcriptional gene regulation in numerous cell types, one of which is maturation of immune cells. The PPP6C gene encodes a phosphatase involved in the G1/S transition of the cell cycle and the authors found that this gene was up-regulated in biopsies of celiac disease patients. Possibly, the gene could be involved in an increased crypt cell proliferation as seen in active celiac disease. Thus the SERPINE2, PPP6C, and PBX3 genes are all plausible to be involved in celiac disease based on our current understanding of celiac disease pathogenesis.
      The novel approach of this study is to combine expression profiling with association analysis of markers localized in regions suspected of harboring celiac disease genes. The authors arrived at several new genes. Are the genes they identified really involved in the pathogenesis of celiac disease? This is far from clear. The reported P values are relatively modest, and only one marker remains significant at the 5% level after correcting for multiple comparisons. A concern underlying this study is the lack of replication in an independent cohort of patients and controls. Replication of the findings is necessary before it is established that variations in these genes are influencing the risk for celiac disease.
      The introduction of genome-wide association studies (GWAS) heralds a new era in complex disease genetics. The GWAS are typically done by genotyping several hundred thousand SNPs in datasets consisting of many thousands of patients and controls. There has been a surge of GWAS for many complex diseases recently. These studies often report associations reaching P values of < 10−10, albeit the genetic effects revealed are generally weak (ie, odds ratios < 2.0). Importantly, in contrast with the previous genome-wide linkage studies, the strongest association signals of the GWAS tend to be replicated across populations. Until now, a single GWAS on celiac disease has been published. This GWAS was done with 778 UK celiac disease patients and 1422 UK population controls.
      • van Heel D.A.
      • Franke L.
      • Hunt K.A.
      • et al.
      A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21.
      The HLA locus came out with very highly significant P values. The second strongest association was found for an SNP (rs6822844) located close to the IL2 and IL21 genes on chromosome 4q27 (P = 2.0 × 10−7). Upon retesting in independent Dutch and Irish cohorts, the UK data were replicated and a meta-analysis yielded a highly significant P value (1.3 × 10−14; odds ratio, 0.63). This association is within a linkage disequilibrium block of 480 kb harboring the KIAA1109 and TENR genes in addition to the IL2 and IL21 genes. Because of the strong linkage disequilibrium, it was impossible to pinpoint the mutation predisposing to celiac disease within this block, but the IL2 and IL21 genes are the prime suspects; they encode cytokines of central importance in immunity. The association of this region with celiac disease has been replicated in a Swedish/Norwegian cohort (Adamovic et al, unpublished observations). A follow-up study by van Heel et al of ∼1000 markers showing association signals in their initial celiac GWAS has identified an additional 7 loci (of genome-wide significance; P < 5 × 10−7) associated with celiac disease (van Heel et al, personal communication, January 3, 2008). Immune related genes map to 6 of these regions, underscoring the fact that celiac disease has an immunologic basis. Notably, none of the susceptibility genes identified by Castellanos–Rubio et al
      • Castellanos–Rubio A.
      • Martin–Pagloa A.
      • Santín I.
      • et al.
      Combined functional and positional gene information for the identification of susceptibility variants in celiac disease.
      are among the celiac genes identified by the follow-up study by van Heel et al.
      How many celiac disease susceptibility genes are there? No clear answer can be given at this point, but it is likely there are many. In Crohn’s disease, a meta-analysis of data from GWAS of 3200 patients and 4800 controls documented >20 significant associations.

      Daly MJ on behalf of Crohn’s disease GWA Meta-analysis Working Group. Joint genome-wide analysis of 3200 Crohn’s disease patients documents more than 20 significant associations. 57th Annual Meeting American Society of Human Genetics 2007. Abstract.

      More than 40 loci had P < 10−5, and most of these loci were considered to be true risk factors for Crohn’s disease. The same picture may emerge for celiac disease. In addition to HLA, which is the overriding genetic factor, there could be ≥20 genes that each has a modest effect on the risk of developing the disease.
      Variation in the human genome is more than variation in single nucleotides. Structural variants, including insertions and deletions of DNA (copy–number variants) and balanced chromosomal rearrangements, such as inversions, are much more frequent than previously thought. In fact, structural variations represent more variation than SNPs when considering the total number of nucleotides affected,
      • Korbel J.O.
      • Urban A.E.
      • Affourtit J.P.
      • et al.
      Paired-end mapping reveals extensive structural variation in the human genome.
      and it seems likely that structural variation has a stronger impact on phenotypic variation than do SNPs.
      • Sebat J.
      Major changes in our DNA lead to major changes in our thinking.
      • McCarroll S.A.
      • Altshuler D.M.
      Copy-number variation and association studies of human disease.
      The first reports on structural variants predisposing to complex immunologic diseases by analysis of candidate genes or regions have emerged; susceptibility to systemic lupus erythematosus is associated with copy number variations in the genes for complement C4 (C4A/C4B)
      • Yang Y.
      • Chung E.K.
      • Wu Y.L.
      • et al.
      Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans.
      and Fc-γ receptor IIIb (FCGR3B),
      • Fanciulli M.
      • Norsworthy P.J.
      • Petretto E.
      • et al.
      FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity.
      variation in copy numbers for β-defensin (DEFB4) genes is predisposing to Crohn’s disease of the colon
      • Fellermann K.
      • Stange D.E.
      • Schaeffeler E.
      • et al.
      A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn’s disease of the colon.
      and psoriasis,
      • Hollox E.J.
      • Huffmeier U.
      • Zeeuwen P.L.
      • et al.
      Psoriasis is associated with increased beta-defensin genomic copy number.
      and susceptibility to HIV infection
      • Gonzalez E.
      • Kulkarni H.
      • Bolivar H.
      • et al.
      The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility.
      and rheumatoid arthritis
      • McKinney C.
      • Merriman M.E.
      • Chapman P.T.
      • et al.
      Evidence for an influence of chemokine ligand 3-like 1 (CCL3L1) gene copy number on susceptibility to rheumatoid arthritis.
      is related to copy number variation of the chemokine CCL3L1 gene. Many of the structural variants are covered poorly by linkage disequilibrium–based methods of association using SNPs.
      • Sebat J.
      Major changes in our DNA lead to major changes in our thinking.
      • Estivill X.
      • Armengol L.
      Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies.
      Notably, the regions of copy number variation associated with the diseases mentioned above were not detected in the published GWAS due to unsatisfactory coverage by the arrays used.
      • Estivill X.
      • Armengol L.
      Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies.
      Thus, the impact of structural DNA variants on complex disorders, including celiac disease, has not been properly addressed in the studies performed to date. This goes with the fact that the genes detected in most published GWAS, as is the case of celiac disease, can only explain a small proportion of the variance in the risk of the diseases. New chapters in the book of celiac disease genetics are to be opened. Exciting days lie ahead of us!

      References

        • Sollid L.M.
        Coeliac disease: dissecting a complex inflammatory disorder.
        Nat Rev Immunol. 2002; 2: 647-655
        • Kagnoff M.F.
        Celiac disease: pathogenesis of a model immunogenetic disease.
        J Clin Invest. 2007; 117: 41-49
        • Greco L.
        • Romino R.
        • Coto I.
        • et al.
        The first large population based twin study of coeliac disease.
        Gut. 2002; 50: 624-628
        • Falchuk Z.M.
        • Rogentine G.N.
        • Strober W.
        Predominance of histocompatibility antigen HL-A8 in patients with gluten-sensitive enteropathy.
        J Clin Invest. 1972; 51: 1602-1605
        • Stokes P.L.
        • Asquith P.
        • Holmes G.K.
        • et al.
        Histocompatibility antigens associated with adult coeliac disease.
        Lancet. 1972; 2: 162-164
        • Sollid L.M.
        • Markussen G.
        • Ek J.
        • et al.
        Evidence for a primary association of celiac disease to a particular HLA-DQ α/β heterodimer.
        J Exp Med. 1989; 169: 345-350
        • Spurkland A.
        • Sollid L.M.
        • Polanco I.
        • et al.
        HLA-DR and -DQ genotypes of celiac disease patients serologically typed to be non-DR3 or non-DR5/7.
        Hum Immunol. 1992; 35: 188-192
        • Greco L.
        • Corazza G.
        • Babron M.C.
        • et al.
        Genome search in celiac disease.
        Am J Hum Genet. 1998; 62: 669-675
        • Amundsen S.S.
        • Adamovic S.
        • Hellqvist A.
        • et al.
        A comprehensive screen for SNP associations on chromosome region 5q31-33 in Swedish/Norwegian celiac disease families.
        Eur J Hum Genet. 2007; 15: 980-987
        • Monsuur A.J.
        • de Bakker P.I.
        • Alizadeh B.Z.
        • et al.
        Myosin IXB variant increases the risk of celiac disease and points toward a primary intestinal barrier defect.
        Nat Genet. 2005; 37: 1341-1344
        • Castellanos–Rubio A.
        • Martin–Pagloa A.
        • Santín I.
        • et al.
        Combined functional and positional gene information for the identification of susceptibility variants in celiac disease.
        Gastroenterology. 2008; 134: 738-746
        • van Heel D.A.
        • Franke L.
        • Hunt K.A.
        • et al.
        A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21.
        Nat Genet. 2007; 39: 827-829
      1. Daly MJ on behalf of Crohn’s disease GWA Meta-analysis Working Group. Joint genome-wide analysis of 3200 Crohn’s disease patients documents more than 20 significant associations. 57th Annual Meeting American Society of Human Genetics 2007. Abstract.

        • Korbel J.O.
        • Urban A.E.
        • Affourtit J.P.
        • et al.
        Paired-end mapping reveals extensive structural variation in the human genome.
        Science. 2007; 318: 420-426
        • Sebat J.
        Major changes in our DNA lead to major changes in our thinking.
        Nat Genet. 2007; 39: S3-S5
        • McCarroll S.A.
        • Altshuler D.M.
        Copy-number variation and association studies of human disease.
        Nat Genet. 2007; 39: S37-S42
        • Yang Y.
        • Chung E.K.
        • Wu Y.L.
        • et al.
        Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans.
        Am J Hum Genet. 2007; 80: 1037-1054
        • Fanciulli M.
        • Norsworthy P.J.
        • Petretto E.
        • et al.
        FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity.
        Nat Genet. 2007; 39: 721-723
        • Fellermann K.
        • Stange D.E.
        • Schaeffeler E.
        • et al.
        A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn’s disease of the colon.
        Am J Hum Genet. 2006; 79: 439-448
        • Hollox E.J.
        • Huffmeier U.
        • Zeeuwen P.L.
        • et al.
        Psoriasis is associated with increased beta-defensin genomic copy number.
        Nat Genet. 2008; 40: 23-25
        • Gonzalez E.
        • Kulkarni H.
        • Bolivar H.
        • et al.
        The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility.
        Science. 2005; 307: 1434-1440
        • McKinney C.
        • Merriman M.E.
        • Chapman P.T.
        • et al.
        Evidence for an influence of chemokine ligand 3-like 1 (CCL3L1) gene copy number on susceptibility to rheumatoid arthritis.
        Ann Rheum Dis. 2007; (Jun 29 [Epub ahead of print]. doi:10.1136/ard.2007.075028)
        • Estivill X.
        • Armengol L.
        Copy number variants and common disorders: filling the gaps and exploring complexity in genome-wide association studies.
        PLoS Genet. 2007; 3: 1787-1799

      Linked Article