Alterations in Intestinal Microbiota of Children With Celiac Disease at Time of Diagnosis and on a Gluten-Free Diet

Please cite this article as: Zafeiropoulou K, Nichols B, Mackinder M, Biskou O, Rizou E, Karanikolou A, Clark C, Buchanan E, Cardigan T, Duncan H, Wands D, Russell J, Hansen R, Russell RK, McGrogan P, Edwards CA, Ijaz UZ, Gerasimidis K, Alterations in Intestinal Microbiota of Children With Celiac Disease at Time of Diagnosis and on a Gluten-Free Diet, Gastroenterology (2020), doi: https://doi.org/10.1053/ j.gastro.2020.08.007.


INTRODUCTION
Celiac disease (CD) is an autoimmune destruction of small intestinal villi triggered by ingestion of gluten in genetically susceptible individuals 1

. It causes nutrient malabsorption, leading to intestinal
and extra-intestinal symptoms 2 . Lifelong adherence to a gluten-free diet (GFD) is the only treatment.
The underlying pathogenesis of CD is multifactorial and while genetic predisposition occurs in 30-40% of the general population, only a small proportion of these individuals will develop CD, suggesting that environmental factors are at play 3 . This premise is supported by the rise in the incidence of CD over past decades, suggesting that changes in population genetics cannot explain this increase, nor can changes in gluten consumption 3 . Weaning practices, antibiotic exposure 4,5 , and viral gastrointestinal (GI) infections 6 have been implicated as risk factors for CD onset 7 . Recent evidence points to the involvement of the gut microbiota [8][9][10][11][12][13][14][15][16][17] ; driven by the rapid expansion of microbiota research.
Several non-communicable diseases, like inflammatory bowel disease, show increasing incidence similar to CD, and have been associated with distinct features of microbiota structure and function, also termed 'dysbiosis' 18 . There has been increasing interest in research exploring the role of the microbiota in CD. However, results remain inconclusive. Previous research was mostly of cross-sectional design providing a snapshot of the gut microbiota in CD and most importantly, did not determine whether a disturbed microbiota in feces of new-onset, untreated patients (UCD) is implicated in disease pathogenesis or is predominantly an epiphenomenon of the underlying disease, altered gastrointestinal (GI) motility and excessive substrate availability, owing to nutrient malabsorption and intestinal inflammation. Research has also described an altered microbiota in treated patients with CD (TCD) but did not explore the extent to which these signals were attributed to dietary modification and exclusions imposed during treatment with GFD. Studies describing the human fecal microbiota of CD using next-generation sequencing do not exist and previous research relied on characterization of selective microbial groups [8][9][10][11][12][13][14][15][16][17] .
This study characterized the gut microbiota of children with CD. We combined crosssectional and prospective patient cohorts and collected data on determinants of the microbiota likely to change in children with CD during GFD, or likely to differ when comparing CD with healthy controls (HC). For the first time in the literature, we profiled the fecal microbiota using 16S rRNA gene sequencing and measured metabolites likely to be influenced by adherence to GFD.
Additionally, we explored associations with dietary intake, GI symptoms and novel biomarkers of GFD compliance.
J o u r n a l P r e -p r o o f 7

Microbiota profiling
Genomic DNA was isolated using the chaotropic method within 3 months of sample collection 24 .
Sequencing of the V4 region of the 16S rRNA gene was performed with MiSeq (Illumina) using the Golay barcodes on the reverse strand 18,25 . Barcoded amplicons were purified using the Zymoclean Gel DNA Recovery Kit (D4001).

Fecal SCFA, lactate and sulfide
The short chain fatty acids (SCFA) acetic, propionic, butyric and valeric acids, and the branch chain fatty acids (BCFA) (isobutyric and isovaleric acids) were measured using gas chromatography (Agilent 7820A) with a DB-WAX UI column on diethyl ether extracts 23,25 . Nitrogen was the carrier gas. D-and L-lactate were measured using a commercial kit (D, L Lactic Acid, Roche, Cat No;11112821035) scaled-down for use in a 96-well plate. Free and total sulfide were measured colorimetrically 23 .

Bioinformatics analysis
Quality trimming was done using Sickle, applying a sliding window approach and trimming regions where average base quality (PHRED score) dropped below 20 26 . Assembly of paired reads was done using PANDAseq 27  taxonomically classified using QIIME with SILVA reference database (version 123). An approximatelymaximum-likelihood phylogenetic tree was produced using the 'ginsi' alignment algorithm in MAFFT 28 , followed by FastTree 29 .

Statistical analysis
General linear models on Box-Cox transformed data were used to compare groups, accounting for the paired design of the prospective cohort, and adjusted using Bonferroni correction. We first compared the fecal metabolite concentrations and microbiota between HC and siblings of the CD patients. In the absence of significant difference, we removed the sibling group from further analysis. Multivariate statistical analysis was performed in R (version 3.4.0) using the packages J o u r n a l P r e -p r o o f 8 vegan, phyloseq and DESeq2. Samples were rarefied to 5,000 reads before calculating α diversity, measured as species richness and Shannon index. Microbial composition was assessed using nonmetric multidimensional scaling plots (NMDS) at genus and OTU level based on Bray-Curtis dissimilarity indices and unweighted UniFrac distances. The former considers bacterial taxon abundance, while the latter considers phylogenetic distances between bacterial taxa through presence/absence, regardless of proportional representation. Permutation ANOVA was applied using the vegan Adonis function on distance matrices (Bray-Curtis/ unweighted UniFrac) with data stratified by subject to allow for repeated sampling from UCD participants during follow-up. Local contribution to β-diversity (LCBD) analysis was performed to measure the contribution of each sample to the total OTU β diversity; samples with high LCBD represent samples that are markedly different from the average β diversity of all study samples. Differences in OTU abundances between groups were found using the DESeq2 method, with participant ID included as a variable in the input formula for paired data. For correlations Kendall rank correlation was used. Benjamini-Hochberg correction was applied to cases of multiple testing. Analysis using the Bioenv function in vegan produced subsets of OTUs whose Euclidean distance matrices correlate maximally with the Bray-Curtis dissimilarity matrices derived from complete OTU tables, thus indicating major determinants of community structure.
Random forest analysis used OTUs that significantly differed in abundance between HC versus both UCD and TCD patients inclusive. Models were generated using Log-proportional abundances via the randomForest R package, with 10,000 decision trees used per model. Model performance was assessed using the rf.significance function in the rfUtilities R package 30 and ROC analysis using the ROCR R package 31 .

Ethical considerations
The study was approved by the West of Scotland Research Ethics Committee (Ref: 11 All UCD children were recommended to follow a GFD. From the 20 UCD patients with baseline fecal samples, 13 (65%) provided follow-up samples at 6 and 12 months after GFD initiation (prospective cohort); 4 (20%) patients were lost at follow-up and 3 (15%) others provided paired samples at 12 months only but were subsequently removed to avoid bias in statistical analysis. There was no difference in age (p=0.27), gender (p=0.998) or BMI z-scores (p=0.63) as well as in α and β diversity of the baseline microbiota between the 13 patients with follow-up samples and the 7 others who did not provide all samples. In total, 167 fecal samples were collected across all groups.
After commencement of a GFD, tTG-IgA titer decreased in the UCD children ( Table 1). The UCD group experienced more GI problems than TCD and HC. The mean PedsQL-GS score was also lower in TCD than HC, suggesting that TCD children had more GI symptoms than HC. In the UCD group, GI symptoms improved only 12 months post-diagnosis.
Thirty-eight of 45 TCD children (84%) had undetectable GIP, indicating at least recent compliance with GFD; the remaining 7 (16%) TCD children had detectable levels indicating either transgression from GFD recommendations or accidental exposure to gluten (Table 1). Compared to baseline, GIP concentration was almost 13 times lower at 6 months and 6 times lower at 12 months on GFD. At 6 and 12 months after recommendation to adhere to a GFD, 2 (15%) and 3 out of 13 (23%) UCD patients had detectable GIP.

Fecal microbiota profiling
There was no difference in α-diversity between TCD, UCD and HC groups, neither at OTU nor genus level (Figure 1a & Supplementary Table 1). In contrast, 2.8% (p=0.025) and 2.5% (p=0.027) of the variation in OTU community structure (β-diversity) for the Bray-Curtis dissimilarity index and unweighted UniFrac distance analyses, respectively, were explained by participant grouping. TCD clustered separately to HC and tended to do so with respect to UCD (Supplementary Table 1), J o u r n a l P r e -p r o o f 10 suggesting a significant effect of GFD on microbiota structure. Similar findings were observed at genus level. Gender did not influence this effect. No separation in community structure was seen between the HC and UCD groups, suggesting an absence of profound dysbiosis at disease onset.
LCBD analysis confirmed that the microbiota structure of TCD individuals differed from that of UCD and HC, with no difference seen between the latter 2 groups (Figure 1a).
Using the Bioenv workflow, a subset of 13 OTUs maintained the group clustering, explaining 92.3% of the variance described by the complete OTU dataset (Supplementary Table 2). When the Bioenv analysis was repeated for pairings of UCD vs TCD and HC vs TCD, separately, subsets of 14 and 12 OTUs were retrieved, explaining 92.0% and 91.4% of the variance described by the complete datasets, respectively. Of note, 9 OTUs featured in both pairwise comparisons, suggesting their strong influence on the fecal microbiota structure of children with CD on a GFD. Compared to disease diagnosis, there was no difference in β-diversity of the 13 UCD children at 6 and 12 months after initiation of GFD ( Figure 1b).

Differential analysis in OTU abundance between new-onset, untreated CD and HC
The UCD and HC groups were characterized by a total of 1,033 distinct OTUs. Thirty-one OTUs (3%) differed significantly between the 2 groups; all of which had a significantly lower abundance in UCD than HC ( Figure 2). Of these 31 discriminatory OTUs, only the abundance of OTU_1054 Alistipes correlated positively with PedsQL-GS score; thus suggesting that the remaining 30 discriminatory OTUs were less likely to be explained by differences in GI symptoms between the 2 groups.

Differential analysis in OTU abundance between patients on recommendation to a GFD with untreated CD or HC
Next, we explored the effect GFD might have on the CD microbiota. First, we looked for differences in OTU abundances between UCD and TCD. Fifty-one of 1,082 OTUs (5%) differed significantly in abundance between the 2 groups (Figure 2 & Supplementary Table 3). Forty-eight OTUs (94%) had significantly higher abundance in TCD than UCD apart from OTU_31 Megamonas, OTU_143 Ruminococcus 1, and OTU_135 Holdemanella.
Likewise, 29 of 1,082 OTUs (3%) had different abundance between HC and TCD. Almost half (n=13, 45%) were increased in TCD compared with HC ( Figure 2). Of the 13 OTUs more abundant in TCD than HC, 10 (77%) were significantly higher in TCD than UCD too, suggesting that treatment with GFD influences these taxa independently of disease status. Of the 16 OTUs with lower relative J o u r n a l P r e -p r o o f abundance in TCD than HC, OTU_31 Megamonas, OTU_143 Ruminococcus 1, and OTU_135 Holdemanella had significantly lower abundance in TCD than UCD ( Figure 2). Of note, these 3 OTUs were the only ones with lower abundance in TCD than UCD, strongly suggesting that their modulation is the consequence of treatment with GFD.
A celiac disease-specific microbiota signature Irrespective of treatment with GFD, the relative abundance of 11 OTUs were consistently lower in children with CD than HC ( Figure 2, Supplementary  (Figure 3). The 2 most influential OTUs were OTU_53 Clostridium sensu stricto 1 followed by OTU_143 Ruminococcus.

Discriminant analysis in OTU abundance in new-onset CD following recommendation to GFD
In the prospective cohort of UCD patients followed-up at 6 and 12 months on GFD, fecal samples were characterized by a total of 835 OTUs; 31 (3.7%) and 12 (1.4%) of which differed significantly 6 and 12 months after initiation of GFD, respectively ( Figure 4). Compared to CD diagnosis, at both 6 and 12 months on GFD, the relative abundance of 7 and 3 OTUs significantly decreased and increased, respectively. It is noteworthy that in this prospective cohort the mean effect size of GFD on OTU abundance ( Figure 4) was more pronounced than the magnitude of OTU abundance difference between TCD and UCD ( Figure 2).

Effect of dietary nutrients and food groups on the gut microbiota of CD
The effect dietary modifications, during treatment with GFD, might have on microbiota was explored using stepwise data analysis. First, we correlated the intake of macronutrients (e.g. carbohydrates) and food groups (e.g. Of the 200 OTUs which associated with either the macronutrient or food group intake of HC, 39 OTUs were differentially abundant between UCD and TCD children ( Figure 2) suggesting that differences in the abundance of 39 of the 51 (76%) discriminatory OTUs between UCD and TCD are likely to be explained by changes in dietary nutrient intake after GFD recommendation. Likewise, 23 of the 200 OTUs were significantly differentially abundant between HC and TCD children ( Figure 2). Therefore, differences in abundance of 23 of the 29 (79%) OTUs that discriminated between HC and TCD are likely to be explained by changes in dietary intake after initiation of GFD.
In the prospective cohort, from the 31 and 12 OTUs whose relative abundance changed at 6 and 12 months of treatment with GFD, 29 (94%) and 11 (82%) correlated with macronutrient or food group intake in HC (Figure 4), further supporting the hypothesis that the gut microbiota of treated CD patients differs to HC predominantly as the result of dietary modification during GFD.

Differential analysis in OTU abundance between patients with and without recent consumption of gluten
In pooled analysis (cross-sectional and prospective cohorts together), 12 (17%) children with recommended adherence to a GFD had detectable and 59 (83%) undetectable GIP. When we looked for differences in OTU abundance between these 2 groups, 89 OTUs differed (Supplementary Figure   4). Among these, all but 2 (OTU_99 Senegalimassilia and OTU_239 Clostridiales vadinBB60 group) OTUs were higher in children with undetectable fecal GIP.
For a few discriminatory OTUs between UCD and/or HC with TCD the direction of their change in abundance differed to that of those OTUs which discriminated between patients with and without recent gluten consumption. 19 out of the 87 OTUs (22%) with higher abundance in children without recent gluten ingestion were significantly increased in TCD compared with UCD, with 8 of them (9%) also significantly increased in TCD over HC (Supplementary Figure 4). Similarly, 1 of the 2 OTUs (50%) with lower abundance in children without recent gluten ingestion was also significantly lower in TCD than in HC.  Table 1). The microbiota structure of the unaffected siblings did not differ to HC either. Fifty-six of 964 OTUs (6%) were differentially abundant between TCD and their unaffected siblings, with 36 (64%) significantly decreased in the TCD group (Supplementary Table 5).

Diet-related microbiota metabolites
There was no difference between the unaffected siblings of TCD children and HC in all bacterial metabolites assayed, water content and pH in feces ( Table 2). No difference was found also in fecal water content and the absolute concentration of SCFA/BCFA among the UCD, TCD and HC. However, the relative abundance (%) of acetic acid was higher, and that of butyric and valeric acids were lower in TCD than HC ( Figure 5a).
In the prospective group of UCD patients the absolute concentrations of butyric acid (p=0.053) and the 2 BCFA (isovaleric acid, p=0.052, isobutyric acid, p=0.063) non-significantly decreased after GFD initiation ( Table 2). The effect of GFD on SCFA production was reflected also in their proportional profile ( Figure 5b). Compared with disease diagnosis, the relative abundance of acetic acid increased and the relative abundances of propionic, butyric and valeric acids decreased at 6 and/or 12 months on GFD, mirroring the observations in the cross-sectional cohort and between the TCD children and HC or UCD.
Samples from TCD children had lower ammonia concentration than HC or UCD, and patients with UCD had significantly less free sulfide than HC (Table 2). Mean fecal L-lactic acid concentration was significantly lower in UCD than TCD and HC, but its D-isomer was higher in UCD than TCD.
During the follow-up of the 13 UCD children, a non-significant (p=0.067) decrease in ammonia levels and a corresponding increase in free sulfide (p=0.074) and L-lactate (p=0.087) concentrations were observed.
There was no difference in bacterial metabolites, except for fecal ammonia which was significantly lower in patients who had undetectable GIP than those who had consumed gluten (Supplementary Table 6).
J o u r n a l P r e -p r o o f 14

DISCUSSION
It is still unclear the extent to which an altered microbiota observed in previous research [8][9][10][11][12][13][14][15][16][17] is involved in CD pathogenesis or if these are secondary effects of disease pathology, including increased epithelial cell turnover and nutrient malabsorption. Following diagnosis, adherence to GFD may associate with a decreased intake of non-digestible carbohydrates from cereals, thus affecting fiber fermenting species and colonic production of SCFA 32 . This is important as CD patients may be unable to compensate for decreased fiber intake from gluten-containing foods by increasing its intake from other sources, including fruits and vegetables. Here, by performing a data-rich study we tried to discern which microbial signals in patients with new-onset and treated CD are potentially involved in disease pathogenesis and which are secondary disease effects, including from treatment.
Even though we identified differences in the abundance of a few species between treatment-naïve UCD patients and HC, the profound microbial dysbiosis noted in Crohn's disease was not observed, at least using crude diversity indices 18 . Instead, significant effects were observed in TCD patients after recommendation to a GFD, confirming our a priori hypothesis. More importantly, we identified three major groups of bacterial taxa ( Figure 2); one group which is CDspecific and non-responsive to treatment with GFD; a second group which is associated with newonset CD but which is also treatment responsive; and a third one which is treatment dependent but does not differentiate between disease and the health state. Of these, the first cluster represents the microbial signature of CD which can distinguish children with from those without CD with a reasonably high likelihood, as demonstrated using machine learning algorithms. The magnitude of microbial alterations observed here are similar to other non-communicable diseases, in which the microbiota has been implicated in their underlying pathogenesis, such as type 1 diabetes 33 .
Although in the second group several other bacteria were different between UCD and the HC, these discriminatory microbial signals vanished following treatment with GFD. These represent bacteria which responded to recovery of gut pathology following treatment with GFD, or bacteria whose underlying role in CD cause and treatment might be important. The role of these taxa, the majority of which belong to Bacteroidetes warrants further research. Using denaturing gradient gel electrophoresis with group-specific primers Sánchez showed also that Bacteroides diversity was higher in duodenal biopsies from controls than in samples from patients with active and treated CD 15 .
The third and largest group should almost certainly represent microbial noise attributed to dietary modification during treatment with GFD and amelioration of disease activity. This speculation is supported by the fact that several differential OTUs between the TCD with the UCD J o u r n a l P r e -p r o o f 15 children or HC overlapped; most of these discriminatory species were associated with participants' diet, as well as by the observation that in patients on GFD almost 100 OTUs differed between patients with positive and negative gluten contents in their feces. As a prime example of these effects, avoidance of wheat products likely explains the decrease of Megamonas in TCD patients 34 and the same may apply for the fiber fermenters Coprococcus, Ruminococcus and Anaerostipes, and Bifidobacterium. The reduction in Bifidobacterium in CD is in accordance with previous research using molecular fingerprinting techniques 17  The role of Bifidobacterium in the underlying microbial origins of CD pathogenesis has received extensive attention within mechanistic research. Inoculation of peripheral blood mononuclear cells with feces from active and asymptomatic CD patients increased TNF-α production and CD86 expression, while decreased IL-10 cytokine production and CD4 expression compared with samples from HC but specific Bifidobacterium strains suppressed this Th1 pro-inflammatory milieu, characteristic of CD 37 . In a subsequent study of the same group, addition of Bifidobacterium strains changed the gliadin-derived peptide pattern and attenuated production of TNF-α and IL-1β and expression of NF-κB and chemokine CXCR3 receptor from Caco-2 cells exposed to gliadin digestions 38 . These in-vitro data were replicated in gliadin-induced enteropathy murine models sensitized with interferon-γ where Bifidobacterium longum CECT 7347 attenuated the production of TNF-α and the CD4 mediated immune response and increased the tissue mRNA levels of NFκB and IL-10 39 .
The exact mechanism by which Bifidobacteria may exhibit immune-modulating properties is not yet clear but it has been demonstrated that Bifidobacterium longum NCC2705 produces a serine protease inhibitor which attenuates gliadin-induced immunopathology and impacts on intestinal microbial composition in the NOD/DQ8 mouse model of gluten sensitivity 40  Irrespective of their primary role in CD pathogenesis, the disease-specific microbial signature identified here might be used as another adjutant, non-invasive biomarker to screen for CD. The observation that the abundance of fiber fermenters or cross-feeders and production of butyric acid diminish in patients on GFD has implications for the dietary management of this population. Dietary fiber intake in the westernized diet is low and adherence to GFD with low consumption of cereals may decrease patient intake even further. It is therefore important for colonic health and gut motility to promote intake of non-gluten containing sources of fiber in this population and routinely fortify gluten-free products with a broad variety of fibers (e.g. pectin, ispaghula).
Limitations of the current study include the modest sample size of the prospective group.
Although the mean effect size of microbial changes were more pronounced than the cross-sectional group, we may have been underpowered to identify smaller size differences. Some patients from the UCD group were lost at follow-up or their measurements were excluded. However, this group of patients did not differ in characteristics and microbiota features from patients who were retained in the analysis. Also, CD is a condition of the small bowel; hence the role of the fecal microbiota may be considered less relevant to its pathogenesis. While this is a fair argument to propose, it is possible that events in the large bowel influence disease pathogenesis upstream along the GI tract, as is perhaps the case in Crohn's disease where colonic microbiota changes can be seen in patients with disease affecting their small intestine. It is also possible that several fecal microbes are markers of the small bowel resident community. Collado et al previously showed that similar bacteria were related to CD in both fecal and duodenal biopsies 47 ; but further research is required to clarify the role of each of these gut niches and in the mucosal adherent microbiota as suggested 9,48 . In HC we       J o u r n a l P r e -p r o o f   What you need to know: Background and Context: It is not clear whether alterations in the intestinal microbiota of children with celiac disease cause the disease or are a result of disease and/or its treatment with gluten-free diet (GFD).

New Findings:
Although several alterations in the intestinal microbiota of children with established celiac disease appear to be effects of a GFD, there are specific bacteria that are distinct biomarkers of celiac disease.
Limitations: It is not clear whether the microbes identified contribute to pathogenesis of celiac disease or are the result of it.

Impact:
The GFD alters the intestinal microbiota, but in patients with celiac disease, there are additional differences, compared with healthy children.
Lay Summary: Children with celiac disease have differences in composition of intestinal microbes compared to healthy children. Some of these differences are caused by a gluten-free diet, but studies are needed to determine whether the other changes are a cause or a result of celiac disease.    paired data from 13 CD children at diagnosis and at six and 12 months after the initiation of GFD, and (b) from cross-sectional data from UCD patients (n=20) and TCD patients (n=45)         Values expressed as mean (SEM); * Mann-Whitney non-parametric test; gluten ingestion indicated by a faecal GIP concentration ≥ 0.156μg/g