microbiome data IBD classification Discussion WHY STUDY THE MICROBIOME? Lean/obese mice studies suggest that gut microbiota affects energy balance Microbiota diversity is reduced by antibiotic therapy, leading to pathogenic infections antibiotic-associated diarrhea, salmonellosis, C.diff colitis Implicated in autoimmune diseases IBD, Diabetes
microbiome data IBD classification Discussion BALANCE BETWEEN IMMUNITY AND GUT MICROBIOME The immune system is one of the main determinants of associated microbial diversity Innate Physical barriers limit microbes from reaching the epithelium APC trigger inflammation to reduce the bacterial load Adaptive B cells secrete polyreactive and antigen-specific IgA T cells mediate killing of specific microorganisms Microbiota influences both innate and adaptive immunity
microbiome data IBD classification Discussion DYSBIOSIS Nature Reviews | Immunology Symbionts Commensals Pathobionts a Immunological equilibrium b Immunological dysequilibrium Regulation Inflammation Regulation Inflammation Pathogens nvolvement of gut bacteria s of animal models. Pre- biotics has been shown to al inflammation in several transgenic rats, IL-10- and conventional conditions ic colitis, whereas they do mation if raised in germ- el of colitis induced by the nic T cells into immuno- ined immunodeficient) or ting gene)) recipient mice, ntestinal pathogens such as und to exacerbate inflam- an be induced in healthy transfer of T cells that are mensal organisms50,60. m reported to be strongly ease is adherent-invasive at inflammatory responses IBD are directed towards sal organisms that have Helicobacter, Clostridium iously, these organisms are Figure 1 | Immunological dysregulation associated with dysbiosis of the microbiota. a | A healthy microbiota contains a balanced composition of many classes of bacteria. Symbionts are organisms with known health- promoting functions. Commensals are permanent REVIEWS Round et al. Nature Reviews Immunology (2009).
microbiome data IBD classification Discussion DYSBIOSIS Nature Reviews | Immunology Symbionts Commensals Pathobionts a Immunological equilibrium b Immunological dysequilibrium Regulation Inflammation Regulation Inflammation Pathogens nvolvement of gut bacteria s of animal models. Pre- biotics has been shown to al inflammation in several transgenic rats, IL-10- and conventional conditions ic colitis, whereas they do mation if raised in germ- el of colitis induced by the nic T cells into immuno- ined immunodeficient) or ting gene)) recipient mice, ntestinal pathogens such as und to exacerbate inflam- an be induced in healthy transfer of T cells that are mensal organisms50,60. m reported to be strongly ease is adherent-invasive at inflammatory responses IBD are directed towards sal organisms that have Helicobacter, Clostridium iously, these organisms are Figure 1 | Immunological dysregulation associated with dysbiosis of the microbiota. a | A healthy microbiota contains a balanced composition of many classes of bacteria. Symbionts are organisms with known health- promoting functions. Commensals are permanent REVIEWS Nature Reviews | Immunology Symbionts Commensals Pathobionts a Immunological equilibrium b Immunological dysequilibrium Regulation Inflammation Regulation Inflammation Pathogens e involvement of gut bacteria dies of animal models. Pre- ntibiotics has been shown to inal inflammation in several 27-transgenic rats, IL-10- and d in conventional conditions ronic colitis, whereas they do ammation if raised in germ- odel of colitis induced by the ogenic T cells into immuno- mbined immunodeficient) or vating gene)) recipient mice, h intestinal pathogens such as found to exacerbate inflam- s can be induced in healthy ive transfer of T cells that are mmensal organisms50,60. sm reported to be strongly disease is adherent-invasive that inflammatory responses tal IBD are directed towards mensal organisms that have as Helicobacter, Clostridium Curiously, these organisms are a and are not typically patho- f all mammals contains these Figure 1 | Immunological dysregulation associated with dysbiosis of the microbiota. a | A healthy microbiota contains a balanced composition of many classes of bacteria. Symbionts are organisms with known health- promoting functions. Commensals are permanent residents of this complex ecosystem and provide no benefit or detriment to the host (at least to our knowledge). REVIEWS Round et al. Nature Reviews Immunology (2009).
microbiome data IBD classification Discussion DYSBIOSIS /2 Nature Reviews | Immunology Host genetics Mutations in NOD2, IL23R, ATG16L and IGRM Lifestyle Diet Stress Disease T H 1, T H 2 and T H 17 cells Health T Reg cells Early colonization Birth in hospitals Altered exposure to microbes Medical practices Vaccination use Antibiotic Hygiene Dysbiosis an animal model of experimental colitis . As symbiotic bacteria seem to have evolved mechanisms to provide protection from colonization by pathobionts that are present in the microbiota, does disease result from the linking thes Western pop The bacteria with IBD is trols74. Howe cific pathoge inflammatio to intestinal healthy and i conclusively This raises th tion in IBD a onts that are Indeed, in 19 bacteria in t allergic child levels of colo els of coloniz allergic child studies have intestinal mi atopic eczem is not clear w disease, it se the gut micr developmen individuals. On these Figure 3 | Proposed causes of dysbiosis of the microbiota. We propose that the composition of the microbiota can shape a healthy immune response or predispose to disease. Many factors can contribute to dysbiosis, including host genetics, lifestyle, exposure to microorganisms and medical practices. Host genetics can potentially influence dysbiosis in many ways. An individual with mutations in genes involved in immune regulatory mechanisms or pro-inflammatory pathways could lead to unrestrained inflammation in the intestine. It is possible that inflammation alone influences the composition of the microbiota, skewing it in favour of pathobionts. Alternatively, a host could ‘select’ or exclude the colonization of particular organisms. This selection can be either active (as would be the case of an organism recognizing a particular receptor on the host) or passive (the host environment is more conducive to Round et al. Nature Reviews Immunology (2009).
microbiome data IBD classification Discussion IMMUNITY AND MICROBIOTA ARE DEEPLY INTERLINKED Microbiota is required for the proper development of immune responses Microbial influence on immunity is rarely exerted in isolation
microbiome data IBD classification Discussion IMMUNITY AND MICROBIOTA ARE DEEPLY INTERLINKED Microbiota is required for the proper development of immune responses Microbial influence on immunity is rarely exerted in isolation we need more systems-level data for all the players involved, measuring many variables at high resolution
microbiome data IBD classification Discussion HIGH-THROUGHPUT MAPPING OF ANTIBODY RESPONSE Profiling of immune responses traditionally relies on cell sorting or serum measurements No data on the secretions of single lymphocytes quantity and timing of secreted cytokines? antibody affinities?
microbiome data IBD classification Discussion MICROENGRAVING METHOD Glass slides coated with capture Ab Secreted Ab is captured Microengraving Glass slides with replicated microarrays of Ab PDMS Culture dish
microbiome data IBD classification Discussion MICROENGRAVING METHOD Antigen specific spot Non-specific spot anti-mouse Ig OVA (var.conc, Green) anti-mouse Ig (10 nM, Red) Glass slides coated with capture Ab Secreted Ab is captured Microengraving Glass slides with replicated microarrays of Ab PDMS Culture dish
microbiome data IBD classification Discussion MICROENGRAVING METHOD Antigen specific spot Non-specific spot anti-mouse Ig OVA (var.conc, Green) anti-mouse Ig (10 nM, Red) Glass slides coated with capture Ab Secreted Ab is captured Microengraving Glass slides with replicated microarrays of Ab PDMS Culture dish [OVA] [IgG] 10pM 100pM 1nM 10nM 100nM _B220 _IgM y t i n i f f A e p y t o s I l l e w o r c i M DNA IgM _
microbiome data IBD classification Discussion SUMMARY Quantitative profiles that detail the cellular origin, extent and diversity of the B cell response Flow cytometry and immunosorbant assays data correlated for each single cell Expandable to cytokine profiling, T cell profiling, primary splenocytes Allows cell retrieval
microbiome data IBD classification Discussion SINGLE LYMPHOCYTE STIMULATION Difficult to expose single naive lymphocytes to controlled stimuli (eg. bacteria) Capture of antigen by B cells is critical for antibody response studied by biochemical and imaging methods early dynamics ? quantitative ?
microbiome data IBD classification Discussion SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS Media Region of observation ] flow Pulse #1 Pulse #2 Pulse #1 (_IgM 568) C(t) time Pulse #2 (_IgM 647) b naive B cell
microbiome data IBD classification Discussion SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS Media Region of observation ] flow Pulse #1 Pulse #2 Pulse #1 (_IgM 568) C(t) time Pulse #2 (_IgM 647) b naive B cell a c Imposed Theory Imposed Exp 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 0 0.2 0.4 0.6 0.8 1 time (s) C eff /C C eff /C
microbiome data IBD classification Discussion SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS Media Region of observation ] flow Pulse #1 Pulse #2 Pulse #1 (_IgM 568) C(t) time Pulse #2 (_IgM 647) b naive B cell a c Imposed Theory Imposed Exp 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 0 0.2 0.4 0.6 0.8 1 time (s) C eff /C C eff /C d Tfn 568 Tfn 647 50 100 150 200 250 300 time (s) fluorescence Intensity (a.u.) 0
microbiome data IBD classification Discussion SUMMARY Observe response to sequential doses of ligands in primary naïve B cells Measure early dynamic of labeled B cell receptors Expandable throughput with chip redesign
microbiome data IBD classification Discussion 1. MACHINE LEARNING APPLIED TO MICROBIOME DATA Environmental shotgun 16S rRNA sequencing allows mapping of bacterial composition 16S rRNA phylogeny is a good approximation of microbes distribution (VonMering 2007) Gene content and phylogeny correlate well (Mueller 2011) Microbial compositional data is large and increasingly difficult to mine Cheaper sequencing means that analysis is becoming the limiting step Needed: routine extraction of patterns in microbial data
microbiome data IBD classification Discussion WHAT IS MACHINE LEARNING? Machine learning algorithms use example data to learn and discover structure in datasets classify samples into distinct categories once learnt from example data, can predict Machine learning algorithms are object of extensive research applications in computing, finance, biology,etc.
microbiome data IBD classification Discussion WHAT IS MACHINE LEARNING? Machine learning algorithms use example data to learn and discover structure in datasets classify samples into distinct categories once learnt from example data, can predict Machine learning algorithms are object of extensive research applications in computing, finance, biology,etc.
microbiome data IBD classification Discussion WHAT IS MACHINE LEARNING? Machine learning algorithms use example data to learn and discover structure in datasets classify samples into distinct categories once learnt from example data, can predict Machine learning algorithms are object of extensive research applications in computing, finance, biology,etc.
microbiome data IBD classification Discussion WHAT IS MACHINE LEARNING? Machine learning algorithms use example data to learn and discover structure in datasets classify samples into distinct categories once learnt from example data, can predict Machine learning algorithms are object of extensive research applications in computing, finance, biology,etc.
microbiome data IBD classification Discussion MICROBIOME DATA Each microbial taxa is a feature that can be used to discriminate between bacterial communities We want to: find automatically the taxa that discriminate best accurately classify communities according to metadata Taxa Sample1 Sample2 ... A 12 2 B 1 10 C 5 0
microbiome data IBD classification Discussion PREPARING MICROBIOME DATA 16S DNA sequence reads quality filtering, chimera check (MOTHUR) RDP classification AGCTGCTCGA TAAGCTGCTCGA AGCTGCTCGATTCTG OTU Clustering (UCLUST) Representative sequences Taxa Sample1 Sample2 ... A 12 2 B 1 10 C 5 0 OTU table Phylogenetic tree
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set build random forest model what are the best taxa?
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set read taxa abundance from test set build random forest model what are the best taxa?
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set predict test set or read taxa abundance from test set build random forest model what are the best taxa?
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set predict test set or read taxa abundance from test set check prediction build random forest model what are the best taxa?
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set predict test set or read taxa abundance from test set check prediction build random forest model what are the best taxa?
microbiome data IBD classification Discussion LEARNING AND CLASSIFICATION Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set Taxa A B C D Sample1 12 1 5 0 Sample2 2 21 5 10 Sample3 12 11 3 2 Sample4 1 2 0 15 training set test set predict test set or read taxa abundance from test set check prediction build random forest model what are the best taxa? repeat the cross-validation and average
microbiome data IBD classification Discussion RANDOM FOREST Taxa A B C D 12 1 5 0 2 21 5 10 12 11 3 2 pick a random sample build a decision tree pick at random mtry taxa 2 21 5 10 A,B A > 10 B < 11
microbiome data IBD classification Discussion RANDOM FOREST Taxa A B C D 12 1 5 0 2 21 5 10 12 11 3 2 pick a random sample build a decision tree pick at random mtry taxa 2 21 5 10 A,B A > 10 B < 11 repeat Ntree times average / take votes
microbiome data IBD classification Discussion INCORPORATING PHYLOGENETICS | NM CD NM UC UC UC 2 3 4 5 1 Healthy / Sick Healthy / Crohn!s / Colitis p = 0.003 CD NM UC 0 0.2 0.4 0.6 0.8 1 CD NM UC 0 0.2 0.4 0.6 0.8 1 1 present absent Hierarchical decision tree outlining the classification of a patient as normal, crohn!s or colitis, depending on whether sequences are present at the given nodes in the phylogenetic tree. Average accuracy is 80%. Decision tree nodes are colored with respect to the hierarchical level. Tree branches are colored based on diagnosis. Bacterial groups in a normal patient are colored green; magenta for Crohn!s samples and cyan for colitis samples. 5 4 1 3 2
microbiome data IBD classification Discussion 2. THE CASE OF IBD Inflammation of autoimmune origin Presenting symptoms: abdominal pain, diarrhea, vomiting, weight loss No known causative agent IBD seems to have a complex etiology: environmental - smoking, western diet ? genetic - autophagy loci (NOD2,ATG16) microbial - correlated with some bacteria, dysbiosis ?
microbiome data IBD classification Discussion IBD TREATMENT IBD alternates between flares (active) and periods of remission (inactive). Long-term immunosuppressants to maintain remission Antibiotic therapy is used empirically to treat flare-ups When medical therapy fails, treatment is bowel resection
microbiome data IBD classification Discussion CLASSIFICATION CAN DISTINGUISH IBD AND HEALTHY Frank et al. survey ï6SHFLILFLW\ 6HQVLWLYLW\ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Frank (AUC = 0.73) Pediatric (AUC = 0.71) Area under the ROC curve probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one
microbiome data IBD classification Discussion CLASSIFICATION CAN DISTINGUISH IBD AND HEALTHY Frank et al. survey ï6SHFLILFLW\ 6HQVLWLYLW\ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Frank (AUC = 0.73) Pediatric (AUC = 0.71) Pediatric case control ï6SHFLILFLW\ 6HQVLWLYLW\ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 IBD (AUC = 0.83) active IBD (AUC = 0.91) Area under the ROC curve probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one
microbiome data IBD classification Discussion SUMMARY Classification can distinguish healthy and IBD patients accurately Patients can be stratified according to activity Identified novel taxa associated with IBD and remission Validated blindly and by fecal calprotectin measurements Careful statistical design should be first step in larger studies
microbiome data IBD classification Discussion HOW DOES IT FIT INTO IBD PRACTICE? Clinical feasibility will depend on shrinking cost of sequencing Primary care screen Gastroenterologist review Serology Physician assessment Diagnosis suspected? Endoscopy Definitive diagnosis
microbiome data IBD classification Discussion HOW DOES IT FIT INTO IBD PRACTICE? Clinical feasibility will depend on shrinking cost of sequencing Primary care screen Gastroenterologist review Serology Physician assessment Diagnosis suspected? Endoscopy Definitive diagnosis fecal biomarkers & microbiome mapping
microbiome data IBD classification Discussion HOW DOES IT FIT INTO IBD PRACTICE? Clinical feasibility will depend on shrinking cost of sequencing Primary care screen Gastroenterologist review Serology Physician assessment Diagnosis suspected? Endoscopy Definitive diagnosis fecal biomarkers & microbiome mapping
microbiome data IBD classification Discussion MACHINE LEARNING DEVELOPMENTS While working on this application, other uses of machine learning techniques for microbiome data appeared: Detect sequence samples mislabelings (Knights 2010) Track the source of microbial contamination (Knights 2011) Predicting response to diet in gnotobiotic mice (Faith 2011) Wastewater bioreactors (Werner 2011)
microbiome data IBD classification Discussion MOVING FORWARD Adapt machine learning methods to use additional data Integrate microbiome tools and immune tools Augment microbiome datasets with immune variables Furthermore, a recent exploratory study found that several host quantitative type, it may still be difficult to determine whether differences in ‘‘discriminating’’ Consequently, taxa that differ may be those that can tolerate inflammation in Figure 1. Processes for Microbial Signature Discovery The process begins with the collection of a large set of sequencing data from various bacterial communities associated with different environments or different host phenotypes. These sequences can serve directly as input to a machine-learning algorithm, or they can be transformed through a preprocessing step (data transformation). Although for microbial community analysis data transformation and supervised learning are typically performed as separate steps, we suggest that predictive models will be improved by the development of novel machine-learning techniques that are informed by the potential data transformations. For example, constructing a good predictive model using metabolic characterizations of metagenomics sequences might be easier if the algorithm has knowledge of the hierarchical relationships between metabolic functions. In the case of marker-gene surveys, a machine-learning algorithm may benefit from knowledge of the phylogenetic relationships of the observed lineages, or the network of average nucleotide similarities between the input sequences. These structures may allow models to share statistical strength across related independent variables in cases where there is high variability within a given environment or host phenotype (i.e., lack of a ‘‘core microbiome’’). Cell Host & Microbe Commentary Knigths et al. Cell Host & Microbe. (2011)