Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ph.D defense talk

elipapa
May 16, 2012

Ph.D defense talk

High-throughput experimental and computational tools for exploring immunity and the microbiomed - thesis defense MIT april 2012

elipapa

May 16, 2012
Tweet

More Decks by elipapa

Other Decks in Science

Transcript

  1. HIGH-THROUGHPUT EXPERIMENTAL
    AND COMPUTATIONAL TOOLS FOR
    EXPLORING IMMUNITY AND THE
    MICROBIOME
    Eliseo Papa
    HST

    View Slide

  2. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    WHY STUDY THE MICROBIOME?
    Lean/obese mice studies suggest that gut microbiota
    affects energy balance
    Microbiota diversity is reduced by antibiotic therapy,
    leading to pathogenic infections
    antibiotic-associated diarrhea, salmonellosis, C.diff colitis
    Implicated in autoimmune diseases
    IBD, Diabetes

    View Slide

  3. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    BALANCE BETWEEN IMMUNITY AND GUT MICROBIOME
    The immune system is one of the main determinants of
    associated microbial diversity
    Innate
    Physical barriers limit microbes from reaching the epithelium
    APC trigger inflammation to reduce the bacterial load
    Adaptive
    B cells secrete polyreactive and antigen-specific IgA
    T cells mediate killing of specific microorganisms
    Microbiota influences both innate and adaptive immunity

    View Slide

  4. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    DYSBIOSIS
    Nature Reviews | Immunology
    Symbionts Commensals Pathobionts
    a Immunological equilibrium
    b Immunological dysequilibrium
    Regulation Inflammation
    Regulation Inflammation
    Pathogens
    nvolvement of gut bacteria
    s of animal models. Pre-
    biotics has been shown to
    al inflammation in several
    transgenic rats, IL-10- and
    conventional conditions
    ic colitis, whereas they do
    mation if raised in germ-
    el of colitis induced by the
    nic T cells into immuno-
    ined immunodeficient) or
    ting gene)) recipient mice,
    ntestinal pathogens such as
    und to exacerbate inflam-
    an be induced in healthy
    transfer of T cells that are
    mensal organisms50,60.
    m reported to be strongly
    ease is adherent-invasive
    at inflammatory responses
    IBD are directed towards
    sal organisms that have
    Helicobacter, Clostridium
    iously, these organisms are
    Figure 1 | Immunological dysregulation associated with
    dysbiosis of the microbiota. a | A healthy microbiota
    contains a balanced composition of many classes of
    bacteria. Symbionts are organisms with known health-
    promoting functions. Commensals are permanent
    REVIEWS
    Round et al. Nature Reviews Immunology (2009).

    View Slide

  5. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    DYSBIOSIS
    Nature Reviews | Immunology
    Symbionts Commensals Pathobionts
    a Immunological equilibrium
    b Immunological dysequilibrium
    Regulation Inflammation
    Regulation Inflammation
    Pathogens
    nvolvement of gut bacteria
    s of animal models. Pre-
    biotics has been shown to
    al inflammation in several
    transgenic rats, IL-10- and
    conventional conditions
    ic colitis, whereas they do
    mation if raised in germ-
    el of colitis induced by the
    nic T cells into immuno-
    ined immunodeficient) or
    ting gene)) recipient mice,
    ntestinal pathogens such as
    und to exacerbate inflam-
    an be induced in healthy
    transfer of T cells that are
    mensal organisms50,60.
    m reported to be strongly
    ease is adherent-invasive
    at inflammatory responses
    IBD are directed towards
    sal organisms that have
    Helicobacter, Clostridium
    iously, these organisms are
    Figure 1 | Immunological dysregulation associated with
    dysbiosis of the microbiota. a | A healthy microbiota
    contains a balanced composition of many classes of
    bacteria. Symbionts are organisms with known health-
    promoting functions. Commensals are permanent
    REVIEWS
    Nature Reviews | Immunology
    Symbionts Commensals Pathobionts
    a Immunological equilibrium
    b Immunological dysequilibrium
    Regulation Inflammation
    Regulation Inflammation
    Pathogens
    e involvement of gut bacteria
    dies of animal models. Pre-
    ntibiotics has been shown to
    inal inflammation in several
    27-transgenic rats, IL-10- and
    d in conventional conditions
    ronic colitis, whereas they do
    ammation if raised in germ-
    odel of colitis induced by the
    ogenic T cells into immuno-
    mbined immunodeficient) or
    vating gene)) recipient mice,
    h intestinal pathogens such as
    found to exacerbate inflam-
    s can be induced in healthy
    ive transfer of T cells that are
    mmensal organisms50,60.
    sm reported to be strongly
    disease is adherent-invasive
    that inflammatory responses
    tal IBD are directed towards
    mensal organisms that have
    as Helicobacter, Clostridium
    Curiously, these organisms are
    a and are not typically patho-
    f all mammals contains these
    Figure 1 | Immunological dysregulation associated with
    dysbiosis of the microbiota. a | A healthy microbiota
    contains a balanced composition of many classes of
    bacteria. Symbionts are organisms with known health-
    promoting functions. Commensals are permanent
    residents of this complex ecosystem and provide no benefit
    or detriment to the host (at least to our knowledge).
    REVIEWS
    Round et al. Nature Reviews Immunology (2009).

    View Slide

  6. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    DYSBIOSIS /2
    Nature Reviews | Immunology
    Host genetics
    Mutations in
    NOD2, IL23R,
    ATG16L and IGRM
    Lifestyle
    Diet
    Stress
    Disease
    T
    H
    1, T
    H
    2 and T
    H
    17 cells
    Health
    T
    Reg
    cells
    Early colonization
    Birth in hospitals
    Altered exposure
    to microbes
    Medical practices
    Vaccination use
    Antibiotic
    Hygiene
    Dysbiosis
    an animal model of experimental colitis . As symbiotic
    bacteria seem to have evolved mechanisms to provide
    protection from colonization by pathobionts that are
    present in the microbiota, does disease result from the
    linking thes
    Western pop
    The bacteria
    with IBD is
    trols74. Howe
    cific pathoge
    inflammatio
    to intestinal
    healthy and i
    conclusively
    This raises th
    tion in IBD a
    onts that are
    Indeed, in 19
    bacteria in t
    allergic child
    levels of colo
    els of coloniz
    allergic child
    studies have
    intestinal mi
    atopic eczem
    is not clear w
    disease, it se
    the gut micr
    developmen
    individuals.
    On these
    Figure 3 | Proposed causes of dysbiosis of the microbiota. We propose that the
    composition of the microbiota can shape a healthy immune response or predispose
    to disease. Many factors can contribute to dysbiosis, including host genetics, lifestyle,
    exposure to microorganisms and medical practices. Host genetics can potentially
    influence dysbiosis in many ways. An individual with mutations in genes involved
    in immune regulatory mechanisms or pro-inflammatory pathways could lead to
    unrestrained inflammation in the intestine. It is possible that inflammation alone
    influences the composition of the microbiota, skewing it in favour of pathobionts.
    Alternatively, a host could ‘select’ or exclude the colonization of particular organisms.
    This selection can be either active (as would be the case of an organism recognizing a
    particular receptor on the host) or passive (the host environment is more conducive to
    Round et al. Nature Reviews Immunology (2009).

    View Slide

  7. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    IMMUNITY AND MICROBIOTA ARE DEEPLY INTERLINKED
    Microbiota is required for the proper development of
    immune responses
    Microbial influence on immunity is rarely exerted in isolation

    View Slide

  8. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    IMMUNITY AND MICROBIOTA ARE DEEPLY INTERLINKED
    Microbiota is required for the proper development of
    immune responses
    Microbial influence on immunity is rarely exerted in isolation
    we need more systems-level data for all the players
    involved, measuring many variables at high resolution

    View Slide

  9. IMMUNITY
    1. High-throughput mapping of B cell responses
    2. Stimulating single lymphocytes

    View Slide

  10. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HIGH-THROUGHPUT MAPPING OF ANTIBODY RESPONSE
    Profiling of immune responses traditionally relies on cell
    sorting or serum measurements
    No data on the secretions of single lymphocytes
    quantity and timing of secreted cytokines?
    antibody affinities?

    View Slide

  11. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROENGRAVING CHIP

    View Slide

  12. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROENGRAVING CHIP / CLOSER LOOK

    View Slide

  13. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROENGRAVING METHOD
    Glass slides coated with
    capture Ab
    Secreted Ab is captured
    Microengraving
    Glass slides with replicated
    microarrays of Ab
    PDMS
    Culture dish

    View Slide

  14. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROENGRAVING METHOD
    Antigen specific spot
    Non-specific spot
    anti-mouse Ig
    OVA
    (var.conc, Green)
    anti-mouse Ig
    (10 nM, Red)
    Glass slides coated with
    capture Ab
    Secreted Ab is captured
    Microengraving
    Glass slides with replicated
    microarrays of Ab
    PDMS
    Culture dish

    View Slide

  15. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROENGRAVING METHOD
    Antigen specific spot
    Non-specific spot
    anti-mouse Ig
    OVA
    (var.conc, Green)
    anti-mouse Ig
    (10 nM, Red)
    Glass slides coated with
    capture Ab
    Secreted Ab is captured
    Microengraving
    Glass slides with replicated
    microarrays of Ab
    PDMS
    Culture dish
    [OVA]
    [IgG]
    10pM 100pM 1nM 10nM 100nM
    _B220 _IgM
    y
    t
    i
    n
    i
    f
    f
    A
    e
    p
    y
    t
    o
    s
    I
    l
    l
    e
    w
    o
    r
    c
    i
    M
    DNA IgM
    _

    View Slide

  16. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HIGH-THROUGHPUT AFFINITY MEASUREMENTS
    [OVA]
    Microwells
    Microarrays
    [IgG]
    a
    b
    0
    0.2
    0.4
    0.6
    0.8
    1
    1.2
    1.4
    1.6
    1.8
    2
    0.01 0.1 1 10 100
    Concentration (OVA, nM)
    Ag/Ab Ratio
    n = 3461
    10
    ï
    10
    ï
    100
    101
    10
    0
    0.1

    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    1
    ELISA
    Microengraving
    [OVA]
    Microwells
    Microarrays
    [IgG]
    a
    b
    0
    0.2
    0.4
    0.6
    0.8
    1
    1.2
    1.4
    1.6
    1.8
    2
    0.01 0.1 1 10 100
    Concentration (OVA, nM)
    Ag/Ab Ratio
    n = 3461
    10
    ï
    10
    ï
    100
    101
    10
    0
    0.1

    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    1
    ELISA
    Microengraving
    K app
    a
    b

    View Slide

  17. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HIGH-THROUGHPUT AFFINITY MEASUREMENTS
    0
    10
    20
    30
    40
    K
    d
    (nM)
    1x boost 2x boost
    [OVA]
    Microwells
    Microarrays
    [IgG]
    a
    b
    0
    0.2
    0.4
    0.6
    0.8
    1
    1.2
    1.4
    1.6
    1.8
    2
    0.01 0.1 1 10 100
    Concentration (OVA, nM)
    Ag/Ab Ratio
    n = 3461
    10
    ï
    10
    ï
    100
    101
    10
    0
    0.1

    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    1
    ELISA
    Microengraving
    [OVA]
    Microwells
    Microarrays
    [IgG]
    a
    b
    0
    0.2
    0.4
    0.6
    0.8
    1
    1.2
    1.4
    1.6
    1.8
    2
    0.01 0.1 1 10 100
    Concentration (OVA, nM)
    Ag/Ab Ratio
    n = 3461
    10
    ï
    10
    ï
    100
    101
    10
    0
    0.1

    0.3
    0.4
    0.5
    0.6
    0.7
    0.8
    0.9
    1
    ELISA
    Microengraving
    K app
    a
    b

    View Slide

  18. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CELL IDENTIFICATION VIA AFFINITY
    Ratio (Ag/IgG)
    0.001 0.01 0.1 1 10 100 1000
    1.0
    0.5
    0.0
    c127
    c136
    Y3
    Microarray Microwells
    a
    b
    Concentration (tet H-2Kb, nM)
    Green = tet H-2Kb (1 nM)
    = c136 = Y3
    = c127
    Red = anti-mouse IgG (10 nM)
    Ratio (Ag/IgG)
    0.001 0.01 0.1 1 10 100 1000
    1.0
    0.5
    0.0
    c127
    c136
    Y3
    Microarray Microwells
    a
    Concentration (tet H-2Kb, nM)
    Green = tet H-2Kb (1 nM)
    = c136 = Y3
    = c127
    Red = anti-mouse IgG (10 nM)

    View Slide

  19. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CELL IDENTIFICATION VIA AFFINITY
    !1 !0.8 -0.6 !0.4 !0.2 0 0.2 0.4 0.6 0.8
    First Principal Component
    Second Principal Component
    !1
    !0.8
    -0.6
    !0.4
    !0.2
    0
    0.2
    0.4
    0.6
    -1.2
    Cell staining
    C136
    Y3
    C127
    Ratio (Ag/IgG)
    0.001 0.01 0.1 1 10 100 1000
    1.0
    0.5
    0.0
    c127
    c136
    Y3
    Microarray Microwells
    a
    b
    Concentration (tet H-2Kb, nM)
    Green = tet H-2Kb (1 nM)
    = c136 = Y3
    = c127
    Red = anti-mouse IgG (10 nM)
    Ratio (Ag/IgG)
    0.001 0.01 0.1 1 10 100 1000
    1.0
    0.5
    0.0
    c127
    c136
    Y3
    Microarray Microwells
    a
    Concentration (tet H-2Kb, nM)
    Green = tet H-2Kb (1 nM)
    = c136 = Y3
    = c127
    Red = anti-mouse IgG (10 nM)

    View Slide

  20. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    AFFINITY MAP - AN ALTERNATIVE REPRESENTATION
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    Normalized Ag/Ab ratio (a.u.)
    median
    a
    b
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    Antigen-specific / saturated Low affinity / non-specific Antigen-specific / unsaturated
    Antigen-specific / saturated
    Low affinity / non-specific
    Antigen-specific / unsaturated
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    Normalized Ag/Ab ratio (a.u.)
    median
    a
    b
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    0.001 0.01 0.1 1 10 100 1000
    Concentration (tet H-2Kb, nM)
    Antigen-specific / saturated Low affinity / non-specific Antigen-specific / unsaturated
    Antigen-specific / saturated
    Low affinity / non-specific
    Antigen-specific / unsaturated

    View Slide

  21. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    ISOLATE ANTIGEN-SPECIFIC CELLS
    ASC
    non specific
    (n=1196)
    non specific
    (n=708)
    2x boost
    ASC
    n = 66
    n = 32
    increasing K
    D
    1x boost
    10pM 100pM 1nM 10nM 100nM
    −0.8
    −0.6
    −0.4
    −0.2
    0
    0.2
    0.4
    0.6
    0.8
    1
    -1
    10pM 100pM 1nM 10nM 100nM

    View Slide

  22. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    QUANTITATIVE PROFILE OF AN IMMUNIZATION
    Single cells (12457)
    BCR/B220+
    (11112)
    Isotype+
    (3119)
    Isotype+Kd
    +
    (2135)
    IgG2b
    : 71
    IgG2a
    : 29
    IgG1
    : 17
    IgM: 2018
    12425
    8987
    2908
    1135
    Ag+
    IgG1
    : 1
    IgM: 28
    4
    1
    29
    1101
    18253
    14199
    3135 648
    12
    1
    43
    592
    14
    36
    *
    days
    0 5 10 15 20 25 30 35
    * *
    * *
    = cells in culture/LPS
    2x boost
    1x boost
    Unimmunized
    Unimmunized 1x boost 2x boost
    = immunization
    * = sacrifice

    View Slide

  23. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUMMARY
    Quantitative profiles that detail the cellular origin, extent
    and diversity of the B cell response
    Flow cytometry and immunosorbant assays data correlated
    for each single cell
    Expandable to cytokine profiling, T cell profiling, primary
    splenocytes
    Allows cell retrieval

    View Slide

  24. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SINGLE LYMPHOCYTE STIMULATION
    Difficult to expose single naive lymphocytes to controlled
    stimuli (eg. bacteria)
    Capture of antigen by B cells is critical for antibody
    response
    studied by biochemical and imaging methods
    early dynamics ? quantitative ?

    View Slide

  25. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS
    Media
    Region of
    observation
    ]
    flow
    Pulse #1 Pulse #2
    Pulse #1
    (_IgM 568)
    C(t)
    time
    Pulse #2
    (_IgM 647)
    b
    naive B cell

    View Slide

  26. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS
    Media
    Region of
    observation
    ]
    flow
    Pulse #1 Pulse #2
    Pulse #1
    (_IgM 568)
    C(t)
    time
    Pulse #2
    (_IgM 647)
    b
    naive B cell
    a
    c
    Imposed
    Theory
    Imposed
    Exp
    0
    0.2
    0.4
    0.6
    0.8
    1
    0 10 20 30 40 50 60
    0
    0.2
    0.4
    0.6
    0.8
    1
    time (s)
    C
    eff
    /C
    C
    eff
    /C

    View Slide

  27. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SINGLE LYMPHOCYTE “PULSE-CHASE” EXPERIMENTS
    Media
    Region of
    observation
    ]
    flow
    Pulse #1 Pulse #2
    Pulse #1
    (_IgM 568)
    C(t)
    time
    Pulse #2
    (_IgM 647)
    b
    naive B cell
    a
    c
    Imposed
    Theory
    Imposed
    Exp
    0
    0.2
    0.4
    0.6
    0.8
    1
    0 10 20 30 40 50 60
    0
    0.2
    0.4
    0.6
    0.8
    1
    time (s)
    C
    eff
    /C
    C
    eff
    /C
    d
    Tfn 568
    Tfn 647
    50 100 150 200 250 300
    time (s)
    fluorescence Intensity (a.u.)
    0

    View Slide

  28. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    TRACKING B CELL RECEPTOR MICROCLUSTERING
    0 30s 60s 90s 120s 180s 210s 9min
    _IgM 568 labeling pulse
    e
    norm. fluor. intensity
    angle (e)
    0
    /
    2/
    1
    0
    a
    b c
    time after pulse (min)
    0 2 4 6 8 10
    _IgM 568 labeling pulse
    norm. fluor intensity
    0
    1
    0 30s 60s 90s 120s 180s 210s 9min
    _IgM 568 labeling pulse
    e
    norm. fluor. intensity
    angle (e)
    0
    /
    2/
    1
    0
    a
    b c
    time after pulse (min)
    0 2 4 6 8 10
    _IgM 568 labeling pulse
    norm. fluor intensity
    0
    1
    0 30s 60s 90s 120s 180s 210s 9min
    _IgM 568 labeling pulse
    e
    norm. fluor. intensity
    angle (e)
    0
    /
    2/
    1
    0
    a
    b c
    time after pulse (min)
    0 2 4 6 8 10
    _IgM 568 labeling pulse
    norm. fluor intensity
    0
    1

    View Slide

  29. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    BCR DYNAMIC FOLLOWING REPEATED PULSES
    norm. fluor intensity 1
    0
    _IgM 647 pulse
    _IgM 568 pulse
    aIgM 647
    time after pulse (min)
    0 2 4 6
    overlay
    aIgM 568
    b
    time after pulse (s)
    % colocalizat
    0
    20
    40
    60
    80
    d
    e
    60 120 180 240 300 36
    # of cells
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    0 100 200 300
    0 100 200 300
    # of cells
    _IgM 647
    _IgM 568
    0
    2
    4
    6
    8
    10
    12
    14
    16
    18
    o568 - o647 (s)
    o - time after pulse (s)

    View Slide

  30. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MHC CLUSTERS RAPIDLY AND INDEPENDENTLY FROM BCR
    a
    norm. fluor intensity
    1
    0
    time after pulse (min)
    0 3 6 9 12 15 18
    MHC II gfp _IgM 568 _IgM 647
    0 3 6 9 12 15 18 0 3 6 9 12 15 18
    colocalization
    40
    60
    80
    100
    120
    b

    View Slide

  31. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUMMARY
    Observe response to sequential doses of ligands in primary
    naïve B cells
    Measure early dynamic of labeled B cell receptors
    Expandable throughput with chip redesign

    View Slide

  32. MICROBIOME
    1. Machine Learning methods
    2. Distinguishing IBD and healthy

    View Slide

  33. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    1. MACHINE LEARNING APPLIED TO MICROBIOME DATA
    Environmental shotgun 16S rRNA sequencing allows
    mapping of bacterial composition
    16S rRNA phylogeny is a good approximation of microbes
    distribution (VonMering 2007)
    Gene content and phylogeny correlate well (Mueller 2011)
    Microbial compositional data is large and increasingly
    difficult to mine
    Cheaper sequencing means that analysis is becoming the
    limiting step
    Needed: routine extraction of patterns in microbial data

    View Slide

  34. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    WHAT IS MACHINE LEARNING?
    Machine learning algorithms use example data to learn and
    discover structure in datasets
    classify samples into distinct categories
    once learnt from example data, can predict
    Machine learning algorithms are object of extensive
    research
    applications in computing, finance, biology,etc.

    View Slide

  35. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    WHAT IS MACHINE LEARNING?
    Machine learning algorithms use example data to learn and
    discover structure in datasets
    classify samples into distinct categories
    once learnt from example data, can predict
    Machine learning algorithms are object of extensive
    research
    applications in computing, finance, biology,etc.

    View Slide

  36. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    WHAT IS MACHINE LEARNING?
    Machine learning algorithms use example data to learn and
    discover structure in datasets
    classify samples into distinct categories
    once learnt from example data, can predict
    Machine learning algorithms are object of extensive
    research
    applications in computing, finance, biology,etc.

    View Slide

  37. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    WHAT IS MACHINE LEARNING?
    Machine learning algorithms use example data to learn and
    discover structure in datasets
    classify samples into distinct categories
    once learnt from example data, can predict
    Machine learning algorithms are object of extensive
    research
    applications in computing, finance, biology,etc.

    View Slide

  38. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MICROBIOME DATA
    Each microbial taxa is a feature that can be used to
    discriminate between bacterial communities
    We want to:
    find automatically the taxa that discriminate best
    accurately classify communities according to metadata
    Taxa Sample1 Sample2 ...
    A 12 2
    B 1 10
    C 5 0

    View Slide

  39. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    PREPARING MICROBIOME DATA
    16S DNA
    sequence reads quality filtering,
    chimera check
    (MOTHUR) RDP classification
    AGCTGCTCGA
    TAAGCTGCTCGA
    AGCTGCTCGATTCTG
    OTU Clustering
    (UCLUST)
    Representative
    sequences
    Taxa Sample1 Sample2 ...
    A 12 2
    B 1 10
    C 5 0
    OTU table
    Phylogenetic
    tree

    View Slide

  40. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set

    View Slide

  41. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    build random
    forest model
    what are the best
    taxa?

    View Slide

  42. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set read taxa
    abundance from
    test set
    build random
    forest model
    what are the best
    taxa?

    View Slide

  43. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    predict test set
    or
    read taxa
    abundance from
    test set
    build random
    forest model
    what are the best
    taxa?

    View Slide

  44. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    predict test set
    or
    read taxa
    abundance from
    test set
    check prediction
    build random
    forest model
    what are the best
    taxa?

    View Slide

  45. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    predict test set
    or
    read taxa
    abundance from
    test set
    check prediction
    build random
    forest model
    what are the best
    taxa?

    View Slide

  46. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    LEARNING AND CLASSIFICATION
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    Taxa
    A
    B
    C
    D
    Sample1
    12
    1
    5
    0
    Sample2
    2
    21
    5
    10
    Sample3
    12
    11
    3
    2
    Sample4
    1
    2
    0
    15
    training set
    test set
    predict test set
    or
    read taxa
    abundance from
    test set
    check prediction
    build random
    forest model
    what are the best
    taxa?
    repeat the cross-validation and average

    View Slide

  47. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    RANDOM FOREST
    Taxa
    A
    B
    C
    D
    12
    1
    5
    0
    2
    21
    5
    10
    12
    11
    3
    2
    pick a random
    sample
    build a decision
    tree
    pick at random
    mtry taxa
    2
    21
    5
    10
    A,B
    A > 10
    B < 11

    View Slide

  48. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    RANDOM FOREST
    Taxa
    A
    B
    C
    D
    12
    1
    5
    0
    2
    21
    5
    10
    12
    11
    3
    2
    pick a random
    sample
    build a decision
    tree
    pick at random
    mtry taxa
    2
    21
    5
    10
    A,B
    A > 10
    B < 11
    repeat Ntree times
    average / take votes

    View Slide

  49. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUPERVISED LEARNING WORKS WELL

    View Slide

  50. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUPERVISED LEARNING WORKS WELL
    OTU identity threshold
    error
    0.10
    0.15
    0.20
    0.25
    0.30
    0.35
    habitat
    ● ● ● ●

    ● ● ● ●
    80 85 90 95
    host









    80 85 90 95

    View Slide

  51. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUPERVISED LEARNING WORKS WELL
    OTU identity threshold
    error
    0.10
    0.15
    0.20
    0.25
    0.30
    0.35
    habitat
    ● ● ● ●

    ● ● ● ●
    80 85 90 95
    host









    80 85 90 95
    number of samples
    area under ROC curve
    0.6
    0.8
    1.0
    40 60 80 100

    View Slide

  52. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUPERVISED LEARNING WORKS WELL
    OTU identity threshold
    error
    0.10
    0.15
    0.20
    0.25
    0.30
    0.35
    habitat
    ● ● ● ●

    ● ● ● ●
    80 85 90 95
    host









    80 85 90 95
    0.0 0.1 0.2 0.3 0.4 0.5 0.6
    0.0 0.1 0.2 0.3 0.4 0.5 0.6
    NO3 predictions (feat selection: significance ranking)
    measured log[NO3]
    predicted log[NO3]
    number of samples
    area under ROC curve
    0.6
    0.8
    1.0
    40 60 80 100

    View Slide

  53. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    INCORPORATING PHYLOGENETICS
    |
    NM
    CD NM
    UC
    UC
    UC
    2
    3
    4
    5
    1
    Healthy / Sick
    Healthy / Crohn!s / Colitis
    p = 0.003
    CD NM UC
    0
    0.2
    0.4
    0.6
    0.8
    1
    CD NM UC
    0
    0.2
    0.4
    0.6
    0.8
    1
    1
    present
    absent
    Hierarchical decision tree outlining the classification of a patient as normal, crohn!s or colitis, depending on whether sequences are present
    at the given nodes in the phylogenetic tree. Average accuracy is 80%. Decision tree nodes are colored with respect to the hierarchical level.
    Tree branches are colored based on diagnosis. Bacterial groups in a normal patient are colored green;
    magenta for Crohn!s samples and cyan for colitis samples.
    5
    4
    1
    3
    2

    View Slide

  54. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    2. THE CASE OF IBD
    Inflammation of autoimmune origin
    Presenting symptoms:
    abdominal pain, diarrhea, vomiting, weight loss
    No known causative agent
    IBD seems to have a complex etiology:
    environmental - smoking, western diet ?
    genetic - autophagy loci (NOD2,ATG16)
    microbial - correlated with some bacteria, dysbiosis ?

    View Slide

  55. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    IBD TREATMENT
    IBD alternates between flares (active) and periods of
    remission (inactive).
    Long-term immunosuppressants to maintain remission
    Antibiotic therapy is used empirically to treat flare-ups
    When medical therapy fails, treatment is bowel resection

    View Slide

  56. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION CAN DISTINGUISH IBD AND HEALTHY
    Frank et al. survey
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    Frank (AUC = 0.73)
    Pediatric (AUC = 0.71)

    View Slide

  57. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION CAN DISTINGUISH IBD AND HEALTHY
    Frank et al. survey
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    Frank (AUC = 0.73)
    Pediatric (AUC = 0.71)
    Area under the ROC curve
    probability that a classifier will rank a
    randomly chosen positive instance
    higher than a randomly chosen
    negative one

    View Slide

  58. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION CAN DISTINGUISH IBD AND HEALTHY
    Frank et al. survey
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    Frank (AUC = 0.73)
    Pediatric (AUC = 0.71)
    Pediatric case control
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    IBD (AUC = 0.83)
    active IBD (AUC = 0.91)
    Area under the ROC curve
    probability that a classifier will rank a
    randomly chosen positive instance
    higher than a randomly chosen
    negative one

    View Slide

  59. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MUCOSAL AND STOOL SAMPLE HAVE SIMILAR PROFILES
    mean difference in abundance (negative = control, positive = ibd)
    Clostridia (cls)
    Clostridiales (ordr)
    Firmicutes (phylum)
    Ruminococcaceae (family)
    Lachnospiraceae (family)
    NA (genus)
    Subdoligranulum (genus)
    Porphyromonadaceae (family)
    Rikenellaceae (family)
    Alistipes (genus)
    Coprococcus (genus)
    Streptococcaceae (family)
    Eubacterium (genus)
    Eubacteriaceae (family)
    Oscillibacter (genus)
    Odoribacter (genus)
    Butyricicoccus (genus)
    Parvimonas (genus)
    Incertae Sedis XIII (family)
    Anaerovorax (genus)
    Akkermansia (genus)
    Verrucomicrobiaceae (family)
    Verrucomicrobiae (cls)
    Citrobacter (genus)
    Anaerotruncus (genus)
    Roseburia (genus)
    Coriobacteriales (ordr)
    Pasteurellaceae (family)
    Pasteurellales (ordr)
    Lactobacillaceae (family)
    Actinobacteria (cls)
    Actinobacteria (phylum)
    Lactobacillales (ordr)
    Bacilli (cls)
    EscherLFKLDï6KLJHOODJHnus)
    Enterobacteriaceae (family)
    Enterobacteriales (ordr)
    Gammaproteobacteria (cls)
    Proteobacteria (phylum)
    ï ï
    Study
    Frank
    Pediatric

    View Slide

  60. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    DEEPER SEQUENCING ALSO HELPED
    Median relative abundance
    Anaerotruncus
    Clostridia
    Clostridiales
    Coprococcus
    Lachnospiraceae
    Peptococcus
    Ruminococcaceae
    NA
    Incertae Sedis XIII
    Peptococcaceae
    Akkermansia
    Coriobacteriaceae
    Coriobacteriales
    Verrucomicrobia
    Verrucomicrobiaceae
    Verrucomicrobiae
    Verrucomicrobiales
    Acetivibrio
    Escherichia−Shigella
    Parabacteroides
    Phascolarctobacterium
    Ruminococcus
    Collinsella
    Eubacterium
    Porphyromonadaceae
    Sporobacter
    Eubacteriaceae
    Odoribacter
    Enterobacteriaceae
    Enterobacteriales
    Anaerovorax
    Oscillibacter
    Butyricicoccus
    Proteobacteria
    Gammaproteobacteria
    Subdoligranulum
    Alistipes
    Rikenellaceae
    10 −3
    10 −2.5
    10 −2
    10 −1.5
    10 −1
    10 −0.5
    log10(pvalue)
    3
    4
    5
    6

    View Slide

  61. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CHARACTERISTIC TAXA ARE ASSOCIATED WITH IBD
    !"! !"# !"$ !"% !"& '"!
    ()* +,+()*
    Tïvalue
    Verrucomicrobia
    Proteobacteria
    Verrucomicrobiae
    Gammaproteobacteria
    Verrucomicrobiales
    Enterobacteriales
    Peptococcaceae
    Verrucomicrobiaceae
    Enterobacteriaceae
    Incertae Sedis XIII
    Porphyromonadaceae
    Rikenellaceae
    Peptococcus
    Ethanoligenens
    Lawsonia
    Phascolarctobacterium
    Sporobacter
    Anaerotruncus
    Akkermansia
    Butyricicoccus
    Ruminococcus
    Odoribacter
    EscherLFKLDï6KLJHOOD
    Parabacteroides
    Oscillibacter
    Anaerovorax
    Subdoligranulum
    Alistipes
    '!
    ¦
    '!
    ¦
    '!
    ¦
    -,+./,0 (+1-.(23 4(0* 4,*3/1.3 5323/3
    1)6 +,71)6
    (458 +,7(458
    eff. size




























    !"!
    !"#
    !"$
    !"%
    phylum
    class
    order
    family
    genus
    9!"#
    Control CD UC
    activity
    antibiotics
    immunosuppr.
    rel. abundance (% of max)

    View Slide

  62. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION IDENTIFIES ACTIVITY LEVELS
    !"#$%"& '#(!$')* +'&, +",*%($* -*)*%*
    Avg. % reads
    phylum
    class
    order
    family
    genus
    Tïvalue
    Verrucomicrobia
    Actinobacteria
    Proteobacteria
    Clostridia
    Bacilli
    Verrucomicrobiae
    Actinobacteria
    Gammaproteobacteria
    Clostridiales
    Bifidobacteriales
    Lactobacillales
    Verrucomicrobiales
    Coriobacteriales
    Enterobacteriales
    Peptococcaceae
    Lachnospiraceae
    Ruminococcaceae
    Incertae Sedis XIV
    Bifidobacteriaceae
    Peptostreptococcaceae
    Verrucomicrobiaceae
    Corynebacteriaceae
    Incertae Sedis XIII
    Coriobacteriaceae
    Porphyromonadaceae
    Eubacteriaceae
    Enterobacteriaceae
    Rikenellaceae
    Roseburia
    NA
    Veillonella
    Oribacterium
    Coprococcus
    Blautia
    Bifidobacterium
    Acetivibrio
    Akkermansia
    Varibaculum
    Anaerotruncus
    Atopobium
    Phascolarctobacterium
    Sporobacter
    Corynebacterium
    Parabacteroides
    Ruminococcus
    Odoribacter
    Anaerostipes
    EscherLFKLDï6KLJHOOD
    Collinsella
    Serratia
    Eubacterium
    Oscillibacter
    Anaerovorax
    Butyricicoccus
    Alistipes
    6XEGROLJranulum
    10
    ï
    10
    ï
    Shannon div. index
    1.0
    1.5
    2.0
    2.5
    inactive mild moderate severe control

    View Slide

  63. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION CAN DIFFERENTIATE CD AND UC
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.2
    0.4
    0.6
    0.8
    1.0
    0.2
    0.4
    0.6
    0.8
    1.0
    CD vs. UC
    One vs. all
    0.0 0.2 0.4 0.6 0.8 1.0
    CD vs. UC
    (AUC = 0.76)
    CD (AUC = 0.68)
    UC (AUC = 0.82)
    Control (AUC = 0.83)
    !" #!
    log
    10
    Tïvalue)
    Verrucomicrobia
    Bacteroidetes
    Gammaproteobacteria
    Bacilli
    Verrucomicrobiae
    Bacteroidia
    Lactobacillales
    Verrucomicrobiales
    Bacteroidales
    Verrucomicrobiaceae
    Bacteroidaceae
    Eubacteriaceae
    Alistipes
    Akkermansia
    Butyricimonas
    Coprobacillus
    Eggerthella
    Parasutterella
    Bacteroides
    Eubacterium
    $%$&
    $%$'
    Avg. % reads
    phylum
    class
    order
    family
    genus

    View Slide

  64. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    BLIND VALIDATION CONFIRMS ACCURACY OF THE MODEL
    classification between: ibd/nonibd
    Specificity
    Sensitivity
    0.0 0.2 0.4 0.6 0.8 1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    AUC = 0.848
    Pediatric case control
    ï6SHFLILFLW\
    6HQVLWLYLW\
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    IBD (AUC = 0.83)
    active IBD (AUC = 0.91)
    Test set (n=68) Training set (n=91)

    View Slide

  65. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    CLASSIFICATION IS BETTER THAN FECAL CALPROTECTIN
    all IBD vs. control (training & validation)
    1 − Specificity
    Sensitivity
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    calprotectin (AUC = 0.77)
    SLiME (AUC = 0.85)
    CD vs. UC classification (training & validation)
    1 − Specificity
    Sensitivity
    0.0
    0.2
    0.4
    0.6
    0.8
    1.0
    0.0 0.2 0.4 0.6 0.8 1.0
    calprotectin (AUC = 0.50)
    SLiME (AUC = 0.69)

    View Slide

  66. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    SUMMARY
    Classification can distinguish healthy and IBD patients
    accurately
    Patients can be stratified according to activity
    Identified novel taxa associated with IBD and remission
    Validated blindly and by fecal calprotectin measurements
    Careful statistical design should be first step in larger
    studies

    View Slide

  67. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HOW DOES IT FIT INTO IBD PRACTICE?
    Clinical feasibility will depend on shrinking cost of
    sequencing
    Primary care
    screen
    Gastroenterologist
    review
    Serology Physician
    assessment
    Diagnosis
    suspected?
    Endoscopy
    Definitive
    diagnosis

    View Slide

  68. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HOW DOES IT FIT INTO IBD PRACTICE?
    Clinical feasibility will depend on shrinking cost of
    sequencing
    Primary care
    screen
    Gastroenterologist
    review
    Serology Physician
    assessment
    Diagnosis
    suspected?
    Endoscopy
    Definitive
    diagnosis
    fecal biomarkers
    & microbiome
    mapping

    View Slide

  69. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    HOW DOES IT FIT INTO IBD PRACTICE?
    Clinical feasibility will depend on shrinking cost of
    sequencing
    Primary care
    screen
    Gastroenterologist
    review
    Serology Physician
    assessment
    Diagnosis
    suspected?
    Endoscopy
    Definitive
    diagnosis
    fecal biomarkers
    & microbiome
    mapping

    View Slide

  70. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MACHINE LEARNING DEVELOPMENTS
    While working on this application, other uses of machine
    learning techniques for microbiome data appeared:
    Detect sequence samples mislabelings (Knights 2010)
    Track the source of microbial contamination (Knights 2011)
    Predicting response to diet in gnotobiotic mice (Faith 2011)
    Wastewater bioreactors (Werner 2011)

    View Slide

  71. Introduction Mapping antibody responses Single lymphocyte stimulation Machine learning for microbiome data IBD classification Discussion
    MOVING FORWARD
    Adapt machine learning methods to use additional data
    Integrate microbiome tools and immune tools
    Augment microbiome datasets with immune variables
    Furthermore, a recent exploratory
    study found that several host quantitative
    type, it may still be difficult to determine
    whether differences in ‘‘discriminating’’
    Consequently, taxa that differ may be
    those that can tolerate inflammation in
    Figure 1. Processes for Microbial Signature Discovery
    The process begins with the collection of a large set of sequencing data from various bacterial communities associated with different environments or different
    host phenotypes. These sequences can serve directly as input to a machine-learning algorithm, or they can be transformed through a preprocessing step (data
    transformation). Although for microbial community analysis data transformation and supervised learning are typically performed as separate steps, we suggest
    that predictive models will be improved by the development of novel machine-learning techniques that are informed by the potential data transformations. For
    example, constructing a good predictive model using metabolic characterizations of metagenomics sequences might be easier if the algorithm has knowledge of
    the hierarchical relationships between metabolic functions. In the case of marker-gene surveys, a machine-learning algorithm may benefit from knowledge of the
    phylogenetic relationships of the observed lineages, or the network of average nucleotide similarities between the input sequences. These structures may allow
    models to share statistical strength across related independent variables in cases where there is high variability within a given environment or host phenotype (i.e.,
    lack of a ‘‘core microbiome’’).
    Cell Host & Microbe
    Commentary
    Knigths et al. Cell Host & Microbe. (2011)

    View Slide

  72. Eric Alm & the Alm lab
    Chris Love & the Love lab
    Ploegh lab
    Athos Bousvaros & Dirk Gevers
    Lynn Bry
    Funding: HST, Poitras & NSERC
    THANK YOU!
    Any questions?

    View Slide