Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OpenTalks.AI - Иван Болохов, Гибридный интеллект для задач разметки данных

opentalks3
February 05, 2021

OpenTalks.AI - Иван Болохов, Гибридный интеллект для задач разметки данных

opentalks3

February 05, 2021
Tweet

More Decks by opentalks3

Other Decks in Business

Transcript

  1. Hybrid Intelligence
    for Data Labelling
    Kognitivnye
    Sistemy
    Ivan Bolokhov
    cogsys.company
    mashinist.cogsys.company

    View Slide

  2. Data Labelling with
    Mashinist Studio

    View Slide

  3. Text Audio
    Classification
    Segmentation
    Pairwise comparison
    Validation
    Classification
    Free string
    Segmentation
    Data types
    Images
    Classification
    Landmarks
    Defining boundaries

    View Slide

  4. Data Labelling Process
    Tasks
    People
    Labelled
    Data
    The classic human-
    driven labelling system
    system is time consuming
    and poorly scalable

    View Slide

  5. Real time results analysis
    Real-time task execution
    statistics. For each
    assignment the system
    determines a quality
    indicator and the results of
    all involved labelers.

    View Slide

  6. Monitoring the performance of
    labelers
    The dynamic rating of each labeler is based on an quality indicator of
    completed tasks and the overall percentage of tasks completed.

    View Slide

  7. HYBRID INTELLIGENCE

    View Slide

  8. Adding AI to the process
    • Bots (models) are trained on partial
    results of human labelers
    • Labelling bots start pre-labeling data
    • Bots compete with each other for
    "survival " – in terms of the quality of
    labelling

    View Slide

  9. Competition between humans and
    bots
    Bots label data on
    par with humans

    View Slide

  10. AI considerably improves data
    labelling process
    • Labelers agree or disagree with the
    system, which greatly speeds up the
    markup process and makes the model
    even "smarter".
    • At the output, the customer receives not
    only the labelled data, but also the
    best model (bot) for future automatic
    labelling.

    View Slide

  11. Results management
    Downloading labelling results into
    convenient formats.
    Not only a dataset, but also a
    ready-to-use model
    • CSV
    • JSON
    • XLS
    • Download Model

    View Slide

  12. Dataset and model marketplace

    View Slide

  13. Markup Cost Quality Speed
    Person - + -
    Robot + - +
    Hybrid + + +
    Comparative analysis

    View Slide

  14. For OpenTalks.AI participants
    Free access invite to
    Mashinist labelling studio
    @bolokhov
    Still have any questions?
    +7-977-687-71-96

    View Slide