Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building AI Chat bot using Python 3 & TensorFlow

Building AI Chat bot using Python 3 & TensorFlow

Recently, chat bot has become the center of public attention as a new mobile user interface since 2015. Chat bots are widely used to reduce human-to-human interaction, from consultation to online shopping and negotiation, and still expanding the application coverage. Also, chat bot is the basic of conversational interface and non-physical input interface with combination of voice recognition.

Traditional chat bots were developed based on the natural language processing (NLP) and bayesian statistics for user intention recognition and template-based response. However, since 2012, accelerated advance in deep-learning technology and NLPs using deep-learning opened the possibilities to create chat bots with machine learning. Machine learning (ML)-based chat bot development has advantages, for instance, ML-based bots can generate (somewhat non-sense but acceptable) responses to random asks that has no connection with the context once the model is constructed with appropriate learning level.

In this talk, I will introduce the garage chat bot creation process step-by-step. First, get the data and preprocess it with Python 3 and pandas. Also, data is modified to more trainable form. With preprocessed data, design the deep learning model with TensorFlow which is suitable for sentence-type input / output and train it. After training, serve the model with messenger interface created by using telegram API and Python 3, and demonstrate the result.

In the process, we have to solve several problems. First is the preprocessing the Korean sentences with natural language processors, and tokenizing the sentences with proper length and types. Also, we have to solve the ‘josa (postpositions in Korean) hell” and conjunction problems to construct TensorFlow model. In addition to preprocessing, model architecture to recognize the conversational context is also needed. To serve bot with Python HTTP server and telegram API, some points demand deliberation. I’ll share my multi-modal bot model idea, implementation and tips to solve these problems.

chat bot은 2015년부터 모바일을 중심으로 새로운 사용자 UI로 주목받고 있다. 챗 봇은 상담시 인간-인간 인터랙션을 줄이는 용도부터 온라인 쇼핑 구매에 이르기까지 다양한 분야에 활용되고 있으며 그 범위를 넓혀 나가고 있다. 챗 봇은 대화형 인터페이스의 기초이면서 동시에 (음성 인식과 결합을 통한) 무입력 방식 인터페이스의 기반 기술이기도 하다.

기존의 챗 봇들은 자연어 분석과 베이지안 통계에 기반한 사용자 의도 패턴 인식과 그에 따른 템플릿 응답을 기본 원리로 하여 개발되었다. 그러나 2012년 이후 급속도로 발전한 딥러닝 및 그에 기초한 자연어 인식 기술은 기계 학습을 이용해 챗 봇을 만들 수 있는 가능성을 열었다. 기계학습을 통해 챗 봇을 개발할 경우, 충분한 학습도의 모델을 구축한 후에는 학습 데이터에 따라 컨텍스트에서 벗어난 임의의 문장 입력에 대해서도 적당한 답을 생성할 수 있다는 장점이 있다.

이 강연에서는 Python 3 를 이용하여 실제 사용할 수 있는 챗 봇을 만드는 과정을 단계별로 진행한다. 우선 데이터를 구한 후 Python 3 와 Pandas를 사용하여 데이터를 전처리한다. 이렇게 전처리한 데이터를 학습에 적당한 형태로 재가공한다. 그 후 컴퓨터에 TensorFlow의 python 3 패키지를 설치한다. 이후 TensorFlow 를 이용하여 문장형 입출력에 적절한 딥러닝 모델을 설계한 후, 앞에서 전처리한 데이터를 이용하여 학습시킨 모델을 만든다. 이렇게 만든 모델을 telegram API 를 이용해 인터페이스를 만든 후, telegram에 봇을 친구로 등록하여 대화를 시연한다.

이 과정에서 여러 문제들을 해결해야 한다. 우선 한국어 자연어 처리를 위해 데이터를 적절히 전처리하는 과정과, 모델 학습을 위해 문장의 길이 및 형태를 적절히 토크나이징하는 과정이 필요하다. 그 다음 Tensorflow 로 모델을 설계하고 딥러닝 모델로 학습하는 단계에서 장애가 되는 조사 및 접속사 처리, 오타 처리등의 문제를 해결해야 한다. 또한 연속 대화 구현을 위하여 문장 단위의 입출력이 아니라 컨텍스트를 인식하기 위한 모델 설계 또한 필요하다. 학습한 결과를 파이썬 HTTP 서버 및 telegram API를 이용해 서빙하는 부분에서 몇가지 고려할 부분들도 있다. 이러한 부분들에 대한 아이디어 및 구현과 팁을 공유하고자 한다.

Jeongkyu Shin
PRO

August 14, 2016
Tweet

More Decks by Jeongkyu Shin

Other Decks in Research

Transcript

  1. Building AI Chat bot
    using
    Python 3
    & TensorFlow
    Jeongkyu Shin
    Lablup Inc.
    Illustration * © Idol M@aster / Bandai Namco Games. All rights reserved.

    View Slide

  2. I’m
    ▪ Humble business man
    ▪ Lablup Inc. (All members are speaking in PYCON APAC 2016!)
    ▪ Open-source devotee
    ▪ Textcube maintainer
    ▪ Play with some (open||hidden) projects / companies
    ▪ Physicist / Neuroscientist
    ▪ Studied information processing procedure in Brain
    ▪ Ph.D in Statistical Physics (complex system)
    ▪ Major in Physics / Computer science
    신정규 / Jeongkyu Shin / @inureyes

    View Slide

  3. > runme –-loop=2
    ▪ Became the first man to get 2 official
    presenter shirts in PyCON APAC 2016!
    ▪ 8.13.2016 (in Korean)
    ▪ 8.14.2016 (in English)
    ▪ Are you ready? (I’m not ready)*
    *Parody of something. Never mind.

    View Slide

  4. Welcome to my garage!
    Tons of garbage here!

    View Slide

  5. –Bryce Courtenay's The Power of One
    “First with the head, then with the heart.”

    View Slide

  6. Today’s Entree: Chat bot
    ▪ Python 3
    ▪ Twitter Korean Analyzer / Komoran with KoNLPy / pandas
    ▪ TensorFlow
    ▪ 0.8 -> 0.9 -> 0.10RC
    ▪ And special sauce!
    ▪ Special data with unique order
    ▪ Special python program to organize / use the data!
    Clipart* (c) thetomatos.com

    View Slide

  7. Ingredients for today's recipe
    ▪ Data
    ▪ Test: FAS dataset (26GB)
    ▪ Today: “Idolm@ster” series and etc.
    ▪ Tools
    ▪ TensorFlow + Python 3
    ▪ Today’s insight
    ▪ Multi-modal Learning models and model chaining

    View Slide

  8. I’m not sure but
    I’ll try to explain the
    whole process I did
    Game screenshot* (c) CAVE
    Forkcrane* (c) Iconix

    View Slide

  9. And I assume that
    you already have
    experience /
    knowledge about
    machine learning
    and TensorFlow
    Illustration *(c) marioandluigi97.deviantart.com

    View Slide

  10. Things that will not be covered today
    ▪ Phase space / embedding
    dimension
    ▪ Recurrent Neural Network (RNN)
    ▪ GRU cell / LSTM cell
    ▪ Multi-layer stacking
    ▪ Batch process for training
    ▪ Vector representation of
    language sentence
    ▪ Sequence-to-sequence model
    ▪ Word2Vec / Senti-Word-Net
    Clip * © Idol M@aster the animation / Bandai Namco Games All rights reserved.

    View Slide

  11. One day in Seoul Itaewon, 2013
    All started with dinner talks of neuroscientists...

    View Slide

  12. “When will AI-based program pass Turing test?”
    “I believe it will happen before 2020.”
    “Is it too fast to be true?”
    “Weak intelligence will achieve the goal with accelerated technology advances and
    our understanding about human brain.”
    “Do you really believe that it will happen in that short time?”
    “Ok, then, let’s make a small bet.”
    …and I started making my own chat bot from next month of the day.

    View Slide

  13. What is chat bot?
    ▪ “Chatting bots”
    ▪ One of the
    ▪ Oldest Human-Computer Interface (HCI) based machines
    ▪ Challenging lexical topics
    ▪ Interface: Text → Speech (vocal) →Brain-Computer Interface (BCI)
    ▪ Commercial UI: Messengers!

    View Slide

  14. View Slide

  15. Basic chat bot components
    Context
    Analyzer
    Natural
    Language
    Processor
    Response
    Generator
    Decision
    maker
    Lexical
    Input
    Lexical
    Output

    View Slide

  16. Lexical
    Output
    Traditional chat bots
    Context
    Analyzer
    Natural
    Language
    Processor
    Response
    Generator
    Templates
    Knowledge
    base
    Decision
    maker
    Search engine
    Lexical
    Input
    Morphemic
    analyzer
    Taxonomy
    analyzer

    View Slide

  17. Lexical
    Output
    Chat-bots with Machine Learning
    Context
    Analyzer
    Natural
    Language
    Processor
    Response
    Generator
    Decision
    maker
    Sentence
    To
    vector
    converter
    Deep-learning model
    (RNN / sentence-to-sentence)
    Knowledgebase
    (useful with TF/IDF ask bots)
    Per-user context
    memory
    Lexical
    Input
    Deep-learning model
    SyntaxNet / NLU
    (Natural Language
    Understanding)

    View Slide

  18. Problems
    ▪ Hooray! Deep-learning based chat bots works well with Q&A scenario!
    ▪ General problems
    ▪ Inhuman: restricted for model training sets
    ▪ Cannot "start" conversation
    ▪ Cannot handle continuous conversational context and its changes
    ▪ Korean-specific problems
    ▪ Dynamic type-changes
    ▪ Postpositions / conjunction (Josa hell)

    View Slide

  19. 헬조사
    Hell Josa
    The great wall
    of Korean ML+NLP
    Like
    ActiveX+N*+F*
    In Korean Web
    =

    View Slide

  20. We expect these but...
    Clip art *Lego ©

    View Slide

  21. We got these.
    Photo * © amazon.com
    ...How can I assemble them?

    View Slide

  22. Back to the origin
    What I learned for 9 years…

    View Slide

  23. How brain works
    ▪ Parallelism: performing a task at separated areas
    Cognition
    Decision making
    Language processes
    (Broca / Wernicke)
    Reflex conversation
    Clipart* (c) cliparts.co

    View Slide

  24. Information pathway during conversation
    ▪ During conversation:
    3. Context recognition
    1. Preprocessing
    2. Send information
    4. Spread / gather
    processes to
    determine answer
    5. Send conceptual
    response to parietal lobe
    6. Postprocessing to
    generate sentence
    Clipart* (c) cliparts.co

    View Slide

  25. Understanding brain process
    ▪ Intelligence / cognitive tasks
    ▪ Temporal information circuit between prefrontal-frontal lobe
    ▪ Language processing
    ▪ Happens after backward information signal
    ▪ Related to somatosensory cortex activity
    ▪ Ok, ok, then?
    ▪ Language process is highly separated in brain
    ▪ Integration / disintegration process is very important
    Clipart* (c) cliparts.co

    View Slide

  26. Architecturing
    ▪ Separate the dots
    ▪ Simplifying information to context analyzer
    ▪ Generates complex response using diverse models
    ▪ Sentence generator
    ▪ Grammar generator model
    ▪ Simple word sequence to be complete sentence
    ▪ Tone generator model
    ▪ Change sentence sequence tones with specific tone

    View Slide

  27. Ideas from structure
    ▪ During conversation:
    3. Context parser
    1. Disintegrator
    2. Send information
    4. Decision maker
    using ML model
    5. Send conceptual response to
    Sentence generators
    6. Postprocessing with
    tone engine to
    generate sentence
    Grammar
    engine
    Clipart* (c) cliparts.co

    View Slide

  28. Ideas from structure
    ▪ Multi-modal model
    ▪ Disintegrator (to simplify sentence into morphemes)
    ▪ Bot engine
    ▪ Generates morpheme sequence
    ▪ Grammar model
    ▪ Make meaningful sentence from morpheme sequence
    ▪ Tone model
    ▪ Change some conjunction (eomi) / words of grammar model result

    View Slide

  29. Lexical
    Output
    Sentence generator
    Deep-learning model
    (sentence-to-sentence
    + context-aware word generator)
    Final structure
    Grammar
    generator
    Context memory
    Knowledge engine
    Emotion engine
    Context
    parser Tone
    generator
    Disintegrator
    Response generator
    NLP + StV Context analyzer+Decision maker
    Lexical
    Input

    View Slide

  30. Making models
    The importance of Prototyping

    View Slide

  31. Creating ML models
    Define
    input function
    step function
    evaluator
    batch
    Prepare
    train dataset
    test dataset
    Runtime environment
    Make
    Estimator
    Optimizer
    Do
    Training
    Testing
    Predicting

    View Slide

  32. Creating ML models
    Define
    input function
    step function
    evaluator
    batch
    Prepare
    train dataset
    test dataset
    Runtime environment
    Make
    Estimator
    Optimizer
    Do
    Training
    Testing
    Predicting

    View Slide

  33. Creating ML models
    Define
    input function
    step function
    evaluator
    batch
    Prepare
    train dataset
    test dataset
    Runtime environment
    Make
    Estimator
    Optimizer
    Do
    Training
    Testing
    Predicting

    View Slide

  34. Creating ML models
    Define
    input function
    step function
    evaluator
    batch
    Prepare
    train dataset
    test dataset
    Runtime environment
    Make
    Estimator
    Optimizer
    Do
    Training
    Testing
    Predicting

    View Slide

  35. Lexical
    Output
    Sentence generator
    Context analyzer
    +
    Decision maker
    Model chain order
    Grammar
    generator
    Tone
    generator
    Disintegrator
    Response generator
    NLP + StV AI
    Lexical
    Input

    View Slide

  36. Lexical
    Output
    Sentence generator
    Context analyzer
    +
    Decision maker
    Model chain order
    Grammar
    generator
    Tone
    generator
    Disintegrator
    Response generator
    NLP + StV AI
    Lexical
    Input
    Fragmented
    text
    sequence
    Fragmented
    text
    sequence
    (Almost)
    Normal text
    Text with
    tones
    Normal text
    Semantic
    sequence

    View Slide

  37. Disintegrator
    ▪ a.k.a. morpheme analyzer for speech / talk analysis
    ▪ Input
    ▪ Text as conversation
    ▪ Output
    ▪ Ordered word fragments

    View Slide

  38. Disintegrator
    ▪ Twitter Korean analyzer
    ▪ Compact and very fast
    ▪ Can be easily adopted with KoNLP package
    ▪ Komoran can be a good alternative (with enough time)
    ▪ Komoran with ko_restoration package (https://github.com/lynn-hong/ko_restoration)
    ▪ Increases both model training accuracy / speed
    ▪ However, it is soooooooo slow... ( > 100 times longer execution time)

    View Slide

  39. Disintegrator
    def get_training_data_by_disintegration(sentence):
    disintegrated_sentence = konlpy.tag.Twitter().pos(sentence, norm=True, stem=True)
    original_sentence = konlpy.tag.Twitter().pos(sentence)
    inputData = []
    outputData = []
    is_asking = False
    for w, t in disintegrated_sentence:
    if t not in ['Eomi', 'Josa', 'Number', 'KoreanParticle', 'Punctuation']:
    inputData.append(w)
    for w, t in original_sentence:
    if t not in ['Number', 'Punctuation']:
    outputData.append(w)
    if original_sentence[-1][1] == 'Punctuation' and original_sentence[-1][0] == "?":
    if len(inputData) != 0 and len(outputData) != 0:
    is_asking = True # To extract ask-response raw data
    return ' '.join(inputData), ' '.join(outputData), is_asking
    get_graining_data_by_disintegration

    View Slide

  40. Sample disintegrator
    ▪ Super simple disintegrator using twitter Korean analyzer (with KoNLPy interface)
    나는 오늘 아침에 된장국을 먹었습니다.
    [('나', 'Noun'), ('는', 'Josa'), ('오늘', 'Noun'), ('아침', 'Noun'), ('에', 'Josa'), ('된장국',
    'Noun'), ('을', 'Josa'), ('먹다', 'Verb'), ('.', 'Punctuation')]
    나 오늘 아침 된장국 먹다
    (venv) disintegrator » python test.py
    Original : 나는 오늘 아침에 된장국을 먹었습니다.
    Disintegrated for bot / grammar input : 나 오늘 아침 된장국 먹다
    Training data for grammar model output: 나 는 오늘 아침 에 된장국 을 먹었 습니다
    I ate miso soup in this morning.
    I / this morning / miso soup / eat

    View Slide

  41. Data recycling / reusing
    ▪ Data recycling
    ▪ Input of disintegrator → Output of grammar model
    ▪ Output of disintegrator → Input of grammar model
    original sentence (output for grammar model): 그럼 다시 한 번 프로듀서 께서 소신 표명 을 해주시 겠 어요 ?
    Disintegrated sentence (input for grammar model): 그렇다 다시 하다 번 프로듀서 소신 표명 해주다
    original sentence (output for grammar model): 저기 . 그러니까 .
    Disintegrated sentence (input for grammar model): 저기 그러니까
    original sentence (output for grammar model): 프로듀서 로서 아직 경험 은 부족하지 만 아무튼 열심히 하겠 습니다 .
    Disintegrated sentence (input for grammar model): 프로듀서 로서 아직 경험 부족하다 아무튼 열심히 하다
    original sentence (output for grammar model): 꿈 은 다 함께 톱 아이돌 !
    Disintegrated sentence (input for grammar model): 꿈 다 함께 톱 아이돌

    View Slide

  42. Conversation Bot model
    ▪ Embedding RNN Sequence-to-sequence model for chit-chat
    ▪ For testing purpose: 4-layer to 8-layer swallow-learning (without input/output
    layer)
    ▪ Use tensorflow.contrib.learn (formally sklearn package)
    ▪ Simpler and easier than traditional (3 month ago?) handcrafted RNN
    ▪ Of course, seq2seq, LSTMCell, GRUCell are all bundled!
    According review papers, ML with > 10 layers are.
    And it’s changing now... it became buzz word..
    What is deep-learning model?

    View Slide

  43. Context parser
    ▪ Challenges
    ▪ Continuous conversation
    ▪ Context-aware talks
    ▪ Ideas
    ▪ Context memory
    ▪ Knowledge engine
    ▪ Emotion engine
    Context memory
    Knowledge engine
    Emotion engine
    Context
    parser

    View Slide

  44. Context parser input
    Memory and emotion
    ▪ Context memory as short-term memory
    ▪ Memorizes current context (variable categories. Tested 4-type situations.)
    ▪ Emotion engine as model
    ▪ Understands past / current emotion of user
    ▪ Use context memory / emotion engine as
    ▪ First inputs of context parser model (for training / serving)
    Context
    memory
    Emotion
    engine
    Input
    Disintegrated sentence fragments

    View Slide

  45. Emotion engine
    ▪ Input: text sequence
    ▪ Output: Emotion flag (6-type / 3bit)
    ▪ Training set
    ▪ Sentences with 6-type categorized emotion
    ▪ Uses senti-word-net to extract emotion
    ▪ 6-axis emotional space by using WordVec model
    ▪ Current emotion indicator: the most weighted emotion axis using WordVec model
    Illustration *(c) http://ontotext.fbk.eu/
    [0.95, 0.14, 0.01, 0.05, 0.92, 0.23] [1, 0, 0, 0, 0, 0] 0x01
    index: 1 2 3 4 5 6
    Position in senti-space:

    View Slide

  46. Knowledge engine
    ▪ Advanced topic: Not necessary for chit-chat bots
    ▪ Searches the tokenized knowledge related to current conversation
    ▪ Querying information
    ▪ If target of conversation is query, use knowledge engine result as inputs of
    sentence generator
    ▪ If information fitness is so high, knowledge+template shows great result
    ▪ That’s why information server bot will come to us soon at first.
    ▪ Big topic: I'll not cover today.

    View Slide

  47. Sentence generator
    ▪ Generates human-understandable sentence as a reply of conversation
    ▪ Idea
    ▪ Thinking and speaking is a “separate” processes in Brain
    ▪ Why we use same model for these processes?
    ▪ Models
    ▪ Consists of two models: Grammar generator + tone generator
    ▪ Why separate models?
    ▪ Training cost
    ▪ Much useful: various tones for user preferences
    Clip art *Lego ©

    View Slide

  48. Grammar generator
    ▪ Assembling sentence from word sequence
    ▪ Input: Sequence of Nouns, pronouns, verbs, adjectives
    ▪ sentence without postpositions / conjunction.
    ▪ Output: Sequence of normal / monotonic sentence

    View Slide

  49. Grammar generator
    ▪ Training set
    ▪ Make sequence by disintegrating normal sentence
    ▪ Remove postpositions / conjunction from sequence
    ▪ Normalize nouns, verbs, adjectives
    ▪ Model
    ▪ 3-layer Sequence-to-sequence model
    ▪ Estimator: ADAM optimizer with GRU cell
    ▪ Adagrad with LSTM cell is also ok. In my case, ADAM+GRU works slightly better. (Data
    size effect?)
    ▪ Hidden feature size of GRU cell: 25, Embedding dimension for each word: 25.

    View Slide

  50. RNN Seq2seq grammar model
    HIDDEN_SIZE = 25
    EMBEDDING_SIZE = 25
    def grammar_model(X, y):
    word_vectors = learn.ops.categorical_variable(X,
    n_classes=n_disintegrated_words,
    embedding_size=EMBEDDING_SIZE, name='words')
    in_X, in_y, out_y = learn.ops.seq2seq_inputs(
    word_vectors, y, MAX_DOCUMENT_LENGTH, MAX_DOCUMENT_LENGTH)
    encoder_cell = tf.nn.rnn_cell.GRUCell(HIDDEN_SIZE)
    decoder_cell = tf.nn.rnn_cell.OutputProjectionWrapper(
    tf.nn.rnn_cell.GRUCell(HIDDEN_SIZE), n_recovered_words)
    decoding, _, sampling_decoding, _ = learn.ops.rnn_seq2seq(in_X, in_y,
    encoder_cell, decoder_cell=decoder_cell)
    return learn.ops.sequence_classifier(decoding, out_y, sampling_decoding)
    Simple grammar model (word-based model with GRUCell and RNN Seq2seq / tensorflow translation example)

    View Slide

  51. Tone generator
    ▪ “Tones” to make sentence to be more humanized
    ▪ Every sentence has tones by speaker
    ▪ The most important part to build the “pretty girl chat-bot”
    ▪ Model
    ▪ 3-Layer sequence-to-sequence model
    ▪ Almost same as grammar model (training set is different)
    ▪ Can also be used to make chat bot speaking “dialects”

    View Slide

  52. Tone generator
    ▪ Input: sentence without tones
    ▪ Output: sentence with tones
    ▪ Data: Normal sentences from various conversation sources
    ▪ Training / test set
    ▪ Remove tones from normal sentences
    ▪ morpheme treating effectively removes tone from sentence.

    View Slide

  53. Useful tips
    ▪ Sequence-to-sequence model is inappropriate for Bot engine
    ▪ Easily diverges during training
    ▪ Of course, RNN training will not work.
    ▪ in this case, input / output sequence relationship is too complex
    ▪ Very hard to inject context-awareness to conversation
    ▪ Response with context-aware need to ”generate” sentence not only from the ask,
    but with context-aware data / knowledgebase / decision making process
    ▪ Idea: input sequence into semantic bundle
    ▪ It will work, I guess...

    View Slide

  54. Useful tips
    ▪ Sequence-to-sequence model really work well with grammar / tone
    engine
    ▪ This is important for today’s.

    View Slide

  55. Training models
    Goal is near here

    View Slide

  56. Training bot model
    ▪ Input
    ▪ Disintegrated sentence sequence without postpositions / conjunction
    ▪ Emotion flag (3 bits)
    ▪ Context flag (extensible, appending sentence with special indicator / 2 bits)
    ▪ Output
    ▪ Answer sequence with nouns, pronouns, verbs, adjectives
    ▪ Learning
    ▪ Supervised learning (for simple communication model / replaces template)
    ▪ Reinforcement learning (for emotion / context flag, on the fly production)

    View Slide

  57. Training bot model
    ▪ Training set
    ▪ FAS log data ( http://antispam.textcube.org )
    ▪ 2006~2016 (from EAS data) / comments on weblogs / log size ~1TB (with spams)
    ▪ Visited and crawled non-spam data, based on comment link (~26GB / MariaDB)
    ▪ Original / reply pair as input / output
    ▪ Preprocessing
    ▪ Remove non-Korean characters from data
    ▪ Data anonymization with id / name / E-mail information

    View Slide

  58. Training grammar generator
    ▪ Original data set
    ▪ Open books without license problem ( https://ko.wikisource.org )
    ▪ Comments are not a good dataset to learn grammar
    ▪ Preprocessing
    ▪ Input data: disintegrated sentence sequence
    ▪ Output data: original sentence sequence

    View Slide

  59. Training tone generator
    ▪ Original data set
    ▪ Open books without license problem
    ▪ Extract sentences wrapped with “
    ▪ e.g. "집에서 온 편지유? 무슨 걱정이 생겼수?"
    ▪ Preprocessing
    ▪ Input data: sentence sequence without tone
    ▪ e.g. “집에서 온 편지? 무슨 걱정 생기다?” (using morpheme analyzer)
    ▪ Output data: original sentence sequence

    View Slide

  60. One page summary
    The simplest is the best

    View Slide

  61. Lexical
    Output
    Sentence generator
    Deep-learning model
    (sentence-to-sentence
    + context-aware word generator)
    Grammar generator
    Context
    memory
    Knowledge engine
    Emotion engine
    Context
    parser
    Tone generator
    Disintegrator
    Response
    generator
    NLP + StV
    Context
    analyzer
    +
    Decision
    maker
    Lexical
    Input
    설마 날 신경써주고 있는 거야?
    설마 날 신경 써주다 있다
    어제 네 기운 없다
    어제 네가 기운이 없길래
    어제 네가 기운이 없길래 요
    [GUESS] 날 [CARE] [PRESENT]
    Disintegrator
    Context analyzer
    Decision maker
    Grammar generator
    Tone generator

    View Slide

  62. Lexical
    Output
    Sentence generator
    Deep-learning model
    (sentence-to-sentence
    + context-aware word generator)
    Grammar generator
    Context
    memory
    Knowledge engine
    Emotion engine
    Context
    parser
    Tone generator
    Disintegrator
    Response
    generator
    NLP + StV
    Context
    analyzer
    +
    Decision
    maker
    Lexical
    Input
    No way, are you caring me now?
    no way you care I now
    because yesterday you tired
    Because you looked tired yesterday
    Because you looked tired yesterday hmm
    [GUESS] I [CARE] [PRESENT]
    Disintegrator
    Context analyzer
    Decision maker
    Grammar generator
    Tone generator

    View Slide

  63. But this is not what I promised...
    at PyCON APAC abstract

    View Slide

  64. Making 미소녀bot
    Let’s make anime character bot (as I promised)!

    View Slide

  65. Data source
    ▪ Subtitle (caption) files of many Animations!
    ▪ Prototyping
    ▪ Idol master conversation script (translated by online fans)
    ▪ Field tests
    ▪ Animations only with female characters

    View Slide

  66. Data converter
    .smi to .srt
    Join
    .srt files into one .txt
    Remove
    timestamps
    and
    blank lines
    Remove
    Logo / Ending
    Song scripts
    : Lines with
    Japanese
    Characters
    and
    the next lines
    of them
    Fetch
    Character names
    Nouns
    Numbers
    using
    custom dictionary
    (Anime characters,
    Locations,
    Specific nouns)
    cat *.srt >> data.txt
    subtitle_converter.py
    *.smi file format is de facto standard of movie caption files in Korea

    View Slide

  67. Extract Conversations
    Conversation data
    for sequence-to-sequence
    Bot model
    Reformat
    merge
    sliced captions
    into one line
    if last_sentence [-1] == '?':
    conversation.add((
    last_sentence,
    current_sentence))
    Remove
    Too short sentences
    Duplicates
    Sentence data
    for
    disintegrator
    grammar model
    tone model
    Train
    disintegrator
    integrator with
    grammar model
    tone model
    Train
    bot model
    subtitle_converter.py
    pandas
    pandas

    View Slide

  68. Conveniences for demo
    ▪ Simple bot engine
    ▪ ask – response sentence similarity match engine (similar to template engine)
    ▪ Merge grammar model with tone model
    ▪ Grammar is not important to create anime character bot?
    ▪ Loose parameter set
    ▪ For fast convergence: data size is not big / too diverse
    ▪ No knowledge engine
    ▪ We just want to talk with him/her.

    View Slide

  69. I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
    total conversations: 4217
    Transforming...
    Total words, asked: 1062, response: 1128
    Steps: 0
    I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had
    negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
    name: GeForce GTX 970
    major: 5 minor: 2 memoryClockRate (GHz) 1.304
    pciBusID 0000:01:00.0
    Total memory: 4.00GiB
    Free memory: 3.92GiB
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
    I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device:
    0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1501 get requests,
    put_count=1372 evicted_count=1000 eviction_rate=0.728863 and unsatisfied allocation rate=0.818787
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2405 get requests,
    put_count=2388 evicted_count=1000 eviction_rate=0.41876 and unsatisfied allocation rate=0.432432
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281
    Bot training procedure (initialization)

    View Slide

  70. ask: 시 분 시작 하다 이 것 대체 .
    response (pred): NAME 해오다 .
    response (gold): NAME 죄송하다.
    ask: 쟤 네 사무소 주제 너무 하다 거 알다.
    response (pred): NAME 해오다 .
    response (gold): 아깝다 꼴 찌다 주목 다 받다
    ask: 아니다 .
    response (pred): NAME 해오다 .
    response (gold): 더 못 참다
    ask: 이렇다 상태 괜찮다 .
    response (pred): 이렇다 여러분 .
    response (gold): NOUN 여러분.
    ask: 기다리다 줄 수 없다 .
    response (pred): 네 충분하다 기다리다 .
    response (gold): 네 충분하다 기다리다.
    ask: 넌 뭔가 생각 하다 거 있다 .
    response (pred): 물론 이 .
    response (gold): 물론 이.
    Bot model training procedure (after first fitting)
    Bot model training procedure (after 50 more fittings)
    Trust me.
    Your NVIDIA card
    can not only play
    Overwatch, but this,
    too.

    View Slide

  71. I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
    total line: 7496
    Fitting dictionary for disintegrated sentence...
    Fitting dictionary for recovered sentence...
    Transforming...
    Total words pool size: disintegrated: 3800, recovered: 5476
    I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had
    negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
    name: GeForce GTX 970
    major: 5 minor: 2 memory
    ClockRate (GHz) 1.304
    pciBusID 0000:01:00.0
    Total memory: 4.00GiB
    Free memory: 3.92GiB
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
    I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: YI
    tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0,
    name: GeForce GTX 970, pci bus id: 0000:01:00.0)
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1501 get requests,
    put_count=1372 evicted_count=1000 eviction_rate=0.728863 and unsatisfied allocation rate=0.818787
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2405 get requests,
    put_count=2388 evicted_count=1000 eviction_rate=0.41876 and unsatisfied allocation rate=0.432432
    I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281
    Grammar+Tone model training procedure (initialization)

    View Slide

  72. disintegrated: 올해 우리 프로덕션 NOUN 의 활약 섭외 들어오다 .
    recovered (pred): 그래서 저기 들 나요 .
    recovered (gold): 올해 는 우리 프로덕션 도 NOUN 의 활약 으로 섭외 가 들어왔 답 니다.
    disintegrated: 둘 다 왜 그렇다 .
    recovered (pred): 어머 어머 아 .
    recovered (gold): 둘 다 왜 그래.
    disintegrated: 정말 우승 하다 것 같다 .
    recovered (pred): 정말 를 .
    recovered (gold): 정말 우승할 것 같네 요.
    disintegrated: 아 진짜 .
    recovered (pred): 아 아 을까 .
    recovered (gold): 아 진짜.
    disintegrated: 호흡 딱 딱 맞다 .
    recovered (pred): 무슨 을 .
    recovered (gold): 호흡 이 딱 딱 맞 습니다.
    disintegrated: 무슨 소리 NAME .
    recovered (pred): 무슨 소리 음 .
    recovered (gold): 무슨 소리 야 NAME.
    disintegrated: 너 맞추다 또 넘어지다 거 잖다 .
    recovered (pred): 너 겹친 또 넘어질 거 .
    recovered (gold): 너 한테 맞춰 주 면 또 넘어질 거 잖아.
    disintegrated: 중계 나름 신경 써주다 .
    recovered (pred): 무대 에서도 을 신경 .
    recovered (gold): 중계 에서도 나름 대로 신경 을 써줘.
    Grammar+Tone model training procedure (after first fitting)
    Grammar+Tone model training procedure (after 10 more fitting)
    Grammar model
    converges fast.
    With GPU,
    it converges much
    faster.

    View Slide

  73. Grammar training
    Bot training
    0
    20
    40
    60
    80
    100
    CPU-only GPU(GTX970)
    Calculation time (scaled to 100%)
    Training speed test
    Grammar training Bot training
    And you must need
    GPU-accelerated environment
    to let them work.

    View Slide

  74. Useful tips for anime character bot
    ▪ DO NOT MIX different anime subtitles
    ▪ Easily diverges during grammar model training. Strange. Huh?
    ▪ Does it come from different translator’s tone? Need to check why.
    ▪ Choose animation with extreme gender ratio
    ▪ Very hard to divide gender-specific conversations from data
    ▪ Tones of Japanese animation character are very different by speakers’ gender
    ▪ Just choose boy-only / girl-only animation for easy data categorization

    View Slide

  75. And tackles today
    ▪ From TensorFlow 0.9RC, Estimator/TensorFlowEstimator.restore is
    removed and not returned yet
    ▪ I can create / train model but cannot load model with original code on TF 0.10RC.
    ▪ Made some tricks for today’s demo
    ▪ Auto-generated talk templates from bot
    ▪ Response matcher (match ask sentence and return response from template pool)
    ▪ Conversation dataset size is too small to create conversation model
    ▪ Not smooth talks
    ▪ Easily diverges. Train many, many models to get proper result.

    View Slide

  76. Serving
    Like peasant in Warcraft (OR workleft?)

    View Slide

  77. Telegram API
    ▪ Why Telegram?
    ▪ Telegram is my primary messenger
    ▪ API implementation is as easy as writing echobot
    ▪ Well-suited with python 3

    View Slide

  78. Serving Telegram bot
    ▪ Python 3
    ▪ Supervisor (for continuous serving)
    [program:pycon-bot]
    command = /usr/bin/python3 /home/ubuntu/pycon_bot/serve.py
    /etc/supervisor/conf.d/pycon_bot.conf
    ~$ pip3 install python-telegram-bot
    Install python-telegram-bot package
    ubuntu@ip-###-###-###-###:~$ sudo supervisorctl
    pycon-bot RUNNING pid 12417, uptime 3:29:52
    supervisorctl

    View Slide

  79. Bot serving code
    from telegram import Updater
    from pycon_bot import pycon_bot, error, model_server
    bot_server = None
    grammar_server = None
    def main():
    global bot_server, grammar_server
    updater = Updater(token=’[TOKENS generated via bot_father]')
    job_queue = updater.job_queue
    dispatcher = updater.dispatcher
    dispatcher.addTelegramCommandHandler('start', start)
    dispatcher.addTelegramCommandHandler("help", start)
    dispatcher.addTelegramMessageHandler(pycon_bot)
    dispatcher.addErrorHandler(error)
    bot_server = model_server(‘./bot’, ‘ask.vocab’, ‘response.vocab’)
    grammar_server = model_server(‘./grammar’, ‘fragment.vocab’, ‘result.vocab’)
    updater.start_polling()
    updater.idle()
    if __name__ == '__main__':
    main()
    /home/ubuntu/pycon_bot/serve.py

    View Slide

  80. Model server
    class model_server(self):
    """ pickle version of TensorFlow model server """
    def __init__(self, model_path='.', x_proc_path='', y_proc_path=''):
    self.classifier = learn.TensorFlowEstimator.restore(model_path)
    self.X_processor = pickle.loads(open(model_path+'/'+x_proc_path,'rb').read())
    self.y_processor = pickle.loads(open(model_path+'/'+y_proc_path,'rb').read())
    def predict(input_data):
    X_test = X_processor.transform(input_data)
    prediction = self.classifier.predict(X_test, axis=2)
    return self.y_processor.reverse(prediction)
    pycon_bot.model_server

    View Slide

  81. Bot engine code
    def pycon_bot(bot, update):
    msg = disintegrate(update.message.text)
    raw_response = bot_server.predict(msg)
    response = grammar_server.predict(raw_answer)
    bot.sendMessage(chat_id=update.message.chat_id, text=’ '.join(response))
    def disintegrate(sentence):
    disintegrated_sentence = konlpy.tag.Twitter().pos(sentence, norm=True,
    stem=True)
    result = []
    for w, t in disintegrated_sentence:
    if t not in ['Eomi', 'Josa', 'Number', 'KoreanParticle', 'Punctuation']:
    result.append(w)
    return ' '.join(result)
    pycon_bot.pycon_bot
    pycon_bot.disintegrate

    View Slide

  82. Result
    That's one small step for a man, one giant leap for anime fans.
    Illustration *(c) Bandai Namco Games

    View Slide

  83. Hi
    When will we open this bot to public?
    Sorry Jeongkyu.
    Sorry? Why?
    I apologize to seniors
    ;;;
    [ERROR]
    The weather is so hot.
    Suddenly but I feel sorry
    What makes you feel like that?
    Nowadays I lose my concentration.
    Ah. sometimes I do too.
    Getting stressful?
    My work is very stressful.
    Let’s not be nervous today.
    I’m still nervous.
    Illustration * © Idol M@aster / Bandai Namco Games. All rights reserved.

    View Slide

  84. And finally... created pretty sad bot.
    Reason?
    Idol M@ster’s conversations are mostly about
    failure and recover
    rather than success.
    Illustration * © Idol M@aster / Bandai Namco Games. All rights reserved.

    View Slide

  85. Summary
    ▪ Today
    ▪ Covers garage chat bot making procedure
    ▪ Making chat bot with TensorFlow + Python 3
    ▪ My contributions / insight to you
    ▪ Multi-modal Learning models / structures for chat-bots
    ▪ Idea to generate “data” for chat-bots

    View Slide

  86. And next...
    ▪ Suggestion from Shin Yeaji (PyCon APAC staff) and my wife in this week
    ▪ Train bot with some unknown (to me) animations.
    ▪ Finish anonymization of FAS data and re-train bot with TensorFlow
    ▪ In fact, FAS data-based bot is run by Caffe. (http://caffe.berkeleyvision.org/)
    ▪ This speak preparation encourages me to migrate my Caffe projects to
    TensorFlow
    ▪ Test Seq2seq to bot engine?
    ▪ By making input sequence into semantic bundle

    View Slide

  87. Home assignment
    ▪ If you are Loveliver*, you already know what to do.
    Internet meme * (c) Marble Entertainment / inven.co.kr
    Are you L..?
    Idol M@ster?
    *The fans of lovelive (another Japanese animation)

    View Slide

  88. Home assignment
    ▪ If your native language is English, how about making
    ?

    View Slide

  89. –Bryce Courtenay's The Power of One
    “First with the head, then with the heart.”

    View Slide

  90. Thank you for listening :)
    @inureyes
    Slides available via pycon.kr
    Code will be available at https://github.com/inureyes/pycon-apac-2016

    View Slide

  91. Selected references
    ▪ De Brabandere, B., Jia, X., Tuytelaars, T., & Van Gool, L. (2016, June 1). Dynamic Filter Networks. arXiv.org.
    ▪ Noh, H., Seo, P. H., & Han, B. (2015, November 18). Image Question Answering using Convolutional Neural Network with Dynamic Parameter
    Prediction. arXiv.org.
    ▪ Andreas, J., Rohrbach, M., Darrell, T., & Klein, D. (2015, November 10). Neural Module Networks. arXiv.org.
    ▪ Bengio, S., Vinyals, O., Jaitly, N., & Shazeer, N. (2015, June 10). Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks.
    arXiv.org.
    ▪ Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science (New York, NY), 349(6245), 253–255.
    http://doi.org/10.1126/science.aac4520
    ▪ Bahdanau, D., Cho, K., & Bengio, Y. (2014, September 2). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv.org.
    ▪ Schmidhuber, J. (2014, May 1). Deep Learning in Neural Networks: An Overview. arXiv.org. http://doi.org/10.1016/j.neunet.2014.09.003
    ▪ Zaremba, W., Sutskever, I., & Vinyals, O. (2014, September 8). Recurrent Neural Network Regularization. arXiv.org.
    ▪ Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013, January 17). Efficient Estimation of Word Representations in Vector Space. arXiv.org.
    ▪ Smola, A., & Vishwanathan, S. V. N. (2010). Introduction to machine learning.
    ▪ Schmitz, C., Grahl, M., Hotho, A., & Stumme, G. (2007). Network properties of folksonomies. World Wide Web ….
    ▪ Esuli, A., & Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. Presented at the Proceedings of LREC.

    View Slide