AI-driven app that helps students and teachers with learning English ‣ Role playing chatbots + automatic English assessment ‣ K12, Universities, Businesses ‣ iOS & Android. Browser version on its way Joyz, Inc. 4
case, it's: ‣ (Statistical) natural language processing ‣ Acoustics and speech processing ‣ Statistical modelling of students' abilities and learning processes Joyz, Inc. 7
to personal taste ‣ Test suite ‣ Basic but usable contrib.auth module ‣ MIGRATION! ‣ Easier to hire for Django allows team to grow without slowing down. Joyz, Inc. 9
three words: "I", "love" and "Django" ‣ Six permutations. ‣ "I love Django.", "Django, I love." ‣ What's YOUR vocabulary like? Best to build LM for learners. Joyz, Inc. 17
need time to pause when speaking ‣ No need for standing connection ‣ Some needs references (e.g. dictionary) when formulating what to say Timescale of interaction is similar to that of web apps (1s~10s) Joyz, Inc. 22
speech triggers transition ‣ Pronunciation, grammar, semantics ‣ In process of moving to a document based approach ‣Document & format versioning ‣Need to explore serialiser options Joyz, Inc. 23
stateless processes ‣ S3 for static resources ‣ Cache for hot dynamic content Peak times: morning-early afternoon for K12, evening for universities and businesses Joyz, Inc. 29
Django + Celery app ‣ Avoid cyclic dependencies (not even dirty hacks!) 2. Factor out to microservice if: ‣ RAM/CPU profiles are vastly different ‣ Code is very stable and/or a dedicated team can be assigned Joyz, Inc. 31
Models (GBs of RAM) ‣ Parsers (GBs of RAM) ‣ Django + gunicorn needs multiple processes ‣ Hybrid setup (multiple web/worker processes + 1 process for each NLP) or Microservice Joyz, Inc. 32
‣ Speech data into phoneme-wise pronunciation accuracy ‣ Estimating vocabulary ‣ Measuring fluency ‣ Data query, aggregation, analysis using NLP microservices Joyz, Inc. 33
Data processing ‣ Computational graph with multiple roots # Node D has two roots, B and C # A is an input node i.e. outputs raw data edges = [(A, B), (A, C), (B, D), (C, D)] # There can be cases where nodes can't even be organised into distinct layers like the example above Joyz, Inc. 34
2. Can run on the same infrastructure as other background tasks such as ‣ Talking SMTP (sending emails) ‣ Data manipulation 3. Out-of-the-box API for parallel execution + callback Joyz, Inc. 36
‣ Machine learning based components: assert on statistical validation, not on the output of single inference ‣ Online learning algorithms need a modified strategy ‣ Test should contain training as well as inference Joyz, Inc. 39
‣ Accepted answers today should still be accepted tomorrow ‣ Running behavioural tests over user logs ‣ Very large fixture ‣ Aggressive parallelisation Joyz, Inc. 41
lets you move fast ‣ Test your UX before you build anything (or pick a framework) ‣ Celery is robust, performant and versatile, allowing you to build complex logi on top ‣ Tests can be heavy, so plan your CI accordingly Joyz, Inc. 44