Predict the future and make breakfast fun, with Markov Chains

Complete Online Safety for Social Brands, Digital Platforms, Advertisers and
Kids Through world-leading AI and a global team of digital risk experts we deliver complete Trust and Safety for social brands, digital platforms, advertisers and kids from bad actors that exploit, extort, distress, offend and misinform.

Predict the future and make breakfast FUN, with Markov Chains

Hello! Chris Core Data Collection Lead at Crisp Thinking •
Started in 2012 • We keep online communities safe • Music / Electronic Engineering @ University of Leeds Hobbies: • Guitar • Photography

Questions Will do a Q&A at the end. However, for
any questions during the talk, please do ask!

High level intro to state machines and Markov Chains MarkovSharp
Baconbotand other Markov Chain uses Talk Overview

State machines, weak AI and stochastic processes

State Machine Represent a process What state am I in
now? Where can I go from there?

State Machines State machines describe transitions between a fixed set
of states. Can only be in one state at once! Water freezing/melting/boiling Game development and animation Cat

A stochastic process is a random process evolving with time
Like rolling a dice to decide which way to walk Markov chains are a type of stochastic process, where the next state is dependent on the current state Stochastic Process

What is a Markov chain, and what are they for?

That depends where you look

Markov chains - probabilistic state machine Represent states, and the
probability of transitions between them No matter how the process arrived at its present state, possible future states are fixed A state could be anything Sunny / Rainy

Visualising a Markov process

Markov Chains Training data sequences are split into atomic parts
E.g for each word in a book, index that word, and store the next word as a possible future state Over time after training, a single word might have a number of different possible future states A trained model can be used to create output with similar characteristics to the training set Are trained by indexing training data

As we move through a chain, other potential futures are
ignored Thousand Larger Tank High Herd Bit

Possible futures can be based on a combination of multiple
previous states Remembering more than just the last state ‘Strengthens’ prediction causing predictions to fit tighter to trained data Number of prior states encoded in chain is called the ‘Order’ of the model

In the most basic types of Markov Chain, states correspond
to single points in a sequence Single words in a sentence: [‘the’], [‘red’], [‘is’], [‘the’], [‘best’] Remembering more than just the last state But states could hold multiple points in a sequence Multiple words in a sentence: [‘the’, ‘red’], [‘red’, ’is’], [‘is’, ’the’], [‘the’, ‘best’]

1st Order: Current state is ‘this’ Example: training on words
2nd Order: Current state is ‘is this’ What word might come next? In 2nd order chain, ‘is’ will almost certainly not be likely, but it may be very likely in the 1st order Low order chains can cause ‘flip flopping’ ‘is this is this is this is this is this is this is this[…]’ Markov chains can be useful in linguistic analysis

‘N’ refers to length N-grams Splits a sequence into buckets
Creating n-grams for the word ‘quick’ depends on ‘n’: • Length 1 (unigram): [ q, u, i, c, k ] • Length 2 (bigram): [ qu, ui, ic, ck] • Length 3 (trigram): [ qui, uic, ick ] • Length 4 (four-gram): [ quic, uick ] • Length 5 (five-gram): [ quick ]

Example process of a chain trained on text Given a
sentence of text input as training data Split input into words (tokenise) For every word: State = ‘key’ based on the word, and the previous n-1 words (where n is the model Order) Store the next word in the sequence against the key Key (state) Value (next) the quick, lazy quick brown brown fox fox jumps jumps over over the lazy dog dog - The quick brown fox jumps over the lazy dog

What changing order Looks like (words) 1st Order (unigrams) •Habit
is a manuscript do him untrustworthy is an appropriate receptacle. •Not the night of people who have anybody else, and all that a chorus of the end. Life is to do? 2nd Order (bigrams) •Go confidently in the foot; C++ makes it easy to take over and surrender. •Artists who seek consolation in existing churches often pay for maturity. 3rd Order (trigrams) •Resistance is never the agent of change. You have to put time into it to build an audience. •Victory attained by violence is tantamount to a defeat, for it is not in earning as much as you please.

What changing order Looks like (letters) 1st Order (unigrams) •If
hares naro frd t tiaristh ale o ceyouthemy sigers gsanjoredmig firche sthak indutequg be. 2nd Order (bigrams) •If ust of meme ted? Why whing a hat but composecoment th alit th ting land move of Chimetive warehing. 3rd Order (trigrams) •Trust founts a poingry took only God, premen was are you with storied itself - shalf the timind it.

What changing order sounds like Original 1st order (next note
based on single previous note) 2nd order (next note based on two previous notes)

Viewing a chain as a Transition table State A B
C D E F G Next F 33% A 50% B 50% C 100% C 50% E 50% A 22% G 67% G 50% G 25% D 50% F 50% C 11% END 25% D 11% E 11% G 45%

MarkovSharp

What is it? // Make a new empty 3rd order
model // States are words represented as strings. var model = new StringMarkov(3); // Learn an array of text model.Learn(Quotes); // generate some new text model.Walk(5).Dump(); C# library that can be used to create Markov Chains without much fuss Can model any type of data as state (generic type) No restriction on chain order length

Most other implementations in C# were not generic Why? Others
had fixed order models Others not very easy to use

MarkovSharp allows developers to use any data type they wish
to represent state Model automatically splits phrases (e.g sentences) into unigrams (e.g words) Comes with a number of pre-built models for common use cases MarkovSharp Generic Models Allows developers to add their own if needed

Model Types StringMarkov: states are collections of strings (words) •Creates
new sentences where words are based on trained words SubstringMarkov: states are collections of individual characters (letters) •Creates new words similar to those in the training set language in sentences •Words created are not necessarily real words - based on letter probability distributions in training set: Betteem we is got but dignitive endity what by with is world, whethings, how lition Este nouce et êtréfique a exemeuvé entturtance magraire depuisans peilien ages de our la tre de bind le phratille SanfordMidiMarkov: states are collections of musical MIDI notes •Creates new midi files where notes and timings are based on trained MIDI music.

BaconBot

Why? Free breakfast on Fridays Wasted time collecting orders People
missed breakfast Forgetting to order (disaster!) We use Slack, so…

Solving the important problems Slack API SQL Server Database C#
Windows Service •Margiebot •MarkovSharp

Margiebot var myBot = new Bot(); myBot.RespondsTo(”hi bot”).With(”hey there”); Takes
care of most low level ‘plumbing’ Quick to get results C# library to make writing slack-bots easier

What can we do? Ordering: Cancelling: Reminders:

What can we do? Repeat orders: Sass:

How MarkovSharp fits in Order suggestions •Create a 2nd order
Markov chain •Use orders as training data The higher order the chain, the more sensible the suggestion baconBot.RespondsTo(”what should I get?”) .With(BreakfastModel.GetSuggestion());

Concerning

Real world applications for predicting the future

Making Predictions Current state is known Possible future states are
known, with probabilities Using simple example for sun/rain on left: •Given Sunny: Prain = 25% and Psun = 75% •Given Rainy: Prain = 50% and Psun = 50% Often represented as a matrix: S R S R Given Chance

Making Predictions Prediction for two days time given today is
sunny? •We don’t care about weather tomorrow, only day after S R S R Given Chance There are two outcomes that satisfy sun in 2 days time given it is sunny today: [S, S] or [R, S] •Given [S], PSS = 0.75 x 0.75 •Given [S], PRS = 0.25 x 0.5

Making Predictions AKA dot product of two matrices •(for mathematicians)
S R S R Given Chance •Given [S], PSS = 0.75 x 0.75 •Given [S], PRS = 0.25 x 0.5 Both are viable outcomes, so if today is sunny, probability it is sunny in 2 days: •(0.75 x 0.75) + (0.25 x 0.5) = 0.5625 + 0.125 = 0.6875, or about 68% General solution for a state n steps ahead:

Accuracy over long times Forecasts made for far future states
settle into a steady state representation of the model •Probability for state 200 days in the future and 201 days in the future is the same •Steady state is no longer dependent on the initial state

Other uses Suggested destinations (Google Maps) Google Page Rank Natural
language generation (subreddit simulator / dissociated press) Reverse cypher (code breaker) Algorithmic music composition (Max/MSP / Supercollider) Predictive text Bioinformatics: DNA sequence simulation or gene drift over time Economics: predictions for market value or share price Basis for speech recognition (hidden Markov models)

Predictive text demo

Thank You For Listening Any questions? If you have a
laptop and a C# editor, feel free to try out MarkovSharp https://github.com/chriscore/MarkovSharp

Predict the future and make breakfast fun, with...

Predict the future and make breakfast fun, with Markov Chains

Other Decks in Technology

Featured

Transcript