Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Patterns, Automata and Regular Expressions

Patterns, Automata and Regular Expressions

Date: March 2, 2016
Course: UiS DAT911 - Foundations of Computer Science (fall 2016)

Please cite, link to or credit this presentation when using it or part of it in your work.

#ComputerScience #CS #RegularExpressions # RegEx #Automata

Darío Garigliotti

March 02, 2016
Tweet

More Decks by Darío Garigliotti

Other Decks in Programming

Transcript

  1. Patterns and their definition • Importance of patterns across a

    lot of areas in computer science • Two fundamental problems: definition and recognition • We will see two different and equivalent ways of defining a pattern (for us, a set of strings) • Finite automata • Regular expressions
  2. Programs, states, transitions • Let's think about a program that

    search for patterns • Example: finding words with subsequence aeiou
  3. Finite automata • Let's give a finite automaton which represents

    it • Some automata terminology: • States • Transitions • Labels • Accepting states and start state • Input strings
  4. Deterministic vs nondeterministic automata • The example was a deterministic

    automata: each symbol is contained in 1 transition • Nondeterministic automata allow to have nondeterministic transitions • Acceptance of an input string asks only for at least one path labeled with it leading to an accepting state • We can convert one into another, preserving the accepted language (i.e. automata are equivalence)
  5. Deterministic vs nondeterministic automata • The example was a deterministic

    automata: each symbol is contained in 1 transition • Nondeterministic automata allow to have nondeterministic transitions • Acceptance of an input string asks only for at least one path labeled with it leading to an accepting state • We can convert one into another, preserving the accepted language (i.e. automata are equivalence)
  6. Deterministic vs nondeterministic automata • Any deterministic one is also

    nondeterministic • We can convert a nondeterm. N into a determ. D using the subset construction • The states in D are subsets of states in N • Start state of D is {s0 }, being s0 start state of N • For each state S in D, we consider each input symbol x in turn. • There is a transition S->T in D, whose label includes x, for a state T, set of states t in N, if for some s in S there is a transition s->t including x in label • A state in D is accepting if contains at least one accepting state in N
  7. Deterministic vs nondeterministic automata • Often the obtained deterministic automaton

    has many more states than the nondeterministic • We can see that the automata are equivalent
  8. Regular expressions • Regular expressions are an algebraic way of

    defining patterns • Operands and values 1. A character x, L(x) = {x} 2. The symbol 3. The symbol 4. A variable, whose value is a regexp • Operators • Union R | S • Concatenation RS • (Kleene) closure R* • Precedence: reverse to the previous order • Equivalence of regular expressions • Algebraic laws ✏, L(✏) = {✏} ;, L(;) = ; R ⌘ S if L(R) = L(S) 1. (; | R) ⌘ (R | ;) ⌘ R 2. ✏R ⌘ R✏ ⌘ R 3. ;R ⌘ R; ⌘ ; 4. (R | S) ⌘ (S | R) 5. ((R | S) | T) ⌘ (R | (S | T)) 7. (R(S | T)) ⌘ (RS | RT)) 8. ((S | T)R) ⌘ (SR | TR)) 6. (R(ST)) ⌘ ((RS)T) 9. (R | R) ⌘ R 10. ;⇤ ⌘ ✏ 11. RR⇤ ⌘ R⇤R 12. (RR⇤ | ✏) ⌘ R⇤
  9. From regular expressions to automata • We introduce automata with

    epsilon-transitions 1. From a regular expression to an automaton with epsilon- transitions, accepting the same language 2. From an automaton with epsilon-transitions to an automaton without epsilon-transitions, accepting the same language
  10. From regular expressions to automata 1. From a regular expression

    to an automaton with epsilon- transitions, accepting the same language Algorithm derived from a complete induction on the number of operator occurrences in the regular expression • Basis: n=0, i.e. R an atomic operand
  11. From regular expressions to automata 1. From a regular expression

    to an automaton with epsilon- transitions, accepting the same language Algorithm derived from a complete induction on the number of operator occurrences in the regular expression • Induction: Case R = R1 | R2
  12. From regular expressions to automata 1. From a regular expression

    to an automaton with epsilon- transitions, accepting the same language Algorithm derived from a complete induction on the number of operator occurrences in the regular expression • Induction: Case R = R1 R2
  13. From regular expressions to automata 1. From a regular expression

    to an automaton with epsilon- transitions, accepting the same language Algorithm derived from a complete induction on the number of operator occurrences in the regular expression • Induction: Case R = R1 *
  14. From regular expressions to automata 2. From an automaton with

    epsilon-transitions to an automaton without epsilon-transitions, accepting the same language • Reachability of states by paths with epsilon-transitions, in particular, for accepting states • New automaton bundles in one transition: a path with epsilon-transitions and a transition to an important state both from the old automaton.
  15. From regular expressions to automata 2. From an automaton with

    epsilon-transitions to an automaton without epsilon-transitions, accepting the same language • Reachability of states by paths with epsilon-transitions, in particular, for accepting states • New automaton bundles in one transition: a path with epsilon-transitions and a transition to an important state both from the old automaton.
  16. From regular expressions to automata 2. From an automaton with

    epsilon-transitions to an automaton without epsilon-transitions, accepting the same language • Reachability of states by paths with epsilon-transitions, in particular, for accepting states • New automaton bundles in one transition: a path with epsilon-transitions and a transition to an important state both from the old automaton.
  17. From regular expressions to automata 2. From an automaton with

    epsilon-transitions to an automaton without epsilon-transitions, accepting the same language • Reachability of states by paths with epsilon-transitions, in particular, for accepting states • New automaton bundles in one transition: a path with epsilon-transitions and a transition to an important state both from the old automaton.
  18. From regular expressions to automata 2. From an automaton with

    epsilon-transitions to an automaton without epsilon-transitions, accepting the same language • Reachability of states by paths with epsilon-transitions, in particular, for accepting states • New automaton bundles in one transition: a path with epsilon-transitions and a transition to an important state both from the old automaton.
  19. From regular expressions to automata 2. From an automaton with

    epsilon-transitions A to an automaton without epsilon- transitions B, accepting the same language • Define the set of important states S as the set of states of B • Add to B a transition from states i to j (both in S) with label x if, in A, for some state k • k reachable from i by a path with epsilon-transitions • there is a transition from k to j with label x • Make an i in S an accepting state if, in A, from i an accepting is reachable by epsilon-trs.
  20. From regular expressions to automata 2. From an automaton with

    epsilon-transitions A to an automaton without epsilon- transitions B, accepting the same language • Define the set of important states S as the set of states of B • Add to B a transition from states i to j (both in S) with label x if, in A, for some state k • k reachable from i by a path with epsilon-transitions • there is a transition from k to j with label x • Make an i in S an accepting state if, in A, from i an accepting is reachable by epsilon-trs.
  21. From automata to regular expressions • We need to remove

    states from an automaton • Use union on each arc, and concatenation along a path State-Elimination Construction • Add possibly • Add possibly • Replace U, U = Ø Rij, Rij = Ø Rij := Rij | SiU⇤Tj
  22. From automata to regular expressions Algorithm: Complete reduction of the

    automaton • For each accepting state , with starting state • Remove from (a copy of the original) automaton states until only and remain. Take the expr. • Take the union of the exprs. for each accepting state t 2 A s 6= t =) Rs,t = S⇤U(T | V S⇤U)⇤ s = t =) Rs,t = S⇤ s t s Rs,t