Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Natural Language Framework

d_date
June 11, 2018

Natural Language Framework

This slide only contains public information.

Publication
2018/06/11 🦍
2018/06/14 STT

d_date

June 11, 2018
Tweet

More Decks by d_date

Other Decks in Programming

Transcript

  1. Natural Language Intelligence Linguistics Machine Learning Language Identification Tokenization Part

    of Speech Lemmatization Named Entity Recognition Word Sentence Paragraph Natural
 Language Input NaturalLanguage.framework WWDC18 session 713 Introducing Natural Language
  2. NLTokenizer Tokenize text into word νϟ΢μʔ৯΂͍ͨʂ νϟ΢μʔ ৯΂ ͍ͨ let

    tokenizer = NLTokenizer(unit: .word) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) }
  3. NLTokenizer Tokenize text into word let tokenizer = NLTokenizer(unit: .word)

    tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) }
  4. NLTokenizer Tokenize text into word let tokenizer = NLTokenizer(unit: .word)

    tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } Initialize NLTokenizer with NLTokenUnit Set string to tokenizer Get ranges to tokens Subscript text with range
  5. NLTokenizer Tokenize text into sentence let tokenizer = NLTokenizer(unit: .sentence)

    tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) }
  6. NLTokenizer Tokenize text into sentence νϟ΢μʔ৯΂͍ͨʂ νϟ΢μʔ৯΂͍ͨ let tokenizer =

    NLTokenizer(unit: .sentence) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } ʂ
  7. NLTokenizer Tokenize text into sentence ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻ ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻͰ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔ

    ϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ͏ɻ let tokenizer = NLTokenizer(unit: .sentence) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻaO ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻ Ͱ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ ͏ɻ
  8. NLTokenizer Tokenize text into sentence ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻ ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻͰ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔ

    ϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ͏ɻ let tokenizer = NLTokenizer(unit: .sentence) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻaO ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻ Ͱ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ ͏ɻ
  9. NLTokenizer Tokenize text into paragraph ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻ ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻͰ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔ

    ϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ͏ɻ let tokenizer = NLTokenizer(unit: .paragraph) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻaO ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻ Ͱ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ ͏ɻ
  10. NLTokenizer Tokenize text into document ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻ ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻͰ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔ

    ϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ͏ɻ let tokenizer = NLTokenizer(unit: .document) tokenizer.string = text tokenizer.tokens(for: text.startIndex..<text.endIndex) .forEach { (range) in print(text[range]) } ͋ͷΠʔϋτʔϰΥͷ͖͢ͱ͓ͬͨ෩ɺՆͰ΋ఈʹྫྷͨ͞Λ΋ͭ੨͍ͦΒɺ͏͍ͭ͘͠৿Ͱ০ΒΕͨϞϦʔΦࢢɺ߫֎ͷ͗Β͗ Βͻ͔Δ૲ͷ೾ɻaO ɹ·ͨͦͷͳ͔Ͱ͍ͬ͠ΐʹͳͬͨͨ͘͞ΜͷͻͱͨͪɺϑΝθʔϩͱϩβʔϩɺ༽ࣂͷϛʔϩ΍ɺإͷ੺͍͜Ͳ΋ͨͪɺ஍ओ ͷςʔϞɺࢁೣത࢜ͷϘʔΨϯτɾσετΡύʔΰͳͲɺ͍·͜ͷ҉͍ڊ͖ͳੴͷݐ෺ͷͳ͔Ͱߟ͍͑ͯΔͱɺΈΜͳΉ͔͠෩ ͷͳ͔͍ͭ͠੨͍ݬ౯ͷΑ͏ʹࢥΘΕ·͢ɻ Ͱ͸ɺΘͨ͘͠͸͍͔ͭͷখ͞ͳΈͩ͠Λ͚ͭͳ͕Βɺ͔ͣ͠ʹ͋ͷ೥ͷΠʔϋτʔϰΥͷޒ݄͔Βे݄·ͰΛॻ͖͚ͭ·͠ΐ ͏ɻ
  11. let recognizer = NLLanguageRecognizer() recognizer.processString(“Lorem ipsum dolor sit amet”) recognizer.languageHints

    = [.english: 0.3, .portuguese: 0.1] print(recognizer.dominantLanguage!) // en NLLanguageRecognizer Specify language hints with factor Specify language hints [NLLanguage : Double] Check dominant language NLLanguage.english
  12. let recognizer = NLLanguageRecognizer() recognizer.processString(“Lorem ipsum dolor sit amet”) recognizer.languageHints

    = [.french: 0.3, .english: 0.00001] print(recognizer.dominantLanguage!) // en NLLanguageRecognizer Specify language hints with factor Specify language hints [NLLanguage : Double] Check dominant language NLLanguage.french
  13. let recognizer = NLLanguageRecognizer() recognizer.processString(“Lorem ipsum dolor sit amet”) recognizer.languageHints

    = [.french: 0.3, .english: 0.1] print(recognizer.dominantLanguage!) // en NLLanguageRecognizer Specify language hints with factor Specify language probabilities [NLLanguage : Double] Check dominant language NLLanguage.english
  14. let recognizer = NLLanguageRecognizer() recognizer.processString(“νϟ΢μʔ৯΂͍ͨʂ”) recognizer.languageHints = [.french: 0.3, .english:

    0.1] print(recognizer.dominantLanguage!) // en NLLanguageRecognizer Specify language hints with factor Specify language probabilities [NLLanguage : Double] Check dominant language NLLanguage.japanese
  15. let recognizer = NLLanguageRecognizer() recognizer.processString(“Lorem ipsum dolor sit amet”) let

    hyphotheses = recognizer.languageHypotheses(withMaximum: 2) print(hyphotheses) NLLanguageRecognizer Hypotheses languages Hypotheses with maximum language count Hypotheses says probably this is English // [pt: 0.32105109095573425, en: 0.4647941291332245]
  16. NLTagger Analyzes natural language text νϟ΢μʔ৯΂͍ͨʂ let text = "νϟ΢μʔ৯΂͍ͨʂ"

    let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } νϟ΢μʔ ৯΂ ͍ͨ
  17. NLTagger Analyzes natural language text νϟ΢μʔ৯΂͍ͨʂ let text = "νϟ΢μʔ৯΂͍ͨʂ"

    let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } νϟ΢μʔ ৯΂ ͍ͨ OtherWord OtherWord OtherWord Japanese not available for tagging text for standard model
  18. NLTagger Analyzes natural language text Chowder time let text =

    “$IPXEFSUJNF” let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } Chowder Time Noun Whitespace Noun
  19. NLTagger Analyzes natural language text let text = “Chowder time”

    let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } Chowder time Noun Whitespace Noun
  20. NLTagger Analyzes natural language text let text = “Chowder time”

    let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } Pass tagSchemes what you need Set text to tagger.string Get tags for range, unit and scheme Chowder time Noun Whitespace Noun
  21. NLTagger Analyzes natural language text - Lexical Class let text

    = “I could not go WWDC in this year” let tagger = NLTagger(tagSchemes: [.lexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in Pronoun Noun could not go WWDC this year Verb Verb Noun Adverb Preposition Determinator
  22. NLTagger Analyzes natural language text - Lemma let text =

    “I could not go WWDC in this year” let tagger = NLTagger(tagSchemes: [.lemma]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in I year could not go WWDC this year can go WWDC not in this
  23. NLTagger Analyzes natural language text - Name type let text

    = “I could not go WWDC in this year” let tagger = NLTagger(tagSchemes: [.nametype]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in Otherword Otherword could not go WWDC this year Otherword Otherword OrganizationName Otherword Otherword Otherword
  24. NLTagger Analyzes natural language text - Name type or lexicalClass

    let text = “I could not go WWDC in this year” let tagger = NLTagger(tagSchemes: [.nameTypeOrLexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in could not go WWDC this year Pronoun Noun Verb Verb OrganizationName Adverb Preposition Determinator
  25. NSLinguisticTagger Analyzes natural language text - Name type or lexicalClass

    let text = “I could not go WWDC in this year” let tagger = NSLinguisticTagger(tagSchemes: [.nameTypeOrLexicalClass]) tagger.string = text tagger.tags(in: NSRange(string: text)!, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in could not go WWDC this year Pronoun Noun Verb Verb OrganizationName Adverb Preposition Determinator
  26. NSLinguisticTagger Analyzes natural language text - Name type or lexicalClass

    let text = “I could not go WWDC in this year” let tagger = NSLinguisticTagger(tagSchemes: [.nameTypeOrLexicalClass]) tagger.string = text tagger.tags(in: NSRange(string: text)!, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in could not go WWDC this year Pronoun Noun Verb Verb OrganizationName Adverb Preposition Determinator Same?
  27. Compare NLTagger and NSLinguisticTagger func testNLTagIsEqualToNSLinguisticTag() { let tSchemes: [NLTagScheme]

    = [.language, .lemma, .lexicalClass, .nameTypeOrLexicalClass, .nameType, .tokenType, .script] let lSchemes: [NSLinguisticTagScheme] = [.language, .lemma, .lexicalClass, .nameTypeOrLexicalClass, .nameType, .tokenType, .script] let tTagger = NLTagger(tagSchemes: tSchemes) let lTagger = NSLinguisticTagger(tagSchemes: lSchemes) zip(tSchemes, lSchemes).forEach { (tScheme, lScheme) in print("---- \(tScheme.rawValue) ----") XCTAssertEqual(tScheme.rawValue, lScheme.rawValue) let tags = tTagger.tags(text: text, unit: .word, scheme: tScheme, options: [.omitPunctuation, .omitWhitespace]) let lTags = lTagger.tags(text: text, unit: .word, scheme: lScheme, options: [.omitPunctuation, .omitWhitespace]) zip(tags, lTags).forEach({ (tTag, lTag) in print(tTag.0.rawValue, lTag.0.rawValue, tTag.0.rawValue == lTag.0.rawValue) XCTAssertEqual(tTag.0.rawValue, lTag.0.rawValue) XCTAssertEqual(text[tTag.1], text[Range(lTag.1, in: text)!]) let text = "I could not go WWDC in this year."
  28. Compare NLTagger and NSLinguisticTagger func testNLTagIsEqualToNSLinguisticTag() { let tSchemes: [NLTagScheme]

    = [.language, .lemma, .lexicalClass, .nameTypeOrLexicalClass, .nameType, .tokenType, .script] let lSchemes: [NSLinguisticTagScheme] = [.language, .lemma, .lexicalClass, .nameTypeOrLexicalClass, .nameType, .tokenType, .script] let tTagger = NLTagger(tagSchemes: tSchemes) let lTagger = NSLinguisticTagger(tagSchemes: lSchemes) zip(tSchemes, lSchemes).forEach { (tScheme, lScheme) in print("---- \(tScheme.rawValue) ----") XCTAssertEqual(tScheme.rawValue, lScheme.rawValue) let tags = tTagger.tags(text: text, unit: .word, scheme: tScheme, options: [.omitPunctuation, .omitWhitespace]) let lTags = lTagger.tags(text: text, unit: .word, scheme: lScheme, options: [.omitPunctuation, .omitWhitespace]) zip(tags, lTags).forEach({ (tTag, lTag) in print(tTag.0.rawValue, lTag.0.rawValue, tTag.0.rawValue == lTag.0.rawValue) XCTAssertEqual(tTag.0.rawValue, lTag.0.rawValue) XCTAssertEqual(text[tTag.1], text[Range(lTag.1, in: text)!]) let text = "I could not go WWDC in this year."
  29. Contractions recognition Actural “I / ca / n’t / go

    / WWDC / in / this / year." Ideal “I / can’t / go / WWDC / in / this / year."
  30. Actural “I / can’t / go / WWDC / in

    / this / year." Ideal “I / can’t / go / WWDC / in / this / year." Contractions recognition .joiningContractions
  31. NLTagger with joining contractions let text = “I could not

    go WWDC in this year” let tagger = NLTagger(tagSchemes: [.nameTypeOrLexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in could not go WWDC this year Pronoun Noun Verb Verb OrganizationName Adverb Preposition Determinator Only available on Natural Language Framework
  32. NLTagger with joining contractions let text = “I could not

    go WWDC in this year” let tagger = NLTagger(tagSchemes: [.nameTypeOrLexicalClass]) tagger.string = text tagger.tags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: [.joinContractions]) .forEach { (tag, range) in if let tag = tag { print(tag.rawValue, text[range]) } } I in couldn’t go WWDC this year Pronoun Noun Verb Verb OrganizationName Preposition Determinator Only available on Natural Language Framework
  33. NLModel A custom model trained to classify or tag natural

    language text. Create ML User Data NLModel NLTagger
  34. NLModel A custom model trained to classify or tag natural

    language text. WIP Waiting for sample from Apple Introducing Create ML WWDC18
  35. • Natural Language • Tokenization … NLTokenizer can separate text

    into tokens • Language Recognition … NLLanguageRecognizer can detect the language in text • Tagging … NLTagger can tag token of lexical class, name type and so on • NLTagger and NSLinguisticTagger have same behavior, but only NLTagger can join contractions into one token • Custom model … NLModel can be used for tagging by setting to tagger. Also, use model independently. Summary
  36. Introducing Natural Language WWDC18 Natural Language Processing and your apps

    WWDC17 Introducing Core ML WWDC17 Core ML in depth WWDC17 Introducing Create ML WWDC18 What’s new in Core ML, Part 1 WWDC18 iOS 11 Programming, ୈ3ষ PEAKS Resources