Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Current research in automatic poetry generation in Basque using Natural Language Generation

Current research in automatic poetry generation in Basque using Natural Language Generation

Manex Agirrezabal

October 13, 2014
Tweet

More Decks by Manex Agirrezabal

Other Decks in Research

Transcript

  1. Poetry generation in Basque Current research in automatic poetry generation

    in Basque using Natural Language Generation Manex Agirrezabal Euskal Herriko Unibertsitatea / University of the Basque Country (UPV/EHU) [email protected] October 13th, 2014 1/23
  2. Poetry generation in Basque About myself About me: U.G. in

    Computer Science (UPV/EHU) (2006-2011) M.S. in Natural Language Processing (UPV/EHU) (2011-2012) PhD Student at UPV/EHU Visiting student in the Dept. Linguistics & Cognitive Sciences Research interests: NLP, Computational Linguistics, Computational Creativity, Natural Language Generation, Computational Phonology, Finite-State Technology, Machine Learning, Singing synthesis 2/23
  3. Poetry generation in Basque Outline 1 Our goal 2 Completed

    works 3 Verse-making in the Basque Country 4 Basque Language 5 Poetry generation: Architecture Content determination Document structuring Lexicalization Aggregation + Referring expression generation Surface realization 6 Verse-singing robot 3/23
  4. Poetry generation in Basque Our goal OUR GOAL: The development

    of an environment for the analysis and generation of poetry 4/23
  5. Poetry generation in Basque Completed works Two main features make

    verses unique: phonetics and meaning 5/23
  6. Poetry generation in Basque Completed works Two main features make

    verses unique: phonetics and meaning Phonetics: We can get the metrical structure for verses in Basque and English We can generate verses that follow a specified meter in Basque Meaning: We could use topic modelling to get the possible topic of the verse But, how to generate a verse about a specific topic? 5/23
  7. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 6/23
  8. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) 6/23
  9. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) Analysis: Metrical analysis for verses in Basque (Agirrezabal et al., 2012) 6/23
  10. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) Analysis: Metrical analysis for verses in Basque (Agirrezabal et al., 2012) Statistical analysis of Basque verses (Agirrezabal et al., 2013a) 6/23
  11. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) Analysis: Metrical analysis for verses in Basque (Agirrezabal et al., 2012) Statistical analysis of Basque verses (Agirrezabal et al., 2013a) ZeuScansion: A tool for scansion of English poetry (Agirrezabal et al., 2013b) 6/23
  12. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) Analysis: Metrical analysis for verses in Basque (Agirrezabal et al., 2012) Statistical analysis of Basque verses (Agirrezabal et al., 2013a) ZeuScansion: A tool for scansion of English poetry (Agirrezabal et al., 2013b) Three approaches for stress assignment to out-of-vocabulary words in English (Agirrezabal et al., 2014) 6/23
  13. Poetry generation in Basque Completed works What we have done

    so far: Basque verse corpora. TXT → TEI P5 Semantic searcher for Basque verses (mg4j IR system) Analysis: Metrical analysis for verses in Basque (Agirrezabal et al., 2012) Statistical analysis of Basque verses (Agirrezabal et al., 2013a) ZeuScansion: A tool for scansion of English poetry (Agirrezabal et al., 2013b) Three approaches for stress assignment to out-of-vocabulary words in English (Agirrezabal et al., 2014) Generation: Surface realization using POS-tag templates from Basque verse corpora (Agirrezabal et al., 2013c) 6/23
  14. Poetry generation in Basque Verse-making in the Basque Country Verse-making

    in the Basque Country It’s a long standing tradition 7/23
  15. Poetry generation in Basque Verse-making in the Basque Country Verse-making

    in the Basque Country It’s a long standing tradition Championships, get-togethers and performances quite common 7/23
  16. Poetry generation in Basque Verse-making in the Basque Country Verse-making

    in the Basque Country It’s a long standing tradition Championships, get-togethers and performances quite common Basque verse-making championship 7/23
  17. Poetry generation in Basque Verse-making in the Basque Country Verse-making

    in the Basque Country It’s a long standing tradition Championships, get-togethers and performances quite common Basque verse-making championship Celebrated every 4 years 7/23
  18. Poetry generation in Basque Verse-making in the Basque Country Verse-making

    in the Basque Country It’s a long standing tradition Championships, get-togethers and performances quite common Basque verse-making championship Celebrated every 4 years Congregates around 15.000 people 7/23
  19. Poetry generation in Basque Verse-making in the Basque Country One

    typical task: Given a topic, verse-makers have to sing three verses extempore 8/23
  20. Poetry generation in Basque Verse-making in the Basque Country One

    typical task: Given a topic, verse-makers have to sing three verses extempore They compose the verses in a limited time 8/23
  21. Poetry generation in Basque Verse-making in the Basque Country One

    typical task: Given a topic, verse-makers have to sing three verses extempore They compose the verses in a limited time Each verse has constraints, such as: Fixed number of syllables Fixed number of lines Some lines have to rhyme 8/23
  22. Poetry generation in Basque Verse-making in the Basque Country One

    typical task: Given a topic, verse-makers have to sing three verses extempore They compose the verses in a limited time Each verse has constraints, such as: Fixed number of syllables Fixed number of lines Some lines have to rhyme Then, they sing the verse 8/23
  23. Poetry generation in Basque Verse-making in the Basque Country One

    typical task: Given a topic, verse-makers have to sing three verses extempore They compose the verses in a limited time Each verse has constraints, such as: Fixed number of syllables Fixed number of lines Some lines have to rhyme Then, they sing the verse ...but there are more tasks 8/23
  24. Poetry generation in Basque Basque Language Basque language, euskara It

    is spoken along the Basque Country Approximately 700.000 speakers 9/23
  25. Poetry generation in Basque Basque Language Basque language, euskara It

    is spoken along the Basque Country Approximately 700.000 speakers Morphologically rich language (multiple declension cases) 9/23
  26. Poetry generation in Basque Basque Language Basque language, euskara It

    is spoken along the Basque Country Approximately 700.000 speakers Morphologically rich language (multiple declension cases) There is a standardized form, but also 4-5 dialects 9/23
  27. Poetry generation in Basque Poetry generation: Architecture 1 Our goal

    2 Completed works 3 Verse-making in the Basque Country 4 Basque Language 5 Poetry generation: Architecture Content determination Document structuring Lexicalization Aggregation + Referring expression generation Surface realization 6 Verse-singing robot 10/23
  28. Poetry generation in Basque Poetry generation: Architecture ...Automatic poetry generation...

    Input: Topic Output: Verse about the topic with metrical constraints 11/23
  29. Poetry generation in Basque Poetry generation: Architecture ...Automatic poetry generation...

    Input: Topic Output: Verse about the topic with metrical constraints How are we going to do it? 11/23
  30. Poetry generation in Basque Poetry generation: Architecture We are going

    to follow the architecture for Natural Language Generation proposed in (Reiter et al., 2000). 12/23
  31. Poetry generation in Basque Poetry generation: Architecture We are going

    to follow the architecture for Natural Language Generation proposed in (Reiter et al., 2000). Six steps for the generation process: Content determination Document structuring Lexicalization Aggregation Referring expression generation Surface realization 12/23
  32. Poetry generation in Basque Poetry generation: Architecture Content determination Input:

    One topic Output: Set of possible content to be included in the verse 13/23
  33. Poetry generation in Basque Poetry generation: Architecture Content determination Input:

    One topic Output: Set of possible content to be included in the verse 1 Lexical resources (Some dictionaries or WordNet) 2 Machine learning (Content word mappings from topic to verses) 3 Topic modelling (Closest words according to some distance, like the cosine distance) 13/23
  34. Poetry generation in Basque Poetry generation: Architecture Document structuring Input:

    Set of content to be included in the verse Output: Content to be included and structure of the verse (Plan of the actions or chunks) 14/23
  35. Poetry generation in Basque Poetry generation: Architecture Document structuring Input:

    Set of content to be included in the verse Output: Content to be included and structure of the verse (Plan of the actions or chunks) Use existing AI planning systems (verbs as states) Take vector representations of words and produce similar verb paths 14/23
  36. Poetry generation in Basque Poetry generation: Architecture Lexicalization Input: Content

    and plan of the verse Output: Specific words for each action or chunk in the plan 15/23
  37. Poetry generation in Basque Poetry generation: Architecture Lexicalization Input: Content

    and plan of the verse Output: Specific words for each action or chunk in the plan Although we have not done too much work on this topic, we intend to use the results of the statistical verse analysis that we did last year. The most used rhyme patterns The use of informal language The use of borrowed words from Spanish (not Basque words) The use of standardized language 15/23
  38. Poetry generation in Basque Poetry generation: Architecture Aggregation + Referring

    expression generation In principle, we are not going to focus on these two steps. 16/23
  39. Poetry generation in Basque Poetry generation: Architecture Surface realization Input:

    Structured content and specific words to be included in each chunk Output: Actual text for each action 17/23
  40. Poetry generation in Basque Poetry generation: Architecture Surface realization Input:

    Structured content and specific words to be included in each chunk Output: Actual text for each action We are going to add each chunk’s information in existing POS-tag templates. 17/23
  41. Poetry generation in Basque Verse-singing robot 1 Our goal 2

    Completed works 3 Verse-making in the Basque Country 4 Basque Language 5 Poetry generation: Architecture Content determination Document structuring Lexicalization Aggregation + Referring expression generation Surface realization 6 Verse-singing robot 18/23
  42. Poetry generation in Basque Verse-singing robot In 2012 we tried

    to create a verse-singing robot by incorporating some simple verse-making algorithms (that included strophe combining or IR) with a modified TTS system (adapted for singing) and a robot. 19/23
  43. Poetry generation in Basque Verse-singing robot In 2012 we tried

    to create a verse-singing robot by incorporating some simple verse-making algorithms (that included strophe combining or IR) with a modified TTS system (adapted for singing) and a robot. This was a collaboration between our research group, which works on NLP, and other two research groups (Robotics lab and Signal processing lab). All of them from the University of the Basque Country. 19/23
  44. Poetry generation in Basque Verse-singing robot Thanks to... University of

    Delaware Department of Linguistics & Cognitive Sciences 21/23
  45. Poetry generation in Basque Verse-singing robot Thanks to... University of

    Delaware Department of Linguistics & Cognitive Sciences Department of Computer & Information Sciences 21/23
  46. Poetry generation in Basque Current research in automatic poetry generation

    in Basque using Natural Language Generation Manex Agirrezabal Euskal Herriko Unibertsitatea / University of the Basque Country (UPV/EHU) [email protected] October 13th, 2014 22/23