Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dynamic Language Embedding With Homogeneous Tool Support

Dynamic Language Embedding With Homogeneous Tool Support

Domain-specific languages (DSLs) are increasingly used as embedded languages within general-purpose host languages. DSLs provide a compact, dedicated syntax for specifying parts of an application related to specialized domains. Unfortunately, such language extensions typically do not integrate well with existing development tools. Editors, compilers and debuggers are either unaware of the extensions, or must be adapted at a non-trivial cost. Furthermore, these embedded languages typically conflict with the grammar of the host language and make it difficult to write hybrid code; few mechanisms exist to control the scope and usage of multiple tightly interconnected embedded languages.

In this dissertation we present Helvetia, a novel approach to embed languages into an existing host language by leveraging the underlying representation of the host language used by these tools. We introduce Language Boxes, an approach that offers a simple, modular mechanism to encapsulate (i) compositional changes to the host language, (ii) transformations to address various concerns such as compilation and syntax highlighting, and (iii) scoping rules to control visibility of fine-grained language changes. We describe the design and implementation of Helvetia and Language Boxes, discuss the required infrastructure of a host language enabling language embedding, and validate our approach by case studies that demonstrate different ways to extend or adapt the host language syntax and semantics.

Lukas Renggli

October 04, 2011
Tweet

More Decks by Lukas Renggli

Other Decks in Technology

Transcript

  1. General Purpose Host Language 6 ? SELECT  email  FROM  users

    WHERE  username  =  'lr' SyntaxSQL SemanticsSQL SyntaxHost SemanticsHost ToolsHost
  2. 7

  3. 9

  4. 11

  5. 14

  6. Thesis To support seamless integration of context-dependent languages without breaking

    the tools, we need 1. a host-language grammar that can be changed by language extensions, 2. a first-class language description used by the development environment, and 3. a transformation mechanism of the embedded language into a common executable representation. 24
  7. 33 ◦ ◦ ◦ ◦ ◦ • ◦ • ◦

    ◦ • • • ◦ ◦ • ◦ • • • ◦ • • • Syntax Vocabulary Semantics
  8. 34 Host Language ◦ ◦ ◦ ◦ ◦ • ◦

    • ◦ ◦ • • • ◦ ◦ • ◦ • • • ◦ • • • Syntax Vocabulary Semantics
  9. 35 Host Language ◦ ◦ ◦ ◦ ◦ • Internal

    Language ◦ • ◦ ◦ • • • ◦ ◦ • ◦ • • • ◦ • • • Syntax Vocabulary Semantics
  10. 36 Host Language ◦ ◦ ◦ ◦ ◦ • Internal

    Language ◦ • ◦ Pidgin ◦ • • • ◦ ◦ • ◦ • • • ◦ • • • Syntax Vocabulary Semantics
  11. 37 Host Language ◦ ◦ ◦ ◦ ◦ • Internal

    Language ◦ • ◦ Pidgin ◦ • • • ◦ ◦ • ◦ • • • ◦ Creole • • • Syntax Vocabulary Semantics
  12. 38 Host Language ◦ ◦ ◦ Argot ◦ ◦ •

    Internal Language ◦ • ◦ Pidgin ◦ • • • ◦ ◦ • ◦ • • • ◦ Creole • • • Syntax Vocabulary Semantics
  13. 39 Host Language ◦ ◦ ◦ Argot ◦ ◦ •

    Internal Language ◦ • ◦ Pidgin ◦ • • — • ◦ ◦ — • ◦ • — • • ◦ Creole • • • Syntax Vocabulary Semantics
  14. 40 Pidgin ◦ • • Creole • • • Argot

    ◦ ◦ • Syntax Vocabulary Semantics
  15. 41   6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF

    $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH
  16. 42   ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ  6RXUFH

    &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH   
  17. 43   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ

     6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH   
  18. 44   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ

     ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH   
  19. 45   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ

     ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH     3LGJLQ  &UHROH  $UJRW
  20. 46 Package Name x = 1 y = 1 (2,

    1) (2, 2) (1, 2) x = 2 y = 2
  21. Package Name x = 1 y = 1 (2, 1)

    (2, 2) (1, 2) x = 2 y = 2 47 aBuilder  row  grow. aBuilder  row  fill. aBuilder  column  grow. aBuilder  column  fill. aBuilder  x:  1  y:  1  add:  (LabelShape  new   text:  [  :each  |  each  name  ];   borderColor:  #black;   borderWidth:  1;   yourself). aBuilder  x:  1  y:  2  w:  2  h:  1  add:  (RectangleShape  new   borderColor:  #black;   borderWidth:  1;   width:  200;   height:  100;   yourself)
  22. Package Name x = 1 y = 1 (2, 1)

    (2, 2) (1, 2) x = 2 y = 2 48 row  =  grow. row  =  fill. column  =  grow. column  =  fill. (1  ,  1)  =  label     text:  [  :each  |  each  name  ];     borderColor:  #black;     borderWidth:  1. (1  ,  2)  -­‐  (2  ,  1)  =  rectangle     borderColor:  #black;     borderWidth:  1;     width:  200;     height:  100.
  23. 49   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ

     ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH   
  24. shape  {   cols:  #grow,  #fill;   rows:  #grow,  #fill;

    } label  {   position:  1  ,  1;     text:  [  :each  |  each  name  ];   borderColor:  #black;   borderWidth:  1; } rectangle  {     position:  1  ,  2;   colspan:  2;   borderColor:  #black;   borderWidth:  1;   width:  200;   height:  100; } Package Name x = 1 y = 1 (2, 1) (2, 2) (1, 2) x = 2 y = 2 50
  25. 51   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ

     ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH   
  26.   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ 

    ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH    52 Conventional Language
  27.   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ 

    ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH    53 Conventional Language Context Specific
  28.   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ 

    ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH    54 Conventional Language Context Specific Homogeneous Code & Data
  29.   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ 

    ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH    55 Conventional Language Context Specific Homogeneous Code & Data Homogeneous Tool Support
  30.   5XOHV  ʳ’ƒ”•‡ʴ  ʳ–”ƒ•ˆ‘”ʴ  ʳƒ––”‹„—–‡ʴ 

    ʳŠ‹‰ŠŽ‹‰Š–ʴ  ʳ‡””‘”ʴ  ʳ”‡ˆƒ…–‘”ʴ  6RXUFH &RGH  6\QWDFWLF $QDO\VLV  6HPDQWLF $QDO\VLV  &RGH *HQHUDWLRQ  ([HFXWDEOH &RGH     3LGJLQ  &UHROH  $UJRW 56 Conventional Language Context Specific Homogeneous Code & Data Homogeneous Tool Support ʳ’ƒ”•‡ʴ 6RXUFH &RGH 3LGJLQ &UHROH $UJRW
  31. 66 |  r  | r  :=   ^  User  fromRow:

     r . SELECT  *  FROM  users
  32. Language Concern 73 Context Menus Navigation Search Code Expansion Code

    Completion Error Correction Custom Inspector Refactorings Code Folding Highlighting
  33. 79

  34. scanIdentifier self step. ((currentCharacter between: $A and: $Z) or: [

    currentCharacter between: $a and: $z ]) ifTrue: [ [ self recordMatch: #IDENTIFIER. self step. (currentCharacter between: $0 and: $9) or: [ (currentCharacter between: $A and: $Z) or: [ currentCharacter between: $a and: $z ] ] ] whileTrue. ^ self reportLastMatch ] 86
  35. #( #[1 0 9 0 25 0 13 0 34

    0 17 0 40 0 21 0 41] #[1 0 9 0 25 0 13 0 34 0 93 0 76 0 157 0 112] #[1 2 38 0 21 2 38 0 25 2 38 0 26 0 13 0 34] #[0 1 154 0 16 0 21 0 25 0 26 0 34 0 40 0 41] #[0 1 210 0 76 0 81] #[0 1 214 0 76 0 81] #[1 0 173 0 76 0 177 0 81] #[0 1 134 0 16 0 21 0 25 0 26 0 34 0 40 0 41] #[1 1 46 0 21 1 46 0 25 1 46 0 26 1 69] #[1 1 54 0 21 1 54 0 25 1 54 0 26 1 54 0 34] #[0 2 102 0 21 0 25 0 26 0 34 0 40 0 41 0 76] #[0 2 50 0 21 0 25 0 26 0 76 0 79] #[1 1 13 0 76 2 85 0 124 1 21 0 125] #[1 2 89 0 17 2 30 0 21 2 30 0 82] #[1 2 93 0 21 2 97 0 82] ) 87
  36. letter digit sequence choice many choice _ letter choice _

    letter    !    letter  |  "_"   93
  37. SELECT  *  FROM  users 102 |  r  | r  :=

      ^  User  fromRow:  r .
  38. Assignments and Swapping Asynchronous Messages Automaton Brainfuck Language Functional Pattern

    Matching Grammar Definition Message Pipes Mondrian Object Relationships Positional Arguments Program Checking Quasiquoting Regular Expression Roman Numbers SPath Expression SQL Schematic Tables String Interpolation Transactional Memory Tuple Space 116 [http://scg.unibe.ch/research/helvetia/examples]
  39. Assignments and Swapping Asynchronous Messages Automaton Brainfuck Language Functional Pattern

    Matching Grammar Definition Message Pipes ✓Mondrian Object Relationships Positional Arguments Program Checking Quasiquoting Regular Expression Roman Numbers SPath Expression ✓SQL Schematic Tables String Interpolation Transactional Memory Tuple Space 117 [http://scg.unibe.ch/research/helvetia/examples]
  40. Assignments and Swapping Asynchronous Messages Automaton Brainfuck Language Functional Pattern

    Matching Grammar Definition Message Pipes ✓Mondrian Object Relationships Positional Arguments ‣ Program Checking ‣ Quasiquoting Regular Expression Roman Numbers SPath Expression ✓SQL Schematic Tables String Interpolation ‣ Transactional Memory Tuple Space 118 [http://scg.unibe.ch/research/helvetia/examples]
  41. apply hasChanged hasConflict Change object * changes Process 0..1 currentTransaction

    do: aBlock retry: aBlock checkpoint abort: anObject escapeContext Transaction previousCopy workingCopy ObjectChange applyBlock conflictTestBlock CustomChange * Transactional Memory 119
  42. Language Boxes Host Language Dynamic Grammars Language and Tool Extensions

    Renggli et al. CLSS 2009 Renggli et al. IWST 2009 Nierstrasz et al. LNCS 2009 Renggli et al. TOOLS 2010 122
  43. Language Boxes Host Language Dynamic Grammars Language and Tool Extensions

    123 To support seamless integration of context-dependent languages without breaking the tools, we need 1. a host-language grammar that can be changed by language extensions, 2. a first-class language description used by the development environment, and 3. a transformation mechanism of the embedded language into a common executable representation.
  44. Language Boxes Host Language Dynamic Grammars Language and Tool Extensions

    124 To support seamless integration of context-dependent languages without breaking the tools, we need 1. a host-language grammar that can be changed by language extensions, 2. a first-class language description used by the development environment, and 3. a transformation mechanism of the embedded language into a common executable representation.
  45. Language Boxes Host Language Dynamic Grammars Language and Tool Extensions

    125 To support seamless integration of context-dependent languages without breaking the tools, we need 1. a host-language grammar that can be changed by language extensions, 2. a first-class language description used by the development environment, and 3. a transformation mechanism of the embedded language into a common executable representation.
  46. Language Boxes Host Language Dynamic Grammars Language and Tool Extensions

    126 To support seamless integration of context-dependent languages without breaking the tools, we need 1. a host-language grammar that can be changed by language extensions, 2. a first-class language description used by the development environment, and 3. a transformation mechanism of the embedded language into a common executable representation.