Building a Programming Language 101

Building a Programming Language 101

This is a presentation about building a programming language for GeekCampID 2017

0fe18dfd87b3e48c0a45280e07cf96c6?s=128

Giovanni Sakti

July 15, 2017
Tweet

Transcript

  1. Building a Programming Language 101 @giosakti

  2. Hi, My name is Gio @giosakti giosakti

  3. My Company

  4. Communities (1) ps: we’re looking for speakers for the next

    meetup on july 25th ! mention me
  5. Communities (2) http://ruby.id

  6. Communities (2) October 6th-7th CFP Open: ruby.id/cfp Support our cause:

    ruby.id/perkumpulan
  7. Building a Programming Language 101 @giosakti

  8. None
  9. How many of you know & understand 5 or more

    of these terms - Turing Completeness - Lexer - Parser - AST - IR - DSL - LLVM - Covfefe - Compiled vs Interpreted Raise your hands!
  10. The Truth Is…

  11. So many things to discuss, So little time.. Intro Compiled

    vs Interpreted Lexer Lex, Yacc Parser AST IR LLVM GPL vs DSL 25 mins …
  12. I’ll give pointers on where & what to begin instead

    We can discuss later or you can come to the id-ruby meetup :D
  13. Why is all of this important? The $1,000,000 question

  14. Immersion

  15. Immersion Deeper understanding + learning experience

  16. Reduce the ‘magical’ feeling A.K.A unexpected behavior

  17. None
  18. Build your own OK for learning purpose, but don’t reinvent

    the wheel
  19. Most popular programming languages are very old and/or has considerable

    backing 34 Years 21 Years 22 Years 3 Years, but Backed by Apple
  20. Contribute to Open Source

  21. DSL This is why I learned this topic I’ll explain

    in detail later
  22. So.. Languages..

  23. “Languages can shape the way we think” Sapir-whorf hypotheses

  24. In case of programming language

  25. GAP What you want to do What computer will actually

    do
  26. Natural Language Processing Hi computer, I want you to say

    “ABC” “ABC”
  27. In natural language processing, Computer understands you! Hi computer, I

    want you to say “ABC” “ABC”
  28. Structured Language Processing def start puts “ABC” end “ABC”

  29. Now you’re the one that try more to understand computer

    def start puts “ABC” end “ABC”
  30. Building a structured (programming) language

  31. There’s actually still a considerable gap even if you talk

    to computer using structured language
  32. At the very low level (which is the CPU) Actual

    computer hardware only understand very basic instruction
  33. mov eax, ebx — copy  the  value   in  ebx

    into  eax mov byte ptr [var], 5 — store  the  value  5  into  the  byte  at   location  var push eax — push  eax on  the  stack push [var] — push  the  4  bytes  at   address  var onto  the  stack
  34. None
  35. What’s even more perplexing, different hardware understand different instruction sets

  36. Structured Language Instruction Sets As a good engineer, now we

    will try to breakdown this complex problems
  37. Steps

  38. Generally, there are two approaches to tackle this problem: Compiling

    or Interpreting
  39. Compiling You yourself somehow use dictionary to convert a manuscript

    into other language that you understand better (for example from English to Bahasa)
  40. def start puts “ABC” end [:trace, 1], [:putnil], [:putstring, "hello

    world"], [:send, :puts, 1, nil, 8, 0], [:leave]]] “ABC”
  41. Interpreting You hire a translator to read and understand the

    manuscript, then you ask him/her to explain it to you
  42. def start puts “ABC” end “ABC” Hey bro say “abc”

  43. But both approaches start the same

  44. Lexing Lexical analysis

  45. None
  46. Lexing by hand

  47. Lexing by tools Lex.

  48. Parsing Syntactic analysis

  49. 7 number + operator 3 number * operator 5 number

    -­‐ operator 2 number
  50. Parsing without tools https://gist.github.com/ascv/5022712 """ exp ::=  term    |

     exp +  term  |  exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) """
  51. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number
  52. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number
  53. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number
  54. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number 7
  55. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number 7
  56. https://gist.github.com/ascv/5022712 exp ::=  term    |  exp +  term  |

     exp -­‐ term term  ::=  factor  |  factor  *  term  |  factor  /  term factor  ::=  number  |  (  exp ) 7 number + operator 3 number * operator 5 number -­‐ operator 2 number 7 3 +
  57. Parsing with tools Yacc or Bison.

  58. None
  59. … if_stmt: 'if' test ':' suite ('elif' test ':' suite)*

    ['else' ':' suite] while_stmt: 'while' test ':' suite ['else' ':' suite] for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] …
  60. Optimization optional

  61. JVM Warmup time

  62. None
  63. Lexing Parsing Optimizing Executing Compiling

  64. Generaly you use “faster” language or more low-level environment to

    do everything above
  65. Basically after AST is created, you can directly execute the

    nodes and the process is finished
  66. Operator (*)

  67. Eval left-hand * eval right-hand

  68. Operator (+)

  69. 7 + 3

  70. And so on…

  71. Lexing Parsing Optimizing Executing Compiling

  72. public  JavapTest(); Code: 0:  aload_0 1:  invokespecial #1    

                                   //  Method  java/lang/Object."<init>":()V 4:  return public  static  void  main(java.lang.String[]); Code: 0:  getstatic #2                                    //  Field  java/lang/System.out:Ljava/io/PrintStream; 3:  bipush 20 5:  invokevirtual #3                                    //  Method  java/io/PrintStream.println:(I)V 8:  return
  73. DSL

  74. DSL Domain Specific Language

  75. DSL vs GPL GPL is General Purpose Language

  76. None
  77. SELECT * FROM users WHERE status = “ACTIVE”

  78. Turing Completeness Prog. Language is called turing complete when it

    can simulate the turing machine
  79. The  ability  to  read  and  write  "variables" (or  arbitrary  data)

      The  ability  to  simulate  moving the   read/write  head The  ability  to  simulate  a  finite  state   machine A  "halt"  state
  80. DSL is most likely not turing complete

  81. Xtext

  82. Summary

  83. If you think this is your cup of tea…

  84. If you want to understand clearly the concept first.. Build

    from scratch http://kanaka.github.io/lambdaconf/#/ https://github.com/kanaka/mal
  85. If you can create an interpreter for LISP using x

    language, then you will understand (almost) all of x language functionalities
  86. None
  87. None
  88. If you want to have a taste in creating programming

    language using modern tools Learn lex + yacc/bison or ANTLR
  89. If you somehow think that DSL suits your use case

    Learn to create DSL Learn Xtext
  90. Thanks! Enjoy your lunch.. Let’s discuss @giosakti