Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Type Checking Ruby Programs with Annotations

Type Checking Ruby Programs with Annotations

My presentation at RubyKaigi 2017.

Soutaro Matsumoto

September 19, 2017
Tweet

More Decks by Soutaro Matsumoto

Other Decks in Programming

Transcript

  1. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping
  2. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types
  3. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types • 2009, Control flow analysis
  4. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types • 2009, Control flow analysis • 2017, Local type inference, structural subtyping [New]
  5. Benefits • Find bugs • Verifiable documentation • Auto completion

    • Easier refactoring • For advanced program analysis
  6. Type Checking for Ruby • People have tried for this

    at least for 12 years • Static Type Inference for Ruby [Furr, 2008] • Type Inference for Ruby Programs Based on Polymorphic Record Types [Matsumoto, 2007] • They had tried to infer types of Ruby programs, because Ruby is an untyped language
  7. Static Type Inference for Ruby • Implementation is available as

    Diamondback Ruby • Based on structural subtyping • This means it cannot infer polymorphic types
  8. Type Inference for Ruby Programs Based on Polymorphic Record Types

    • RubyKaigi 2008 • Based on ML type inference and polymorphic record types • Infers polymorphic types • Cannot give types to some Ruby builtin • Polymorphic recursion (Array cannot be polymorphic) • Non regular types (Array#map)
  9. Type Checking for Ruby Furr, 2008 Matsumoto, 2007 Type System

    Structural Subtyping Polymorphic Record Types Type Inference Constraint based ML Type Inference Correctness Maybe (not proved) Limitations Cannot infer polymorphic types Cannot type some builtin
  10. Type Checking for Ruby Furr, 2008 Matsumoto, 2007 Type System

    Structural Subtyping Polymorphic Record Types Type Inference Constraint based ML Type Inference Correctness Maybe (not proved) Limitations Cannot infer polymorphic types Cannot type some builtin
  11. The Conclusion • We cannot construct type inference for Ruby

    programs • If we choose subtyping, no polymorphic types inferred • If we choose polymorphic type inference, some builtins cannot be typed
  12. Requirements • Correctness: if type checker says ok, no type

    error during execution • Static: without execution • No annotation: type inference
  13. Relaxing Requirements • Correctness → Forget correctness • Static →

    Defer type checking to runtime • No annotation → Let programmers write types
  14. Forget Correctness • Incorrect type checking may still help programmers

    • TypeScript accepts unsound co-variant subtyping on function parameters • Lint tools • RuboCop, Brakeman, Querly • Set of ad-hoc bad program patterns, but helps detecting bugs
  15. Type Checking at Runtime • Just-in-Time Static Type Checking for

    Dynamic Languages [Ren, 2016] • Run type check for method body at the beginning of the execution of the method • Not before starting execution • Before Ruby raising NoMethodError • Support meta-programming 1 def foo(x) 2 "".bar if x 3 end 4 5 foo(false)
  16. Key Ideas • Gradual Typing • If you don't annotate

    your program, it won't type check • Programmers annotate their Ruby programs • Local type inference to minimize annotation effort • Another language to define types
  17. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  18. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  19. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  20. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  21. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  22. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  23. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  24. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  25. Type Definitions • Types are defined in another language by

    programmers • Not extracted from Ruby programs • Like C headers
  26. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  27. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  28. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  29. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  30. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  31. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  32. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  33. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  34. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  35. Open Class • Use another signature construct called extension •

    Extension adds methods to existing class/module • The name is from C#, Swift, or Objective C • This is also flow insensitive extension Object (try) def try: (Symbol) -> any | <'a> { (instance) -> 'a } -> 'a end
  36. What is Signature? • Interface is the core of the

    type system • Classes and modules are utility constructs • Defines interfaces expanding inheritance and mixin • Person class signature is not a type of Person.new • Specify type of Person constant by annotation explicitly
  37. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  38. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  39. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  40. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  41. Steep • Local type inference & structural subtyping (from TypeScript)

    • Polymorphism • Union types • Method overloading • Signature code separation & extension (from Objective C) • No need to type meta-programming • Monkey patching
  42. Conclusion • Type inference for Ruby is impossible • I'm

    working for a type checker with type annotations • Local type inference & structural subtyping • I hope Steep be a good material to explorer the static type checker for Ruby
  43. References • [Matsumoto, 2007] S. Matsumoto and Y. Minamide. Type

    Inference for Ruby Programs based on Polymorphic Record Types • [Furr, 2009] M. Furr, J. hoon (David) An, J. S. Foster, and M. Hicks. Static Type Inference for Ruby • [Ren, 2016] B. M. Ren and J. S. Foster. Just-in-Time Static Type Checking for Dynamic Languages
  44. Using Steep 1. Declare types 2. Implement and annotate the

    Ruby program 3. Run the type checker You can find examples from some of Steep source code and its tests
  45. Declare Types class Contact def initialize: (name: String, address: Address)

    -> any def name: -> String def address: -> Address end class ContactList def contacts: -> Array<Contact> def filter: { (Contact) -> _Boolean } -> ContactList end
  46. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  47. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  48. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  49. ... 8 9 list.contacts << 3 10 list = list.contacts.filter

    {|contact| contact.name == "Matsumoto" } Type Check $ steep check -I address.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  50. Annotate Ruby Program class Contact # @implements Contact attr_reader :name

    attr_reader :address def initialize(name:, address:) @name = name @address = address end end
  51. Type Check • There is no defs in the class

    for name or address • Add annotation to tell steep that it does not have to check existence of that method definitions $ steep check -I contact.rbi contact.rb contact.rb:1:0: MethodDefinitionMissing: module=Contact, method=name contact.rb:1:0: MethodDefinitionMissing: module=Contact, method=address
  52. Annotate Ruby Program class Contact # @implements Contact # @dynamic

    name attr_reader :name # @dynamic address attr_reader :address def initialize(name:, address:) @name = name @address = address end end
  53. Annotate Ruby Programs class ContactList # @implements ContactList # @dynamic

    contacts attr_reader :contacts def initialize; @contacts = []; end def filter contacts.select do |contact| yield contact end end end
  54. Type Check $ steep check -I contact.rbi contact.rb contact.rb:25:2: MethodBodyTypeMismatch:

    method=filter, expected=ContactList, actual=Array<Contact> def filter contacts.select do |contact| yield contact end end contacts is Array<Contact> and #select returns Array<Contact>
  55. Fix Implementation class ContactList # @implements ContactList ... def filter

    copy = ContactList.new contacts.each do |contact| copy.contacts << contact if yield(contact) end copy end end
  56. Type Check • Is that sure? • I'm afraid if

    there is unexpected fallback to any type $ steep check -I contact.rbi contact.rb $
  57. Future Work • Support typical Ruby programming styles • Instead

    of adding annotations to all constants, infer their types from signatures • String::String=3, ([Integer, String].sample)::Foo=3 • Type system improvements • Typing rule enhancements & bug fixes • Access control (public/private) • Integration with Ruby
  58. Integration with Ruby • Some features cannot be implemented without

    extending Ruby • Integrating annotations to Ruby syntax • Update: Matz rejected adding typing syntax yesterday • Dynamic type testing • Ruby only has is_a? inheritance relation testing operator • Want structural subtyping relation testing operator • Not Module#conform (because the params and return type should be checked)
  59. Structural Subtyping Failure class A def initialize: (name: String) ->

    any end class B <: A def initialize: (year: Integer) -> any def print: () -> any end A.class == { new: (name: String) -> A, ... } B.class == { new: (year: Integer) -> B, ... } A == { class: -> A.class, ... } B == { class: -> B.class, ... }
  60. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol>
  61. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol>
  62. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol> ???