Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Type Checking Ruby Programs with Annotations

Type Checking Ruby Programs with Annotations

My presentation at RubyKaigi 2017.

1fab9d01b25e99522f3dfd01e3d4cb51?s=128

Soutaro Matsumoto

September 19, 2017
Tweet

Transcript

  1. Type Checking Ruby Programs with Annotations Soutaro Matsumoto
 (@soutaro)

  2. Soutaro Matsumoto • GitHub, Twitter @soutaro

  3. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby
  4. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping
  5. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types
  6. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types • 2009, Control flow analysis
  7. Soutaro Matsumoto • GitHub, Twitter @soutaro • I have implemented

    four type checkers for Ruby • 2005, Type inference, structural subtyping • 2007, Type inference, polymorphic record types • 2009, Control flow analysis • 2017, Local type inference, structural subtyping [New]
  8. Why do we want a type checker?

  9. Benefits • Find bugs • Verifiable documentation • Auto completion

    • Easier refactoring • For advanced program analysis
  10. Type Checking for Ruby • People have tried for this

    at least for 12 years • Static Type Inference for Ruby [Furr, 2008] • Type Inference for Ruby Programs Based on Polymorphic Record Types [Matsumoto, 2007] • They had tried to infer types of Ruby programs, because Ruby is an untyped language
  11. Static Type Inference for Ruby • Implementation is available as

    Diamondback Ruby • Based on structural subtyping • This means it cannot infer polymorphic types
  12. Type Inference for Ruby Programs Based on Polymorphic Record Types

    • RubyKaigi 2008 • Based on ML type inference and polymorphic record types • Infers polymorphic types • Cannot give types to some Ruby builtin • Polymorphic recursion (Array cannot be polymorphic) • Non regular types (Array#map)
  13. Type Checking for Ruby Furr, 2008 Matsumoto, 2007 Type System

    Structural Subtyping Polymorphic Record Types Type Inference Constraint based ML Type Inference Correctness Maybe (not proved) Limitations Cannot infer polymorphic types Cannot type some builtin
  14. Type Checking for Ruby Furr, 2008 Matsumoto, 2007 Type System

    Structural Subtyping Polymorphic Record Types Type Inference Constraint based ML Type Inference Correctness Maybe (not proved) Limitations Cannot infer polymorphic types Cannot type some builtin
  15. The Conclusion • We cannot construct type inference for Ruby

    programs • If we choose subtyping, no polymorphic types inferred • If we choose polymorphic type inference, some builtins cannot be typed
  16. Requirements • Correctness: if type checker says ok, no type

    error during execution • Static: without execution • No annotation: type inference
  17. Relaxing Requirements • Correctness → Forget correctness • Static →

    Defer type checking to runtime • No annotation → Let programmers write types
  18. Forget Correctness • Incorrect type checking may still help programmers

    • TypeScript accepts unsound co-variant subtyping on function parameters • Lint tools • RuboCop, Brakeman, Querly • Set of ad-hoc bad program patterns, but helps detecting bugs
  19. Type Checking at Runtime • Just-in-Time Static Type Checking for

    Dynamic Languages [Ren, 2016] • Run type check for method body at the beginning of the execution of the method • Not before starting execution • Before Ruby raising NoMethodError • Support meta-programming 1 def foo(x) 2 "".bar if x 3 end 4 5 foo(false)
  20. Annotate Ruby Programs • This is my latest static type

    checker
  21. Steep • Gradual typing for Ruby • https:/ /github.com/soutaro/steep $

    gem install steep --pre
  22. Key Ideas • Gradual Typing • If you don't annotate

    your program, it won't type check • Programmers annotate their Ruby programs • Local type inference to minimize annotation effort • Another language to define types
  23. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  24. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  25. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  26. Example • Type annotations are given as comment • ClassName

    is a type for instance of that class • ClassName.class is a type for class itself • Local variable types can be inferred from its value # @type var x: String # @type const Pathname: Pathname.class path = Pathname.new(x)
  27. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  28. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  29. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  30. Annotating Constants? • In Ruby, constants are similar to method

    • Inheritance • Module nest • Dynamic class Foo def foo p Foo end end Foo.new.foo # => Foo Foo::Foo = "Hello World" Foo.new.foo # => "Hello World"
  31. Type Definitions • Types are defined in another language by

    programmers • Not extracted from Ruby programs • Like C headers
  32. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  33. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  34. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  35. Type Definition interface _StringConvertible def to_str: -> String end class

    String def split: (_StringConvertible, ?Integer) -> Array<String> | (Regexp, ?Integer) -> Array<String> ... end class Person <: Object def initialize: (name: String) -> any def name: -> String end
  36. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  37. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  38. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  39. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  40. Type Definition interface _WithEach<'a> def each: { ('a) -> any

    } -> instance end module Enumerable<'a> : _WithEach<'a> def map: <'b> { ('a) -> 'b } -> Array<'b> ... end class Array<'a> include Enumerable<'a> def each: { ('a) -> any } -> instance ... end
  41. Open Class • Use another signature construct called extension •

    Extension adds methods to existing class/module • The name is from C#, Swift, or Objective C • This is also flow insensitive extension Object (try) def try: (Symbol) -> any | <'a> { (instance) -> 'a } -> 'a end
  42. What is Signature? • Interface is the core of the

    type system • Classes and modules are utility constructs • Defines interfaces expanding inheritance and mixin • Person class signature is not a type of Person.new • Specify type of Person constant by annotation explicitly
  43. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  44. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  45. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  46. Signature Code Separation • There are at least two Object

    class definition • If the types are Ruby classes, when you write a type Object, which one does that mean? • To avoid the confusion, Steep uses another signature language class Object ... end class Object def try(...) ... end ... end
  47. Steep • Local type inference & structural subtyping (from TypeScript)

    • Polymorphism • Union types • Method overloading • Signature code separation & extension (from Objective C) • No need to type meta-programming • Monkey patching
  48. Conclusion • Type inference for Ruby is impossible • I'm

    working for a type checker with type annotations • Local type inference & structural subtyping • I hope Steep be a good material to explorer the static type checker for Ruby
  49. Slides below were skipped in my presentation

  50. References • [Matsumoto, 2007] S. Matsumoto and Y. Minamide. Type

    Inference for Ruby Programs based on Polymorphic Record Types • [Furr, 2009] M. Furr, J. hoon (David) An, J. S. Foster, and M. Hicks. Static Type Inference for Ruby • [Ren, 2016] B. M. Ren and J. S. Foster. Just-in-Time Static Type Checking for Dynamic Languages
  51. Using Steep 1. Declare types 2. Implement and annotate the

    Ruby program 3. Run the type checker You can find examples from some of Steep source code and its tests
  52. Declare Types class Contact def initialize: (name: String, address: Address)

    -> any def name: -> String def address: -> Address end class ContactList def contacts: -> Array<Contact> def filter: { (Contact) -> _Boolean } -> ContactList end
  53. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  54. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  55. Declare Types # @type const ContactList: ContactList.class list = ContactList.new

    list.contacts << 3 list = list.contacts.filter {|contact| contact.name == "Matsumoto" } $ steep check -I contact.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  56. ... 8 9 list.contacts << 3 10 list = list.contacts.filter

    {|contact| contact.name == "Matsumoto" } Type Check $ steep check -I address.rbi test.rb test.rb:9:0: ArgumentTypeMismatch: type=Array<Contact>, method=<< test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList, rhs_type=Array<Contact>
  57. Annotate Ruby Program class Contact # @implements Contact attr_reader :name

    attr_reader :address def initialize(name:, address:) @name = name @address = address end end
  58. Type Check • There is no defs in the class

    for name or address • Add annotation to tell steep that it does not have to check existence of that method definitions $ steep check -I contact.rbi contact.rb contact.rb:1:0: MethodDefinitionMissing: module=Contact, method=name contact.rb:1:0: MethodDefinitionMissing: module=Contact, method=address
  59. Annotate Ruby Program class Contact # @implements Contact # @dynamic

    name attr_reader :name # @dynamic address attr_reader :address def initialize(name:, address:) @name = name @address = address end end
  60. Annotate Ruby Programs class ContactList # @implements ContactList # @dynamic

    contacts attr_reader :contacts def initialize; @contacts = []; end def filter contacts.select do |contact| yield contact end end end
  61. Type Check $ steep check -I contact.rbi contact.rb contact.rb:25:2: MethodBodyTypeMismatch:

    method=filter, expected=ContactList, actual=Array<Contact> def filter contacts.select do |contact| yield contact end end contacts is Array<Contact> and #select returns Array<Contact>
  62. Fix Implementation class ContactList # @implements ContactList ... def filter

    copy = ContactList.new contacts.each do |contact| copy.contacts << contact if yield(contact) end copy end end
  63. Type Check • Is that sure? • I'm afraid if

    there is unexpected fallback to any type $ steep check -I contact.rbi contact.rb $
  64. Type Check $ steep check --fallback-any-is-error -I contact.rbi contact.rb contact.rb:26:11:

    FallbackAny 26 copy = ContactList.new
  65. Annotate Ruby Programs class ContactList # @implements ContactList # @type

    const ContactList: ContactList.class ... end
  66. Type Check $ steep check --fallback-any-is-error -I contact.rbi contact.rb $

  67. Future Work • Support typical Ruby programming styles • Instead

    of adding annotations to all constants, infer their types from signatures • String::String=3, ([Integer, String].sample)::Foo=3 • Type system improvements • Typing rule enhancements & bug fixes • Access control (public/private) • Integration with Ruby
  68. Integration with Ruby • Some features cannot be implemented without

    extending Ruby • Integrating annotations to Ruby syntax • Update: Matz rejected adding typing syntax yesterday • Dynamic type testing • Ruby only has is_a? inheritance relation testing operator • Want structural subtyping relation testing operator • Not Module#conform (because the params and return type should be checked)
  69. Structural Subtyping Failure class A def initialize: (name: String) ->

    any end class B <: A def initialize: (year: Integer) -> any def print: () -> any end A.class == { new: (name: String) -> A, ... } B.class == { new: (year: Integer) -> B, ... } A == { class: -> A.class, ... } B == { class: -> B.class, ... }
  70. Array#zip class Array<'a> ... def zip: <'b> (Array<'b>) -> Array<any>

    end
  71. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol>
  72. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol>
  73. Array#zip • Tuple types ('a * 'b) • Type variable

    extraction difficulty def zip: <'b> (Array<'b>) -> Array<'a * 'b> # @type var x: Array<Integer> ["a"].zip(x) -> Array<String * Integer> # @type var y: Foo ["a"].zip(y) -> Array<String * Symbol> ???