$30 off During Our Annual Pro Sale. View Details »

Type Checking Ruby Programs with Annotations

Type Checking Ruby Programs with Annotations

My presentation at RubyKaigi 2017.

Soutaro Matsumoto

September 19, 2017
Tweet

More Decks by Soutaro Matsumoto

Other Decks in Programming

Transcript

  1. Type Checking Ruby Programs
    with Annotations
    Soutaro Matsumoto

    (@soutaro)

    View Slide

  2. Soutaro Matsumoto
    • GitHub, Twitter @soutaro

    View Slide

  3. Soutaro Matsumoto
    • GitHub, Twitter @soutaro
    • I have implemented four type checkers for Ruby

    View Slide

  4. Soutaro Matsumoto
    • GitHub, Twitter @soutaro
    • I have implemented four type checkers for Ruby
    • 2005, Type inference, structural subtyping

    View Slide

  5. Soutaro Matsumoto
    • GitHub, Twitter @soutaro
    • I have implemented four type checkers for Ruby
    • 2005, Type inference, structural subtyping
    • 2007, Type inference, polymorphic record types

    View Slide

  6. Soutaro Matsumoto
    • GitHub, Twitter @soutaro
    • I have implemented four type checkers for Ruby
    • 2005, Type inference, structural subtyping
    • 2007, Type inference, polymorphic record types
    • 2009, Control flow analysis

    View Slide

  7. Soutaro Matsumoto
    • GitHub, Twitter @soutaro
    • I have implemented four type checkers for Ruby
    • 2005, Type inference, structural subtyping
    • 2007, Type inference, polymorphic record types
    • 2009, Control flow analysis
    • 2017, Local type inference, structural subtyping
    [New]

    View Slide

  8. Why do we want a type checker?

    View Slide

  9. Benefits
    • Find bugs
    • Verifiable documentation
    • Auto completion
    • Easier refactoring
    • For advanced program analysis

    View Slide

  10. Type Checking for Ruby
    • People have tried for this at least for 12 years
    • Static Type Inference for Ruby [Furr, 2008]
    • Type Inference for Ruby Programs Based on Polymorphic Record
    Types [Matsumoto, 2007]
    • They had tried to infer types of Ruby programs, because Ruby is an
    untyped language

    View Slide

  11. Static Type Inference for Ruby
    • Implementation is available as Diamondback Ruby
    • Based on structural subtyping
    • This means it cannot infer polymorphic types

    View Slide

  12. Type Inference for Ruby Programs Based on
    Polymorphic Record Types
    • RubyKaigi 2008
    • Based on ML type inference and polymorphic record types
    • Infers polymorphic types
    • Cannot give types to some Ruby builtin
    • Polymorphic recursion (Array cannot be polymorphic)
    • Non regular types (Array#map)

    View Slide

  13. Type Checking for Ruby
    Furr, 2008 Matsumoto, 2007
    Type System Structural Subtyping Polymorphic Record Types
    Type Inference Constraint based ML Type Inference
    Correctness Maybe (not proved)
    Limitations Cannot infer polymorphic types Cannot type some builtin

    View Slide

  14. Type Checking for Ruby
    Furr, 2008 Matsumoto, 2007
    Type System Structural Subtyping Polymorphic Record Types
    Type Inference Constraint based ML Type Inference
    Correctness Maybe (not proved)
    Limitations Cannot infer polymorphic types Cannot type some builtin

    View Slide

  15. The Conclusion
    • We cannot construct type inference for Ruby programs
    • If we choose subtyping, no polymorphic types inferred
    • If we choose polymorphic type inference, some builtins cannot be
    typed

    View Slide

  16. Requirements
    • Correctness: if type checker says ok, no type error during execution
    • Static: without execution
    • No annotation: type inference

    View Slide

  17. Relaxing Requirements
    • Correctness → Forget correctness
    • Static → Defer type checking to runtime
    • No annotation → Let programmers write types

    View Slide

  18. Forget Correctness
    • Incorrect type checking may still help programmers
    • TypeScript accepts unsound co-variant subtyping on function
    parameters
    • Lint tools
    • RuboCop, Brakeman, Querly
    • Set of ad-hoc bad program patterns, but helps detecting bugs

    View Slide

  19. Type Checking at Runtime
    • Just-in-Time Static Type Checking for
    Dynamic Languages [Ren, 2016]
    • Run type check for method body at the
    beginning of the execution of the method
    • Not before starting execution
    • Before Ruby raising NoMethodError
    • Support meta-programming
    1 def foo(x)
    2 "".bar if x
    3 end
    4
    5 foo(false)

    View Slide

  20. Annotate Ruby Programs
    • This is my latest static type checker

    View Slide

  21. Steep
    • Gradual typing for Ruby
    • https:/
    /github.com/soutaro/steep
    $ gem install steep --pre

    View Slide

  22. Key Ideas
    • Gradual Typing
    • If you don't annotate your program, it won't type check
    • Programmers annotate their Ruby programs
    • Local type inference to minimize annotation effort
    • Another language to define types

    View Slide

  23. Example
    • Type annotations are given as comment
    • ClassName is a type for instance of that class
    • ClassName.class is a type for class itself
    • Local variable types can be inferred from its value
    # @type var x: String
    # @type const Pathname: Pathname.class
    path = Pathname.new(x)

    View Slide

  24. Example
    • Type annotations are given as comment
    • ClassName is a type for instance of that class
    • ClassName.class is a type for class itself
    • Local variable types can be inferred from its value
    # @type var x: String
    # @type const Pathname: Pathname.class
    path = Pathname.new(x)

    View Slide

  25. Example
    • Type annotations are given as comment
    • ClassName is a type for instance of that class
    • ClassName.class is a type for class itself
    • Local variable types can be inferred from its value
    # @type var x: String
    # @type const Pathname: Pathname.class
    path = Pathname.new(x)

    View Slide

  26. Example
    • Type annotations are given as comment
    • ClassName is a type for instance of that class
    • ClassName.class is a type for class itself
    • Local variable types can be inferred from its value
    # @type var x: String
    # @type const Pathname: Pathname.class
    path = Pathname.new(x)

    View Slide

  27. Annotating Constants?
    • In Ruby, constants are similar to
    method
    • Inheritance
    • Module nest
    • Dynamic
    class Foo
    def foo
    p Foo
    end
    end
    Foo.new.foo # => Foo
    Foo::Foo = "Hello World"
    Foo.new.foo # => "Hello World"

    View Slide

  28. Annotating Constants?
    • In Ruby, constants are similar to
    method
    • Inheritance
    • Module nest
    • Dynamic
    class Foo
    def foo
    p Foo
    end
    end
    Foo.new.foo # => Foo
    Foo::Foo = "Hello World"
    Foo.new.foo # => "Hello World"

    View Slide

  29. Annotating Constants?
    • In Ruby, constants are similar to
    method
    • Inheritance
    • Module nest
    • Dynamic
    class Foo
    def foo
    p Foo
    end
    end
    Foo.new.foo # => Foo
    Foo::Foo = "Hello World"
    Foo.new.foo # => "Hello World"

    View Slide

  30. Annotating Constants?
    • In Ruby, constants are similar to
    method
    • Inheritance
    • Module nest
    • Dynamic
    class Foo
    def foo
    p Foo
    end
    end
    Foo.new.foo # => Foo
    Foo::Foo = "Hello World"
    Foo.new.foo # => "Hello World"

    View Slide

  31. Type Definitions
    • Types are defined in another language by programmers
    • Not extracted from Ruby programs
    • Like C headers

    View Slide

  32. Type Definition
    interface _StringConvertible
    def to_str: -> String
    end
    class String
    def split: (_StringConvertible, ?Integer) -> Array
    | (Regexp, ?Integer) -> Array
    ...
    end
    class Person <: Object
    def initialize: (name: String) -> any
    def name: -> String
    end

    View Slide

  33. Type Definition
    interface _StringConvertible
    def to_str: -> String
    end
    class String
    def split: (_StringConvertible, ?Integer) -> Array
    | (Regexp, ?Integer) -> Array
    ...
    end
    class Person <: Object
    def initialize: (name: String) -> any
    def name: -> String
    end

    View Slide

  34. Type Definition
    interface _StringConvertible
    def to_str: -> String
    end
    class String
    def split: (_StringConvertible, ?Integer) -> Array
    | (Regexp, ?Integer) -> Array
    ...
    end
    class Person <: Object
    def initialize: (name: String) -> any
    def name: -> String
    end

    View Slide

  35. Type Definition
    interface _StringConvertible
    def to_str: -> String
    end
    class String
    def split: (_StringConvertible, ?Integer) -> Array
    | (Regexp, ?Integer) -> Array
    ...
    end
    class Person <: Object
    def initialize: (name: String) -> any
    def name: -> String
    end

    View Slide

  36. Type Definition
    interface _WithEach<'a>
    def each: { ('a) -> any } -> instance
    end
    module Enumerable<'a> : _WithEach<'a>
    def map: <'b> { ('a) -> 'b } -> Array<'b>
    ...
    end
    class Array<'a>
    include Enumerable<'a>
    def each: { ('a) -> any } -> instance
    ...
    end

    View Slide

  37. Type Definition
    interface _WithEach<'a>
    def each: { ('a) -> any } -> instance
    end
    module Enumerable<'a> : _WithEach<'a>
    def map: <'b> { ('a) -> 'b } -> Array<'b>
    ...
    end
    class Array<'a>
    include Enumerable<'a>
    def each: { ('a) -> any } -> instance
    ...
    end

    View Slide

  38. Type Definition
    interface _WithEach<'a>
    def each: { ('a) -> any } -> instance
    end
    module Enumerable<'a> : _WithEach<'a>
    def map: <'b> { ('a) -> 'b } -> Array<'b>
    ...
    end
    class Array<'a>
    include Enumerable<'a>
    def each: { ('a) -> any } -> instance
    ...
    end

    View Slide

  39. Type Definition
    interface _WithEach<'a>
    def each: { ('a) -> any } -> instance
    end
    module Enumerable<'a> : _WithEach<'a>
    def map: <'b> { ('a) -> 'b } -> Array<'b>
    ...
    end
    class Array<'a>
    include Enumerable<'a>
    def each: { ('a) -> any } -> instance
    ...
    end

    View Slide

  40. Type Definition
    interface _WithEach<'a>
    def each: { ('a) -> any } -> instance
    end
    module Enumerable<'a> : _WithEach<'a>
    def map: <'b> { ('a) -> 'b } -> Array<'b>
    ...
    end
    class Array<'a>
    include Enumerable<'a>
    def each: { ('a) -> any } -> instance
    ...
    end

    View Slide

  41. Open Class
    • Use another signature construct called extension
    • Extension adds methods to existing class/module
    • The name is from C#, Swift, or Objective C
    • This is also flow insensitive
    extension Object (try)
    def try: (Symbol) -> any
    | <'a> { (instance) -> 'a } -> 'a
    end

    View Slide

  42. What is Signature?
    • Interface is the core of the type system
    • Classes and modules are utility constructs
    • Defines interfaces expanding inheritance and mixin
    • Person class signature is not a type of Person.new
    • Specify type of Person constant by annotation explicitly

    View Slide

  43. Signature Code Separation
    • There are at least two Object class
    definition
    • If the types are Ruby classes, when you
    write a type Object, which one does that
    mean?
    • To avoid the confusion, Steep uses
    another signature language
    class Object ... end
    class Object
    def try(...) ... end
    ...
    end

    View Slide

  44. Signature Code Separation
    • There are at least two Object class
    definition
    • If the types are Ruby classes, when you
    write a type Object, which one does that
    mean?
    • To avoid the confusion, Steep uses
    another signature language
    class Object ... end
    class Object
    def try(...) ... end
    ...
    end

    View Slide

  45. Signature Code Separation
    • There are at least two Object class
    definition
    • If the types are Ruby classes, when you
    write a type Object, which one does that
    mean?
    • To avoid the confusion, Steep uses
    another signature language
    class Object ... end
    class Object
    def try(...) ... end
    ...
    end

    View Slide

  46. Signature Code Separation
    • There are at least two Object class
    definition
    • If the types are Ruby classes, when you
    write a type Object, which one does that
    mean?
    • To avoid the confusion, Steep uses
    another signature language
    class Object ... end
    class Object
    def try(...) ... end
    ...
    end

    View Slide

  47. Steep
    • Local type inference & structural subtyping (from TypeScript)
    • Polymorphism
    • Union types
    • Method overloading
    • Signature code separation & extension (from Objective C)
    • No need to type meta-programming
    • Monkey patching

    View Slide

  48. Conclusion
    • Type inference for Ruby is impossible
    • I'm working for a type checker with type annotations
    • Local type inference & structural subtyping
    • I hope Steep be a good material to explorer the static type checker
    for Ruby

    View Slide

  49. Slides below were skipped in my presentation

    View Slide

  50. References
    • [Matsumoto, 2007] S. Matsumoto and Y. Minamide. Type Inference for
    Ruby Programs based on Polymorphic Record Types
    • [Furr, 2009] M. Furr, J. hoon (David) An, J. S. Foster, and M. Hicks.
    Static Type Inference for Ruby
    • [Ren, 2016] B. M. Ren and J. S. Foster. Just-in-Time Static Type
    Checking for Dynamic Languages

    View Slide

  51. Using Steep
    1. Declare types
    2. Implement and annotate the Ruby program
    3. Run the type checker
    You can find examples from some of Steep source code and its tests

    View Slide

  52. Declare Types
    class Contact
    def initialize: (name: String, address: Address) -> any
    def name: -> String
    def address: -> Address
    end
    class ContactList
    def contacts: -> Array
    def filter: { (Contact) -> _Boolean } -> ContactList
    end

    View Slide

  53. Declare Types
    # @type const ContactList: ContactList.class
    list = ContactList.new
    list.contacts << 3
    list = list.contacts.filter {|contact| contact.name == "Matsumoto" }
    $ steep check -I contact.rbi test.rb
    test.rb:9:0: ArgumentTypeMismatch: type=Array, method=<<
    test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList,
    rhs_type=Array

    View Slide

  54. Declare Types
    # @type const ContactList: ContactList.class
    list = ContactList.new
    list.contacts << 3
    list = list.contacts.filter {|contact| contact.name == "Matsumoto" }
    $ steep check -I contact.rbi test.rb
    test.rb:9:0: ArgumentTypeMismatch: type=Array, method=<<
    test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList,
    rhs_type=Array

    View Slide

  55. Declare Types
    # @type const ContactList: ContactList.class
    list = ContactList.new
    list.contacts << 3
    list = list.contacts.filter {|contact| contact.name == "Matsumoto" }
    $ steep check -I contact.rbi test.rb
    test.rb:9:0: ArgumentTypeMismatch: type=Array, method=<<
    test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList,
    rhs_type=Array

    View Slide

  56. ...
    8
    9 list.contacts << 3
    10 list = list.contacts.filter {|contact| contact.name == "Matsumoto" }
    Type Check
    $ steep check -I address.rbi test.rb
    test.rb:9:0: ArgumentTypeMismatch: type=Array, method=<<
    test.rb:10:0: IncompatibleAssignment: lhs_type=ContactList,
    rhs_type=Array

    View Slide

  57. Annotate Ruby Program
    class Contact
    # @implements Contact
    attr_reader :name
    attr_reader :address
    def initialize(name:, address:)
    @name = name
    @address = address
    end
    end

    View Slide

  58. Type Check
    • There is no defs in the class for name or address
    • Add annotation to tell steep that it does not have to check existence
    of that method definitions
    $ steep check -I contact.rbi contact.rb
    contact.rb:1:0: MethodDefinitionMissing: module=Contact,
    method=name
    contact.rb:1:0: MethodDefinitionMissing: module=Contact,
    method=address

    View Slide

  59. Annotate Ruby Program
    class Contact
    # @implements Contact
    # @dynamic name
    attr_reader :name
    # @dynamic address
    attr_reader :address
    def initialize(name:, address:)
    @name = name
    @address = address
    end
    end

    View Slide

  60. Annotate Ruby Programs
    class ContactList
    # @implements ContactList
    # @dynamic contacts
    attr_reader :contacts
    def initialize; @contacts = []; end
    def filter
    contacts.select do |contact|
    yield contact
    end
    end
    end

    View Slide

  61. Type Check
    $ steep check -I contact.rbi contact.rb
    contact.rb:25:2: MethodBodyTypeMismatch: method=filter,
    expected=ContactList,
    actual=Array
    def filter
    contacts.select do |contact|
    yield contact
    end
    end contacts is Array and #select returns Array

    View Slide

  62. Fix Implementation
    class ContactList
    # @implements ContactList
    ...
    def filter
    copy = ContactList.new
    contacts.each do |contact|
    copy.contacts << contact if yield(contact)
    end
    copy
    end
    end

    View Slide

  63. Type Check
    • Is that sure?
    • I'm afraid if there is unexpected fallback to any type
    $ steep check -I contact.rbi contact.rb
    $

    View Slide

  64. Type Check
    $ steep check --fallback-any-is-error -I contact.rbi contact.rb
    contact.rb:26:11: FallbackAny
    26 copy = ContactList.new

    View Slide

  65. Annotate Ruby Programs
    class ContactList
    # @implements ContactList
    # @type const ContactList: ContactList.class
    ...
    end

    View Slide

  66. Type Check
    $ steep check --fallback-any-is-error -I contact.rbi contact.rb
    $

    View Slide

  67. Future Work
    • Support typical Ruby programming styles
    • Instead of adding annotations to all constants, infer their types
    from signatures
    • String::String=3, ([Integer, String].sample)::Foo=3
    • Type system improvements
    • Typing rule enhancements & bug fixes
    • Access control (public/private)
    • Integration with Ruby

    View Slide

  68. Integration with Ruby
    • Some features cannot be implemented without extending Ruby
    • Integrating annotations to Ruby syntax
    • Update: Matz rejected adding typing syntax yesterday
    • Dynamic type testing
    • Ruby only has is_a? inheritance relation testing operator
    • Want structural subtyping relation testing operator
    • Not Module#conform (because the params and return type
    should be checked)

    View Slide

  69. Structural Subtyping Failure
    class A
    def initialize: (name: String) -> any
    end
    class B <: A
    def initialize: (year: Integer) -> any
    def print: () -> any
    end
    A.class == { new: (name: String) -> A, ... }
    B.class == { new: (year: Integer) -> B, ... }
    A == { class: -> A.class, ... }
    B == { class: -> B.class, ... }

    View Slide

  70. Array#zip
    class Array<'a>
    ...
    def zip: <'b> (Array<'b>) -> Array
    end

    View Slide

  71. Array#zip
    • Tuple types ('a * 'b)
    • Type variable extraction difficulty
    def zip: <'b> (Array<'b>) -> Array<'a * 'b>
    # @type var x: Array
    ["a"].zip(x) -> Array
    # @type var y: Foo
    ["a"].zip(y) -> Array

    View Slide

  72. Array#zip
    • Tuple types ('a * 'b)
    • Type variable extraction difficulty
    def zip: <'b> (Array<'b>) -> Array<'a * 'b>
    # @type var x: Array
    ["a"].zip(x) -> Array
    # @type var y: Foo
    ["a"].zip(y) -> Array

    View Slide

  73. Array#zip
    • Tuple types ('a * 'b)
    • Type variable extraction difficulty
    def zip: <'b> (Array<'b>) -> Array<'a * 'b>
    # @type var x: Array
    ["a"].zip(x) -> Array
    # @type var y: Foo
    ["a"].zip(y) -> Array
    ???

    View Slide