Slide 1

Slide 1 text

Using F# in Production Functional Conf, 2016 Ankit Solanki, cleartax.in @_anks, [email protected]

Slide 2

Slide 2 text

About me • Co-founder at ClearTax • Lifetime functional programming novice • Still learning!

Slide 3

Slide 3 text

Outline • History: the product and why we picked F# • Wins & Losses • Scaling-up with the language • Learnings

Slide 4

Slide 4 text

The Product • cleartds.com • SAAS product for preparing TDS Returns • Quarterly deadlines for submission of TDS Returns • Launched in early 2014 (Pre-YC)

Slide 5

Slide 5 text

Product Requirements • Accuracy • Cannot afford to make any mistakes • 1,000s of people in every single TDS return • Constraints: • Arcane file format used by the department • Lack of documentation

Slide 6

Slide 6 text

Product Requirements • Flexibility • Rules change every quarter • File format changes every quarter • Constraints: • Changes announced with no warning • Updated formats applicable from the day of release

Slide 7

Slide 7 text

Product Requirements • Simplicity / Expressiveness / Speed • Need ability to launch fast • Needed to be able to run the product on auto-pilot for large chunks of time (after launch) • Ability to quickly dip-in every quarter and make changes • Constraints: • Lack of engineering resources • Lack of time

Slide 8

Slide 8 text

File Format • Flat file format, like CSV files • Different types of lines • Mode-based, hierarchical format • Large number of fields
 and line types • Definition changes over time • Field means X in 2015 but Y in 2016 • Different variants based on use case (submission from department, download from department) and time (quarter, year, form type) • Delta-submissions in case of filing a revised return. Multiple modes of correction

Slide 9

Slide 9 text

• File format main blocker • Experimented with a few approaches • Procedural code, DSLs, Mapping tools • Found F# Type Providers

Slide 10

Slide 10 text

Type Providers • Compile-time ability to deal with structured data formats • Generate types & metadata based on sample data, at compile time • Parser to read input data of that type

Slide 11

Slide 11 text

Type Providers Demo

Slide 12

Slide 12 text

Using type providers drastically simplified our actual parsing,
 and made our code super-readable deduction.Tds <- record.``TDS / TCS -Income Tax for the period`` deduction.Surcharge <- record.``TDS / TCS -Surcharge for the period`` deduction.EducationCess <- record.``TDS / TCS -Cess``

Slide 13

Slide 13 text

• We tested the file parser / generator • Simple strategy: • Parse sample file into your data model • Generate it • Do line-by-line, field-by-field comparison • Automated a large part of this with F#

Slide 14

Slide 14 text

• We ended up writing the whole product in F# • But we did not focus on making it ‘functional’ • Using F# it as a better C# or Java (at first) • Core components (business logic) written in mostly functional style • Glue (controllers) written more in an imperative style

Slide 15

Slide 15 text

Our philosophy with FP • Be pragmatic • Grow with the language • Experiment, bring in parts that you feel are valuable

Slide 16

Slide 16 text

Product launch • We wanted to launch in 6-8 weeks • Very focused execution • We learned parts of the language and limited ourselves to only those parts • Started out with mostly the basics: pattern matching, currying, pipelines

Slide 17

Slide 17 text

Initial period

Slide 18

Slide 18 text

Pipelines // Pipeline operator itself is very simple
 
 let (|>) x fn = fn x
 
 // Usage is natural
 
 let square x = x * x
 let double x = 2 * x
 
 [1 ; 2; 3 ] |> List.map square |> List.map double
 // [2 ; 8; 18 ] // A little more natural than composition, at least for beginners // Can be arbitrarily long
 users
 |> List.map validateUsers
 |> List.filter isValid
 |> List.map getUserId

Slide 19

Slide 19 text

Pattern Matching • Since you can pattern match on a tuple with n elements, this results in the ability to flatten nested logic into a simple decision table • Huge win for readability • We might have gone over-board with this

Slide 20

Slide 20 text

Partial Application • Ability to encapsulate in a functional manner • Big aha moment when I finally understood this // Validations would be context dependent let validateDeduction year quarter returnType deduction = … // But at a higher level, you want a simpler signature type ValidateFn<'A> = 'A -> ValidationResult // So you can just freeze the context sensitive parameters // to get the specific validator you want let currentValidator = validateDeduction 2016 Q4 Original

Slide 21

Slide 21 text

Code as Data // Type definitions type IsColumnVisible = (TdsReturn -> bool) list type IsColumnEditable<'T> = ('T -> bool) list type ColumnSpecification<'T> = (Quotations.Expr * IsColumnVisible * IsColumnEditable<'T>) list // Column visibility specifiers let showAlways = ... let showForRevised = ... let showForGovernment = ... // Editing specifiers let editAlways = fun (d : Deduction) -> true let editWhenNil = fun (d : Deduction) -> d.Amount = 0 let editWhenDateIsPresent = fun (d : Deduction) -> d.Date |> Option.isSome // UI Specification let (deductionColumns : ColumnSpecification) = [ <@ fun (c : Deduction) -> c.Date @> , [ showAlways ] , [ editAlways ] <@ fun (c : Deduction) -> c.SectionCode @> , [ showForRevised ] , [ editWhenNil ] <@ fun (c : Deduction) -> c.Amount @> , [ showForRevised ; showForGovernment ] , [ editWhenNil ; editWhenDateIsPresent ] ]

Slide 22

Slide 22 text

ORM • ORM – we picked a C#-specific ORM [ServiceStack.OrmLite] early on • ORM was ideal for the product use case (bulk inserts, updates, simple conceptual model) • It actually worked pretty well with F# (with a minimal wrapper) • Had some issues, will go into detail later let loadDeductionsByName name = // Where clause let condition = <% fun (d : Deduction) -> d.Name = name %> // Order-by clause let ordering = <% fun (d : Deduction) -> d.CreateTimestamp %> // Pagination let currentPage = { page = 1 ; pageSize = 10 } // This executes: // select * from deduction where name = ? order by create_timestamp limit 10 DbHelper.LoadPageWhen currentPage condition ordering

Slide 23

Slide 23 text

Mistakes were made though.

Slide 24

Slide 24 text

Design Issue: Not leveraging types • One of the main mistakes we made • ORM layer was unable to deal with F# specific types (records, tuples, discriminated unions) • This resulted in nullable values introduced in the data model • Polluted the whole codebase • Right solution would have been to isolate this in a data layer

Slide 25

Slide 25 text

More type problems • Same base types (example: Tax Deduction) used throughout the product • In some ways, made things simpler • But also led to unnecessary complexity • UI may not need to know about some fields, but it still gets them • Too much capability stuffed into a single entity • We should have defined more granular types • If I started over, would have spent more time getting the types right, building a layered architecture

Slide 26

Slide 26 text

Performance: Lazy Evaluation F# Sequences and their transformations are lazy let squares = [1 ; 2 ] |> Seq.map (fun i -> printfn "%d" i i * i ) printfn "%A" squares // 1 // 2 // seq [1; 4] printfn "%A" squares // prints again // 1 // 2 // seq [1; 4] Usually, laziness is what we want. But responsibility lies with caller. Sometimes, this can lead to very expensive operations being repeated.

Slide 27

Slide 27 text

Performance: Expression Trees • Or “Code Quotations” – language level feature of F# • Expression trees that you can work with programatically • Example: 
 let ordering = <% fun (d : Deduction) -> d.CreateTimestamp %> • Possible to inspect this expression and do code generation, evaluate it, get the property name it refers to, etc

Slide 28

Slide 28 text

Performance: Expression Trees (continued) • These are actually fairly expensive to build • Relatively slow, even when compared to operations like creating a new function • First version of our application used quotations while generating the final TDS return • One quotation per field, per line • Profiler said that 99.99% time was spent building the trees or traversing these trees • Refactored the code to get a 100x speed improvement

Slide 29

Slide 29 text

Data Structure Selection • This was a relatively minor issue, but still painful • F# has its own parallel data structures (List, Map, etc) • Different from standard C# structures • Immutable in design • Picking the right data structure was tricky • Libraries would work with C# lists, not F# lists, we had to convert • Also, default list in F# is a linked-list, not a array list

Slide 30

Slide 30 text

Some tooling issues

Slide 31

Slide 31 text

Tooling Issues: IDE • Visual Studio tooling for F# was not great in the beginning • We had crashes, slowdowns, etc • Example: • F# does not allow cyclic dependencies • Order of files in the proejct matters • Visual Studio (2013) actually did not have options to insert file at a particular location or re-order files • We hand-edited project files for a long time

Slide 32

Slide 32 text

IDE (continued) • F# support in Visual Studio is better now • Still not ideal, not on-par with C# • F# compiler much slower than C# compiler, for example • Will take some time to catch up • F# story outside Windows is also good, now

Slide 33

Slide 33 text

Tooling Issues: Language / Compiler Versions • Hit by this several times • Subtle differences in complier versions or language level support caused compilation fail during deployment • "Works on my machine", though • Resolving this was very painful

Slide 34

Slide 34 text

Maturing with the language

Slide 35

Slide 35 text

Computation Expressions • Syntactic sugar for monads • (Let's not talk about monads) • Will let you design 'workflows', flatten your logic even further • Simplified our business logic

Slide 36

Slide 36 text

// Sugared syntax using the maybe computation expression // 'maybe' is not a built-in let lateFine = maybe { // CreditDate, TdsDate are option types, can be None if not entered let! creditDate = deduction.CreditDate let! deductionDate = deduction.TdsDate let diffInMonths = getDifferenceInMonth creditDate deductionDate return (calculateFine deduction.taxDeducted diffInMonths }

Slide 37

Slide 37 text

// De-sugared let lateFine = match creditDate with | None -> None | Some c -> match deductionDate with | None -> None | Some d -> let diffInMonths = getDifferenceInMonth c d calculateFine deduction.taxDeducted diffInMonths

Slide 38

Slide 38 text

Big win with 'glue' code • The ‘glue’ logic in controllers (written in a mostly imperative style) ended up being more and more complex • Needed to handle different use cases, UI states • Things became difficult to reason about • We started to use computation expressions to simplify it (where suitable)

Slide 39

Slide 39 text

Type annotations are a win • Initially, we omitted type annotations in most places • “Complier is smart enough to infer types, why do I need to mention them?” • Code was difficult to read, and error messages were a little cryptic at times • Type inference would pick up wrong type for the function if you make a mistake • We started adding annotations to functions • Also started defining type aliases for common combinations
 
 let d : ColumnSpecification = …
 is more readable than
 let d : ((Quotations.Expr * IsColumnVisible * IsColumnEditable<'T>) list)

Slide 40

Slide 40 text

Union types • If doing a do-over, would especially focus on algebraic types / union types • Data model would be richer if we had made proper use • Mostly ended up using union types at leaf level (individual fields), did not use at higher levels (entity level, composition of entities)

Slide 41

Slide 41 text

Final thoughts

Slide 42

Slide 42 text

On-boarding engineers • We thought F# would be hard for people to learn • Current team: One person with prior Haskell experience, one JavaScript programmer and one fresher • Everyone was able to ramp-up and start using F# very quickly • In-depth understanding takes time though • Basics are really easy to pick up

Slide 43

Slide 43 text

Design matters • Language will not solve problems for you • No substitute for good design • The areas where we took shortcuts came back to haunt us • Functional languages like F# do help, if you are willing to listen • Not a silver bullet

Slide 44

Slide 44 text

F# on the CLR • When we started, fewer F# specific libraries • Ability to use any C# library, but they often didn't feel natural • We made compromises here. Used some sub-optimal tools • Situation is better now. F# ecosystem is vibrant and growing!

Slide 45

Slide 45 text

F# at ClearTax right now • TDS Product fully built & running on F# • Some features of ClearTax built in F# • Majority of ClearTax is still C# • Mostly because of tooling issues: not possible to mix and match languages within a singe 'project' • Browser based testing – Canopy, a F# DSL over Selenium

Slide 46

Slide 46 text

Q&A

Slide 47

Slide 47 text

Thank you! You can contact me on Twitter: _anks or, [email protected]

Slide 48

Slide 48 text

Resources for F# • Online • F# For Fun And Profit • F# Programming WikiBook • Books • F# Deep Dives • F# Applied