Taking back control of your code by Nickolas Grigoriadis

Taking back control of your code-base Well, better behaved, at
least... By Nickolas Grigoriadis A lead developer at

Why? I had to maintain a code base where its
requirements could not be set. Hence with very little time, it got incredibly hairy! Here I am sharing some strategies and tools that I used to help me take back control of my own code base. “If you can’t measure it you can’t manage it.”

Agenda • Preamble • Simplicity • Risk • Static analysis
• Testing • Design internal data format • Profiling

Preamble Common strategies to manage code bases: • Static analysers
• Monitoring & Logging • Testing Some under-utilised strategies: • Loose coupling • Complexity management • Performance management

Have a dev/QA environment that mirrors your production environment closely.
• Prepare your environment to record unexpected failures. • Log everything feasible, this helps in finding out what went wrong — Use raven & Sentry • Set up monitoring. Always automate provisioning & deployment!! (at least as much as is feasible)

Use the really nice, new, features of Python3: • Type
annotations • First class dict merge operations: e = {"a":"a", "b":b" **c, **d} • super() actually works as you expect • Distinct separation between text and binary I find it helps differentiate human content and machine content. • yield from, async & await Makes reasoning about async applications easy • etc...

Simplicity There is often a big difference between what is
simple, and what is easy. If you want more predictable outcomes, you should favour simplicity over ease of implementation. e.g. SOAP vs JSON-REST

Assuming “perfect” implementations of both: SOAP is “easy” to use:
It provides service discovery It provides data-types and validation rules. JSON-REST is “hard” to use: You have to go read docs! You have to trust that your implementation is to spec

SOAP is “complicated”: It is an XML document (data +
formatting) describing a series of endpoints that uses an XML document query wrapping an XML document payload containing data. Many types, customisable JSON-REST is “simple”: It uses HTTP path and verb to access an endpoint. Data is encoded via JSON. Few types.

Due to the complexity of SOAP, almost no implementation is
actually to spec, meaning that one always have to deal with something unexpected In contrast JSON-RESTs simplicity allows it to be used much more reliably. It also tends to perform significantly better.

Risk Not all code is equally important. Consider marking the
code in levels of risk: Low: Things that are run rarely, or only ever under supervision. (setup/migration related) High: Anything that is absolutely core to your code base, involves money, safety, or has a tendency to break. Medium: Everything else.

Allows you to focus on what is important. Low: •
Linters & type checkers • Depend on a skilled person to supervise Medium: • “Spec”/“Integration”-level unit tests • Behavioural testing (BDD) • Fuzzers e.g. hypothesis High: • Full unit testing aiming for 100% coverage • Force yourself to re-think the critical details

Static analysis By using and taking some static analysers seriously,
you catch real bugs before you even get to writing tests for them. And it helps to keep a consistent minimum bar of quality. Yes, you probably use something like flake8 for style checking, but that isn’t what I’m talking about.

Pylint: • Often too strict out of the box •
But its suggestions really are quite sensible most of the time. • Some of the style checks are arguably silly, or slightly different to flake8 if you use it. At least run it as pylint -E Best to start a project with it, but generally applying it to a legacy code base actually finds bugs...

Mypy: (and import typing) • You define types in a
similar way to type-strict languages • But optionally, with configurable strictness • Guido endorses it Finds where you passed in an incorrect, but almost compatible parameter in. Great for helping you refactoring large swaths of code at once. Helps you get better code-completion in a IDE that supports it, e.g. PyCharm

Testing If you can, have a manual tester. It’s their
job to break things in ways you can't comprehend. It’s important that you encourage them to do that. Even if it makes you cry. Of course automated testing is critical as well. Use different testing tools for different things in the same project where they make sense.

Why should you write tests? Ensure future control: • Protect
against accidental breakage or regressions • Makes refactoring easier Improve confidence in your code base: • Validating that your product does on a high-level what you expect it to • Forces yourself to re-think many critical tiny details

I’m a great fan of behave, a BDD implementation •
BDD tools are state-machines tailored for testing. • With this great Given, When, Then interface. • That is very close to manual test plans. • Unfortunately they don’t interpret things like humans, so a developer is still required to automate. Often used for automating Selenium, but I found it works fantastic for testing any work flow or API that expects dependant interactions.

Another fantastic testing tool is hypothesis. It allows you to
build a strategy for input data, which it can then generate syntactically correct fuzzer-like data. It is especially useful in these scenarios: • Robustness of data handling. • Conformance to some spec. • Tests that does symmetric data transformation. It really finds bugs!

Use coverage, with branch tracking Use it as a tool
to write tests for the things you missed. But don’t always REQUIRE 100% coverage. Except for High-risk code sections, of course. Generally getting ~80% test coverage is good enough to help you during large refactoring.

Design internal data format Why should I formally design my
data interchange? • Gives a birds-eye view of the relevant sections • Allows loosely connected components • Helps to make dependencies non-cyclic • Makes it easier to refactor your code base One could use a logic-less, slotted attrs or any common schema, such as JSONSchema.

How does this help refactoring? • When the unexpected happens,
you can get a birds-eye view of the breaking changes early on, and therefore potentially avoid a bad decision. • Loose coupling allows you to refactor a now isolated portion of the code base at a time. • Moves unexpected errors to component boundaries as it protects other components from unexpected data (the schema won't allow it). • You can feed your schema into hypothesis :-)

Models: Persistence + Work-flow Direct Django-Model Access Work-flow Editor Errors
can be ANYWHERE!

Models: Persistence + Work-flow Model Serialisation Methods Schema Access Functions
Errors only here Work-flow Editor

Models: Persistence Read only Schema Access Functions Work-flow Editor Highly
cacheable! Work- flow

Profiling A good programmer with a profiler is better than
one without. Enter vmprof, a great profiling tool for Python. Herewith follows a real-world sample of optimising Feersum Engines “message-in” handler. Including the stupid stuff I did because I made baseless assumptions...

Baseline: ~12.24ms (81.72/s)... WHAAAAT? Why so slow? I decided that
400/s would be a good target. After removing generic Django model caching “optimisation”: ~9.21ms (108.55/s) (Hahaha!) After meticulously optimising code that I expected to take the majority of time: ~7.23ms (138.23/s) (Disappointing)

On the first use of vmprof, I spotted that schema
validation (jsonschema) was taking half the time. Changed to a faster schema validator fastjsonschema: 4.47ms (223.70/s) (See, profilers make you a better programmer) Next vmprof pointed out dateparse as a time-hog. Since all the times are in ISO8601 format, changed to iso8601 module: 3.50ms (285.64/s) (Wow, date parsing sucks!)

Changed logging from Django-ORM to direct-SQL logger: 3.04ms (328.92/s) (See
how a nice fancy ORM can get in the way of good performance?) Changed session store from a Django-ORM model to Redis: 1.42ms (703.57/s) (Yay!!) After this, vmprof showed nothing obvious left.

Then after some manual testing, everything was a bit wonky.
Fixing the transient storage bug reduced performance to: 1.82ms (555.79/s) (Ah well) This all took less than 2 days...

The last two changes could only happen because dependencies was
flowing in one direction only, that was the case because a few months earlier in the project we put a formal internal interface in the way. Without the formal internal interface, this would not have been possible in my time-box of 3 days. Without all the tests, it probably would have been flaky for the following month.

After teasing apart the work flow engine, (and then running
it through a profiler of course) that same metric is now ~1280/s This was largely possible due to the clear separation of concerns via the schema persistence interface, significantly simplifying caching. I could probably do even more. e.g. I'm now spending a disproportionate time logging...

v0.5 v0.6 v0.8 v0.10 20 200 2000 92.52 122.3 555.79
1287.49 Existing User New User No-Channel Continue Create Work flow Performance over time

Phew!! All done!! Thanks for listening :-) Nickolas Grigoriadis ([email protected])
Developer at and (www.feersum.io) (www.praekelt.com) Github user: grigi

Taking back control of your code by Nickolas G...

Taking back control of your code by Nickolas Grigoriadis

More Decks by Pycon ZA

Other Decks in Programming

Featured

Transcript