Presentation deck presented at Pycon Korea 2018.
- Why and when to refactor
- How to approach refactoring
- Good and bad ways of refactoring
- Python design patterns and library features
for six or more months might as well have been written by someone else.” -- Eagleson's Law Let’s start! • What is refactoring? • How does it affect code quality? • Do you need to refactor? ◦ When do you need to refactor? ◦ Setting goals. ◦ Refactoring timeline • Refactoring approach ◦ Looking for affected code ◦ Common design patterns and solutions ◦ Single Responsibility Principle • Results assessment ◦ Is it effective? ◦ Showing the improvements ◦ Future implications of refactoring
the mess of the first developers (junior devs) • Jumping into the abyss (senior dev) • Just don’t break any existing tests (QA) • Do we have time for that? (project managers/leads) • It’s okay, I guess, ermmm, can you add this new feature? (product owners) • Re . . . what??!! (executives)
code quality - make it easier for new people to understand the code • Easy testing • Make integrations faster and efficient • Makes the system more modular and flexible
with Leads/PM/PO for possible time allotment • Do not refactor early! (Big mistake usually, wastes a lot of time) • Do not refactor too late! (Causes a lot of technical debt) • Small incremental refactoring is ideal.
print('I am not okay') def show_dimension(self): print('I am {} from dimension {}'.format(self.name, self.dimension)) class Morty: name = 'Morty' dimension = 'c137' def awwwww_geeez(self): print('Geeeezz Rick!') def show_dimension(self): print('I am {} from dimension {}'.format(self.name, self.dimension)) if __name__ == '__main__': Rick().show_dimension() Morty().show_dimension() Class Usage
if row['team'] not in temp_dict: temp_dict[row['team']] = { 'members': [], 'member_count': 0 } temp_dict[row['team']]['members'].append(row) temp_dict[row['team']]['member_count'] += 1 for team in temp_dict.values(): teams.append(team) print(teams) Data formatting
similar kinds of data handling, it is important to make assumptions about the data. Always consider the ff: • Data structure • Ordering of data • Type of data • Size of data
class, use Exception as parent class 2) Flask has a feature @app.errorhandler(ErrorResponse) to be able to catch custom exception errors 3) Throw the error Exception for error handling.
class ErrorResponse(Exception): status_code = 400 def __init__(self, message, status_code=None, payload=None): Exception.__init__(self) self.message = message if status_code is not None: self.status_code = status_code self.payload = payload def to_dict(self): return dict(code=self.status_code, message=self.message, data=self.payload) @app.errorhandler(ErrorResponse) def exception_encountered(error): error = error.to_dict() # You can modify this to return any kind of error # You can return an error page # You can use json return for pure API return make_response(jsonify(error), error['code']) Solution - Flask App
print('I am not okay') def show_dimension(self): print('I am {} from dimension {}'.format(self.name, self.dimension)) class Morty: name = 'Morty' dimension = 'c137' def awwwww_geeez(self): print('Geeeezz Rick!') def show_dimension(self): print('I am {} from dimension {}'.format(self.name, self.dimension)) if __name__ == '__main__': Rick().show_dimension() Morty().show_dimension() Class Usage
of different class by creating parent class that have them as methods 2) Utilize class features such as interfaces, subclasses and singleton patterns. Singletons for example are very useful in storing states in any kinds of application. 3) Method overriding and method overloading
self.dimension)) class Rick(InfoFormatter): name = 'Rick' dimension = 'c137' def wubulubudubdub(self): print('I am not okay') class Morty(InfoFormatter): name = 'Morty' dimension = 'c137' def awwwww_geeez(self): print('Geeeezz Rick!') if __name__ == '__main__': Rick().show_dimension() Morty().show_dimension() Solution - Class Usage
lambda functions needed for each data type. 2) Utilize dictionaries and mapping for faster access and filtering. In python, there are several “hashable” data types that you can use. This is a basic example but it has big potential especially when dealing with much more complex problems.
if row['team'] not in temp_dict: temp_dict[row['team']] = { 'members': [], 'member_count': 0 } temp_dict[row['team']]['members'].append(row) temp_dict[row['team']]['member_count'] += 1 for team in temp_dict.values(): teams.append(team) print(teams) Data formatting
the data it handles • Data structure, arrangement and data type affects the speed of your code • Know the best data structures for different types of task. Eg. search - trees, etc • Sometimes a tradeoff between speed and aesthetics
'n' # Usual implementation found = False for c in chars: if look_for == c: print('Found you!') found = True break if not found: print('Not found "{}"'.format(look_for)) # Use Else for c in chars: if look_for == c: print('Found you!') break else: print('Not found "{}"'.format(look_for)) Else on loops
as much as possible. It is good for performance num = [1, 2, 3] doubles = map(lambda x: x * 2, num) num = [1, 2, 3] doubles = [x * 2 for x in num] List operations
responsibility. Eg. add() function should only do addition - Keep responsibility/use case as small as possible - Each function/class should be able to integrate properly. Eg. add(), subtract(), multiply() should be able to create a calculator() function
quality improved? • Is it much more readable? • Is your testing a lot better? No tests should break randomly. • Is it easier to expand and integrate your code? • Is the overall system more modular?
• Create benchmarks to show quantitative data of the improvements. You can use http://pyperformance.readthedocs.io • Ask your leads/PM to inform mid managers about the improvements • If you created a new library as a result of refactoring, do not forget to inform other developers that such library/tools exists.
Dedicate time once a month. • Make sure your leads and managers understand the importance of a clean codebase. • Early refactoring wastes a lot of time. • Refactoring too late incurs a lot of technical debt. • Use language features and design patterns as much as possible.
write stuff at least once a month. All code examples can be found at https://github.com/pprmint/pycon_kr If you are interested in AI and Healthcare technology, you can check us out at Hacarus. We use Sparse Modeling instead of the traditional ML/Deep Learning approaches. https://hacarus.com/ https://hacarus.com/information/tech/sparse-modeling-for-it-engineers/