"Automating infrastructure at SA Home Loans with Python (and friends)" by Kim van Wyk

AUTOMATING INFRASTRUCTURE AT SA HOME LOANS WITH PYTHON (AND FRIENDS)
PYCONZA 2019 Kim van Wyk - October 2019

INFRASTRUCTURE

PROD SERVERS ± 40 virtualised Windows servers Internally-developed Windows services
Windows infrastructure Active Directory Exchange IIS 7 Ubuntu servers Docker hosts

LABS AND TEAMS 18 sandboxed virtualised clones of prod environment:
5 Agile development teams Front-line support DBAs Platforms/Devops Additional Infrastructure: NATing DNS entries Consistent internal host names

PROBLEMS WITH LABS Labs built manually Labour-intensive - 1 week
+ Error-prone - consistency almost impossible No DIY capacity - teams reliant on Devops availability

AUTOMATE (ALMOST) ALL THE THINGS!

Adopted several on-premise Open Source tools: - Machine Image Creator
- Orchestration - Virtualisation - Containerisation - Container Registry - Git and Continuous Integration Tooling - Project generator - Secret Management - Key/Value Store - Web Server Large portion of infrastructure executed on VMWare hosts Packer Rundeck OpenStack Docker Harbor Gitlab Yeoman Vault etcd NGINX

PACKER TEMPLATES Almost all servers in prod and labs derived
from source- controlled Packer templates OpenSSH installed on Windows boxes Monitoring and logging tools installed, feeding into for log shipping Docker metrics Windows events Common public SSH key added to every server Python and some useful libraries installed Logstash Filebeat

RUNDECK Orchestration jobs defined in YAML Used to execute scripts
on specific targets Python or Bash in most cases Powershell to control VMWare and Windows hosts Comprehensive scheduling and threading Job maintenance and configuration via API Full history aids with audit trail

WHY RUNDECK? Could have used Puppet, Ansible, Chef etc GUI
easy for non-development teams to drive YAML based config fairly easy to understand Does the job well enough to allow moving on to the next problem

RUNDECK USAGE Internal Python/Node.js tool builds Rundeck job files from
a simplified YAML config Rundeck jobs using internal Python/Bash tooling to upload new jobs from internal Git repo Developers can add or modify jobs without needing to understand full complexity Git branching allows teams to develop specific jobs without aﬀecting other teams

VALIDATION, MONITORING & CONTROL Infrastructure validation via Monitoring tools deployed
via Docker container to each lab Stack Rundeck and above services all deployed as Docker containers Internally developed tooling also Dockerised Docker control via InSpec Prometheus ELK Grafana Portainer

CONTAINER DEVELOPMENT & DEPLOYMENT Third-party and internally-developed Docker images served
by Teams can upload images to their own libraries as they wish Promotion to prod library and subsequent deployment controlled via ticketing Applied to both third-party and internal images CI tooling ensures consistent and functional images templating aids greatly in eliminating common mistakes Harbor Gitlab Yeoman

EXISTING SCRIPTS Docker containers useful to wrap a consistent interface
around existing scripts "Black Box" nature allows support teams to execute jobs without needing to know several languages Rundeck deployment adds a level of auditing that is otherwise manually tracked

ADVANTAGES Rundeck, Portainer and monitoring tools allow teams to solve
±80% of day-to-day issues in their labs without DevOps team support Implemented over 6 months by a 3 member team of senior developers Supported by a 6 member team of various experience levels Python and Node training provided in-house over 2 weeks suﬀicient to enable this support

EXAMPLE - DB BACKUPS

Backups performed for SQL Server and Postgres databases Originally a
collection of SQL Server jobs and Bash Moved to Python-based Docker containers Scheduled via a dedicated DBA Rundeck Same host can be used to interact with all databases Consistent interface across all database types and schemas Diﬀerent operations all handled in the same way: Backup to local storage Copy of backup files to other hosts Restoring backup files Standardised logshipping

Common behaviour baked into all the images: stores DB and
file host access credentials for file system operations on Windows hosts to execute SQL on SQL Server instances to store current state of local backup, copy and restore to DR servers YAML config served from internal Git server pulled directly by the image Explanatory commandline parser Vault pywinrm pymssql etcd argparse

LESSONS AND CONSIDERATIONS

Avoid manual fixes if at all possible Will initially take
longer, but pay-oﬀ should come quickly Worth asking individual developers about manual steps in their workflow Descriptive naming of automated jobs cuts down on support requirements Easy to underestimate the number of processes that aren't written down Consistent look-and-feel of related tasks eases learning Allow jobs to be re-run without negative consequences

QUESTIONS & DISCUSSION

"Automating infrastructure at SA Home Loans wit...

"Automating infrastructure at SA Home Loans with Python (and friends)" by Kim van Wyk

Pycon ZA

More Decks by Pycon ZA

Other Decks in Programming

Featured

Transcript