C++ Programmer for 5+ years, • Worked at two startups and HP, • Enjoy reading and making software that people use, • Homepage: http://uptosomething.in • Email: [email protected]
Data Explosion – HP Case Study, 4) Multiprocessing – Introduction and Capabilities, 5) Multiprocessing – Live Code Review, 6) Gearman – Introduction and Capabilities, 7) Gearman – How-to and Code Review, 8) Logging, Debugging, Monitoring.
a long-term trend in the history of computing hardware: the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years.” Moore's law as of today comes to us in the form of multi-core CPU's. Developers of yesteryears still code like they have access to a single core. Even for Embarrassingly parallel class of problems. ( Search wikipedia for embarrassingly parallel ) Python developers are at severe disadvantage thanks to GIL and unavailability of Intel Thread building blocks like data structures as part of python standard library.
worth instrumentation and performance data received from cluster of storage (SAN) boxes deployed globally, b) If Box goes down (downtime) SLA, monetary consequences are dire, c) Best case solution is to analyze the data in near realtime. Thus finding problems waiting to happen and dispatch an automated email containing data to support what needs to be corrected in the box and how, d) Existing codebase for passive analysis of above exist. Codebase contains code in perl, sed, awk, shell script, tcl, c, e) Python (Multiprocessing, Django) and Gearman save the day, i) Python ensured we got version 1.0 working in no time, ii) Multiprocessing ensured we used all cores on a given machine, iii) Gearman ensured that we can not only use a diversified codebase but also scale our application/software to run across more machines with minimal refactoring of version 1.0.
supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.” Source: http://docs.python.org/library/multiprocessing.html a) Shields programmers from the chores of IPC by offering Pipes, Queues, Process Pool, Managers for shared data, b) Similarity with Threading API, c) Multiprocessing can scale over farm of machines using Remote Manager, d) Multiprocessing doesn't suffer from GIL as each process is responsible for it's own memory management aka each process has it's own instance of Python Interpreter.
process ) b) Factorial example using multiprocessing employing multiple cores ( 2 * No of cores ) Code can be downloaded from http://uptosomething.in/scipy/code.tar.gz ( Link will be valid from 4th December onwards )
framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages.” http://gearman.org What Sqlite is to relational databases, Gearman is to Distributed Job Queues.
usage of gearman, b) Factorial example using gearman ( demo uses multiple machines ) Code can be downloaded from http://uptosomething.in/scipy/code.tar.gz ( Link will be valid from 4th December onwards )
e.g SocketHandler for distributed application, syslogd for one using multiple processes running on one machine, b) For debugging attach cProfile to each process then dump stats output of it. Possible to join output of all. Yes it's cumbersome/painful specially if we are looking at resource deadlocks, c) Proactive Monitoring of distributed python software is important if it's to run 24/7. You got to know when and if it's down. Explore Monit ( http://mmonit.com/monit/ ) or ( http://www.nagios.org/ ) Code sample for logging and monitoring can be downloaded from http://uptosomething.in/scipy/code.tar.gz ( Link will be valid from 4th December onwards )