data from users per day • 400 GB of data from services per day • 64 TB of data generated in Hadoop each day • 590 node Hadoop cluster • 7,500 Hadoop jobs per day
very comprehensive • MIME, HTTP, SMTP, FTP • Interprocess Communication & Networking • Data persistence, archiving, file I/O • Unit testing, Doc testing, Logging, and CLI parsing • PyPI - over 38,000 packages • pip install $packagename • Networking like gevent, Twisted • Web Frameworks like Django, Flask, Pyramid • Test Frameworks & Doc tools, like pytest and read the docs and sphinx
A lot of home-grown solutions • No good HTTP client - every library is missing something • memcache client doesn’t work well with our gevent setup • Debian packaging for everything.
revisit Python 3 • Java’s popularity internally is growing since difficult to hire Python developers • Wanting to incorporate more Python into our desktop client