Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robert King: Making a scalable course search engine with Python

Robert King: Making a scalable course search engine with Python

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Robert King:
Making a scalable course search engine with Python
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
@ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Track 2
http://kiwi.pycon.org/

**Audience level**

Experienced

**Description**

Creating a custom search engine with python on google app engine. Serve large spikes in search request traffic. Allow students to find course reviews across multiple universities and countries.

**Abstract**

- introduction to the real world problem - Students need to be able to find university courses across multiple countries and universities.
- first - explore how to solve the problem - collect course data & decide on a rough solution
- second - create minimum viable product & see how people use it. Iteratively make it better.
- second continued: Organise a big website launch event before you've created the website - then proceed to write 10K lines of code in the week before launch.
- third - analyse the 50K most recent search terms & make a simple tree data structure to help improve search performance.
- four - caching & cache invalidation
- five - Maybe I'll do an online marketing campaign halfway through the talk and show graphs of the app responding in real time.

- Covers Data analysis with python (csv, matplotlib, networkx, collections.Counter, logfile parsing)

- Covers "Futures" - doing RPC calls in parallel.
- Unit testing & simulating all things. - Being able to see how adjusting search functionality effects query times & quality of results.

- Some tasteful jokes to keep things entertaining ;)

**YouTube**

https://www.youtube.com/watch?v=568mFzqsjqk

New Zealand Python User Group

September 14, 2014
Tweet

More Decks by New Zealand Python User Group

Other Decks in Programming

Transcript

  1. C O M M S C P L A W

    N G I TRIE TREE DATA STRUCTURE