Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robert King: Making a scalable course search engine with Python

Robert King: Making a scalable course search engine with Python

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Robert King:
Making a scalable course search engine with Python
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
@ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Track 2
http://kiwi.pycon.org/

**Audience level**

Experienced

**Description**

Creating a custom search engine with python on google app engine. Serve large spikes in search request traffic. Allow students to find course reviews across multiple universities and countries.

**Abstract**

- introduction to the real world problem - Students need to be able to find university courses across multiple countries and universities.
- first - explore how to solve the problem - collect course data & decide on a rough solution
- second - create minimum viable product & see how people use it. Iteratively make it better.
- second continued: Organise a big website launch event before you've created the website - then proceed to write 10K lines of code in the week before launch.
- third - analyse the 50K most recent search terms & make a simple tree data structure to help improve search performance.
- four - caching & cache invalidation
- five - Maybe I'll do an online marketing campaign halfway through the talk and show graphs of the app responding in real time.

- Covers Data analysis with python (csv, matplotlib, networkx, collections.Counter, logfile parsing)

- Covers "Futures" - doing RPC calls in parallel.
- Unit testing & simulating all things. - Being able to see how adjusting search functionality effects query times & quality of results.

- Some tasteful jokes to keep things entertaining ;)

**YouTube**

https://www.youtube.com/watch?v=568mFzqsjqk

6b880a0b67fac54c42c77fe70d97334d?s=128

New Zealand Python User Group

September 14, 2014
Tweet

Transcript

  1. Looking into the Matrix Building a Course Search Engine with

    Python
  2. None
  3. None
  4. that moment when you realise - you don’t understand the

    code you’re trying to explain
  5. None
  6. None
  7. None
  8. None
  9. Build lots of things from scratch and get good at

    refactoring
  10. None
  11. What’s Student Course Review?

  12. Video

  13. Popular search terms

  14. C O M M S C P L A W

    N G I TRIE TREE DATA STRUCTURE
  15. None
  16. None
  17. None
  18. None
  19. None
  20. but you don’t have to scale yourself

  21. None
  22. • • • • • • • •

  23. Caching all the things

  24. None
  25. None
  26. Did you know harry potter was a code wizard? He

    could speak parseltongue
  27. Sharded counter like counting the votes during an election night.

  28. Who wants to be Kermit

  29. None
  30. None
  31. None
  32. None
  33. And in Java?

  34. Conclusions If your architecture is language agnostic then you’re safer

    Python > Java
  35. www.google.com/+robertking kingrobertking at gmail dot com robert-king.com http://www.studentcoursereview.co.nz/feedback