Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robert King: Making a scalable course search engine with Python

Robert King: Making a scalable course search engine with Python

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Robert King:
Making a scalable course search engine with Python
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
@ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Track 2
http://kiwi.pycon.org/

**Audience level**

Experienced

**Description**

Creating a custom search engine with python on google app engine. Serve large spikes in search request traffic. Allow students to find course reviews across multiple universities and countries.

**Abstract**

- introduction to the real world problem - Students need to be able to find university courses across multiple countries and universities.
- first - explore how to solve the problem - collect course data & decide on a rough solution
- second - create minimum viable product & see how people use it. Iteratively make it better.
- second continued: Organise a big website launch event before you've created the website - then proceed to write 10K lines of code in the week before launch.
- third - analyse the 50K most recent search terms & make a simple tree data structure to help improve search performance.
- four - caching & cache invalidation
- five - Maybe I'll do an online marketing campaign halfway through the talk and show graphs of the app responding in real time.

- Covers Data analysis with python (csv, matplotlib, networkx, collections.Counter, logfile parsing)

- Covers "Futures" - doing RPC calls in parallel.
- Unit testing & simulating all things. - Being able to see how adjusting search functionality effects query times & quality of results.

- Some tasteful jokes to keep things entertaining ;)

**YouTube**

https://www.youtube.com/watch?v=568mFzqsjqk

New Zealand Python User Group

September 14, 2014
Tweet

More Decks by New Zealand Python User Group

Other Decks in Programming

Transcript

  1. Looking into the Matrix
    Building a Course Search Engine with Python

    View Slide

  2. View Slide

  3. View Slide

  4. that moment when
    you realise - you don’t understand the code
    you’re trying to explain

    View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. Build lots of things from
    scratch
    and get good at refactoring

    View Slide

  10. View Slide

  11. What’s Student Course Review?

    View Slide

  12. Video

    View Slide

  13. Popular search terms

    View Slide

  14. C
    O
    M
    M
    S
    C
    P
    L
    A
    W
    N
    G
    I
    TRIE TREE DATA STRUCTURE

    View Slide

  15. View Slide

  16. View Slide

  17. View Slide

  18. View Slide

  19. View Slide

  20. but you don’t have to
    scale yourself

    View Slide

  21. View Slide









  22. View Slide

  23. Caching all the things

    View Slide

  24. View Slide

  25. View Slide

  26. Did you know harry potter
    was a code wizard?
    He could speak parseltongue

    View Slide

  27. Sharded counter
    like counting the votes during an election night.

    View Slide

  28. Who wants to be Kermit

    View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. And in Java?

    View Slide

  34. Conclusions
    If your architecture is language agnostic then
    you’re safer
    Python > Java

    View Slide

  35. www.google.com/+robertking
    kingrobertking at gmail dot com
    robert-king.com
    http://www.studentcoursereview.co.nz/feedback

    View Slide