what
pyvanot
● python package and app that creates
● a 12.7 billion row, 448 column, open dataset updated in
real time
addressing the question of our age:
is there a difference between Paul Ivanov and David
Nicholson?
Slide 4
Slide 4 text
why
Paul David
Slide 5
Slide 5 text
why
Paul David
background in neuro / AI background in neuro / AI
industry gig industry gig
Python-based blog Python-based blog
can ride a bike can ride a bike
makes terrible puns makes terrible puns
Slide 6
Slide 6 text
why
Slide 7
Slide 7 text
how
pyvanot
project and package that leverages the power and flexibility of
the Python data science stack to provide insight into this
question that has troubled humanity since time immemorial
Slide 8
Slide 8 text
how
pyvanot
built on cookiecutter-pyopensci
follows best practices established in PyOpenSci dev guide
https://www.pyopensci.org/contributing-guide
Slide 9
Slide 9 text
how
● scrape the web for the latest Paul Ivanov and David
Nicholson data
○ using requests beautiful_soup and luigi
Slide 10
Slide 10 text
how
● build and serve the `pyvanot` open dataset as a table with
billions of rows
○ using Dask and datasette
■ inspired by this blog post from Sergio Sanchez:
https://towardsdatascience.com/making-open-data-more-ac
cessible-with-datasette-480a1de5e919