Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to create your own streaming torrent client with Python (for learning purposes)

How to create your own streaming torrent client with Python (for learning purposes)

A few years ago I created TouchAndGo [0] with Felipe Lerena and we learned a lot! I'll share some of that knowledge with you.
- What's BitTorrent? How does it work?
- Which libs you can use to find magnets links?
- How to handle magnet links;
- use libtorrent (API, tips, mis);
- download fast and be friendly with the torrent mesh.
- stream a video with Python;
- download subtitles;
- integrate all the parts together.

Disclaimer:
This talk is a rehash of one we did in 2014 with L1pe: https://www.youtube.com/watch?v=oMl1UiF_2Ss (The project is not maintained anymore).

[0] https://github.com/touchandgo-devs/touchandgo

Nicolás Demarchi

April 22, 2020
Tweet

More Decks by Nicolás Demarchi

Other Decks in Technology

Transcript

  1. How to create your own
    streaming torrent client
    with Python.
    (for learning purposes)
    Nicolás Demarchi - @gilgamezh
    22/04/2020 - Python Ireland Meetup (online)
    https://www.meetup.com/pythonireland/events/270053115/

    View Slide

  2. Farewell
    Marcos
    Mundstock!
    Les Luthiers

    View Slide

  3. About me: Nicolás Demarchi
    https://py.amsterdam
    https://python.org.ar

    View Slide

  4. Disclaimer
    ● This talk is a rehash of one we gave in 2014
    with L1pe. I had to read all the code and
    translate it last weekend.
    ● I’m not interested in having a philosophical
    discussion about torrent. This is a tech talk
    ● I want to discuss how to build VOD using
    OSS

    View Slide

  5. What’s Bit Torrent?
    ● It’s a decentralized protocol for file sharing
    (p2p)
    ● Each client is a Peer that connects to a
    Tracker to find other Peers (in order) to
    download Pieces of a Torrent from them.
    More about it

    View Slide

  6. Torrent & Magnet
    ● The first step to share a file using torrent is to generate a
    .torrent file. This file contains all the information for a Peer to
    find other Peers who are sharing the same file: hash value,
    size, filename(s), Tracker IP Address, size of the Pieces
    ● A Magnet link contains the required information to ask and
    download the .torrent file from other clients.You don’t have to
    store/download the .torrent file
    ● Each .torrent file has a unique 20-byte SHA-1 identifier.

    View Slide

  7. Peer
    ● It’s an instance of a BitTorrent client that
    transfers data from and to other clients.
    ● Seed: a Peer with 100% of the Pieces.
    ● Leech: a Peer with < 100%.

    View Slide

  8. Tracker
    ● It’s what the Peers use to get the initial list of
    other Peers sharing a file.
    ● A Peer “announces” in a Tracker that it’s
    ready to interchange a torrent and the
    availability for it.
    ● It’s the entry point to a Swarm
    ● It’s possible to skip it using DHT

    View Slide

  9. Piece
    ● A Piece is the exchange unit of a .torrent.
    ● A .torrent has all the information about the
    size and amount of Pieces to download
    ● Common size is between 64KB and 4MB
    ● Each Piece has a SHA-1 unique identifier

    View Slide

  10. View Slide

  11. How to get the magnet links
    https://github.com/harshanas/Py1337x

    View Slide

  12. How to download the torrent
    ● tl;dr → libtorrent
    ○ BitTorrent C++ implementation
    ○ Focus on performance and usability.
    ○ Good docs..
    ○ PYTHON BINDING!!
    ○ Easy to use.
    ○ Available on any respectable OS (...and
    Windows).

    View Slide

  13. View Slide

  14. ● Session: libtorrent principal instance. It
    contains the main loop that controls all the
    torrents we are downloading.
    ● Torrent Handle: It handles a particular
    .torrent.
    ● Torrent Status: Contains all the information
    about the .torrent

    View Slide

  15. Some tips to understand how it works.
    ● status.pieces() exposes a bitmask representing
    all the Pieces with a state (True if a piece was
    downloaded)
    ● It’s possible to get/set the priority of a piece
    using handle.piece_priority()
    ● The download queue is exposed at
    status.get_downlodad_queue(),
    status.download_rate() and status.upload_rate()

    View Slide

  16. View Slide

  17. Strategy

    View Slide

  18. Piece picker
    ● It’s who manages which Pieces are added to the download
    queue.
    ● It has different strategies, rare first is the default. In this mode
    it always sets the highest priority to the pieces with less
    availability on the swarm. It’s the most solidary.
    ● Each Piece has a priority from 1 to 7 (7 is highest) and a
    deadline (handle.set_piece_deadline() ). Both options affect
    the moment the Piece picker adds it to the download queue.
    ● TIP: A Piece with a really low deadline will be downloaded
    ASAP.

    View Slide

  19. Video

    View Slide

  20. Torrent Streaming
    ● VOD with BitTorrent.
    ● How? : Downloading a video from the torrent
    network and serving it over HTTP to a local
    player.
    ● How to server it over HTTP?:
    SimpleHTTPServer

    View Slide

  21. View Slide

  22. Subtitles: subliminal and guessit
    https://pypi.python.org/pypi/subliminal/
    Download subtitles in lots of languages using different
    providers
    https://pypi.python.org/pypi/guessit
    GuessIt is a python library that extracts as much
    information as possible from a video filename.

    View Slide

  23. View Slide

  24. How to put all the parts
    together?
    pip install touchandgo
    https://github.com/touchandgo-devs/touchandgo
    We did it (in 2014):
    TouchAndGo.

    View Slide

  25. Last but not least
    This talk and TouchAndGo are really good
    example of the most important Python piece:
    The community.
    60% of the work was already done ;-)

    View Slide

  26. Questions?
    [email protected]
    @gilgamezh

    View Slide