Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing file system in CPython

Delimitry
August 16, 2018

Writing file system in CPython

The presentation from PiterPy meetup about writing simple file system PEPFS in CPython using FUSE (File System In Userspace)

Delimitry

August 16, 2018
Tweet

More Decks by Delimitry

Other Decks in Programming

Transcript

  1. Writing file system
    in CPython
    Dmitry Alimov
    2018

    View Slide

  2. File System
    Application
    Virtual File System (VFS)
    Device Driver
    Device (HDD, SSD, CD-ROM, NIC, etc)
    User Space
    Kernel Space Syscall Interface
    Block Layer
    Page Cache
    Direct I/O
    Hardware
    ext2, ext3,
    ext4, btrfs
    NFS,
    smbfs
    FUSE
    procfs,
    tmpfs

    View Slide

  3. FUSE (File System In Userspace)
    Available for Linux, FreeBSD, OpenBSD, NetBSD (as puffs), OpenSolaris, Minix 3,
    Android and macOS [2]
    In Linux kernel since version 2.6.14
    Windows compatibility is provided by libraries and ports [3, 4, 5]

    View Slide

  4. FUSE
    Diagram showing how FUSE works [2]

    View Slide

  5. Example uses
    GlusterFS: Clustered Distributed Filesystem
    GmailFS: Filesystem which stores data as mail in Gmail
    SSHFS: Provides access to a remote filesystem through SSH
    WikipediaFS: View and edit Wikipedia articles as if they were real files
    πfs: A file system that stores all files in the digits of Pi

    View Slide

  6. struct fuse_operations {
    int (*getattr) (const char *path, struct stat *stbuf);
    ...
    int (*readdir) (const char *path, void *buf,
    fuse_fill_dir_t filler, off_t offset,
    struct fuse_file_info *fi);
    ...
    int (*read) (const char *path, char *buf, size_t size,
    off_t offset, struct fuse_file_info *fi);
    ...
    };
    libfuse API

    View Slide

  7. Python interface to FUSE
    fusepy module [7] — simple interface to FUSE and MacFUSE:
    def getattr(self, path, fh=None):
    if path != '/':
    raise FuseOSError(errno.ENOENT)
    return {'st_mode': (S_IFDIR | 0o755), 'st_nlink': 2}
    def readdir(self, path, fh=None):
    return ['.', '..']
    def read(self, path, size, offset, fh=None):
    return self.data[path][offset:offset + size]

    View Slide

  8. PEPFS
    Read-only file system with PEPs as the files [8]
    Uses github repository with PEPs to get the current PEPs [9]
    Implemented in Python and uses the fusepy module
    Lazy PEP files' read (download specific PEP on demand)

    View Slide

  9. PEPFS mount
    $ ./pepfs.py /tmp/pepfs/
    $ mount
    ...
    PEPFS on /tmp/pepfs type fuse
    (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)
    ...

    View Slide

  10. PEPFS example
    $ ls -la /tmp/pepfs/
    total 4
    drwxr-xr-x 2 root root 6943251 Jul 25 00:07 .
    drwxrwxrwt 35 root root 4096 Jul 25 00:07 ..
    -rw-r--r-- 1 root root 29582 Jul 25 00:07 pep-0001.txt
    -rw-r--r-- 1 root root 8214 Jul 25 00:07 pep-0002.txt
    -rw-r--r-- 1 root root 2229 Jul 25 00:07 pep-0003.txt
    ...
    -rw-r--r-- 1 root root 81947 Jul 25 00:07 pep-3333.txt

    View Slide

  11. Page Cache
    To enable Page Cache, you need to set the flag keep_cache, in open() method:
    def open(self, path, flags):
    flags.keep_cache = 1
    return 0
    And also set raw_fi to True in FUSE(PEPFS(), ..., raw_fi=True)
    NB: Invalidation of cache and updating of data occurs only if the file size changes

    View Slide

  12. Questions
    https://t.me/spbpython
    https://t.me/piterpy_meetup

    View Slide

  13. References:
    1. https://en.wikibooks.org/wiki/The_Linux_Kernel/Storage
    2. https://en.wikipedia.org/wiki/Filesystem_in_Userspace
    3. https://en.wikipedia.org/wiki/Dokan_Library
    4. https://github.com/crossmeta/cxfuse
    5. http://www.secfs.net/winfsp/
    6. https://github.com/libfuse/libfuse
    7. https://github.com/fusepy/fusepy
    8. https://github.com/delimitry/pepfs
    9. https://github.com/python/peps/

    View Slide