Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing file system in CPython

Delimitry
August 16, 2018

Writing file system in CPython

The presentation from PiterPy meetup about writing simple file system PEPFS in CPython using FUSE (File System In Userspace)

Delimitry

August 16, 2018
Tweet

More Decks by Delimitry

Other Decks in Programming

Transcript

  1. Writing file system in CPython Dmitry Alimov 2018

  2. File System Application Virtual File System (VFS) Device Driver Device

    (HDD, SSD, CD-ROM, NIC, etc) User Space Kernel Space Syscall Interface Block Layer Page Cache Direct I/O Hardware ext2, ext3, ext4, btrfs NFS, smbfs FUSE procfs, tmpfs
  3. FUSE (File System In Userspace) Available for Linux, FreeBSD, OpenBSD,

    NetBSD (as puffs), OpenSolaris, Minix 3, Android and macOS [2] In Linux kernel since version 2.6.14 Windows compatibility is provided by libraries and ports [3, 4, 5]
  4. FUSE Diagram showing how FUSE works [2]

  5. Example uses GlusterFS: Clustered Distributed Filesystem GmailFS: Filesystem which stores

    data as mail in Gmail SSHFS: Provides access to a remote filesystem through SSH WikipediaFS: View and edit Wikipedia articles as if they were real files πfs: A file system that stores all files in the digits of Pi
  6. struct fuse_operations { int (*getattr) (const char *path, struct stat

    *stbuf); ... int (*readdir) (const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi); ... int (*read) (const char *path, char *buf, size_t size, off_t offset, struct fuse_file_info *fi); ... }; libfuse API
  7. Python interface to FUSE fusepy module [7] — simple interface

    to FUSE and MacFUSE: def getattr(self, path, fh=None): if path != '/': raise FuseOSError(errno.ENOENT) return {'st_mode': (S_IFDIR | 0o755), 'st_nlink': 2} def readdir(self, path, fh=None): return ['.', '..'] def read(self, path, size, offset, fh=None): return self.data[path][offset:offset + size]
  8. PEPFS Read-only file system with PEPs as the files [8]

    Uses github repository with PEPs to get the current PEPs [9] Implemented in Python and uses the fusepy module Lazy PEP files' read (download specific PEP on demand)
  9. PEPFS mount $ ./pepfs.py /tmp/pepfs/ $ mount ... PEPFS on

    /tmp/pepfs type fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000) ...
  10. PEPFS example $ ls -la /tmp/pepfs/ total 4 drwxr-xr-x 2

    root root 6943251 Jul 25 00:07 . drwxrwxrwt 35 root root 4096 Jul 25 00:07 .. -rw-r--r-- 1 root root 29582 Jul 25 00:07 pep-0001.txt -rw-r--r-- 1 root root 8214 Jul 25 00:07 pep-0002.txt -rw-r--r-- 1 root root 2229 Jul 25 00:07 pep-0003.txt ... -rw-r--r-- 1 root root 81947 Jul 25 00:07 pep-3333.txt
  11. Page Cache To enable Page Cache, you need to set

    the flag keep_cache, in open() method: def open(self, path, flags): flags.keep_cache = 1 return 0 And also set raw_fi to True in FUSE(PEPFS(), ..., raw_fi=True) NB: Invalidation of cache and updating of data occurs only if the file size changes
  12. Questions https://t.me/spbpython https://t.me/piterpy_meetup

  13. References: 1. https://en.wikibooks.org/wiki/The_Linux_Kernel/Storage 2. https://en.wikipedia.org/wiki/Filesystem_in_Userspace 3. https://en.wikipedia.org/wiki/Dokan_Library 4. https://github.com/crossmeta/cxfuse 5.

    http://www.secfs.net/winfsp/ 6. https://github.com/libfuse/libfuse 7. https://github.com/fusepy/fusepy 8. https://github.com/delimitry/pepfs 9. https://github.com/python/peps/