Slide 1

Slide 1 text

importlib.resources importlib.resources If you can import it, you can read it* If you can import it, you can read it* Pycon 2018 Cleveland, Ohio May 2018 Barry Warsaw Python Foundation @ LinkedIn

Slide 2

Slide 2 text

My code needs some static files. How hard can it be to read them at run time?

Slide 3

Slide 3 text

Types of static files Types of static files Templates Sample data Certificates gettext translation catalogs

Slide 4

Slide 4 text

File system layout File system layout thepkg/ __init__.py a.py b.py data/ sample.dat

Slide 5

Slide 5 text

Naive approach Naive approach import thepkg from pathlib import Path pkg = Path(thepkg.__file__).parent path = pkg / 'data' / 'sample.dat' with open(path, 'rb') as fp: contents = fp.read()

Slide 6

Slide 6 text

Done! Done! Right? Right? What's the problem?

Slide 7

Slide 7 text

Things get complicated Things get complicated thepkg/ __init__.py a.py b.py data/ sample.dat

Slide 8

Slide 8 text

Zip files and zipapps Zip files and zipapps pkg = Path(thepkg.__file__).parent path = pkg / 'data' / 'sample.dat' with open(path, 'rb') as fp: contents = fp.read() Traceback (most recent call last): File "run.py", line 7, in with open(path, 'rb') as fp: NotADirectoryError: [Errno 20] Not a directory: '.../thepkg.zip/thepkg/data/sample.dat'

Slide 9

Slide 9 text

pkg_resources pkg_resources Basic Resource Access from pkg_resources import \ resource_string as resource_bytes contents = resource_bytes( 'thepkg', 'data/sample.dat') Works for both file system Works for both file system paths and zip file paths paths and zip file paths

Slide 10

Slide 10 text

Done! Done! Right? Right? What's the problem?

Slide 11

Slide 11 text

pkg_resources pkg_resources has import-time side-effects is slow tries to do too much has funky APIs is everywhere still supports Python 2

Slide 12

Slide 12 text

We can do better! We can do better! Because we have Python's import machinery to help us

Slide 13

Slide 13 text

importlib.resources importlib.resources from importlib.resources import read_binary contents = read_binary( 'thepkg.data', 'sample.dat') import thepkg.data contents = read_binary( thepkg.data, 'sample.dat')

Slide 14

Slide 14 text

File system layout File system layout thepkg/ __init__.py a.py b.py data/ sample.dat

Slide 15

Slide 15 text

File system layout File system layout thepkg/ __init__.py a.py b.py data/ __init__.py sample.dat

Slide 16

Slide 16 text

Terminology Terminology Access a "resource" in a "package" Q: What's a "package"? Q: What's a "resource"? Subdirectories/subpackages are not resources! Namespace packages cannot contain resources E.g. a directory containing an __init__.py A: Any importable module with a __path__ attribute A: Any readable object contained in a package E.g. a file inside a package

Slide 17

Slide 17 text

Packages and resources Packages and resources thepkg/ __init__.py a.py b.py data/ __init__.py sample.dat Package: thepkg

Slide 18

Slide 18 text

Packages and resources Packages and resources thepkg/ __init__.py a.py b.py data/ __init__.py sample.dat Package: thepkg.data

Slide 19

Slide 19 text

importlib.resources API importlib.resources API Types Package = Union[str, ModuleType] Resource = Union[str, os.PathLike]

Slide 20

Slide 20 text

importlib.resources API importlib.resources API Get the contents of a resource read_binary( package: Package, resource: Resource) ­> bytes read_text( package: Package, resource: Resource, encoding: str = 'utf­8', errors: str = 'strict') ­> str

Slide 21

Slide 21 text

importlib.resources API importlib.resources API Get a file-like object open for reading open_text( package: Package, resource: Resource, encoding: str = 'utf­8', errors: str = 'strict') ­> TextIO open_binary( package: Package, resource: Resource) ­> BinaryIO

Slide 22

Slide 22 text

importlib.resources API importlib.resources API Get a concrete file system path with path( thepkg, 'foo.cpython­37m­darwin.so' ) as lib: import_shared_library(lib) path( package: Package, resource: Resource) ­> Iterator[Path]

Slide 23

Slide 23 text

importlib.resources API importlib.resources API List what's in a package * contents( package: Package) ­> Iterable[str] * Items are not guaranteed to be resources! >>> print(sorted(contents( 'thepkg.data'))) ['__init__.py', '__pycache__', 'sample.dat']

Slide 24

Slide 24 text

importlib.resources API importlib.resources API Is a thing a resource? is_resource( package: Package name: str) ­> bool * Use this with contents() to iterate over resources in a package

Slide 25

Slide 25 text

API for loaders API for loaders Low level API for custom loaders Built-in support for file system and zips loader.get_resource_reader( str: package_name ) ­> importlib.abc.ResourceReader

Slide 26

Slide 26 text

importlib.abc.ResourceReader importlib.abc.ResourceReader open_resource(str: resource) ­> BytesIO resource_path(str: resource) ­> str is_resource(str: name) ­> bool contents() ­> Iterable[str] FileNotFoundError raised when resource doesn't exist resource_path() requires a concrete file system path contents() can return non-resources

Slide 27

Slide 27 text

Performance Performance CLIs start up 25-50% faster importlib.resources shiv (new open source replacement for pex) http://shiv.readthedocs.io/en/latest/

Slide 28

Slide 28 text

importlib_resources importlib_resources Backport of resource reading for Python 2.7, 3.4-3.6 (works as a shim for 3.7) importlib-resources.rtfd.org

Slide 29

Slide 29 text

Give it up for Give it up for Brett Cannon Brett Cannon First of hopefully many great collaborations between the LinkedIn and Microsoft Python teams

Slide 30

Slide 30 text

Barry Warsaw Barry Warsaw barry@python.org bwarsaw@linkedin.com @pumpichank github.com/warsaw gitlab.com/warsaw importlib-resources.rtfd.org