Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How Import Does Its Thing

How Import Does Its Thing

PyCon 2008


Brett Cannon

March 14, 2008


  1. HOW IMPORT DOES ITS THING Brett Cannon PyCon 2008

  2. About This Talk Goal is for you to understand how

    import finds a module to import. How to go from source code to module object is beyond this talk. I will cover (planned) Python 3.0 semantics.
  3. from __future__ import absolute_import Allows dots to specify where to

    look for a module relative to your current location in a package. Cannot escape out of a package. No dots represents a top-level import (e.g., directly accessed through sys.path).
  4. Examples Assume foo/bar/spam/bruce directory hierarchy and importing from foo/bar/spam. from

    . import bruce Current level. from .. import spam Parent level.
  5. STEP 1

  6. Statement to Function Call Import statement is syntactic sugar. Calls

    built-in function __import__(). Can override if so desired.
  7. __import__(name, globals={}, locals={}, fromlist=[], level=0)

  8. __import__ name Name of module being requested. Could be relative.

    globals Caller’s (i.e., module requesting the import) globals. locals Caller’s locals.
  9. fromlist from module import fromlist level Number of dots in

    an import statement.
  10. STEP 2

  11. Import Lock Acquired before each import. Prevents parallel imports of

    the same module from different threads since module state is shared between interpreters.
  12. STEP 3

  13. Name Resolution Want to resolve relative import names to be

    absolute. Based on name, globals and level. We care if the caller is a package’s __init__ or not.
  14. Example to Use Caller is foo.bar, which is a package

    . foo/bar/__init__.py . from ..ni import shrubbery . level == 2 name == ni Want to end up with foo.ni .
  15. name, globals, level caller's path = globals['__path__'] caller's name =

    globals['__name__'] caller's path level -= 1 base = Trim caller's name from the right level # of dots name return result result = base + '.' + name result = base True False True False level = 2 caller’s path = [‘/foo/bar’] caller’s name = ‘foo.bar’ base = ‘foo’ level = 1 name = ‘ni’ result = ‘foo.ni’
  16. STEP 4

  17. sys.modules If module is in sys.modules, use it and be

    done searching for the module.
  18. STEP 5

  19. sys.meta_path Handles importing modules that have no “location”. Built-in modules.

    Frozen modules. Sequence of objects called “importers”.
  20. Start for importer in sys.meta_path: loader = importer.find_module(name, path) return

    loader.load_module(name) ... True False
  21. STEP 6

  22. sys.path Sequence of objects representing “locations”. Pretty much always contains

    strings. Similar to sys.meta_path in terms of importers/ loaders, but has a caching mechanism for importers. For backwards-compatibility, entries on sys.path are not replaced with their representative importer. Not searched if parent module defines __path__. __path__ is used instead.
  23. ... Parent module has __path__ search = parent's __path__ search

    = sys.path search True False
  24. for entry in search: importer = sys.path_importer_cache[entry] loader = importer.find_module(name)

    return loader.load_module(name) Search path hook True False True False importer raise ImportError
  25. for hook in sys.path_hooks: importer = hook(entry) sys.path_importer_cache[entry] = importer

    sys.path_importer_cache[entry] = dummy path hook importer False True

  27. Overview 1. Function call. 2. Import lock. 3. Resolve name

    to be absolute. 4. sys.modules. 5. sys.meta_path. 6. Parent’s __path__ or sys.path.
  28. The Big Picture ... Literally __import__(name, globals, locals, fromlist, level)

    caller_name = globals['__name__'] caller_path = globals['__path__'] caller_path level -= 1 base = caller_name.rsplit(".", level)[0] name absolute_name = base absolute_name = base + '.' + name absolute_name in sys.modules return sys.modules[absolute_name] for importer in sys.meta_path: loader = importer.find_module(absolute_name, caller_path) loader return loader.load_module(absolute_name) caller_path search = caller_path search = sys.path for entry in search: entry in sys.path_importer_cache importer = sys.path_importer_cache[absolute_name] loader = importer.find_module(absolute_name) loader return loader.load_modlue(absolute_name) for hook in sys.path_hooks importer = hook(entry) sys.path_importer_cache[entry] = importer sys.path_importer_cache[entry] = default_importer raise ImportError False False True True False True True False True False True False ImportError no exception raised False True
  29. More Information PEP 302 http://www.python.org/dev/peps/pep-0302/ importlib http://svn.python.org/projects/sandbox/import_in_py Large flowchart in

    docs directory. import.c Python/import.c

  31. Importer 1. Take the tail-end name of the module to

    be imported. 2. Join with the directory being searched in. 3. Look for a __init__.py file. 4. Otherwise look for a file ending in .py, .pyc, .pyo, or .pyw.
  32. Loader 1. Create a module object and set __name__, __file__,

    __loader__ (and __path__ if needed). 2. Get data from file found by the importer. 3. Create a code object. 4. Execute code object in the module namespace.
  33. Bytecode or Source? Bytecode must have a matching magic number.

    If source is also available, timestamp in bytecode must be newer than last modification date on source file. Otherwise use source if available.