Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Long Hello World

The Long Hello World

Hello World is often used as a stand-in for "the simplest program", great for teaching and getting people started on their coding journey. But what _really_ happens in that one line? This talk will be a deep dive into the Python interpreter, C libraries, Linux kernel, and beyond. We will poke holes in every abstraction and learn about our computers like never before, both to aid in debugging and to pick the best level of tooling for future development projects.

Avatar for Noah Kantrowitz

Noah Kantrowitz

October 18, 2025
Tweet

More Decks by Noah Kantrowitz

Other Decks in Technology

Transcript

  1. Noah Kantrowitz • He/him • coderanger.net | cloudisland.nz/@coderanger • Kubernetes

    and Python • SRE/Platform for Geomagical Labs, part of IKEA • We do CV/AR for the home PyBay 2025 – Noah Kantrowitz – @[email protected] 2
  2. The Setup $ python Python 3.14.0 (main) [Clang 17.0.0] on

    darwin Type "help", "credits" or "license" for more information. >>> print("Hello world") PyBay 2025 – Noah Kantrowitz – @[email protected] 6
  3. Lib/code.py - InteractiveConsole def interact(self, banner=None, exitmsg=None): # ... while

    True: line = input(prompt) more = self.push(line) # ... PyBay 2025 – Noah Kantrowitz – @[email protected] 7
  4. Lib/_pyrepl/console.py - InteractiveColoredConsole def runcode(self, code): try: exec(code, self.locals) except

    SystemExit: raise except BaseException: self.showtraceback() return self.STATEMENT_FAILED return None PyBay 2025 – Noah Kantrowitz – @[email protected] 8
  5. Lexing and Parsing • Bytes - 0×31 • Decoding •

    Characters - '1' • Tokenizing / Lexing • Tokens - NUMBER "1" • Parsing • Syntax - Constant(value=1) • Compiling • Opcodes - LOAD_CONST 0 (1) PyBay 2025 – Noah Kantrowitz – @[email protected] 9
  6. • Python/bltinmodule.c - builtin_exec_impl • Python/pythonrun.c - _PyRun_StringFlagsWithName • Parser/peg_api.c

    - _PyParser_ASTFromString • Parser/pegen.c - _PyPegen_run_parser_from_string • Parser/tokenizer/string_tokenizer.c - _PyTokenizer_FromString • Parser/lexer/lexer.c - tok_get_normal_mode if (Py_ISDIGIT(c)) { if (c == '0') { /* Hex, octal or binary -- maybe. */ c = tok_nextc(tok); if (c == 'x' || c == 'X') { // ... elided return MAKE_TOKEN(NUMBER); PyBay 2025 – Noah Kantrowitz – @[email protected] 10
  7. Token Types • NAME, NUMBER, STRING • LPAR, RPAR, PLUS,

    COLON • COMMENT, NEWLINE • INDENT, DEDENT • import token PyBay 2025 – Noah Kantrowitz – @[email protected] 11
  8. Our Tokens $ echo 'print("Hello world")' | python -m tokenize

    -e 1,0-1,5: NAME 'print' 1,5-1,6: LPAR '(' 1,6-1,19: STRING '"Hello world"' 1,19-1,20: RPAR ')' 1,20-1,21: NEWLINE '\n' 2,0-2,0: ENDMARKER '' PyBay 2025 – Noah Kantrowitz – @[email protected] 12
  9. Grammar/python.gram t_primary[expr_ty]: | a=t_primary '.' b=NAME { _PyAST_Attribute(a, b->v.Name.id) }

    | a=t_primary '[' b=slices ']' { _PyAST_Subscript(a, b) } | a=t_primary '(' b=[arguments] ')' { _PyAST_Call(a, (b) ? ((expr_ty) b)->v.Call.args : NULL, (b) ? ((expr_ty) b)->v.Call.keywords : NULL, ) } | a=atom { a } PyBay 2025 – Noah Kantrowitz – @[email protected] 14
  10. $ echo -n 'print("Hello world")' | python -m ast Module(

    body=[ Expr( value=Call( func=Name(id='print', ctx=Load()), args=[ Constant(value='Hello world')]))]) PyBay 2025 – Noah Kantrowitz – @[email protected] 15
  11. Interpreted Language Compiled Language • Back to _PyRun_StringFlagsWithName mod =

    _PyParser_ASTFromString(str, name, start, flags, arena); if (mod != NULL) { ret = run_mod(mod, name, globals, locals, flags, arena, source, generate_new_source); } PyBay 2025 – Noah Kantrowitz – @[email protected] 16
  12. Virtual Machine • An abstract stack • PUSH 1 •

    PUSH 2 • ADD • POP • Call frame metadata • Thread state PyBay 2025 – Noah Kantrowitz – @[email protected] 17
  13. Opcodes • IMPORT_NAME – Import a module and push it

    • DICT_UPDATE – Pop two dicts and push a single merged dict • LOAD_CONST – Push a constant value • JUMP_BACKWARD – Move the execution pointer backwards • RETURN_GENERATOR – Create a generator object for this function and push it • NOP – Do nothing, be chill PyBay 2025 – Noah Kantrowitz – @[email protected] 18
  14. • Python/pythonrun.c - run_mod • Python/compile.c - _PyAST_Compile • Python/compile.c

    - compiler_codegen • Python/codegen.c - _PyCodegen_Module • Python/codegen.c - codegen_body for (Py_ssize_t i = first_instr; i < asdl_seq_LEN(stmts); i++) { VISIT(c, stmt, (stmt_ty)asdl_seq_GET(stmts, i)); } PyBay 2025 – Noah Kantrowitz – @[email protected] 19
  15. $ echo 'print("Hello world")' | python -m dis 0 RESUME

    0 1 LOAD_NAME 0 (print) PUSH_NULL LOAD_CONST 0 ('Hello world') CALL 1 POP_TOP LOAD_CONST 1 (None) RETURN_VALUE PyBay 2025 – Noah Kantrowitz – @[email protected] 20
  16. • Python/pythonrun.c - run_mod • Python/pythonrun.c - run_eval_code_obj • Python/ceval.c

    - PyEval_EvalCode • include/internal/pycore_ceval.h - _PyEval_EvalFrame • Python/ceval.c - _PyEval_EvalFrameDefault TARGET(LOAD_NAME) { frame->instr_ptr = next_instr; next_instr += 1; _PyStackRef v; PyObject *name = GETITEM(FRAME_CO_NAMES, oparg); _PyFrame_SetStackPointer(frame, stack_pointer); PyObject *v_o = _PyEval_LoadName(tstate, frame, name); stack_pointer[0] = v; stack_pointer += 1; PyBay 2025 – Noah Kantrowitz – @[email protected] 21
  17. What even is print? if (PyMapping_GetOptionalItem(frame->f_locals, name, &value) < 0)

    { return NULL; } if (value != NULL) { return value; } if (PyDict_GetItemRef(frame->f_globals, name, &value) < 0) { return NULL; } if (value != NULL) { return value; } if (PyMapping_GetOptionalItem(frame->f_builtins, name, &value) < 0) { return NULL; } PyBay 2025 – Noah Kantrowitz – @[email protected] 23
  18. • {"print", _PyCFunction_CAST(builtin_print), • Python/bltinmodule.c - builtin_print_impl if (file ==

    Py_None) { file = PySys_GetAttr(&_Py_ID(stdout)); } for (i = 0; i < objects_length; i++) { err = PyFile_WriteObject(objects[i], file, Py_PRINT_RAW); if (err) { Py_DECREF(file); return NULL; } } PyBay 2025 – Noah Kantrowitz – @[email protected] 24
  19. What is stdout? Python/pylifecycle.c - init_sys_streams extern FILE *stdout; /*

    Set sys.stdout */ fd = fileno(stdout); std = create_stdio(config, iomod, fd, 1, "<stdout>", config->stdio_encoding, config->stdio_errors); if (std == NULL) goto error; PySys_SetObject("__stdout__", std); _PySys_SetAttr(&_Py_ID(stdout), std); PyBay 2025 – Noah Kantrowitz – @[email protected] 25
  20. Aside: Pythonic C writer = PyObject_GetAttr(f, &_Py_ID(write)); value = PyObject_Str(v);

    result = PyObject_CallOneArg(writer, value); Basically the same as ... writer = f.write value = str(v) result = write(value) PyBay 2025 – Noah Kantrowitz – @[email protected] 26
  21. • Objects/fileobject.c - PyFile_WriteObject • Lib/_pyio.py - TextIOWrapper.write • Lib/_pyio.py

    - BufferedWriter.write • Modules/_io/fileio.c - _io_FileIO_write_impl • Python/fileutils.c - _Py_write_impl do { errno = 0; n = write(fd, buf, count); err = errno; } while (n < 0 && err == EINTR); PyBay 2025 – Noah Kantrowitz – @[email protected] 27
  22. Libc – A Standard Library for C POSIX too? But

    Really Glibc ssize_t write(int fildes, const void* buf, size_t nbyte); PyBay 2025 – Noah Kantrowitz – @[email protected] 28
  23. sysdeps/unix/sysv/linux/write.c - __libc_write /* Write NBYTES of BUF to FD.

    Return the number written, or -1. */ ssize_t __libc_write (int fd, const void *buf, size_t nbytes) { return SYSCALL_CANCEL (write, fd, buf, nbytes); } PyBay 2025 – Noah Kantrowitz – @[email protected] 29
  24. • sysdeps/unix/sysv/linux/write.c - __libc_write • nptl/cancellation.c - __internal_syscall_cancel • sysdeps/unix/sysv/linux/syscall_cancel.c

    - __syscall_cancel_arch • sysdeps/unix/sysdep.h - INTERNAL_SYSCALL_NCS_CALL • sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S movq %rdi, %rax /* Syscall number (1) -> rax. */ movq %rsi, %rdi /* shift arg1 - arg5. */ movq %rdx, %rsi movq %rcx, %rdx movq %r8, %r10 movq %r9, %r8 movq 8(%rsp),%r9 /* arg6 is on the stack. */ syscall /* Do the system call. */ PyBay 2025 – Noah Kantrowitz – @[email protected] 31
  25. • arch/x86/entry/entry_64.S - entry_SYSCALL_64 • arch/x86/entry/syscall_64.c - x64_sys_call • fs/read_write.c

    - ksys_write struct files_struct *files = current->files; if (likely(atomic_read_acquire(&files->count) == 1)) { file = files_lookup_fd_raw(files, fd); if (!file || unlikely(file->f_mode & mask)) return EMPTY_FD; return BORROWED_FD(file); if (!fd_empty(f)) { loff_t pos, *ppos = file_ppos(fd_file(f)); ret = vfs_write(fd_file(f), buf, count, ppos); PyBay 2025 – Noah Kantrowitz – @[email protected] 32
  26. • arch/x86/entry/entry_64.S - entry_SYSCALL_64 • arch/x86/entry/syscall_64.c - x64_sys_call • fs/read_write.c

    - ksys_write • fs/read_write.c - vfs_write if (file->f_op->write) ret = file->f_op->write(file, buf, count, pos); else if (file->f_op->write_iter) ret = new_sync_write(file, buf, count, pos); else ret = -EINVAL; PyBay 2025 – Noah Kantrowitz – @[email protected] 33
  27. >>> import os, stat >>> os.fstat(1).st_mode 8592 >>> stat.S_ISCHR(os.fstat(1).st_mode) True

    >>> os.fstat(1).st_rdev >> 8 136 >>> os.fstat(1).st_rdev & 0xFF 0 PyBay 2025 – Noah Kantrowitz – @[email protected] 34
  28. // include/uapi/linux/major.h #define UNIX98_PTY_MASTER_MAJOR 128 #define UNIX98_PTY_S***E_MAJOR 136 // drivers/tty/tty_io.c

    - tty_register_driver dev = MKDEV(driver->major, driver->minor_start); error = register_chrdev_region(dev, driver->num, driver->name); // drivers/tty/tty_io.c - tty_init static const struct file_operations tty_fops = { .read_iter = tty_read, .write_iter = tty_write, .splice_read = copy_splice_read }; cdev_init(&tty_cdev, &tty_fops); PyBay 2025 – Noah Kantrowitz – @[email protected] 35
  29. TTY - Teletype Terminal PTY - Pseudo TTY • A

    fancy pipe • A shared buffer • Bidirectional (but ignoring that today) static const struct tty_operations ptm_unix98_ops = { .open = pty_open, .close = pty_close, .write = pty_write, }; PyBay 2025 – Noah Kantrowitz – @[email protected] 37
  30. • arch/x86/entry/entry_64.S - entry_SYSCALL_64 • arch/x86/entry/syscall_64.c - x64_sys_call • fs/read_write.c

    - ksys_write • fs/read_write.c - vfs_write • drivers/tty/tty_io.c - tty_write • drivers/tty/tty_io.c - iterate_tty_write • drivers/tty/pty.c - pty_write • drivers/tty/tty_buffer.c - __tty_insert_flip_string_flags memcpy(char_buf_ptr(tb, tb->used), chars, space); PyBay 2025 – Noah Kantrowitz – @[email protected] 38
  31. arch/x86/lib/memcpy_64.S - memcpy_orig SYM_FUNC_START_LOCAL(memcpy_orig) .Lless_16bytes /* * Move data from

    8 bytes to 15 bytes. */ movq 0*8(%rsi), %r8 movq -1*8(%rsi, %rdx), %r9 movq %r8, 0*8(%rdi) movq %r9, -1*8(%rdi, %rdx) RET PyBay 2025 – Noah Kantrowitz – @[email protected] 39
  32. We Rise • Execute RET instruction • Commit TTY write

    • Release VFS write lock • Return from syscall and then libc • Complete CALL opcode • Run LOAD_CONST None and RETURN_VALUE opcodes • Return from exec() • Loop and wait for input in InteractiveConsole.interact PyBay 2025 – Noah Kantrowitz – @[email protected] 41
  33. // xterm size = (int) read(screen->respond, (char *) data->last, (size_t)

    // ... IChar *str = xw->work.write_text + offset; chars = ld->charData + screen->cur_col; for (n = 0; n < length; ++n) { chars[n] = str[n]; } PyBay 2025 – Noah Kantrowitz – @[email protected] 42
  34. $ python Python 3.14.0 (main) [Clang 17.0.0] on darwin Type

    "help", "credits" or "license" for more information. >>> print("Hello world") Hello world >>> PyBay 2025 – Noah Kantrowitz – @[email protected] 43