Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enclose.IO: current cutting-edges and the future work

Minqi Pan
November 24, 2016

Enclose.IO: current cutting-edges and the future work

Minqi Pan

November 24, 2016
Tweet

More Decks by Minqi Pan

Other Decks in Programming

Transcript

  1. We made some modifications to … • Node.js Compile-time •

    Node.js Link-time / Test-time • Node.js Run-time
  2. • Serialize project files into structs of {path, *source, source_len}

    • Serialize raw file contents into array of bytes • Bonus of Compiler hints: the compiler can tell us the offsets of those JS scripts • (TODO) better usage of resource handling of the compiler
  3. • TODO: maybe {path, *source, source_len, atime, mtime, ctime, birthtime}

    for fs.stat()? Anybody uses that info? • TODO: better record dir, for fs.readdir
  4. • skip ParseArgs • (cf. in Start(), ParseArgs is called

    before StartNodeInstance and CreateEnvironment)
  5. • checks for memfs in process.argv[1] • customized entrance via

    automatically setting process.argv[1] • e.g. process.argv[1]="/__enclose_io_memfs__/ node_modules/coffee-script/bin/coffee" • (cf. ExecuteString called on bootstrap_node.js when LoadEnvironment)
  6. • (TODO) simulate file descriptors • (TODO) Libmemfs: a portable,

    pure C implementation of a Unix-style, read-only, in- memory filesystem.
  7. • Enforces static linking • copies routines of openssl, libuv,

    … into the executable image • (TODO) copy C extensions like fsevents used by the project into the executable image • problem of GLIBC and GLIBCXX
  8. • the program is loaded into memory • your serialized

    project files are initially stored in .text, are then “copied” by the OS into the data segment (.data) of the program virtual address space during start-up • (cf.) text segment, the “REAL” executable instructions, read-only and has a fixed size • (TODO) read-only data segment (.rodata)
  9. – Wikipedia: Loader (computing) “In the case of operating systems

    that support virtual memory, the loader may not actually copy the contents of executable files into memory, but rather may simply declare to the virtual memory subsystem that there is a mapping between a region of memory allocated to contain the running program's code and the contents of the associated executable file.”
  10. – Wikipedia: Loader (computing) “The virtual memory subsystem is then

    made aware that pages with that region of memory need to be filled on demand if and when program execution actually hits those areas of unfilled memory. This may mean parts of a program's code are not actually copied into memory until they are actually used, and unused code may never be loaded into memory at all.”
  11. • String::NewFromUtf8 during start-up • need to copy memory when

    passed to V8 • from data segment to the heap, what a PAIN!
  12. • (TODO) to speed up startup, use zero-copy external string

    resources, V8 no longer requires that strings are aligned if they are one-byte strings (string data must be immutable and that the data must be Latin-1 and not UTF-8, which would require special treatment internally in the engine and do not allow efficient indexing)
 
 V8 3.20.9 enforces that external pointers are aligned on a two-byte boundary (cf. Tagged pointers, objects are always at least 4-byte aligned, and we can never have a pointer to the middle of an object in JavaScript) • (TODO) maybe zero-copy buffers?
  13. • (TODO) to speed up startup, delay String::NewExternalOneByte only when

    accessed, not running literally millions of String::NewFromUtf8/ String::NewExternalOneByte at start up
  14. • fs.fstat, fs.lstat, fs.lstatSync, fs.statSync: first checks for memfs path

    • fs.watch, fs.watchFile: does not watch for memfs • fs.realpathSync: first checks for memfs path
  15. • child_process.fork needs the original argv[1] semantic • will set

    __enclose_io_fork__ as argv[1] in child_process.fork
  16. Future Work • see previous TODO’s • only include JS

    scripts that can be referenced • only include symbols that can be referenced • generate unoptimized machine code in compile- time via v8, no sources required in runtime