https://www.collaboraonline.com/about-us/ • Google Summer of Code 2010 / 2011 • Rewrite of the Writer RTF import/export • Then a full-time LibreOffice developer for SUSE • Finally a contractor at Collabora since 2013
render + delta bad hard disk seek time bluetooth keyboard ... Frankfurt – Milan wired keyboard bad 60Hz frame time Frankfurt – London bluetooth keyboard... mash keyboard / key Meeks typing / key Frankfurt – US East Screen render good Human eye blink Frankfurt – US West Screen render bad pro typist / key Frankfurt – Hong Kong average typist / key "good [web] start/ren... 1 101 201 301 401 501 601 701 1 2 3 9 10 12 15 16,6 27 30 30 90 100 100 100 150 150 160 196 300 700 Sample latencies - Milliseconds - linear plot Do you think it’s latency? Speed of things: Thanks to: • RTINGS - hardware latency • Cloudping – network latency • Web latency • JsFiddle – typing latency
entire document, to track invalidations • LOK view-port: visible in the browser window • Sync layout for the later • Async layout for the previous • 446ms → 19ms for a document of 300 pages: 19x faster Stack of function callers Width is proportional time
the document • View-port is a setting on the loaded document • We do a sync full layout on the document as part of load • Set a rough viewport as a load parameter • 540 ms → 112 ms speedup
bounding box of all painted areas • When there is a clean-start, or a full invalidation: reset to empty • Send a new tile → grow that bbox • When we get an invalidate → crop it against that bbox • Take the canonical view ID into account when doing this (dark mode, spellcheck, etc) • 477 → 8 invalidates during file load (source)
job takes >100ms on the main loop • Watchdog case: trigger something on “stalls” • Specialized watchdog thread which the event loop pings when there is some activity • If nothing updates the watchdog in a reasonable time then the watchdog fires something that perf can detect • What we use is an obscure syscall “futimesat” that basically nothing uses and profile with perf using: • perf record -e syscalls:sys_enter_futimesat
make it back to the main loop • Thanks to Chris Lord • No warnings → avoids unresponsive UI • Also: more aggressive Javascript tile caching • Reusing tiles is better than nothing (avoids destruct + create) • 150 - 250 tiles as canvases (30-60Mb) • Manage canvas memory better • JS ‘GC’ is not your friend; need to explicitly memory manage these. • Store & manage zstd compressed tiles • Creating an actual ImageBitmap from data: hydrate the tile
Previous / Next Slide in direction of movement – 100ms after switch • Tracking global invalidations to manage larger cache properly • Scrolling: fetching and caching around the view area • Take the direction into account
shows like flicker • Instead: LayoutingService • Queue of jobs to be executed inside a single time slot • Similar to a rendering transaction • Meant to produce perfect frames
• Interactivity / latency: server-side watchdog • Memory usage • Perfect frames in the browser: client-side watchdog • Demo servers → long-running CPU profiling • Multi-user testing • How does it “feel” in our community call with ~20 people • Profiling interactive stress testing