Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Big Code: algorithms, a more practical approch

unionx
February 25, 2014

Big Code: algorithms, a more practical approch

润超的算法演讲,超屌(虽然我没去听)

unionx

February 25, 2014
Tweet

More Decks by unionx

Other Decks in Programming

Transcript

  1. Code can be reordered. int doMath(int x, int y, int

    z) { int a = x + y; int b = x - y; int c = z + x; return a + b; } int doMath(int x, int y, int z) { int c = z + x; int b = x - y; int a = x + y; return a + b; } Saturday, August 31, 13
  2. Dead code can be removed int doMath(int x, int y,

    int z) { int a = x + y; int b = x - y; int c = z + x; return a + b; } int doMath(int x, int y, int z) { int a = x + y; int b = x - y; return a + b; } Saturday, August 31, 13
  3. int doMath(int x, int y, int z) { int a

    = x + y; int b = x - y; int c = z + x; return a + b; } int doMath(int x, int y, int z) { return x + y + x - y; } Values can be propagated Saturday, August 31, 13
  4. int doMath(int x, int y, int z) { int a

    = x + y; int b = x - y; int c = z + x; return a + b; } int doMath(int x, int y, int z) { return x + x; } Math can be simplified Saturday, August 31, 13
  5. largestValueLog = Math.log(largestValueWithSingleUnitResolution); magnitude = (int) Math.ceil(largestValueLog/ Math.log(2.0)); subBucketMagnitude =

    (magnitude > 1) ? magnitude : 1; subBucketCount = (int) Math.pow(2, subBucketMagnitude); subBucketMask = subBucketCount - 1; Hard enough to follow as it is No value in “optimizing” human-readable meaning away Compiled code will end up the same anyway. So Why does this matter Saturday, August 31, 13
  6. int distanceRatio(Object a) { int distanceTo = a.getX() - start;

    int distanceAfter = end - a.getX(); return distanceTo/distanceAfter; } int distanceRatio(Object a) { int x = a.getX(); int distanceTo = x - start; int distanceAfter = end - x; return distanceTo/ distanceAfter; } Reads can be cached Saturday, August 31, 13
  7. void loopUntilFlagSet(Object a) { while (!a.flagIsSet()) { loopcount++; } }

    void loopUntilFlagSet(Object a) { boolean flagIsSet = a.flagIsSet(); while (!flagIsSet) { loopcount++; } } Reads can be cached Saturday, August 31, 13
  8. Intermediate values might never be visible void updateDistance(Object a) {

    int distance = 100; a.setX(distance); a.setX(distance * 2); a.setX(distance * 3); } void updateDistance(Object a) { a.setX(300); } Writes can be eliminated Saturday, August 31, 13
  9. Intermediate values might never be visible void updateDistance(Object a) {

    a. setVisibleValue(0); for (int i = 0; i < 1000000; i++) { a.setInternalValue(i); } a.setVisibleValue(a.getInternalValue()); } void updateDistance(Object a) { a.setInternalValue(1000000); a.setVisibleValue(1000000); } Writes can be eliminated++ Saturday, August 31, 13
  10. public class Thing { private int x; public final int

    getX() { return x }; } ... myX = thing.getX(); Class Thing { int x; } ... myX = thing.x; Inlining Saturday, August 31, 13
  11. Adaptive compilation make cleaner code practical Reduces need to trade

    off clean design against speed E.g. “final” should be used on methods only when you want to prohibit extension, overriding. Has no effect on speed. E.g. branching can be written “naturally” Saturday, August 31, 13
  12. Why should you care about GC? A good architect must,

    first and foremost, be able to impose their architectural choices on the project. Find the root cause. Saturday, August 31, 13
  13. Trying to solve GC problems in application architecture is like

    throwing knives It takes practice and understanding to get it right You can get very good at it, but do you really want to? Will all the code you leverage be as good as yours? Saturday, August 31, 13
  14. Most of what People seem to "know" about Garbage Collection

    is wrong In many cases, it’s much better than you may think GC is extremely efficient. Much more so that malloc() Dead objects cost nothing to collect GC will find all the dead objects (including cyclic graphs) ... In many cases, it’s much worse than you may think Yes, it really does stop for ~1 sec per live GB (in most JVMs). No, GC does not mean you can’t have memory leaks No, those pauses you eliminated from your 20 minute test are not gone ... Saturday, August 31, 13
  15. A Basic Terminology example: What is a concurrent collector? A

    Concurrent Collector performs garbage collection work concurrently with the application’s own execution A Parallel Collector uses multiple CPUs to perform garbage collection Saturday, August 31, 13
  16. Garbage Objects that are not live, but are not free

    either, are called garbage. With explicit deallocation, garbage cannot be reused: its space has leaked away. Saturday, August 31, 13
  17. Garbage Collection This Garbage collection (GC) is a form of

    automatic memory management. The garbage collector is the program attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Saturday, August 31, 13
  18. Finalizer/Finalization A finalizer is a special method that is executed

    when an object is garbage collected. It is similar in function to a destructor. Finalizers are usually not deterministic. a finalizer is executed when the internal garbage collection system frees the object. Saturday, August 31, 13
  19. Resurrection Resurrection occurs when an object's finalizer causes the object

    to become reachable (that is, not garbage). The garbage collector must determine if the object has been resurrected by the finalizer or risk creating a dangling reference. The JVM will not invoke finalize() method again after resurrection Saturday, August 31, 13
  20. Heap(Data structure) In computer science, a heap is a specialized

    tree-based data structure that satisfies the heap property: If A is a parent node of B then key(A) is ordered with respect to key(B) with the same ordering applying across the heap Saturday, August 31, 13
  21. Heap(Memory Block) In Dynamic memory allocation, Heap is Memory requests

    are satisfied by allocating portions from a large pool of memory. Saturday, August 31, 13
  22. Reference Type Reference type is a Data type that can

    be only be accessed by references. Objects of reference types cannot be directly embedded into composite objects and are always dynamically allocated. They are usually destroyed automatically after they become unreachable. Saturday, August 31, 13
  23. Value Type Types of values or Types of objects with

    deep copy semantics. Value type that use the term value type to refer to the types of objects for which assignment has deep copy semantics (as opposed to reference types, which have shallow copy semantics) Saturday, August 31, 13
  24. Immutable Objects A reference type variable to an immutable object

    behaves with the same semantics as a value type variable. Saturday, August 31, 13
  25. Strong reference A strong reference is a normal reference that

    protects the referred object from collection by a garbage collector. The term is used to distinguish the reference from weak references. Saturday, August 31, 13
  26. Weak reference a weak reference is a reference that does

    not protect the referenced object from collection by a garbage collector; unlike a strong reference. An object referenced only by weak references is considered unreachable (or weakly reachable) and so may be collected at any time. Some garbage-collected languages feature or support various levels of weak references, Saturday, August 31, 13
  27. Weak references (references which are not counted in reference counting)

    may be used to solve the problem of circular references if the reference cycles are avoided by using weak references for some of the references within the group. For example, Apple's Cocoa framework recommends this approach, by using strong references for parent-to-child references, and weak references for child- to-parent references, thus avoiding cycles. Weak reference distill Saturday, August 31, 13
  28. Weak reference distill In the case of C++, normal pointers

    are weak and smart pointers are strong; although pointers are not true weak references, as weak references are supposed to know when the object becomes unreachable. Saturday, August 31, 13
  29. In languages Python and Ruby, all types are reference types,

    including those that appear as primitive types. On the Java platform, all composite and user- defined types are reference types. Only primitive types are value types. The .NET Framework makes a clear distinction between value and reference types, and allows creation of user-defined types for both kinds. Saturday, August 31, 13
  30. Dispose pattern The dispose pattern is a design pattern which

    is used to handle resource cleanup in runtime environments that use automatic garbage collection. The fundamental problem that the dispose pattern aims to solve is that, because objects in a garbage-collected environment have finalizers rather than destructors, there is no guarantee that an object will be destroyed at any deterministic point in time. The dispose pattern works around this by giving an object a method (usually called Dispose or similar) which frees any resources the object is holding onto. Saturday, August 31, 13
  31. Expose pattern effects on programming langs One disadvantage of this

    approach is that it requires the programmer to explicitly add cleanup code in a finally block. This leads to code size bloat, and failure to do so will lead to resource leakage in the program. Saturday, August 31, 13
  32. On Java The Java language introduced a new syntax called

    try-with-resources in Java version 7. Saturday, August 31, 13
  33. Classifying a collector’s operation A Concurrent Collector performs garbage collection

    work concurrently with the application’s own execution A Parallel Collector uses multiple CPUs to perform garbage collection A Stop-the-World collector performs garbage collection while the application is completely stopped An Incremental collector performs a garbage collection operation or phase as a series of smaller discrete operations with (potentially long) gaps in between Mostly means sometimes it isn’t (usually means a different fall back mechanism exists) Saturday, August 31, 13
  34. Precise vs. Conservative Collection A Collector is Conservative if it

    is unaware of some object references at collection time, or is unsure about whether a field is a reference or not A Collector is Precise if it can fully identify and process all object references at the time of collection A collector MUST be precise in order to move objects The COMPILERS need to produce a lot of information (oopmaps) All commercial server JVMs use precise collectors All commercial server JVMs use some form of a moving collector Saturday, August 31, 13
  35. Memory Use How many of you use heap sizes of:

    more than 1/2 GB? more than 1 GB? more than 2 GB? more than 4 GB? more than 10 GB? more than 20 GB? more than 50 GB? Saturday, August 31, 13
  36. pros Dangling pointer bugs Double free bugs Certain kinds of

    memory leaks Efficient implementations of persistent data structures Saturday, August 31, 13
  37. cons GC consumes computing resources. The moment when the garbage

    is actually collected can be unpredictable, resulting in stalls scattered throughout a session. that is STW. Non-deterministic GC is incompatible with RAII based management of non GCed resources. As a result, the need for explicit manual resource management (release/close) for non-GCed resources becomes transitive to composition. Garbage collection is rarely used on embedded or real- time systems because of the perceived need for very tight control over the use of limited resources. However, garbage collectors compatible with such limited environments have been developed. Saturday, August 31, 13
  38. Reference counting It is also the method used by many

    operating systems to determine whether a file may be deleted from the file-store. Saturday, August 31, 13
  39. Reference counting Management of active and garbage cells is interleaved

    with the execution of the user program. Reference counting may therefore be a suitable method if a smoother response time is important. reference count becomes zero can be reclaimed without access to cells in other pages of the heap. Disadvantage is the high processing cost paid to update counters to maintain the reference count invariant. Saturday, August 31, 13
  40. Q:Cyclic data structure Many implementations of lazy functional languages based

    on graph reduction use cycles to handle recursion. Saturday, August 31, 13
  41. Tradeoffs Since Python makes heavy use of malloc() and free(),

    it needs a strategy to avoid memory leaks as well as the use of freed memory. The chosen method is called reference counting. The principle is simple: every object contains a counter, which is incremented when a reference to the object is stored somewhere, and which is decremented when a reference to it is deleted. When the counter reaches zero, the last reference to the object has been deleted and the object is freed. Saturday, August 31, 13
  42. Warm up for codes There are two macros, Py_INCREF(x) and

    Py_DECREF(x), Py_DECREF() also frees the object when the count reaches zero. For flexibility, it doesn't call free() directly -- rather, it makes a call through a function pointer in the object's type object. For this purpose (and others), every object also contains a pointer to its type object. http:/ /docs.python.org/release/2.5.2/ext/refcounts.html Saturday, August 31, 13
  43. When to use Py_INCREF(x)/Py_DECREF(x)? Nobody ``owns'' an object; however, you

    can own a reference to an object. An object's reference count is now defined as the number of owned references to it. The owner of a reference is responsible for calling Py_DECREF() when the reference is no longer needed. Ownership of a reference can be transferred. There are three ways to dispose of an owned reference: pass it on, store it, or call Py_DECREF(). Forgetting to dispose of an owned reference creates a memory leak. Saturday, August 31, 13
  44. Mark-Sweep/scan Algorihm Mark-Sweep Algorithm relies on a global traversal of

    all live objects to determine which cells are available for reclamation. Saturday, August 31, 13
  45. Mark-Sweep Algorihm Mark phase identifies all actives cells. Sweep phase

    returns garbage collection cells to the free pool. Saturday, August 31, 13
  46. Mark (aka "Trace") Start from "roots" (thread stacks, statics, etc.)

    "Paint" anything you can reach as “live” At the end of a mark pass: all reachable objects will be marked "live" all non-reachable objects will be marked "dead" (aka "non-live"). Note: work is generally linear to "live set" Saturday, August 31, 13
  47. Sweep Scan through the heap, identify "dead" objects and track

    them somehow (usually in some form of free list) Note: work is generally linear to heap size Saturday, August 31, 13
  48. Why do garbage collection? One of the biggest sources of

    bookkeeping in systems programs is memory management. We feel it's critical to eliminate that programmer overhead, and advances in garbage collection technology in the last few years give us confidence that we can implement it with low enough overhead and no significant latency. http:/ /golang.org/doc/faq#garbage_collection Saturday, August 31, 13
  49. Copy GC A copying collector moves all lives objects from

    a "from" space to a "to" space & reclaims "from" space At start of copy, all objects are in "from" space and all references point to "from" space. Start from "root" references, copy any reachable object to "to" space, correcting references as we go At end of copy, all objects are in "to" space, and all references point to "to" space Note: work generally linear to "live set". Saturday, August 31, 13
  50. pros and cons Allocation costs are extremely low: the out-of-space

    check is a simple pointer comparison; new memory is acquired simple by incrementing the free space pointer. Saturday, August 31, 13
  51. Generational/ephemeral GC Generational Hypothesis: most objects die young Focus collection

    efforts on young generation: Use a moving collector: work is linear to the live set The live set in the young generation is a small % of the space Promote objects that live long enough to older generations Only collect older generations as they fill up “Generational filter” reduces rate of allocation into older generations Tends to be (order of magnitude) more efficient Great way to keep up with high allocation rate Practical necessity for keeping up with processor throughput Saturday, August 31, 13
  52. Incremental Compaction Track cross-region remembered sets (which region points to

    which) To compact a single region, only need to scan regions that point into it to remap all potential references identify regions sets that fit in limited time Each such set of regions is a Stop-the-World increment Safe to run application between (but not within) increments Note: work can grow with the square of the heap size The number of regions pointing into a single region is generally linear to the heap size (the number of regions in the heap) Saturday, August 31, 13
  53. Garbage Collector in v8 V8 compiles JavaScript to native machine

    code (IA-32, x86-64, ARM, or MIPS CPUs)[3][6] before executing it, instead of more traditional techniques such as executing bytecode or interpreting it. Optimization techniques used include inlining, elision of expensive runtime properties, and inline caching, among many others. The garbage collector of V8 is a generational incremental collector Saturday, August 31, 13
  54. Comparison Copy requires 2x the max. live set to be

    reliable Mark/Compact [typically] requires 2x the max. live set in order to fully recover garbage in each cycle Mark/Sweep/Compact only requires 1x (plus some) Copy and Mark/Compact are linear only to live set Mark/Sweep/Compact linear (in sweep) to heap size Mark/Sweep/(Compact) may be able to avoid some moving work Copying is [typically] "monolithic". Saturday, August 31, 13
  55. Intuitive If we had exactly 1 byte of empty memory

    at all times, the collector would have to work “very hard”, and GC would take 100% of the CPU time If we had infinite empty memory, we would never have to collect, and GC would take 0% of the CPU time GC CPU % will follow a rough 1/x curve between these two limit points, dropping as the amount of memory increases. Saturday, August 31, 13
  56. Empty memory needs (empty memory == CPU power) The amount

    of empty memory in the heap is the dominant factor controlling the amount of GC work For both Copy and Mark/Compact collectors, the amount of work per cycle is linear to live set The amount of memory recovered per cycle is equal to the amount of unused memory (heap size) - (live set) The collector has to perform a GC cycle when the empty memory runs out A Copy or Mark/Compact collector’s efficiency doubles with every doubling of the empty memory Saturday, August 31, 13
  57. What empty memory controls Empty memory controls efficiency (amount of

    collector work needed per amount of application work performed) Empty memory controls the frequency of pauses (if the collector performs any Stop-the-world operations) Empty memory DOES NOT control pause times (only their frequency) In Mark/Sweep/Compact collectors that pause for sweeping, more empty memory means less frequent but LARGER pauses Saturday, August 31, 13
  58. G1GC(aka “Garbage first”) Monolithic Stop-the-world copying NewGen Mostly Concurrent, OldGen

    marker Mostly concurrent marking. Stop-the-world to catch up on mutations, ref processing, etc. Tracks inter-region relationships in remembered sets Stop-the-world mostly incremental compacting old gen Objective: “Avoid, as much as possible, having a Full GC...” Compact sets of regions that can be scanned in limited time Delay compaction of popular objects, popular regions Fallback to Full Collection ( Monolithic Stop the world). Used for compacting popular objects, popular regions, etc. Saturday, August 31, 13
  59. Why clojure hanging for 3 secs when start nrepl? I

    guess, Stop the world, and then flush... Saturday, August 31, 13
  60. Speacial thanks Using Princeton/coursera slides as skeleton materials. Design patterns.

    Algorithms (4ed ver.) Gil Tene’s presentation in infoq. Garbage Collection Code Complete (2nd ver.) Saturday, August 31, 13