Skip to content

JVM Memory Model

JDK 8

GC

Garbage Collection Algorithm

  • Mark-Sweep

    • Performance issue (stop the world)
    • Space fragmentation(the free space is not continuous)
  • Copying

    To solve the problem of efficiency, a "copy" collection algorithm has emerged. It divides the memory into two blocks of the same size and uses one of them at a time. When this piece of memory is cleaned up, the surviving objects are copied to another piece, and then the used space is cleaned up. In this way, each time the memory is collected, half of the memory range is collected.

  • Mark-Compact

    A tagging algorithm based on the characteristics of the old generation. The tagging process is still the same as the "mark-sweep" algorithm, but the subsequent steps are not to directly recycle the recyclable objects, but to move all surviving objects to one end and then clean up directly. Memory beyond the end boundary. No division.

  • Generational Collection

    The current generation of garbage collection for virtual machines uses a generational collection algorithm. This algorithm has no new ideas, but only divides the memory into several blocks according to the different life cycle of the object. The java heap is generally divided into new generation and old generation, so we can choose the appropriate garbage collection algorithm according to the characteristics of each generation.

    For example, in the new generation, a large number of objects die every time you collect, so you can choose a replication algorithm, and you only need to pay a small amount of object replication costs to complete each garbage collection. The old generation object has a higher probability of survival, and there is no extra space to guarantee its allocation, so we must choose "mark-clear" or "mark-finish" algorithms for garbage collection.

Why Young Generation & Old Generation (Tenured Generation)

The only reason is to optimize GC performance.

The HotSpot VM divides the young generation into three parts. One Eden and two Survivors and the default ration are 8:1. The garbage in the young generation is Minor GC. On the contrary, the garbage in the old generation is Major GC or Full GC. The frequency of Minor GC is much higher than the Major GC.

Garbage collector

  • Serial

    • Young Generation

      Copying

    • Old Generation

      Mark-Compact

    • Suit for client mode

  • PairNew

    It's the same as the Serial collector except it uses multiple GC threads.

    It suits for server-side mode.

  • Parallel Scavenge

    It likes the same as the PairNew collector. But it cares about the throughput, efficient use of CPU.

    The so-called throughput is the ratio of the time in the CPU used to run user code to the total CPU consumption time. The Parallel Scavenge collector provides many parameters for users to find the most suitable pause time or maximum throughput. If you do not know much about the collector operation, if manual optimization is difficult, you can choose to hand over memory management optimization to the virtual machine to complete it.

  • CMS

    • Initial Mark

      Pause all other threads and record the objects directly connected to root, very fast

    • Concurrent Mark

      Turn on the GC and user threads at the same time, and use a closure structure to record reachable objects. But at the end of this phase, this closure structure is not guaranteed to contain all currently reachable objects. Because the user thread may continuously update the reference domain, the GC thread cannot guarantee the real-time performance of the reachability analysis. So this algorithm keeps track of where these reference updates occur.

    • Remark

      The remark phase is to modify the marking record of the part of the object whose marking changes due to the user program continuing to run during the concurrent marking period. The pause time in this phase is generally slightly longer than the initial marking phase, which is much longer than the concurrent marking phase. short

    • Concurrent Sweep

      The user thread is started, and the GC thread starts cleaning the marked area.

    • Concurrent Reset

    Advantages & Disadvantages

    • Advantages

      Concurrent collection, low pause

    • Disadvantage

      Sensitive to CPU resources.

      Unable to handle floating garbage.

      The mark-clear algorithm it uses will cause a large amount of space debris to be generated at the end of the collection.

  • G1

    G1 is a concurrent, generational garbage collector that works by dividing the heap into multiple regions. Garbage collection is performed by copying live objects from one region to another, and the regions that are collected are then compacted. G1 is designed to achieve high throughput and low pause times, and it is well-suited for large heaps.

    The older garbage collectors, the heap are structured into there sections with fixed memory size,

    • Young Generation
    • Old Generation
    • Permanent Generation

    The G1 takes a different approach.

    Note

    Humongous regions is special region in G1 for saving big object.

    • The heap is partitioned into a set of equal-sized heap regions, each a contiguous range of virtual memory

      Each size of the region is a multiple of 2, and ranges from 1Mb to 32Mb.

    • Certain region sets ares assigned the same roles(eden, survivor, old), but not fixed size for them

      • Young Generation

        The heap is split into approximately 2000 regions.

        Note

        The regions are not required to be contiguous like the older garbage collectors..

        A young GC in G1 is the following

        • Live objects are evacuated(copied or moved) to one or more survivor region

          If the aging threshold is met, the objects are promoted to the old generation regions.

        • STW

          Eden size and survivor size is calculated for the next young GC. Accounting information is kept to help calculate the size. Things like the pause time goal are taken into consideration.

          This approach makes it very easy to resize regions, making them bigger or smaller as needed.

        • End

          Live objects have been evacuated to survivor regions or to old generation regions.

        Note

        The young GC is done in parallel using multiple threads.

      • Old Generation

        Like the CMS collector, the G1 collector is designed to be a low pause collector for old generation objects. The following table describes the G1 collection phases on old generation.

        Phase Description
        (1) Initial Mark (Stop the World Event) This is a stop the world event. With G1, it is piggybacked on a normal young GC. Mark survivor regions (root regions) which may have references to objects in old generation.
        (2) Root Region Scanning Scan survivor regions for references into the old generation. This happens while the application continues to run. The phase must be completed before a young GC can occur.
        (3) Concurrent Marking Find live objects over the entire heap. This happens while the application is running. This phase can be interrupted by young generation garbage collections.
        (4) Remark (Stop the World Event) Completes the marking of live object in the heap. Uses an algorithm called snapshot-at-the-beginning (SATB) which is much faster than what was used in the CMS collector.
        (5) Cleanup (Stop the World Event and Concurrent) Performs accounting on live objects and completely free regions. (Stop the world)Scrubs the Remembered Sets. (Stop the world)Reset the empty regions and return them to the free list. (Concurrent)
        () Copying (Stop the World Event)* These are the stop the world pauses to evacuate or copy live objects to new unused regions. This can be done with young generation regions which are logged as [GC pause (young)]. Or both young and old generation regions which are logged as [GC Pause (mixed)].

    In summary, the Garbage-First (G1) collector is a server-style garbage collector, targeted for multi-processor machines with large memories. It meets garbage collection (GC) pause time goals with a high probability, while achieving high throughput.

    Reference https://www.oracle.com/technetwork/tutorials/tutorials-1876574.html