Wednesday, February 3, 2016

How Java garbage collection works

As a Java developer, we all know JVM provides us an automatic Garbage Collection mechanism. And we don't need to worry about memory allocation and deallocation like in C. But how GC works behind the scene? It would help us to write much better Java applications if we understand that.

There are many articles you can find from Google to dive deep into it, I will only put some GC basics in this blog. Firstly, you might heard a term of "stop-the-world". What does that mean? It means the JVM stops running the application for a GC execution.  During the stop-the-world time, every thread will stop their tasks until the GC thread complete its task.

JVM Generations

In Java, we don't explicitly allocate and deallocate memory in the code. The GC finds those unreferenced objects and removes them. According to an article by Sangmin Lee[1], the GC was designed by following the two hypotheses below.
  • Most objects soon become unreachable.
  • References from old objects to young objects only exist in small numbers.
Therefore, the memory heap is broken into different segments, Java calls them as generations.

Young Generation: All new objects are allocated in Young Generation. When this area is full, GC removes unreachable objects from it. This is called "minor garbage collection" or "minor GC".

Old Generation: When objects survived from Young Generation, they are moved to Old Generation or Tenured Generation. Old Generation has bigger size and GC removes objects less frequently from it. When GC removes objects from Old Generation, it is called "major garbage collection" or "major GC".

Permanent Generation: Permanent Generation contains metadata of classes and methods, so it is also known as "method area". It does not store objects survived from Old Generation. The GC occurs in this area is also considered as "major GC". Some places call a GC as "full GC" if it performs on Permanent Generation.

You may notice the Young Generation is divided into a Eden space and two Survivor Spaces. They are used to determine the age of objects and whether to move them to Old Generation.

Generational Garbage Collection

Now, how does the GC process with those different generations in memory heap?
1. New created objects are allocated in Eden space. Two Survivor spaces are empty at the beginning.


2. When Eden space is full, a minor GC occurs. It deletes all unreferenced objects from Eden space and moves referenced objects to the first survivor space (S0). So the Eden space will be empty and new objects can be allocated to it.
3. When Eden space is full again, another minor GC occurs. It deletes all unreferenced objects from Eden space and moves referenced objects. But this time, referenced objects are moved to the second survivor space (S1). In addition, referenced objects in the first survivor space (S0) also get moved to S1 and have their age incremented. Unreferenced objects in S0 also get deleted. So we always have one survivor space empty.
4. The same process repeats in subsequent minor GC with survivor spaces switched.
5. When the aged objects in survivor spaces reach a threshold, they are moved to Old Generation.
6. When the Old Generation is full, a major GC will be performed to delete the unreferenced objects in Old Generation and compact the referenced objects.

The above steps are a quick overview of the GC in the Young Generation. The major GC process is different among different GC types. Basically, there are 5 GC types.
1. Serial GC
2. Parallel GC
3. Parallel Compacting GC
4. CMS GC
5. G1 GC

The 5 GC types can be switched using different command lines, like -XX:+UseG1GC will set the GC type to G1 GC.

Monitor Java Garbage Collection

There are several ways to monitor GC. I will list some most commonly used ones below.

jstat

jstat is in $JAVA_HOME/bin. You can run it by "jstat -gc <vmid> 1000". vmid is the virtual machine identifier. It is normally the process id of the JVM. 1000 means display the GC data every 1 second. The meaning of the output columns can be found here.

VisualVM

Visual VM is a GUI tool provided by Oracle. It can be downloaded from here.

GarbageCollectorMXBean and GarbageCollectionNotificationInfo

GarbageCollectorMXBean and GarbageCollectionNotificationInfo can be used to collect GC data in a programming way. An example can be found from here in my GitHub. You can use "mvn jetty:run" to start a jetty server and observe the GC information like below.
Minor GC: - 61 (Allocation Failure) start: 2016-02-03 22:22:17.784, end: 2016-02-03 22:22:17.789
        [Eden Space] init:4416K; used:19.2%(13440K) -> 0.0%(0K); committed: 19.2%(13440K) -> 19.2%(13440K)
        [Code Cache] init:160K; used:14.7%(4823K) -> 14.7%(4823K); committed: 14.7%(4832K) -> 14.7%(4832K)
        [Survivor Space] init:512K; used:16.7%(1456K) -> 13.3%(1162K); committed: 19.1%(1664K) -> 19.1%(1664K)
        [Metaspace] init:0K; used:19393K -> 19393K); committed: 19840K -> 19840K)
        [Tenured Gen] init:10944K; used:18.6%(32621K) -> 19.2%(33563K); committed: 19.0%(33360K) -> 19.2%(33616K)
duration:5ms, throughput:99.9%, collection count:61, collection time:213

Major GC: - 6 (Allocation Failure) start: 2016-02-03 22:22:17.789, end: 2016-02-03 22:22:17.839
        [Eden Space] init:4416K; used:0.0%(0K) -> 0.0%(0K); committed: 19.2%(13440K) -> 19.2%(13440K)
        [Code Cache] init:160K; used:14.7%(4823K) -> 14.7%(4823K); committed: 14.7%(4832K) -> 14.7%(4832K)
        [Survivor Space] init:512K; used:13.3%(1162K) -> 0.0%(0K); committed: 19.1%(1664K) -> 19.1%(1664K)
        [Metaspace] init:0K; used:19393K -> 19393K); committed: 19840K -> 19840K)
        [Tenured Gen] init:10944K; used:19.2%(33563K) -> 14.0%(24559K); committed: 19.2%(33616K) -> 19.2%(33616K)
duration:50ms, throughput:99.6%, collection count:6, collection time:228

Or you can run the GCMonitor class as a java application. It would take long time to finish the execution until a major GC occurs.

Reference:
[1] http://www.cubrid.org/blog/dev-platform/understanding-java-garbage-collection/
[2] http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

No comments:

Post a Comment