Showing posts with label Java. Show all posts
Showing posts with label Java. Show all posts

Wednesday, February 3, 2016

How Java garbage collection works

As a Java developer, we all know JVM provides us an automatic Garbage Collection mechanism. And we don't need to worry about memory allocation and deallocation like in C. But how GC works behind the scene? It would help us to write much better Java applications if we understand that.

There are many articles you can find from Google to dive deep into it, I will only put some GC basics in this blog. Firstly, you might heard a term of "stop-the-world". What does that mean? It means the JVM stops running the application for a GC execution.  During the stop-the-world time, every thread will stop their tasks until the GC thread complete its task.

JVM Generations

In Java, we don't explicitly allocate and deallocate memory in the code. The GC finds those unreferenced objects and removes them. According to an article by Sangmin Lee[1], the GC was designed by following the two hypotheses below.
  • Most objects soon become unreachable.
  • References from old objects to young objects only exist in small numbers.
Therefore, the memory heap is broken into different segments, Java calls them as generations.

Young Generation: All new objects are allocated in Young Generation. When this area is full, GC removes unreachable objects from it. This is called "minor garbage collection" or "minor GC".

Old Generation: When objects survived from Young Generation, they are moved to Old Generation or Tenured Generation. Old Generation has bigger size and GC removes objects less frequently from it. When GC removes objects from Old Generation, it is called "major garbage collection" or "major GC".

Permanent Generation: Permanent Generation contains metadata of classes and methods, so it is also known as "method area". It does not store objects survived from Old Generation. The GC occurs in this area is also considered as "major GC". Some places call a GC as "full GC" if it performs on Permanent Generation.

You may notice the Young Generation is divided into a Eden space and two Survivor Spaces. They are used to determine the age of objects and whether to move them to Old Generation.

Generational Garbage Collection

Now, how does the GC process with those different generations in memory heap?
1. New created objects are allocated in Eden space. Two Survivor spaces are empty at the beginning.


2. When Eden space is full, a minor GC occurs. It deletes all unreferenced objects from Eden space and moves referenced objects to the first survivor space (S0). So the Eden space will be empty and new objects can be allocated to it.
3. When Eden space is full again, another minor GC occurs. It deletes all unreferenced objects from Eden space and moves referenced objects. But this time, referenced objects are moved to the second survivor space (S1). In addition, referenced objects in the first survivor space (S0) also get moved to S1 and have their age incremented. Unreferenced objects in S0 also get deleted. So we always have one survivor space empty.
4. The same process repeats in subsequent minor GC with survivor spaces switched.
5. When the aged objects in survivor spaces reach a threshold, they are moved to Old Generation.
6. When the Old Generation is full, a major GC will be performed to delete the unreferenced objects in Old Generation and compact the referenced objects.

The above steps are a quick overview of the GC in the Young Generation. The major GC process is different among different GC types. Basically, there are 5 GC types.
1. Serial GC
2. Parallel GC
3. Parallel Compacting GC
4. CMS GC
5. G1 GC

The 5 GC types can be switched using different command lines, like -XX:+UseG1GC will set the GC type to G1 GC.

Monitor Java Garbage Collection

There are several ways to monitor GC. I will list some most commonly used ones below.

jstat

jstat is in $JAVA_HOME/bin. You can run it by "jstat -gc <vmid> 1000". vmid is the virtual machine identifier. It is normally the process id of the JVM. 1000 means display the GC data every 1 second. The meaning of the output columns can be found here.

VisualVM

Visual VM is a GUI tool provided by Oracle. It can be downloaded from here.

GarbageCollectorMXBean and GarbageCollectionNotificationInfo

GarbageCollectorMXBean and GarbageCollectionNotificationInfo can be used to collect GC data in a programming way. An example can be found from here in my GitHub. You can use "mvn jetty:run" to start a jetty server and observe the GC information like below.
Minor GC: - 61 (Allocation Failure) start: 2016-02-03 22:22:17.784, end: 2016-02-03 22:22:17.789
        [Eden Space] init:4416K; used:19.2%(13440K) -> 0.0%(0K); committed: 19.2%(13440K) -> 19.2%(13440K)
        [Code Cache] init:160K; used:14.7%(4823K) -> 14.7%(4823K); committed: 14.7%(4832K) -> 14.7%(4832K)
        [Survivor Space] init:512K; used:16.7%(1456K) -> 13.3%(1162K); committed: 19.1%(1664K) -> 19.1%(1664K)
        [Metaspace] init:0K; used:19393K -> 19393K); committed: 19840K -> 19840K)
        [Tenured Gen] init:10944K; used:18.6%(32621K) -> 19.2%(33563K); committed: 19.0%(33360K) -> 19.2%(33616K)
duration:5ms, throughput:99.9%, collection count:61, collection time:213

Major GC: - 6 (Allocation Failure) start: 2016-02-03 22:22:17.789, end: 2016-02-03 22:22:17.839
        [Eden Space] init:4416K; used:0.0%(0K) -> 0.0%(0K); committed: 19.2%(13440K) -> 19.2%(13440K)
        [Code Cache] init:160K; used:14.7%(4823K) -> 14.7%(4823K); committed: 14.7%(4832K) -> 14.7%(4832K)
        [Survivor Space] init:512K; used:13.3%(1162K) -> 0.0%(0K); committed: 19.1%(1664K) -> 19.1%(1664K)
        [Metaspace] init:0K; used:19393K -> 19393K); committed: 19840K -> 19840K)
        [Tenured Gen] init:10944K; used:19.2%(33563K) -> 14.0%(24559K); committed: 19.2%(33616K) -> 19.2%(33616K)
duration:50ms, throughput:99.6%, collection count:6, collection time:228

Or you can run the GCMonitor class as a java application. It would take long time to finish the execution until a major GC occurs.

Reference:
[1] http://www.cubrid.org/blog/dev-platform/understanding-java-garbage-collection/
[2] http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

Sunday, October 18, 2015

Jackson serialization of Map Polymorphism with Spring MVC

I come across a problem of serializing Map type objects polymorphism when I re-engineering a legacy code. Spring MVC and Jackson are used in the RESTful API implementation. The problem is I have a list of Map type objects and they are in different implementation of Map. I want to serialize and deserialize the list with the actual type of each Map instance. For example, I have a list of map as below. One map is a HashMap and the other map is Hashtable.


List<Map> maps = new LinkedList<>();

Map<String, String> map1 = new HashMap();
Map<String, String> map2 = new Hashtable();

maps.add(map1);
maps.add(map2);

With Jackson's default settings, the type information of map1 and map2 will be lost after serialization. And they both will be LinkedHashMap after deserialization which makes sense to Jackson because it doesn't know the actual type of map1 and map2 in deserilaization. Jackson does provide a @JsonTypeInfo annotation to resolve the polymorphism problem, but it only applies to the values of the map, not the map itself.


After several days search online, the best solution I found so far is to customize the TypeResolverBuilder class used by Jackson's ObjectMapper instance. However, it requires both server and client side to set the customized TypeResolverBuilder to the ObjectMapper instance, which means if your RESTful API is exposed to the public, you have to provide your clients with the customized ObjectMapper class. I know it is not ideal, so if you have a better solution, please let me know. 

Now, the solution!

Firstly, write our own TypeResolverBuilder class. The important part is the useForType method. We override the method to return true if the type is a map like type.


public class MapTypeIdResolverBuilder extends StdTypeResolverBuilder {

    public MapTypeIdResolverBuilder() {
    }

    @Override
    public TypeDeserializer buildTypeDeserializer(DeserializationConfig config,
                                                  JavaType baseType, Collection<NamedType> subtypes) {
        return useForType(baseType) ? super.buildTypeDeserializer(config, baseType, subtypes) : null;
    }

    @Override
    public TypeSerializer buildTypeSerializer(SerializationConfig config,
                                              JavaType baseType, Collection<namedtype> subtypes) {
        return useForType(baseType) ? super.buildTypeSerializer(config, baseType, subtypes) : null;
    }

    /**
     * Method called to check if the default type handler should be
     * used for given type.
     * Note: "natural types" (String, Boolean, Integer, Double) will never
     * use typing; that is both due to them being concrete and final,
     * and since actual serializers and deserializers will also ignore any
     * attempts to enforce typing.
     */
    public boolean useForType(JavaType t) {
        return t.isMapLikeType() || t.isJavaLangObject();
    }
}


Then, we need to set it to the ObjectMapper instance used by Jackson. We will also have to call init and inclusion methods, otherwise exceptions will be thrown at runtime. It is not required to use JsonTypeInfo.Id.CLASS and JsonTypeInfo.As.PROPERTY, you can use whatever you want provided by JsonTypeInfo annotation.

ObjectMapper objectMapper = new ObjectMapper();
MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
objectMapper.setDefaultTyping(mapResolverBuilder);


As I said earlier, both client side and server side of our RESTful API need to use the above ObjectMapper instance to do the serialization and deserialization. Because I am using Spring MVC. So I have to register the ObjectMapper instance to the MappingJackson2HttpMessageConverter used by Spring. If you are using a different framework with Jackson, it should provide a way to set the customized ObjectMapper instance, hopefully. 

I will use the Java config instead of XML config in Spring. If you are using XML config, you can set the customized ObjectMapper instance as below, but it would be a bit tricky of how to call the init and inclusion methods in the ObjectMapper bean.

<mvc:annotation-driven>
        <mvc:message-converters>
            <bean class="org.springframework.http.converter.json.MappingJackson2HttpMessageConverter">
                <property name="objectMapper" ref="customObjectMapper"/>
            </bean>
        </mvc:message-converters>
</mvc:annotation-drive>


The Java config I am using at server side is as below.

@Configuration
@EnableWebMvc
@ComponentScan("com.geekspearls.mvc.jackson.server")
public class AppConfig extends WebMvcConfigurerAdapter {

    @Override
    public void configureDefaultServletHandling(DefaultServletHandlerConfigurer configurer) {
        configurer.enable();
    }

    @Override
    public void configureMessageConverters(List<HttpMessageConverter<?>> converters) {
        converters.add(converter());
    }

    @Bean
    public MappingJackson2HttpMessageConverter converter() {
        MappingJackson2HttpMessageConverter converter = new MappingJackson2HttpMessageConverter();
        converter.setObjectMapper(objectMapper());
        return converter;
    }

    @Bean
    public ObjectMapper objectMapper() {
        ObjectMapper objectMapper = new ObjectMapper();
        MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
        mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
        mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
        objectMapper.setDefaultTyping(mapResolverBuilder);
        return objectMapper;
    }
}


Then the client side class is as below. I am using the RestTemplate to call the RESTful service for simplicity.

public class ServiceConsumer {

    private static final String REST_ENDPOINT = "http://localhost:8080/rest/api";

    public InStock getInStock() {

        ObjectMapper objectMapper = new ObjectMapper();
        MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
        mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
        mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
        objectMapper.setDefaultTyping(mapResolverBuilder);

        List<HttpMessageConverter<?>> converters = new ArrayList<>();
        MappingJackson2HttpMessageConverter jackson2HttpMessageConverter = new MappingJackson2HttpMessageConverter();
        jackson2HttpMessageConverter.setObjectMapper(objectMapper);
        converters.add(jackson2HttpMessageConverter);
        RestOperations operations = new RestTemplate(converters);
        InStock s = operations.getForObject(REST_ENDPOINT + "/book/in_stock", InStock.class);
        return s;
    }
}


The complete code example can be found in my GitHub in the mvc.jackson package. The example can be run in jetty server via 'mvn jetty:run' command. And you will get the following JSON message when hit the server with URL 'http://localhost:8080/rest/api/book/in_stock in the browser. As you can see, it contains the type information of the maps `"@class": "java.util.Hashtable"` and `"@class": "java.util.HashMap"`.

{
  "store": "Los Angeles Store",
  "books": [
    {
      "@class": "com.geekspearls.mvc.jackson.server.model.ChildrenBook",
      "title": "Giraffes Can't Dance",
      "isbn": "1-84356-568-3",
      "properties": {
        "@class": "java.util.Hashtable",
        "Price": [
          "java.lang.Float",
          4.42
        ],
        "Type": "Board book",
        "Currency": "USD",
        "Pages": 10
      },
      "minAge": 3,
      "maxAge": 0
    },
    {
      "@class": "com.geekspearls.mvc.jackson.server.model.TextBook",
      "title": "Database Systems",
      "isbn": "1-84356-028-3",
      "properties": {
        "@class": "java.util.HashMap",
        "Pages": 560,
        "Type": "HardCover",
        "Price": [
          "java.lang.Float",
          146.16
        ],
        "Currency": "USD"
      },
      "subject": "Computer Science"
    }
  ]
}


By running the RestTest unit test provided in the example, you will get the following result. The first properties map is in Hashtable type and the second one is in HashMap type.

Store ->Los Angeles Store
book@com.geekspearls.mvc.jackson.server.model.ChildrenBook
Title: Giraffes Can't Dance
ISBN: 1-84356-568-3
Properties@java.util.Hashtable
Price -> 4.42@java.lang.Float
Currency -> USD@java.lang.String
Type -> Board book@java.lang.String
Pages -> 10@java.lang.Integer
Min Age: 0
Max Age: 3
=======================================
book@com.geekspearls.mvc.jackson.server.model.TextBook
Title: Database Systems
ISBN: 1-84356-028-3
Properties@java.util.HashMap
Pages -> 560@java.lang.Integer
Type -> HardCover@java.lang.String
Price -> 146.16@java.lang.Float
Currency -> USD@java.lang.String
Subject: Computer Science
=======================================