Sunday, October 18, 2015

Jackson serialization of Map Polymorphism with Spring MVC

I come across a problem of serializing Map type objects polymorphism when I re-engineering a legacy code. Spring MVC and Jackson are used in the RESTful API implementation. The problem is I have a list of Map type objects and they are in different implementation of Map. I want to serialize and deserialize the list with the actual type of each Map instance. For example, I have a list of map as below. One map is a HashMap and the other map is Hashtable.


List<Map> maps = new LinkedList<>();

Map<String, String> map1 = new HashMap();
Map<String, String> map2 = new Hashtable();

maps.add(map1);
maps.add(map2);

With Jackson's default settings, the type information of map1 and map2 will be lost after serialization. And they both will be LinkedHashMap after deserialization which makes sense to Jackson because it doesn't know the actual type of map1 and map2 in deserilaization. Jackson does provide a @JsonTypeInfo annotation to resolve the polymorphism problem, but it only applies to the values of the map, not the map itself.


After several days search online, the best solution I found so far is to customize the TypeResolverBuilder class used by Jackson's ObjectMapper instance. However, it requires both server and client side to set the customized TypeResolverBuilder to the ObjectMapper instance, which means if your RESTful API is exposed to the public, you have to provide your clients with the customized ObjectMapper class. I know it is not ideal, so if you have a better solution, please let me know. 

Now, the solution!

Firstly, write our own TypeResolverBuilder class. The important part is the useForType method. We override the method to return true if the type is a map like type.


public class MapTypeIdResolverBuilder extends StdTypeResolverBuilder {

    public MapTypeIdResolverBuilder() {
    }

    @Override
    public TypeDeserializer buildTypeDeserializer(DeserializationConfig config,
                                                  JavaType baseType, Collection<NamedType> subtypes) {
        return useForType(baseType) ? super.buildTypeDeserializer(config, baseType, subtypes) : null;
    }

    @Override
    public TypeSerializer buildTypeSerializer(SerializationConfig config,
                                              JavaType baseType, Collection<namedtype> subtypes) {
        return useForType(baseType) ? super.buildTypeSerializer(config, baseType, subtypes) : null;
    }

    /**
     * Method called to check if the default type handler should be
     * used for given type.
     * Note: "natural types" (String, Boolean, Integer, Double) will never
     * use typing; that is both due to them being concrete and final,
     * and since actual serializers and deserializers will also ignore any
     * attempts to enforce typing.
     */
    public boolean useForType(JavaType t) {
        return t.isMapLikeType() || t.isJavaLangObject();
    }
}


Then, we need to set it to the ObjectMapper instance used by Jackson. We will also have to call init and inclusion methods, otherwise exceptions will be thrown at runtime. It is not required to use JsonTypeInfo.Id.CLASS and JsonTypeInfo.As.PROPERTY, you can use whatever you want provided by JsonTypeInfo annotation.

ObjectMapper objectMapper = new ObjectMapper();
MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
objectMapper.setDefaultTyping(mapResolverBuilder);


As I said earlier, both client side and server side of our RESTful API need to use the above ObjectMapper instance to do the serialization and deserialization. Because I am using Spring MVC. So I have to register the ObjectMapper instance to the MappingJackson2HttpMessageConverter used by Spring. If you are using a different framework with Jackson, it should provide a way to set the customized ObjectMapper instance, hopefully. 

I will use the Java config instead of XML config in Spring. If you are using XML config, you can set the customized ObjectMapper instance as below, but it would be a bit tricky of how to call the init and inclusion methods in the ObjectMapper bean.

<mvc:annotation-driven>
        <mvc:message-converters>
            <bean class="org.springframework.http.converter.json.MappingJackson2HttpMessageConverter">
                <property name="objectMapper" ref="customObjectMapper"/>
            </bean>
        </mvc:message-converters>
</mvc:annotation-drive>


The Java config I am using at server side is as below.

@Configuration
@EnableWebMvc
@ComponentScan("com.geekspearls.mvc.jackson.server")
public class AppConfig extends WebMvcConfigurerAdapter {

    @Override
    public void configureDefaultServletHandling(DefaultServletHandlerConfigurer configurer) {
        configurer.enable();
    }

    @Override
    public void configureMessageConverters(List<HttpMessageConverter<?>> converters) {
        converters.add(converter());
    }

    @Bean
    public MappingJackson2HttpMessageConverter converter() {
        MappingJackson2HttpMessageConverter converter = new MappingJackson2HttpMessageConverter();
        converter.setObjectMapper(objectMapper());
        return converter;
    }

    @Bean
    public ObjectMapper objectMapper() {
        ObjectMapper objectMapper = new ObjectMapper();
        MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
        mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
        mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
        objectMapper.setDefaultTyping(mapResolverBuilder);
        return objectMapper;
    }
}


Then the client side class is as below. I am using the RestTemplate to call the RESTful service for simplicity.

public class ServiceConsumer {

    private static final String REST_ENDPOINT = "http://localhost:8080/rest/api";

    public InStock getInStock() {

        ObjectMapper objectMapper = new ObjectMapper();
        MapTypeIdResolverBuilder mapResolverBuilder = new MapTypeIdResolverBuilder();
        mapResolverBuilder.init(JsonTypeInfo.Id.CLASS, null);
        mapResolverBuilder.inclusion(JsonTypeInfo.As.PROPERTY);
        objectMapper.setDefaultTyping(mapResolverBuilder);

        List<HttpMessageConverter<?>> converters = new ArrayList<>();
        MappingJackson2HttpMessageConverter jackson2HttpMessageConverter = new MappingJackson2HttpMessageConverter();
        jackson2HttpMessageConverter.setObjectMapper(objectMapper);
        converters.add(jackson2HttpMessageConverter);
        RestOperations operations = new RestTemplate(converters);
        InStock s = operations.getForObject(REST_ENDPOINT + "/book/in_stock", InStock.class);
        return s;
    }
}


The complete code example can be found in my GitHub in the mvc.jackson package. The example can be run in jetty server via 'mvn jetty:run' command. And you will get the following JSON message when hit the server with URL 'http://localhost:8080/rest/api/book/in_stock in the browser. As you can see, it contains the type information of the maps `"@class": "java.util.Hashtable"` and `"@class": "java.util.HashMap"`.

{
  "store": "Los Angeles Store",
  "books": [
    {
      "@class": "com.geekspearls.mvc.jackson.server.model.ChildrenBook",
      "title": "Giraffes Can't Dance",
      "isbn": "1-84356-568-3",
      "properties": {
        "@class": "java.util.Hashtable",
        "Price": [
          "java.lang.Float",
          4.42
        ],
        "Type": "Board book",
        "Currency": "USD",
        "Pages": 10
      },
      "minAge": 3,
      "maxAge": 0
    },
    {
      "@class": "com.geekspearls.mvc.jackson.server.model.TextBook",
      "title": "Database Systems",
      "isbn": "1-84356-028-3",
      "properties": {
        "@class": "java.util.HashMap",
        "Pages": 560,
        "Type": "HardCover",
        "Price": [
          "java.lang.Float",
          146.16
        ],
        "Currency": "USD"
      },
      "subject": "Computer Science"
    }
  ]
}


By running the RestTest unit test provided in the example, you will get the following result. The first properties map is in Hashtable type and the second one is in HashMap type.

Store ->Los Angeles Store
book@com.geekspearls.mvc.jackson.server.model.ChildrenBook
Title: Giraffes Can't Dance
ISBN: 1-84356-568-3
Properties@java.util.Hashtable
Price -> 4.42@java.lang.Float
Currency -> USD@java.lang.String
Type -> Board book@java.lang.String
Pages -> 10@java.lang.Integer
Min Age: 0
Max Age: 3
=======================================
book@com.geekspearls.mvc.jackson.server.model.TextBook
Title: Database Systems
ISBN: 1-84356-028-3
Properties@java.util.HashMap
Pages -> 560@java.lang.Integer
Type -> HardCover@java.lang.String
Price -> 146.16@java.lang.Float
Currency -> USD@java.lang.String
Subject: Computer Science
=======================================

Saturday, August 29, 2015

Dynamic programming - The Longest Common Subsequence Problem

The longest common subsequence problem is a simple problem that illustrates dynamic programming.

Longest Common Subsequence Problem

The problem is given two strings A and B of length m and n respectively, determine the length of the longest subsequence that is common in A and B. For example, A = xyxzx and B = zxyzy, then the longest common subsequence of A and B is xyz and the length is 3.

Brute-force Method

One way to solve the problem is to enumerate all the \(2^m\) subsequences of A, and for each subsequence check if it is also a subsequence in B. The time complexity of this method is \(\Theta(n2^m)\).

Dynamic Programming Method

Another way to solve this problem is using dynamic programming. It is easy to observe the formula below. \(a_i\) is the character at index i in A. \(b_j\) is the character at index j in B. L[i, j] is the length of the longest common subsequence in the substring of A[0..i] and the substring of B[0..j]. 
  • If \(a_i = b_j\), L[i, j] = L[i - 1, j - 1] + 1.
  • If \(a_i \neq b_j\), L[i, j] = max{L[i, j - 1], L[i - 1, j]}.
For example, A = xyxzx, B = zxyzy. Consider substring of A[0..2] = xyx and substring of B[0..2] = zxy. The L[2, 2] = 2 and the longest common subsequence is xy. Then consider A[0..3] = xyxz and B[0..3]=zxyz. Because \(a_3\) = z == \(b_3\) = z, it is clear that the L[3, 3] = 3 = L[2, 2] + 1 and the longest common subsequence is xyz. Then let's consider  A[0..3] = xyxz and B[0..4] = zxyzy. Because \(a_3\) = z \(\neq\) \(b_4\) = y, the longest common subsequence should still be xyz and the L[3, 4] = L[3, 3] = 3.

We can use an (m + 1) \(\times\) (n + 1) table to store the values of L[i, j]. And we fill the table row by row using the above observation. Then L[m, n] has the length of the longest common subsequence in A and B. The pseudo code is described in LCS.

Algorithm 6 LCS
Input: Two strings A and B of length m and n respectively.
Output: The length of the longest common subsequence of A and B.
  1. for i = 0 to m
  2.     for j = 0 to n
  3.         if (i == 0 and j == 0) then L[i, j] = 0
  4.         else if (\(a_i\) == \(b_j\)) then L[i, j] = L[i - 1, j - 1] + 1
  5.         else L[i, j] = max{L[i, j - 1], L[i - 1, j]}
  6.         end if
  7.     end for
  8. end for
  9. return L[m, n]
A JAVA implementation of this algorithm can be found on GitHub LCS.

An Example

Running LCS algorithm with input A = "abcdeabcd" and B = "acebde" will generate the table below. The result is 5 and the longest common subsequence is "acebd".

       a  c  e  b  d  e
  [0, 0, 0, 0, 0, 0, 0]
a[0, 1, 1, 1, 1, 1, 1]
b[0, 1, 1, 1, 2, 2, 2]
c[0, 1, 2, 2, 2, 2, 2]
d[0, 1, 2, 2, 2, 3, 3]
e[0, 1, 2, 3, 3, 3, 4]
a[0, 1, 2, 3, 3, 3, 4]
b[0, 1, 2, 3, 4, 4, 4]
c[0, 1, 2, 3, 4, 4, 4]
d[0, 1, 2, 3, 4, 5, 5]


Time Complexity

Filling each entry in the table requires \(\Theta(1)\) time. The size of the table is \(\Theta(mn)\), so the time complexity of LCS algorithm is \(\Theta(mn)\), which is the total time of filling the whole table.

Sunday, August 16, 2015

Spring + Quartz integration example (3) - inject Spring bean into Quartz job instance

In my work of integrating Quartz scheduler with Spring. There is a requirement to pass an object into the Quartz job instance. The object is a Spring bean. I found because the Quartz job instance is created by Quartz thread and it doesn't aware of Spring context, so I couldn't use the Spring bean in the job class directly, it is always null. After googling it a bit, I found three ways to do that. 1. Pass the object into JobDataMap, 2. get it from Spring application context and 3. use Spring to inject the object. Personally, I vote for the third way as the best practice.


1. Passing the object into JobDataMap

This is the easiest way, what we have to do is letting our class implement Serializeble interface. Then the object can be simply passed into the Quartz JobDataMap. However, there is a flaw in this method. If you use a persistent JobStore, the object in it will be serialized. And you will get a class-versioning error if your class definition is changed in future. Quartz recommends only pass primitive and String into JobDataMap. For details, see Quartz lesson 3.

Put object into JobDataMap

   <bean id="object" class="com.package.MyObject"/>  
   <bean class="org.springframework.scheduling.quartz.JobDetailFactoryBean">  
     <property name="jobClass" value="com.package.MyJob"/>  
     <property name="jobDataAsMap">  
       <map>  
         <entry key="myObject" value-ref="object"/>  
       </map>  
     </property>  
     <property name="durability" value="true"/>  
   </bean>

Retrieve object in Quartz job class

...
MyObject obj = jobExecutionContext.getMergedJobDataMap().getString("myObject");
...
>

2. Using Spring application context

In this way, we have to define an application context key in the scheduler bean.
<bean id="scheduler"  
      class="org.springframework.scheduling.quartz.SchedulerFactoryBean">  
     <property name="applicationContextSchedulerContextKey" value="applicationContext"/>  
     <property name="jobDetails">  
       <list>
          ...  
       </list>  
     </property>  
     <property name="triggers">  
       <list>
         ...  
       </list>  
     </property>  
   </bean>
Then in the job class, we can get the Spring application context as below and retrieve any bean via the bean name.
try {
            ApplicationContext context =
                    (ApplicationContext) jobExecutionContext.getScheduler().getContext().get("applicationContext");
            if (context != null) {
                InjectObject obj = (InjectObject) context.getBean("geeksPearls");
                if (obj != null) {
                    System.out.println("[" + new Date() + "] I am saying hello from a Quartz job, " + obj);
                }
            }
        } catch (SchedulerException e) {
            e.printStackTrace();
        }
The completed example can be found on my GitHub Example3. Have a look at InjectTask class and the Spring context XML.


3. Using Spring injection

We can also use Spring injection to inject any object into our job. However, we have to do a bit more work. I found this way in this blog. And I think it's elegant and the Spring guys should use it as the default way.

Firstly, extend the SpringBeanJobFactory class as below.
public class AutowiringSpringBeanJobFactory extends SpringBeanJobFactory implements ApplicationContextAware {

    private transient AutowireCapableBeanFactory beanFactory;

    public void setApplicationContext(final ApplicationContext applicationContext) throws BeansException {
        beanFactory = applicationContext.getAutowireCapableBeanFactory();
    }

    @Override
    protected Object createJobInstance(final TriggerFiredBundle bundle) throws Exception {
        final Object job = super.createJobInstance(bundle);
        beanFactory.autowireBean(job);
        return job;
    }
}
Secondly, override the default jobFactory in Spring's SchedulerFactoryBean.
<bean id="autoWireScheduler"  
      class="org.springframework.scheduling.quartz.SchedulerFactoryBean">  
     <property name="jobFactory" >  
       <bean class="com.geekspearls.quartz.example3.AutowiringSpringBeanJobFactory"/>  
     </property>  
     <property name="jobDetails">  
       <list>  
         <ref bean="injectJob2"/>  
       </list>  
     </property>  
     <property name="triggers">  
       <list>  
         <ref bean="simpleTrigger2"/>  
       </list>  
     </property>  
   </bean>
Then we can autowire any object into Quartz job class.
public class AnotherInjectTask extends QuartzJobBean {

    @Autowired
    private InjectObject injectObject;

    @Override
    protected void executeInternal(JobExecutionContext jobExecutionContext) throws JobExecutionException {
        System.out.println("[" + new Date() + "] I am saying another hello from a Quartz job, " +
                injectObject);
    }

    public void setInjectObject(InjectObject injectObject) {
        this.injectObject = injectObject;
    }
}
Be aware that in order to use the autowire annotation of Spring. Add either <context:annotation-config> or <context:componentscan> in the Spring context XML file. A completed example can be found on my GitHub Example3. Have a look at AnotherInjectTask and the Spring context XML.

In the example, if we run Example3 class as java application. We can see the following output in the console.
...
[Sun Aug 16 16:46:12 AEST 2015] I am saying hello from a Quartz job, GeeksPearls
[Sun Aug 16 16:46:12 AEST 2015] I am saying another hello from a Quartz job, null
[Sun Aug 16 16:46:12 AEST 2015] I am saying another hello from a Quartz job, GeeksPearls
...
The first line is output by the job using the Spring application context way. The second line is output by the job using the default Spring SchedulerFactoryBean. You can see the object is not injected so it outputs null. The third line is output by the job using our AutowiringSpringBeanJobFactory, you can see the object is injected successfully. Enjoy!

Sunday, August 9, 2015

Spring + Quartz integration example (2) - schedule jobs dynamically

Sometimes, we need schedule jobs dynamically. This requires us to use Quartz scheduler in a programming way rather than a static XML configuration way. I'll show in this post about how to do it with a simple example.


Environment

Java 8
Maven 3
Spring 3.2.3
Quartz 2.2.2
SFL4J 1.7.9 (to show Quartz start up information)


Scenario

We need schedule a report generation task for different departments in an organisation. The task is doing the same thing for different departments but needs to be run at different time, and it collects different data from database for different departments. Suppose sales department's report is to be generated at 7:00 am everyday, marketing department's report is to be generated at 8:00 am on Monday every week and engineering department's report is to be generated at 6:00 pm on the last Friday of every month.

It would be a little difficult to configure the tasks in Spring XML file, because it would be tricky to tell the XML file which department it is. Fortunately, Quartz allows us to use a programming way to schedule the task dynamically.


Implementation

The project configuration is as below.
   <dependencies>  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-core</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-context-support</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <!-- Transaction dependency is required with Quartz integration -->  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-tx</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <!-- Quartz framework -->  
     <dependency>  
       <groupId>org.quartz-scheduler</groupId>  
       <artifactId>quartz</artifactId>  
       <version>${quartz.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.slf4j</groupId>  
       <artifactId>slf4j-api</artifactId>  
       <version>${slf4j.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.slf4j</groupId>  
       <artifactId>slf4j-simple</artifactId>  
       <version>${slf4j.version}</version>  
     </dependency>  
   </dependencies>  
   <properties>  
     <spring.version>3.2.3.RELEASE</spring.version>  
     <quartz.version>2.2.1</quartz.version>  
     <slf4j.version>1.7.9</slf4j.version>  
   </properties>
To make it simple here, the task is just print out an information on the console. The task class is as below. the department name is retrieved from the JobDataMap and can be used in the report generation.
public class GenerateReportTask extends QuartzJobBean {

    public static final String DEPARTMENT = "DEPARTMENT";

    @Override
    protected void executeInternal(JobExecutionContext jobExecutionContext) throws JobExecutionException {
        JobDataMap dataMap = jobExecutionContext.getMergedJobDataMap();
        String department = dataMap.getString(DEPARTMENT);
        System.out.println("[" + new Date() + "] Generating report for department " + department);
    }
}
A QuartzScheduler class is used to schedule the task dynamically. The schedule CRON expressions are stored in a schedules map and the map is initialised in init() method. The init() method will be called when Spring initialise the QuartzScheduler bean.

public class QuartzScheduler {

    private Department[] departments = {Department.ENGINEER, Department.MARKETING, Department.SALES};
    private Map schedules = new HashMap();

    public void init() {
        schedules.put(Department.ENGINEER, "0 0 6 ? * 6L");
        schedules.put(Department.MARKETING, "0 0 8 ? * 2");
        schedules.put(Department.SALES, "0 0 7 * * ?");

        for (Department department : departments) {
            schedule(department);
        }
    }

    private enum Department {
        SALES, MARKETING, ENGINEER
    }
}
In the QuartzScheduler class, an instance of org.quartz.Scheduler class will be injected by Spring, it is the real Quartz scheduler to schedule the tasks. And the schedule() method is using Quartz's static method newJob() and newTrigger() to create the jobDetail and Trigger instances, then add the trigger to the scheduler. You might notice that we put the department name into the JobDataMap. As mentioned above, the department name will be used in the GenerateReportTask class.

public class QuartzScheduler {
    private Scheduler scheduler;    
    ...
    private void schedule(Department department) {
        JobDetail job = newJob(GenerateReportTask.class)
                .withIdentity(department.toString(), "reportgen")
                .build();
        job.getJobDataMap().put(GenerateReportTask.DEPARTMENT, department.toString());
        Trigger trigger = newTrigger()
                .withIdentity(department.toString(), "reportgen")
                .withSchedule(cronSchedule(schedules.get(department)))
                .forJob(job).build();
        try {
            scheduler.scheduleJob(job, trigger);
        } catch (SchedulerException e) {
            e.printStackTrace();
        }
    }
    ...
}
Finally, the Spring XML configuration file contains the bean definition of the QuartzScheduler class and the Spring SchedulerFactoryBean bean to create the real Quartz scheduler instance.
<beans xmlns="http://www.springframework.org/schema/beans"  
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
     xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">  
   <bean id="quartzScheduler"  
      class="com.geekspearls.quartz.example2.QuartzScheduler"  
       init-method="init">  
     <property name="scheduler" ref="scheduler"/>  
   </bean>  
   <bean id="scheduler"  
      class="org.springframework.scheduling.quartz.SchedulerFactoryBean"/>  
 </beans>
And the Java application class to test our code.
public class Example2 {

    public static void main(String[] args) {
        ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext-quartz2.xml");
        System.out.println("Application started [" + new Date() + "] ");
    }
}
A complete code can be found on my GitHub here. In order to demo the example, I put the schedules a much shorter period as below.
        schedules.put(Department.ENGINEER, "10 * * * * ?");
        schedules.put(Department.MARKETING, "30 * * * * ?");
        schedules.put(Department.SALES, "50 * * * * ?");
Run the Example2 class as a Java application and you can see the tasks are running at scheduled time on the console. I hope you enjoy it.

Saturday, July 25, 2015

Spring 3.2.3 + Quartz 2.2.1 integration example (1) - configuration in Spring context XML

Recently, I am assigned a task of integrating Quartz into a Spring project. I find it's quite an interesting job and would like to share some of my findings.

Quartz is a very popular scheduling library and is used in many Java applications to run jobs. It is open source, simple to use and most importantly, it is free. Here is the Quartz website Quartz Scheduler.

In this post, I'd share how to integrate Quartz into Spring. And how to schedule some simple jobs using Spring's JobDetailFactory and MethodInvokingJobDetailFactory in a simple trigger and CRON trigger respectively.

Environment

Java 8
Maven 3
Spring 3.2.3
Quartz 2.2.2
SFL4J 1.7.9 (to show Quartz start up information)

Project configuration

Create a new Maven project and add the following dependencies to the pom file (Assume Java 8 and Maven 3 have already been installed).
   <dependencies>  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-core</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-context-support</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <!-- Transaction dependency is required with Quartz integration -->  
     <dependency>  
       <groupId>org.springframework</groupId>  
       <artifactId>spring-tx</artifactId>  
       <version>${spring.version}</version>  
     </dependency>  
     <!-- Quartz framework -->  
     <dependency>  
       <groupId>org.quartz-scheduler</groupId>  
       <artifactId>quartz</artifactId>  
       <version>${quartz.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.slf4j</groupId>  
       <artifactId>slf4j-api</artifactId>  
       <version>${slf4j.version}</version>  
     </dependency>  
     <dependency>  
       <groupId>org.slf4j</groupId>  
       <artifactId>slf4j-simple</artifactId>  
       <version>${slf4j.version}</version>  
     </dependency>  
   </dependencies>  
   <properties>  
     <spring.version>3.2.3.RELEASE</spring.version>  
     <quartz.version>2.2.1</quartz.version>  
     <slf4j.version>1.7.9</slf4j.version>  
   </properties>

Quartz job and trigger configuration

The first job class simply print a greeting message. I print the date time at the beginning of the greeting so we can know when the job is triggered by Quartz.

public class HelloWorldTask {

    public void sayHello() {
        System.out.println("[" + new Date() + "] Hello from Quartz! World");
    }
}
Then let's schedule the job into Quartz. Firstly, we need configure the JobDetail bean in Spring context file. We'll use the MethodInvokingJobDetailFacrotyBean, it allows to specify which class and which method in the class to run when the job is triggered. For details about MethodInvokingJobDetailFacrotyBean, please read the Spring docs.

   <bean id="helloWorldTask" class="com.geekspearls.quartz.task.HelloWorldTask"/>
   
   <bean id="helloWorldJob"  
      class = "org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">  
     <property name="targetObject" ref="helloWorldTask"/>  
     <property name="targetMethod" value="sayHello"/>  
   </bean>

Secondly, we need to add our job to a trigger. We will use the SimpleTriggerFactoryBean to schedule our first job and set the job to be run every 5 seconds and delay 1 second to start. For details about SimpleTriggerFactoryBean, please read the Spring docs.

  <bean id="simpleTrigger"  
      class="org.springframework.scheduling.quartz.SimpleTriggerFactoryBean">  
     <property name="jobDetail" ref="helloWorldJob" />  
     <property name="repeatInterval" value="5000" />  
     <property name="startDelay" value="1000" />  
   </bean>  

Then let's create our second job. The second job class extends QuartzJobBean interface. When it is triggered, executeInternal method will be invoked. Inside the method, it takes the value of 'name' from JobDataMap which we set in the jobDetails configuration and print a greeting message. The date time will also be printed out so we can know when the job is triggered.

public class SayHelloTask extends QuartzJobBean {
    @Override
    protected void executeInternal(JobExecutionContext jobExecutionContext) throws JobExecutionException {
        String name = jobExecutionContext.getMergedJobDataMap().getString("name");
        System.out.println("[" + new Date() + "] Hello from Quartz! " + name);
    }
}

For our second job, we will use the JobDetailFactoryBean to configure it in Spring context file and pass in a value of 'name' into jobDataMap.
  <bean id="sayHelloJob"  
      class="org.springframework.scheduling.quartz.JobDetailFactoryBean">  
     <property name="jobClass" value="com.geekspearls.quartz.task.SayHelloTask"/>  
     <property name="jobDataAsMap">  
       <map>  
         <entry key="name" value="Geek's pearls reader"/>  
       </map>  
     </property>  
     <property name="durability" value="true"/>  
   </bean> 

Then we schedule this job with CronTriggerFactoryBean. It is scheduled to run every 10 seconds start from every 0 second. For details about CRON, please read Quartz CronTrigger Tutorial
  <bean id="cronTrigger"  
      class="org.springframework.scheduling.quartz.CronTriggerFactoryBean">  
     <property name="jobDetail" ref="sayHelloJob" />  
     <property name="cronExpression" value="0/10 * * * * ?"/>  
   </bean> 

Now, we have our jobs and triggers ready. Let's add them into Quartz scheduler.
   <bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">  
     <property name="jobDetails">  
       <list>  
         <ref bean="helloWorldJob" />  
         <ref bean="sayHelloJob" />  
       </list>  
     </property>  
     <property name="triggers">  
       <list>  
         <ref bean="simpleTrigger" />  
         <ref bean="cronTrigger" />  
       </list>  
     </property>  
   </bean>

Finally, everything is ready. The completed Spring context XML file is below.
   <beans xmlns="http://www.springframework.org/schema/beans"  
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
     xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">  
   <bean id="sayHelloJob"  
      class="org.springframework.scheduling.quartz.JobDetailFactoryBean">  
     <property name="jobClass" value="com.geekspearls.quartz.task.SayHelloTask"/>  
     <property name="jobDataAsMap">  
       <map>  
         <entry key="name" value="Geek's pearls reader"/>  
       </map>  
     </property>  
     <property name="durability" value="true"/>  
   </bean>  
   <bean id="helloWorldTask" class="com.geekspearls.quartz.task.HelloWorldTask"/>  
   <bean id="helloWorldJob"  
      class = "org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">  
     <property name="targetObject" ref="helloWorldTask"/>  
     <property name="targetMethod" value="sayHello"/>  
   </bean>  
   <bean id="simpleTrigger"  
      class="org.springframework.scheduling.quartz.SimpleTriggerFactoryBean">  
     <property name="jobDetail" ref="helloWorldJob" />  
     <property name="repeatInterval" value="5000" />  
     <property name="startDelay" value="1000" />  
   </bean>  
   <bean id="cronTrigger"  
      class="org.springframework.scheduling.quartz.CronTriggerFactoryBean">  
     <property name="jobDetail" ref="sayHelloJob" />  
     <property name="cronExpression" value="0/10 * * * * ?"/>  
   </bean>  
   <bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">  
     <property name="jobDetails">  
       <list>  
         <ref bean="helloWorldJob" />  
         <ref bean="sayHelloJob" />  
       </list>  
     </property>  
     <property name="triggers">  
       <list>  
         <ref bean="simpleTrigger" />  
         <ref bean="cronTrigger" />  
       </list>  
     </property>  
   </bean>  
 </beans>

Now, let's test it! Run the following class as Java application.
public class HelloWorld {
    public static void main(String[] args) {
        ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext.xml");
        System.out.println("Application started [" + new Date() + "]");
    }
}

In the log and console output, we can see the information below. Firstly, it tells us a Quartz scheduler with instanceId 'NON_CLUSTER' running locally with 10 threads. And it doesn't support persistence and cluster. I will share how to use clustered Quartz scheduler in my next post.
[main] INFO org.quartz.core.QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.2.1) 'org.springframework.scheduling.quartz.SchedulerFactoryBean#0' with instanceId 'NON_CLUSTERED'
  Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
  NOT STARTED.
  Currently in standby mode.
  Number of jobs executed: 0
  Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
  Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.

Then we can see the application is started at 17:14:23. 1 second later, our first job is triggered and repeated every 5 seconds. At 17:14:30, our second job is triggered and repeated every 10 seconds. That's what we expected! The completed example can be found on my GitHub here. I hope you will enjoy it!
Application started [Sat Jul 25 17:14:23 AEST 2015]
[Sat Jul 25 17:14:24 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:29 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:30 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:14:34 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:39 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:40 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:14:44 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:49 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:50 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:14:54 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:14:59 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:00 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:15:04 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:09 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:10 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:15:14 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:19 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:20 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:15:24 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:29 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:30 AEST 2015] Hello from Quartz! Geek's pearls reader
[Sat Jul 25 17:15:34 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:39 AEST 2015] Hello from Quartz! World
[Sat Jul 25 17:15:40 AEST 2015] Hello from Quartz! Geek's pearls reader
...

Monday, June 29, 2015

Git with Eclipse (4) - pushing local repositories to remote repositories

This post describes how to create a remote repository at GitHub and how to push a local repository to remote repository at GitHub using EGit.


Create a repository at GitHub

Firstly, we need a remote repository to start. I will use GitHub to host our remote repository. Sign up from here to get an account at GitHub. After sign in, click the  icon and select New repository.
In the next screen, give a name "HelloWorld" to our repository. You can also write a brief description about the repository, so other people will know what is your repository about. Choose Public as the repository type, so it can be accessed by anyone with restriction on modification. Private repository is available to paid users. If you tick Initialize this repository with a README, GitHub will create a read me file in the repository. We leave it as it is so we will have a clean repository. Then click the Create repository button and our HelloWorld repository is created.

In the next screen, you can see the HelloWorld repository is empty and GitHub provides several suggestions to start with. We will leave it for now and go back to HelloWold project in Eclipse.

SSH configuration

If you are going to use SSH protocol instead of HTTP for read and write access to the remote repository at GitHub, you have to configure the SSH key in Eclipse and GitHub.

Eclipse SSH configuration

Open Eclipse Preferences via Windows -> Preferences. Then go to General -> Network Connections -> SSH2. Select Key Management tab and click the Generate RSA Key... button. Eclipse will generate a public key (Copy the public key to a text file and it will be uploaded to GitHub later). Then enter a pass phrase in Passphrase: and Confirm passphrase: fields, it will be used to protect the private key. Click the Save Private Key... button to save the private key in your computer (usually it is saved in C:\Users\<username>\.ssh). Then go to General tab and enter the path of the private key (usually is C:\Users\<username>\.ssh) in SSH2 home: field. Click Apply button to save the configuration.

GitHub SSH configuration

Log in to GitHub. Then click the icon on the top right of the screen and select Settings in the menu. Choose SSH Keys in the navigator on the left. Then click Add SSH Key button and paste the Eclipse generated public key into the Key text box. Title field is optional.

Click Add key button and you will be asked to confirm your password. After the password is confirmed, the public key is added to GitHub. Now you are able to use SSH protocol to access your repositories at GitHub.

Push the local repository to remote repository

Firstly, we need a local repository to push. Create the HelloWorld project in Eclipse and share it in a local Git repository. Please refer to Git with Eclipse (3) - basic use cases in EGit with local repository for how to share project in local repository.

Secondly, because we created an empty repository at GitHub in previous step, we need to create a branch in the remote repository. Open Git perspective in Eclipse by selecting Windows -> Open Perspective -> Other....in the menu, then choose Git in the opened dialog.

In the Git perspective, expand HelloWorld repository, right click on Remotes and select Create Remote... In the menu.

Give a remote name in the opened dialog (Usually is origin by default) and select Configure push, then click OK.



The Configure Push dialog asks for the URI of the remote repository. we can copy the URI from HelloWorld repository at GitHub (we will use SSH URI in our example). Log in to GitHub and open HelloWorld repository. In the top section of the page, click on SSH and click theicon to copy the SSH URI.

Then go back to Eclipse. Click the Change... button in Configure Push dialog and paste the SSH URI into the URI: field in the opened Destination Git Repository dialog. Select ssh in the Protocol: field. Authentication Users: is git. Leave the rest as what it is and click Finish. You will be asked for the pass phrase. Enter the pass phrase you have set up in the SSH configuration and click OK.

Then click Save and Push button in Configure Push dialog.


A confirmation window will be shown once the repository is pushed to GitHub successfully.

Click OK to close the window. Log in to GitHub and open HelloWorld repository. You will see the HelloWorld project is now available at GitHub.

Now the HelloWorld project is ready to be forked or collaborated by other contributors. Also, you can fetch any updates in the HelloWorld project from GitHub and push your changes in HelloWorld project to GitHub. For how to clone a remote repository and fetch/push changes. Please read Git with Eclipse (5) - work with remote repositories (coming soon...).


Reference:




Friday, June 19, 2015

Sorting algorithm - Heap Sort

Recall that selection sort algorithm which sorts by finding the minimum element from the unsorted part and put it at the front (Selection Sort). Similarly, it will work if we find the maximum element from the unsorted part and put it at the end. However, the time complexity of selection sort is \(\Theta(n^2)\). Interestingly, the time complexity can be improved substantially if a right data structure is used. Binary heap is a such data structure.


Binary heap

Binary heap is an almost-complete binary tree with each node satisfying the heap property: If v and p(v) are a node and its parent, then the key of the item stored in p(v) is not less (or greater) than the key of the item stored in v. The former is called as max heap and the latter is called as min heap.

A binary heap T with n nodes can be represented by an array H[1..n] as following:

  • The root of T is stored in H[1].
  • The left child of a node H[i] is stored in H[2j] and the right child is stored in H[2j+1].
  • The parent of a node H[i] is stored in \(H[\lfloor \frac j 2 \rfloor]\).
Below is a max binary heap and its array representation.


Operations on heap

A heap data structure normally supports the following operations:

  • peek(): Retrieve, but don't remove the root element of the heap.
  • poll(): Retrieve and remove the root element of the heap.
  • insert(e): Insert an element into the heap.
  • delete(i): Delete the ith element from the heap.
  • makeHeap(A): Transform array A into a heap.
And there are two basic operations that are used to implement the above operations:
  • shiftUp(i): Move the ith element up in the heap if the key of the element is greater than the key of its parent.
  • shiftDown(i): Move the ith element down in the heap if the key of the element is less than the key of its children.
note: The above operations are for max heap. It is easy to modify them to be used for min heap.

To implement heap sort algorithm, only shiftDown(i) and makeHeap(A) operations are needed. The other operations will not be described here. The pseudo code of the two operations are described in ShiftDown and MakeHeap. (Note: for simplicity, instead of 0 to n-1, we use 1 to n as the indices of the heap)

Algorithm 5.1 ShiftDown
Input: An array A[1..n] and an index i between 1 and n.
Output: A[i] is moved down so it is not smaller than its children.
  1. done = false
  2. if 2i > n then exit {comment: node i has no children}
  3. repeat
  4.     i = 2i
  5.     if \({i+1} \le n\) and key(A[i+1]) > key(A[i]) then
  6.         i = i + 1 {comment: use right child if its key is greater than the left child}
  7.     end if
  8.     if key(\(A[\lfloor \frac i 2 \rfloor]\)) < key(A[i]) then
  9.         swap \(A[\lfloor \frac i 2 \rfloor]\) and A[i]
  10.     else done = true
  11.     end if
  12. until 2i > n or done
Algorithm 5.2 MakeHeap
Input: An array A[1..n] of n elements
Output: A[1..n] is transformed into a heap

  1. for i = \(\lfloor \frac n 2 \rfloor\) to 1
  2.     ShiftDown(A, i)
  3. end for

Heap Sort

As in the heap data structure, the maximum (or minimum) element is the root of the heap. If we choose to use the min heap, poll() operation will give us the root of the heap which is the minimum element. So given an array A[1..n], we can sort it in non-decreasing order as follows. First, transform the array into a min heap. Next, for i = 1 to n, call the poll() operation on the min heap and store the element in A[i]. However, we need extra space to store the min heap in this approach. As stated earlier, we can use ShiftDown and MakeHeap to sort the array in place. First, we transform the array A[1..n] into a max heap, then the maximum element is in A[1]. Next, swap A[1] and A[n] so that A[n] is the maximum element in the array. Now, A[1] may be smaller than its children. Therefore, call ShiftDown(A[1..n-1], 1) to transform A[1..n-1] to a max heap. Next, swap A[1] and A[n-1] and transform A[1..n-2] into a max heap. Repeat the procedure until the heap size becomes to 1 which A[1] is the minimum element. The pseudo code is described in HeapSort.

Algorithm 5 HeapSort
Input: An array A[1..n] of n elements.
Output: A[1..n] sorted in non-decreasing order.

  1. MakeHeap(A)
  2. for i = n to 1
  3.     swap A[1] and A[i]
  4.     ShiftDown(A[1..i-1], 1)
  5. end for
A JAVA implementation of this algorithm can be found on GitHub HeapSort.

Time complexity

Let T be the heap of the array A[1..n]. Because T is an almost-complete binary tree, the height of the tree \(h = \lfloor logn \rfloor\). It is easy to observe that the maximum number of comparison performed by ShiftDown happens when move the root element down to the leaf level, so the number of comparison is at most 2(h-1) = 2logn - 2. So the time complexity of ShiftDown is O(logn).

If we move down the ith element in heap A[1..n], the time complexity of ShiftDown is analysed as following. The level of ith element in the tree \(l = \lfloor logi \rfloor\). It is easy to observe that the number of comparison performed by ShiftDown is at most 2(h - l). Since there are \(2^l\) nodes on level l, \(0 \le l \lt h\), the total number of comparison performed by MakeHeap is 
$$\begin{align} \sum\limits_{l=0}^{h-1}(h-l){2^l} & =2^0(h) + 2^1(h-1) + 2^2(h-2) + \cdots + 2^{k-1}(1) \\ & = 1(2^{h-1}) + 2(2^{h-2}) + \cdots + h(2^{h-h}) \\ & = \sum\limits_{i=1}^{h}i2^{h-i} \\ & =2^h\sum\limits_{i=1}^{h} \frac i {2^i} \\ & \le n\sum\limits_{i=1}^{h} \frac i {2^i} \\ & \lt 2n \\ \end{align}$$
The last second equality is because \(2^h \le n\) in an almost-complete binary tree. The last equality follows the geometric series
$$\sum\limits_{i=0}^n c^i = \frac {c^{n+1}-1} {c-1}$$
Differentiating both side of above equation and multiplying by c yields
$$\sum\limits_{i=0}^n ic^i = \frac {nc^{n+2}-nc^{n+1}-c^{n+1}+c} {(c-1)^2}$$ 
Let c = 1/2 yieds
$$\sum\limits_{i=0}^n \frac i {2^i} = 2 - \frac {n+2}{2^n} \lt 2$$
Thus, the time complexity of MakeHeap is O(n).

Based on the above, the time complexity of HeapSort algorithm is calculated as following. The number of comparison performed by MakeHeap at step 1 is \(\le 2n\). The number of comparison performed by ShiftDown at step 4 is \(\lt (n-1)logn\). Thus, the time complexity of HeapSort is
$$C(n) \lt 2n + (n-1)logn = O(nlogn)$$


Reference:
Algorithm Design Techniques and Analysis by M.H. Alsuwaiyel

Sunday, June 14, 2015

Sorting algorithm - Merge Sort and Quick Sort

In this post, I will introduce two divide-and-conquer sorting algorithms - Merge Sort and Quick Sort.


Merge Sort
The merge sort algorithm divides input array in two halves and sort the two sub arrays respectively, then merge them to get the desired sorted array. As to the sorting method used for each half, merge sort algorithm use itself recursively.

Let A[0..n] be an array of n elements. The merge sort algorithm divides A to A[0..\(\frac n 2\)] and A[\(\frac n 2+1\)..n]. And it performs itself on the two sub arrays to make them sorted. Then it merges two sub arrays to get A[0..n] sorted. The diagram from Wikipedia illustrates the process of merge sort algorithm.


The pseudo code is described in MergeSort.

Algorithm 3 MergeSort
Input: An array A[0..n] of n elements.
Output: A[0..n] sorted in non-decreasing order.

  1. mergesort(A, 0, n)
Function mergesort(A, low, high)


  1. if low < high
  2.     mid = (low + high)/2
  3.     mergesort(A, low, mid)
  4.     mergesort(A, mid+1, high)
  5.     merge(A, low, mid, high)
  6. end if
Function merge(A, p, q, r)
Input: An array A[0..n] and three indices p, q and r with \(0 \le p \le q \lt r \le n\). Both sub arrays A[p..q] and A[q+1..r] are sorted in non-decreasing order.
Output: A[p..r] contains the elements from A[p..q] and A[q+1..r] sorted in non-decreasing order.
  1. B[p..r] is an auxiliary array
  2. i = p, j = q+1, k = p
  3. while (\(i \le q\)) and (\(j \le r\))
  4.     if A[i] < A[j] then
  5.         B[k] = A[i]
  6.         i = i + 1
  7.     else
  8.         B[k] = A[j]
  9.         j = j + 1
  10.     end if
  11.     k = k + 1
  12. end while
  13. if \(i \le q\) then
  14.     B[k..r] = A[i..q]
  15. end if
  16. if \(j \le r\) then
  17.     B[k..r] = A[j..r]
  18. end if
  19. A[p..r] = B[p..r]

A JAVA implementation of this algorithm can be found on GitHub Merge Sort.

Time complexity
The basic operation of merge sort algorithm is element comparison. Let C(n) be the number of element comparisons. For simplicity, assume n is a power of 2. If n = 1, C(n) = 0 because the algorithm does not perform any element comparison. If n > 1, in function mergesort(A, low, high), step 3 and step 4 perform C(\(\frac n 2\)) element comparisons. And step 5 calls function merge(A, p, q, r). It is easy to see the number of element comparison performed by function merge is between \(\frac n 2\) and n - 1. Thus, the minimum number of comparison is 

$$C(n) = \begin{cases}0 & n=1 \\ 2C(\frac n 2) + \frac n 2 & n \ge 2 \end{cases}$$

Proceed to solve the recurrence by expansion as follows. Note: we assume \(n=2^k\).

$$\begin{align}C(n) & = 2C(\frac n 2) + \frac n 2 \\ & = 2^2C(\frac n {2^2}) + 2\frac n 2 \\ & = ... \\ & = 2^kC(\frac n {2^k}) + k\frac n 2 \\ & = 2^kC(1) + k\frac n 2 \\ & = \frac {nlogn} 2\end{align}$$

The maximum number of comparison is 

$$C(n) = \begin{cases}0 & n=1 \\ 2C(\frac n 2) + n - 1 & n \ge 2 \end{cases}$$

Proceed to solve the recurrence by expansion as follows. Note: we assume \(n=2^k\).

$$\begin{align}C(n) & = 2C(\frac n 2) + n - 1 \\ & = 2^2C(\frac n {2^2}) + 2n - 2 - 1 \\ & = ... \\ & = 2^kC(\frac n {2^k}) + kn - 2^k-1 - 2^k-2 - ... - 2 - 1 \\ & = 2^kC(1) + kn - \sum\limits_{i=0}^{k-1} 2^i \\ & = kn - 2^k + 1 \\ & = nlogn - n + 1\end{align}$$


According to the analysis above, The time complexity of merge sort algorithm is \(C(n)=\Omega(nlogn) = O(nlogn) = \Theta(nlogn)\)

Quick Sort
The quick sort algorithm is another divide-and-conquer sorting algorithm. It divides the input array by a chosen pivot, so that all elements smaller than the pivot are at the left of it and all greater elements are at the right. This procedure is called partitioning or splitting. An important observation is that after splitting the array by the pivot, the pivot is in its correct position. Recursively perform the same splitting procedure on each side of the pivot until all elements are in their correct position, the array therefore is sorted. There are many different ways to choose a pivot in the array, here I will pick the first element as the pivot. However, it leads to quadratic running time in the worst case by always picking the first element as the pivot. This will be discussed later in the time complexity analysis of quick sort algorithm. The animation below from Wikipedia illustrates the idea of quick sort algorithm.



The pseudo code is described in QuickSort.

Algorithm 4 QuickSort
Input: An array A[0..n] of n elements.
Output: A[0..n] sorted in non-decreasing order.

  1. quicksort(A, 0, n)
Function quicksort(A, low, high)
  1. if low < high then
  2.     w = split(A, low, high)
  3.     quicksort(A, low, w-1)
  4.     quicksort(A, w+1, high)
  5. end if
Function split(A, low, high)
Input: An array A[0..n] and two indices low and high with \(0 \le low < high \le n\).
Output: An index w with \(low \le w \le high\). And w is the correct position of A[low] in A[low..high].
  1. i = low
  2. x = A[low]
  3. for j = i + 1 to high
  4.     if A[j] < x then
  5.         i = i + 1
  6.         if \(i \ne j\) then swap A[i] and A[j] comment: move the smaller element to the left
  7.     end if
  8. end for
  9. swap A[low] and A[i] comment: i is the correct position of A[low]
  10. return i

A JAVA implementation of this algorithm can be found on GitHub Quick Sort.

Time complexity

The average case scenario
The time complexity of quick sort algorithm depends on the relative order of the elements in the input array. To analyse the average behavior of the algorithm, assume the distribution of each element in the input array is equally likely. Let C(n) be the number of elements comparisons performed by the quick sort algorithm. It is easy to see the number of elements comparison performed by split in step 2 is high - low - 1, which is n - 1 for high = n, low = 0. The recursive call in steps 3 and 4 cost C(w-1) and C(n-w). Based on the assumption of each element is equally likely to be the first element of the input array and thus chosen as the pivot, the probability that any element be the pivot is 1/n. Hence, the total number of comparison is
$$C(n) = (n-1) + \frac {1} {n} \sum\limits_{w=1}^n(C(w-1) + C(n-w))$$
Since
$$\sum\limits_{w=1}^nC(n-w)=C(n-1)+C(n-2)+...+C(0)=\sum\limits_{w=1}^nC(w-1)$$
Thus
$$C(n)=(n-1) + \frac {2} {n} \sum\limits_{w=1}^nC(w-1)$$
Multiply the equation by n
$$nC(n)=n(n-1) + 2\sum\limits_{w=1}^{n}C(w-1)$$
Replace n by n-1
$$(n-1)C(n-1)=(n-1)(n-2) + 2\sum\limits_{w=1}^{n-1}C(w-1)$$
Subtracting the above two equations
$$nC(n) - (n-1)C(n-1)=2(n-1) + 2C(n-1)$$
Rearrange  the equation yields
$$\frac {C(n)} {n+1} = \frac {C(n-1)} {n} + \frac {2(n+1)} {n(n+1)}$$
Let \(D(n) = \frac {C(n)} {n+1}\)
$$D(n) = D(n-1) + \frac {2(n+1)} {n(n+1)}$$
Resolve the above equation
$$D(n) = 2\sum\limits_{i=1}^n\frac{i-1}{i(i+1)}=2\sum\limits_{i=1}^n\frac 1 j - \frac {4n} {n+1}=2lnn - \Theta(1)=\frac 2 {log e}log n-\Theta(1) \approx 1.44log n$$
Consequently,
$$C(n) = (n+1)D(n) \approx 1.44nlogn = \Theta(nlogn)$$

The worst case scenario
Suppose that in every call to quicksort, the pivot which is A[low], is the smallest element in the array. It results the w returned by split is equal to low. Then the next recursive calls will be quicksort(A, 0, -1) and quicksort(A, 1, n), and quicksort(A, 0, -1) will do nothing. Thus, if the input array is already sorted in non-decreasing order, the call sequence to quicksort is below:

quicksort(A, 0, n), quicksort(A, 1, n), ... , quicksort(A, n-1, n), quicksort(A, n, n)

These result in calls to split as following:

split(A, 0, n), split(A, 1, n), ... , split(A, n-1, n), split(A, n, n)

The number of elements comparison performed by split is high - low - 1. Thus, the total number of comparison done by quick sort algorithm in the worst case is
$$(n-1) + (n-2) + ... + 1 + 0 = \frac {n(n-1)} 2 = \Theta(n^2)$$
The scenario above however is not the only case that results in quadratic running time. If the algorithm always picks the smallest k elements with k is sufficiently small relative to n, the running time of quick sort algorithm is also quadratic. This can be improved to \(\Theta(nlogn)\) by always picking the median element as the pivot which splits the input array in balance. But the time complexity of the median finding algorithm needs to be considered carefully. Another approach to improve the quick sort algorithm is to choose the pivot randomly which will result in a very efficient algorithm. It will be discussed in my future post.


Reference:
Algorithm Design Techniques and Analysis by M.H. Alsuwaiyel