Sunday, October 28, 2012

A simple way to setup Java application with external configuration file

Many Java applications would deploy and run it with some kind of external configuration files. It's very typical that you would want a set of config files per environments such as DEV, QA and PROD. There many options in tackling this problem, especially in a Java app, but keeping it simple and easy to maintain would take some disciplines.
Here I would layout a simple way you may use to depploy a typical Java application. The concept is simple and you can easily apply to a standalone, web, or even a JEE application.

Use a env System Properties per environment

Java allows you to invoke any program with extra System Properties added. When launching an Java application, you should set an env property. For example:
bash> java -Denv=prod myapp.Main
This property would give your Main application to identify which environment you are running against with. You should be reading it like this inside your code:
String env = System.getProperty("env", "dev");
This way you always would have an environment to work with. Even if user doesn't supply one, it will default to use dev env.
Another useful System Property to set is app.home. You want to set this value in relative to where your application is deployed, so you may reference any files (eg: data) easily. To do this, you can use a script wrapper to automatically calculate the path. See section below for an example.

Use a config dir prefix in CLASSPATH

In stead of passing a explicit config file to your application as argument, another flexible way to load configuration file is to add an extra config folder into your CLASSPATH. For example, you can easily create a startup wrapper script myapp.sh like this:
#/usr/bin/env bash
APP_HOME=$(cd "`dirname $0`/.." && pwd)
java $JAVA_OPTS -Dapp.home=$APP_HOME -cp "$APP_HOME/config:$APP_HOME/lib/*" myapp.Main "$@"
From this, you can setup the application packaging layout this way:
myapp
    +- bin
        +- myapp.sh
    +- config
        +- dev.properties
        +- qa.properties
        +- prod.properties
    +- data
        +- myrecords.data
    +- lib
        +- myapp-1.0.0.jar
        +- slf4j-1.7.1.jar
You would typically invoke the application like this:

bash> JAVA_OPTS='-Denv=prod' myapp/bin/myapp.sh

The above will give you a good foundation to load a single config properties file per env. For example, you can read your properites file like this somewhere in your code.

// Get appHome and data dir.
String appHome = System.getProperty("app.home");
String dataDir = appHome + "/data";

// Get env value to load config parameters
String env = System.getProperty("env", "dev");
String config = env + ".properites";
Properties configProps = new Properties();
InputStream inStream = null;
try {
    inStream = getClass().getClassLoader().getResourceAsStream(config);
    configProps.load(inStream);
} finally {
    if (inStream != null)
        inStream.close();
}
// Now load any config parameters from configProps map.
Now you would have the configProps object at your disposal to read any configuration keys and values set per an environment.
NOTE: If you want a more flexible Java wrapper script, see my old post on run-java wrapper.

Do not abuse CLASSPATH

Now, just because you have setup config as part of your CLASSPATH entry, I have to caution you not to abuse it. What I mean is do not go wild on loading all your application resources in that folder! If you have that many resources that user MUST edit and configure, then you should re-think about your application design! Simple interface, or configuration in this case, is always a win. Do not bother users with complexity just because your application can support gazillion ways of configuration combination. If you can keep it as one config file, it would make users very happy.
Also, this doesn't mean you have to put the entire world inside one of prod.properites either. In the real world, an application is likely going to have only handful of user tunable parameters, and many other resources are less frequent used. I would recommand put the most frequently used parameters in these config properties only. For all other (for example most of the Spring context files in an application do not belong to a typical users config level, they are more developer level config files. In another word, changing these files would have catastrophic effect to your application!) You should put these inside as part of your myapp.jar.
You might ask, 'Oh, but what happen if I must want to override one of the resource in the jar?'. But in that very unusual case, you would still have an option to override! You have the config as prefix in CLASSPATH, remember? Even when you nested resources inside a package inside the jar, you would still able to overwrite by simply create same directory structure inside config. You probably only do this for emmergency and less frequent use anyway.

Feedback

So what are some clever ways you have seen or done with your application configuration? I hope to hear from you and share.

Thursday, October 25, 2012

Simple Variable Substitution in Java String

When I wrote about how to improve the Java Properties class using Props, I've shown a feature where you can use variable substition such as mypath=${user.home} in your config file. The implementation underneath it uses the Apache Common Lang library with org.apache.commons.lang.text.StrSubstitutor. There is nothing wrong with this, but I was curious how bad would it be to remove such dependency, so the Props can be more standalone.

Here is a quick implementation in Groovy, but you should able to translate to Java easily.

// String variable substitutions
def parseVariableNames(String text) {
    def names = []
    def pos = 0, max = text.length()
    while (pos < max) {
        pos = text.indexOf('${', pos)
        if (pos == -1)
            break
        def end = text.indexOf('}', pos + 2)
        if (end == -1)
            break
        def name = text.substring(pos + 2, end)
        names.add(name)
        pos = end + 1
    }
    return names
}
def replaceVariable(String key, String value, String text) {
    //println "DEBUG: Replacing '${key}'' with '${value}'"
    result = text.replaceAll('\\$\\{' + key + '}', value)
    return result
}

Probably not the most efficient thing, but it should work. Let's have some tests.

// Test
def map = ["name": "Zemian", "id": "1001"]
def inputs  = [
    'Hello ${name}',
    'My id is ${id}',
    '${name} is a good programmer.',
    '${name}\'s id is ${id}.'
]

result = inputs.collect{ line ->
    def names = parseVariableNames(line)
    names.each{ key ->
        line = replaceVariable(key, map.get(key), line) 
    }
    line
}
assert result == [
    'Hello Zemian',
    'My id is 1001',
    'Zemian is a good programmer.',
    'Zemian\'s id is 1001.'
]

The output should print nothing, as it passed the test. What do you think?

Tuesday, October 23, 2012

Exploring different scheduling types with Quartz 2

We often think of Cron when we want to schedule a job. Cron is very flexible in expressing an repeating occurance of an event/job in a very compact expression. However it's not answer for everything, as I often see people are asking for help in the Quartz user forum. Did you know that the popular Quartz 2 library provide many other schedule types (called Trigger) besides cron? I will show you each of the Quartz 2 built-in schedule types here within a complete, standalone Groovy script that you can run and test it out. Let's start with a simple one.

@Grab('org.quartz-scheduler:quartz:2.1.6')
@Grab('org.slf4j:slf4j-simple:1.7.1')
import org.quartz.*
import org.quartz.impl.*
import org.quartz.jobs.*

import static org.quartz.DateBuilder.*
import static org.quartz.JobBuilder.*
import static org.quartz.TriggerBuilder.*
import static org.quartz.SimpleScheduleBuilder.*

def trigger = newTrigger()
    .withSchedule(
        simpleSchedule()
        .withIntervalInSeconds(3)
        .repeatForever())
    .startNow()
    .build()
dates = TriggerUtils.computeFireTimes(trigger, null, 20)
dates.each{ println it }

This is the Quartz's SimpleTrigger, and it allows you to create a fixed rate repeating job. You can even limit to certain number of count if you like. I have imported all the nessary classes the script needs, and I use the latest Quartz 2.x builder API to create an instance of the trigger.

The quickest way to explore and test out whether a scheduling fits your need is to print out its future execution times. Hence you see me using TriggerUtils.computeFireTimes in the script. Run the above and you should get the datetimes as scheduled to be run, in this case every 3 seconds.

bash> $ groovy simpleTrigger.groovy
    Tue Oct 23 20:28:01 EDT 2012
    Tue Oct 23 20:28:04 EDT 2012
    Tue Oct 23 20:28:07 EDT 2012
    Tue Oct 23 20:28:10 EDT 2012
    Tue Oct 23 20:28:13 EDT 2012
    Tue Oct 23 20:28:16 EDT 2012
    Tue Oct 23 20:28:19 EDT 2012
    Tue Oct 23 20:28:22 EDT 2012
    Tue Oct 23 20:28:25 EDT 2012
    Tue Oct 23 20:28:28 EDT 2012
    Tue Oct 23 20:28:31 EDT 2012
    Tue Oct 23 20:28:34 EDT 2012
    Tue Oct 23 20:28:37 EDT 2012
    Tue Oct 23 20:28:40 EDT 2012
    Tue Oct 23 20:28:43 EDT 2012
    Tue Oct 23 20:28:46 EDT 2012
    Tue Oct 23 20:28:49 EDT 2012
    Tue Oct 23 20:28:52 EDT 2012
    Tue Oct 23 20:28:55 EDT 2012
    Tue Oct 23 20:28:58 EDT 2012

The most frequent used scheduling type is the CronTrigger, and you can test it out in similar way.

@Grab('org.quartz-scheduler:quartz:2.1.6')
@Grab('org.slf4j:slf4j-simple:1.7.1')
import org.quartz.*
import org.quartz.impl.*
import org.quartz.jobs.*

import static org.quartz.DateBuilder.*
import static org.quartz.JobBuilder.*
import static org.quartz.TriggerBuilder.*
import static org.quartz.CronScheduleBuilder.*

def trigger = newTrigger()
    .withSchedule(cronSchedule("0 30 08 * * ?"))
    .startNow()
    .build()
dates = TriggerUtils.computeFireTimes(trigger, null, 20)
dates.each{ println it }

The javadoc for CronExpression is very good and you should definately read it throughly to use it effectively. With the script, you can explore all the combination you want easily and verify future fire times before your job is invoked.

Now, if you have some odd scheduling needs such as run a job every 30 mins from MON to FRI and only between 8:00AM to 10:00AM, then don't try to cramp all that into the Cron expression. The Quartz 2.x has a dedicated trigger type just for this use, and it's called DailyTimeIntervalTrigger! Check this out:

@Grab('org.quartz-scheduler:quartz:2.1.6')
@Grab('org.slf4j:slf4j-simple:1.7.1')
import org.quartz.*
import org.quartz.impl.*
import org.quartz.jobs.*

import static org.quartz.DateBuilder.*
import static org.quartz.JobBuilder.*
import static org.quartz.TriggerBuilder.*
import static org.quartz.DailyTimeIntervalScheduleBuilder.*
import static java.util.Calendar.*

def trigger = newTrigger()
    .withSchedule(
        dailyTimeIntervalSchedule()
        .startingDailyAt(TimeOfDay.hourMinuteAndSecondOfDay(8, 0, 0))
        .endingDailyAt(TimeOfDay.hourMinuteAndSecondOfDay(10, 0, 0))
        .onDaysOfTheWeek(MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY)
        .withInterval(10, IntervalUnit.MINUTE))
    .startNow()
    .build()
dates = TriggerUtils.computeFireTimes(trigger, null, 20)
dates.each{ println it }

Another hidden Trigger type from Quartz is CalendarIntervalTrigger, and you would use this if you need to repeat job that's in every interval of a calendar period, such as every year or month etc, where the interval is not fixed, but calendar specific. Here is a test script for that.

@Grab('org.quartz-scheduler:quartz:2.1.6')
@Grab('org.slf4j:slf4j-simple:1.7.1')
import org.quartz.*
import org.quartz.impl.*
import org.quartz.jobs.*

import static org.quartz.DateBuilder.*
import static org.quartz.JobBuilder.*
import static org.quartz.TriggerBuilder.*
import static org.quartz.CalendarIntervalScheduleBuilder.*
import static java.util.Calendar.*

def trigger = newTrigger()
    .withSchedule(
        calendarIntervalSchedule()
        .withInterval(2, IntervalUnit.MONTH))
    .startAt(futureDate(10, IntervalUnit.MINUTE))
    .build()
dates = TriggerUtils.computeFireTimes(trigger, null, 20)
dates.each{ println it }

I hope these will help you get started on most of your scheduling need with Quartz 2. Try these out and see your future fire times before even scheduling a job into the scheduler should save you some times and troubles.

Monday, October 15, 2012

Running maven commands with multi modules project

Have you ever tried running Maven commands inside a sub-module of a multi modules Maven project, and get Could not resolve dependencies for project error msg? And you checked the dependencies that it's missing are those sister modules within the same project! So what gives? It turns out you have to give few more options to get this running correctly, and you have to remember always stays in the parent pom directory to run it!

For exmaple if you checkout the TimeMachine scheduler project, you can invoke the timemachine-hibernate module with maven commands like this:

bash> hg clone https://bitbucket.org/timemachine/scheduler
bash> cd scheduler
bash> mvn -pl timemachine-hibernate -am clean test-compile

You can start the scheduler using Maven like this (remember to stay in the parent pom directory!):

bash > mvn -pl timemachine-scheduler exec:java -Dexec.mainClass=timemachine.scheduler.tool.SchedulerServer -Dexec.classpathScope=test

I have added the -Dexec.classpathScope=test so you will see logging output, because there is an log4j.properties in the classpath for testing.

Without these, you can always run mvn install in the project root directory, then you can cd into any sub-module and run Maven commands. However you will have to keep a tab on what changed in the dependencies, even if they are in sister modules.

You can read more from this article from Sonatype.

Sunday, October 7, 2012

What's up with the JUnit and Hamcrest dependencies?

It's awesome that JUnit is recognizing the usefulness of Hamcrest, because I use these two a lot. However, I find JUnit packaging of their dependencies odd, and can cause class loading problem if you are not careful.

Let's take a closer look. If you look at junit:junit:4.10 from Maven Central, you will see that it has this dependencies graph:

+- junit:junit:jar:4.10:test
    |  - org.hamcrest:hamcrest-core:jar:1.1:test

This is great, except that inside the junit-4.10.jar, you will also find the hamcrest-core-1.1.jar content are embedded!

But why???

I suppose it's a convenient for folks who use Ant, so that they save one jar to package in their lib folder, but it's not very Maven friendly. And you also expect classloading trouble if you want to upgrade Hamcrest or use extra Hamcrest modules.

Now if you use Hamcrest long enough, you know that most of their goodies are in the second module named hamcrest-library, but this JUnit didn't package in. JUnit however chose to include some JUnit+Hamcrest extension of their own. Now including duplicated classes in jar are very trouble maker, so JUnit has a separated module junit-dep that doesn't include Hamcrest core package and help you avoid this issue. So if you are using Maven project, you should use this instead.

<dependency>
    <groupId>junit</groupId>
    <artifactId>junit-dep</artifactId>
    <version>4.10</version>
    <scope>test</scope>
    <exclusions>
        <exclusion>
            <groupId>org.hamcrest</groupId>
            <artifactId>hamcrest-core</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.hamcrest</groupId>
    <artifactId>hamcrest-library</artifactId>
    <version>1.2.1</version>
    <scope>test</scope>
</dependency>

Notice that's a junit-dep, and also on how I have to exclude hamcrest from it. This is needed if you want hamcrest-library that has higher version than the one JUnit comes with, which is 1.1.

Interesting enough, Maven's dependencies in pom is order sensitive when it comes to auto resolving conflicting versions dependencies. Actually it would just pick the first one found and ignore the rest. So you can shorten above without exclusion if, only if, you place the Hamcrest bofore JUnit like this:

<dependency>
    <groupId>org.hamcrest</groupId>
    <artifactId>hamcrest-library</artifactId>
    <version>1.2.1</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>junit</groupId>
    <artifactId>junit-dep</artifactId>
    <version>4.10</version>
    <scope>test</scope>
</dependency>

This should make Maven use the following dependencies:

+- org.hamcrest:hamcrest-library:jar:1.2.1:test
|  \- org.hamcrest:hamcrest-core:jar:1.2.1:test
+- junit:junit-dep:jar:4.10:test

However I think using the exclusion tag would probably give you more stable build and not rely on Maven implicit ordering rule. And it avoid easy mistake for Maven beginer users. However I wish JUnit would do a better job at packaging and remove duplicated classes in jar. I personally think it's more productive for JUnit to also include hamcrest-libray instead of just the hamcrest-core jar.

What do you think?