Monday, December 17, 2007

Impala and OSGi

The project most closely related to Impala is Spring OSGi, so it's tempting to try make some comparisons on the two projects, or at least on their approaches to solving the problems they are tackling. I should preface this by saying that I am talking from a certain position of relative ignorance: my understanding of Spring and OSGi is based mostly on what I have read in the documentation and a small amount of playing with samples some time ago. I've since been to a talk on OSGi at JavaWUG in London, and read a fair portion of the OSGi spec. I hope this has given me sufficient understanding to make a few remarks which reflect my impressions of the differences between Impala and the world of Spring OSGi.

Spring OSGi and Impala are tackling an overlapping set of problems

What Impala and Spring OSGi have in common is that they are both dynamic module based systems.

Impala provides a developer productivity solution through build support, an interactive test runner, and support for efficient, fast running integration tests.

OSGi, on the other hand, tackles a wider set of problems relating to visibility of classes. OSGi allows you to run multiple versions of third party libraries within the same JVM, which can prevent problems which may occur from different libraries depending on incompatible versions of the same third party library. You can think of OSGi as defining multiple class spaces for third party libraries as well as application code.

Impala makes a clear distinction between application code and third party libraries. There is only a single class space for third party libraries. If there is a clash between third party libraries, then you are no better off (or worse off, for that matter) than you would be in a standard Java/pre-OSGi world. You would still need to resort to whatever workaround would apply in that case.

While third party library clashes are certainly a theoretical possibility, it is quite rare in my experience to actually be a victim of such problems in a way which couldn't be addressed relatively easily through a simple workaround. Others may have different experiences in this regard. However, it does seem that there is quite a bit of overhead required to make sure that the Jars you use fit in nicely with OSGi requirements, so that they "play nicely" in OSGi world. It's the kind of overhead that I think most developers would look for ways to avoid for as long as they are able to do.

Impala has more of a focus on developer productivity

The purpose of Spring OSGi seems primarily to bring the benefits of OSGi to Java, via Spring. In that sense, it is OSGi centric. The fundamental trade-off with OSGi is one of productivity. Are you prepared to go to quite a lot more trouble in defining quite precisely the nature of dependencies between libraries, with the benefit that you will never get ClassCastExceptions due to classloader issues, and that you will be able to dynamically update library versions. Make no mistake, these benefits come at a cost. One very experienced OSGi developer described working with library dependencies OSGi as a "pain in the arse", something you wouldn't want to do without tool support.

Impala, by contrast, does not impose these kinds of constraints. In fact, it's very essence is about productivity. Dynamic module reloading can benefit productivity, because it means that integration tests can be written on the fly against a running application.

Impala Support for OSGi

Impala has all the internal interfaces necessary for seamlessly supporting OSGi, probably via Spring OSGi. I'm definitely considering adding this in the future. For the time being, that's not on the immediate horizon - getting the project into a publicly releasable form is a higher priority right now.

Sunday, November 25, 2007

Impala being used on a "real" project

I'm pleased to say that I am now using Impala for what looks like may and should become a core and major development at the company where I am working. The results have been great. Progress has been more rapid than I have expected, and a lot of this is down to the capabilities provided by Impala.

I haven't run into any major obstacles that couldn't be overcome with relatively little effort. Dynamic reloading is working well. Test execution speed - a bugbear of the previous version - is quick. There are now around 200 tests, of which around 30 or so are integration tests which load up various bits of the application. The application currently consists of 21 Hibernate entities. Total execution time of all the tests - integration included- is about six to eight seconds for initial startup, plus the same again to run the tests.

It's a major moment for any open source project is when it gets its first real commercial user. In this case, the actual user is still primarily myself. That's a good thing - it means that on a day-by-day basis I am getting to validate the features and functionality, discover bugs when the appear, and get ideas for improvements and new features.

Successful open source projects are often extractions from existing successful commercial projects. Others are developed from a vision of the technology the developer would like to work with on his or her next project. Impala started off this way. The danger for the latter kind of projects is that the opportunities to use the technology in anger on real commercial projects may not present themselves that readily, with the danger that you don't get the chance to "eat your own dog food". At the moment, I am eating my own dog food by the spadeful, and it tastes good!

Tuesday, November 6, 2007

New package and domain names

I've been spending a bit of time working on the package layout of Impala classes. While the project is still at an early stage, it makes sense to try to get this right.

The first thing to do was to change the root package name. I've had to do this after moving the code from to Google code. The Impala root package is now org.impalaframework. It would have been nice to use org.impala, but I'm politely sticking to the convention that you should own the domain that you use (yes, I have actually bought the domain org.impalaframework!). I certainly didn't want to use com.googlecode; if I need to move the project again, then I don't want to have to do another root package rename. org.impalaframework is a bit wordy, but it at least there is a precedent in the form of the root package names used for Spring: org.springframework. Hope this doesn't make me too much of a copycat.

All this being said, the convention that you should to use a package name which corresponds to a domain name that you own kinda sucks. Of course, this is only a convention, and you can choose to ignore it. I may choose to do so, but for now, I'm more interested in getting the structure of the packages and classes right. That means getting the names of classes and packages right, getting their locations correct and eliminating package cycles. This process should settle down soon. It will need to before I'm ready to do the first public release.

Monday, October 22, 2007

Grails Exchange

Last month I did a talk on behalf of the Java Web User Group (JavaWUG) at the Grails Exchange. Unfortunately, I did not go to the full conference but it was nice to pop in and attend a couple of the evening sessions, do my bit, and join in for a beer and chat aftewards.

The thrust of my talk was on embedding Grails into Java applications. Grails is a full-stack framework, which aims to solve an overlapping set of problems to Impala. I feel lured towards Groovy, but every time I get put off by my perception that Groovy is too slow. I'm not talking about execution speed, more compile speed.

Anyway, there could be a nice synergy with Impala and Grails - Impala for a standard Spring-based back end, and Grails for the front end. That's certainly a part of the reason for my interest in embedding Grails in a typical Java application.

Impala, of course, works out of the box with Spring MVC, but it would be great to get integration going with other frameworks. One of the key requirements for a web framework integration with Impala is that the web application is itself dynamically reloadable. Of course, you could integrate Impala with any statically reloading web application, but it seems a bit pointless having dynamically reloadable back-end functionality while still having to restart the application every time you want to make a change to one of your presentation classes. Grails seems to fit the dynamic reloadable requirement quite well. However, unpicking it's functionality so that you can reuse it outside of the Grails environment is not trivial.

Here's some code which goes part of the way to achieving this task.

public static void main(String[] args) throws IOException {
MetaClassRegistry registry = GroovySystem.getMetaClassRegistry();

if (!(registry.getMetaClassCreationHandler()
instanceof ExpandoMetaClassCreationHandle))
registry.setMetaClassCreationHandle(new ExpandoMetaClassCreationHandle());

FileSystemXmlApplicationContext cpc =
new FileSystemXmlApplicationContext("web-app/WEB-INF/grails-context.xml");
DefaultRuntimeSpringConfiguration springConfig =
new DefaultRuntimeSpringConfiguration(cpc);

GrailsApplication application = (GrailsApplication) cpc.getBean("grailsApplication");
application.registerArtefactHandler(new ControllerArtefactHandler());
application.registerArtefactHandler(new TagLibArtefactHandler());
application.registerArtefactHandler(new DomainClassArtefactHandler());

application.getConfig().put("plugin.includes", "filter, controllers, taglib");

DefaultGrailsPluginManager pluginManager =
new DefaultGrailsPluginManager(new Class[] {}, application);

WebApplicationContext context = springConfig.getUnrefreshedApplicationContext();


WebApplicationContext ctx = springConfig.getApplicationContext();


Note that this code will fire up Grails, load its web-specific plugins. The trick now is to get the rest of the environment working, so that a full Grails web application can slot neatly into Impala's environment. Something I'll be trying in the next few weeks.

Thursday, October 4, 2007

Impala moved to Google code

I've just moved Impala from to Google code.

Here's the new link:

Here's why.

I originally set Impala up with because I wanted project hosting which would be simpler, quicker and easier to maintain than Sourceforge, where I had previously hosted projects. On the whole things did work out that way, but there are a few things about that I don't like.

First, and this is a biggie, there doesn't appear to be anonymous SVN source access. This means that people who want to check out the source code need to sign up to, and for some people this is definitely likely to be a barrier to trying out the project, and hence to the project's adoptions.

On a related issue, there doesn't seem to be direct web-based SVN browsing available. I can't send a link which points directly to a source file on the SVN repository.

Second, it seems to have quite a complicated roles and permission system that I didn't immediately understand and still don't. Frankly, the only roles or permissions I'd be interested in are project owner, committers and everyone else. Anyone should be able to see anything; only developers should be able to change things.

Thirdly, while everything seems nicely packaged, the tools seem a bit dated. The bug tracker, which I hadn't started looking, looked like a wrapper around bugzilla.

On the whole, the site has a bit of a clunky feel - I'm not that big a fan of the CollabNet software. I wouldn't expect this to change dramatically too soon, because I can just imagine the pain that would be involved with upgrading.

Finally, setting up the project took a bit long. It took a couple of weeks before it was put in the Enterprise Incubator category. There's just a bit too much process around the project setup. I want plain and simple hosting. The users and community as a whole, and not some Sun employees, will decide if there is any merit in the project.

Google Code, on the other hand, was really no frills in its setup. The tools seem very simple to use. Everything is laid out clearly. I didn't need to spend ages figuring out how the site works. Google generally does a good job at creating usable applications, and this one doesn't seem to be any different.

Of course, there is anonymous SVN access for casual users. I like the tabbed layout of the project home page. The bug tracker looks simple but very functional, and the same goes for the wiki. And, as you'd expect with Google, there is good indexing and search capability, so that documentation and source code is searchable.

All in all, it made sense to change sooner rather than later, as the project is in a very early, pre-adoption phase. The job would be much more painful later on. Now I've got to rename the packages!

Saturday, September 22, 2007

I've added some timing information for running Impala in interactive test mode, and the numbers are quite interesting.

I start by firing up a JUnit test class, called WineDAOTest, which looks like the following:

public class WineDAOTest extends BaseDataTest {

public static void main(String[] args) {;

public void testDAO() {

WineDAO dao = DynamicContextHolder.getBean(this, "wineDAO", WineDAO.class);

Wine wine = new Wine();
wine.setVineyard("Chateau X");

Collection winesOfVintage = dao.getWinesOfVintage(1996);
System.out.println("Wines of vintage 1996: " + winesOfVintage.size());
assertEquals(1, winesOfVintage.size());


Wine updated = dao.findById(wine.getId());
assertEquals(2000, updated.getVintage());


public PluginSpec getPluginSpec() {
return new PluginSpec("parent-context.xml", new String[] { "dao", "hibernate", "merchant" });


The Impala features have been added by

  • exposing a main method which calls PluginTestRunner
  • obtaining the Spring context bean using DynamicContextHolder.getBean(...)
  • providing an implementation of getPluginSpec(), which tells the PluginTestRunner which plugins to include as part of the test. Note that we have plugins for the basic Hibernate configuration, for the DAO implementations, and for the service classes (the merchant plugin)

We now run the class in interactive mode by running it as a Java application in Eclipse (instead of running it as a JUnit test. Here's the output.

Enter u to show usage
l [testClass] to load test class
[testName] to run test
reload [plugin name] to reload plugin
reload to reload parent context
s to show test methods
r to rerun last command
r to rerun last run test
e to exit
Enter u to show usage
Available test methods:
Enter u to show usage
Running test tests.WineDAOTest
.Wines of vintage 1996: 1
Time: 3.264
OK (1 test)

Enter u to show usage
Parent context loaded in 0.551 seconds
Used memory: 3.4MB
Max available memory: 63.6MB

Enter u to show usage
Parent context loaded in 0.34 seconds
Used memory: 4.2MB
Max available memory: 63.6MB

Enter u to show usage
Parent context loaded in 0.521 seconds
Used memory: 3.8MB
Max available memory: 63.6MB

Enter u to show usage
>reload hibernate
Plugin hibernate loaded in 0.21 seconds
Used memory: 4.3MB
Max available memory: 63.6MB

Enter u to show usage
>reload merchant
Plugin merchant loaded in 0.16 seconds
Used memory: 4.6MB
Max available memory: 63.6MB

Enter u to show usage
>reload dao
Plugin dao loaded in 0.141 seconds
Used memory: 4.5MB
Max available memory: 63.6MB

Enter u to show usage
Parent context loaded in 0.511 seconds
Used memory: 4.1MB
Max available memory: 63.6MB

Enter u to show usage
Parent context loaded in 0.43 seconds
Used memory: 3.9MB
Max available memory: 63.6MB

Enter u to show usage
Running test tests.WineDAOTest
.Wines of vintage 1996: 1
Time: 0.081
OK (1 test)

Enter u to show usage
Running test tests.WineDAOTest
.Wines of vintage 1996: 1
Time: 0.07
OK (1 test)

Enter u to show usage

Notice how the initial test run took 3.2 seconds, which is how long we'd expect a small Spring/Hibernate test run to take. After that it gets more interesting.

Subsequent reloads of the application context (shown by the reload call) take only a fraction of the time: 0.55 seconds, 0.34 seconds, 0.51 seconds and 0.34 seconds. This is only a fraction of the time it takes to load the application context the first time.

Reloading individual plugins in much quicker: hibernate took 0.21 seconds, dao took 0.14 and merchant took 0.16 seconds. This means, at least for a small application, we can reflect changes almost instantly. Even for an application which loads 10 times more slowly, the numbers are still quite acceptable.

Running the test without doing the reload is also extremely fast: compare 0.07 and 0.08 seconds with the original 3.2 seconds.

Wednesday, September 19, 2007

The Rationale of Micro Hot Deployment

A big part of Impala's reason for being is that it supports micro hot deployment of Java applications. So what is micro hot deployment, and why is it a valuable concept?

Traditional hot deployment in Java is all about being able to update an application on the fly, typically a WAR or EAR on an application server. In reality, this kind of hot deployment is not terribly useful. Firstly, it often comes with memory leaks, which mean that after a couple of redeployments you may end up running out of memory. Secondly, because it is a coarse-grained redeployment, it can take quite a long time. For example, if the server itself only takes two or three seconds to start up, and the application takes 20 to 30 seconds to load, then in terms of downtime you little worse off doing the server restart.

Micro hot deployment involves hot (re)deployment of parts of an application. Java Servlet containers such as Tomcat already support this in a very limited sense. For example, JSPs, which are compiled into Servlets, can typically be updated without an application restart. This is because it is safe to associate a JSP with its own class loader, because the class which the JSP compiles to it will never be referenced by another application class. It is very much at the end of the application dependency chain.

The only other form of hot deployment considered safe by application servers is redeployment of full applications. This is a fairly brute force tactic for getting around the limitations and pitfalls of Java classloaders.

Other technologies do a much better job of implementing micro hot deployment. The likes of PHP and Ruby on Rails, for example. Even within the Java camp, scripting based solutions such as Grails, based on Groovy, have tackled this problem head-on.

Unfortunately, Java frameworks have been pretty slow to follow suit. Tapestry 5 now promises that application classes will be reloadable on the fly in production, not just development, and Wicket has a reloading filter which can be used to hot-redeploy pages. But these are the exception, not the rule, and most Java frameworks are less ambitious in this department.

Solving the hot redeployment is something that Java application frameworks need to get right if they want developers to stay with the platform in the long term. This means working with classloaders, which can be a tricky business. But tricky does not mean impossible.

Impala tackles the problem of micro micro hot deployment for Spring-based applications. It allows for the division of the application into modules which can be reloaded separately. One of the important principals that it recognises is that in terms of frequency of change, not all application artifacts are created equal. Let's start with the most frequently changed parts of an application:
  • configuration flags: application specific properties which allow for switchable behaviour of the system at runtime. A trivial example would be a flag testMode which would be switched off in production.
  • UI templates: without any changes to the structure of the application, changing these can change the way the application appears.
  • infrastructure configuration: here we're talking about resources such as database connection pools, which existing independent from any application classes.
  • business rules: parts of the application which carry out the business logic of the application. These can, for example, be changed without having to change the domain model of the application.
  • domain model: we're moving closer to the root of the dependency graph here - changes to domain model objects typically can have downstream effects on all of the items listed above.
  • shared utility classes: these are units of code which are shared by different parts of the application, that don't relate directly to the domain model or business processes of the application. Since they haven't been packaged into separate third party libraries, they are technically still part of the application.
  • third party libraries: these tend to change much less often than the artifacts of the application itself.
Impala recognises the different life cycles of the different types of artifacts within an application. For example:
  • it is possible to reload the core of the application (domain model plus shared utility classes) and all of the business components without reloading any of the third party classes. This is important, because one of the things that takes Java apps so long to start is the need to load classes from third party libraries. Typically, the number of third party classes used, directly or indirectly, is much larger than the number of application classes.
  • it is possible to reload one of the business components without reloading the application core or any other business components
  • it will be possible to reload infrastructure configurations without reloading the core of the application, and vice versa. Note that the latter is possible because infrastructure components don't depend directly on the core application classes.
  • it is possible to reload tests without having to reload any of the application classes they are testing. This can dramatically cut down the time to write integration tests.
  • configuration flags and UI templates are less of a challenge to reload dynamically. The former can be done via reloadable configuration files, while the latter is usually supported by good web frameworks or servlet containers.
Micro hot deployment is about finding ways to minimise the granularity of artifact reloading, so that only artifacts that have changed, and those which depend on them, have to reload when changes are made. The benefits for improving developer productivity are obvious, and their are important potential benefits for live updates of in deployment environment, too.

Friday, September 14, 2007

Impala's non-Maven approach to simpler builds

One of the goals of Impala is to have a pure work out-the-box feeling. If you download the distribution, you should be able to set up a new project with just one or two commands. Once you've done this, the project structure should be ready for you. The build infrastructure should be just there. As long as you obey the project structure conventions, then you should be able to plug in an existing build system into your new project.

All of this is behind the ideas that drive Maven. Maven defines a standardised folder locations, and an existing build infrastructure which you can just plug in and use. All of these ideas are great, but I don't want the project to depend on Maven. I'm still trying to decide whether I should make the project structure conventions conform to those of Maven. The advantage is that Maven users would be able to simply Mavenize their project by adding a POM xml. The disadvantage is personal - I don't particularly like the Maven project structure conventions. I wouldn't have chosen them for myself.

Right now I'm pretty close to having a pure out-of-the-box ANT based build system ready for Impala. It's taken quite a lot of time, but it's starting to feel much more right. Basically, the build will be enabled using the following combination:
  • a build.xml in the project root directory. The build.xml needs to have a property called impala.home, which defines where the Impala install files have been dumped to on the file system
  • a set of other project-specific properties which need to be specified, either within the build.xml itself a properties file
  • a set of imports of build scripts sitting in the impala.home folder
Here's an example:

<?xml version="1.0"?>
<project name="Build" basedir=".">

<property name = "workspace.root" location = "..">
<property name = "impala.home" location = "${workspace.root}/../impala">
<property file = "">
<import file = "${impala.home}/project-build.xml">
<import file = "${impala.home}/download-build.xml">

Note that the clean, compile, jar and test targets typical in a build system are found in project-build.xml. This file in turn relies on your project structure conventions to find the resources it needs. Similarly, adding download-build.xml adds support for obtaining dependencies, for example from a Maven ibiblio repository. You can make this build file the master build file for a multi-project build, simply by adding an import to shared-build.xml and adding the project.list property, as shown in this example:
<?xml version="1.0"?>
<project name="Build" basedir=".">

<property name = "workspace.root" location = ".."/>
<property name = "impala.home" location = "${workspace.root}/../impala"/>

<echo>Project using workspace.root: ${workspace.root}</echo>
<echo>Project using impala home: ${impala.home}</echo>

<property file = ""/>
<import file = "${impala.home}/project-build.xml"/>
<import file = "${impala.home}/shared-build.xml"/>
<import file = "${impala.home}/download-build.xml"/>
<import file = "${impala.home}/repository-build.xml"/>

<target name = "get" depends = "shared:get"/>
<target name = "fetch" depends = "repository:fetch-impala"/>
<target name = "clean" depends = "shared:clean"/>
<target name = "dist" depends = "shared:all-no-test"/>
<target name = "test" depends = "shared:test"/>

with the extra property in
I'm looking forward to getting all of this work done, so I can get back to what Impala is really supposed to be doing, which is supporting dynamic Spring modules. But all of this functionality is terribly important for making the whole experience as painless as possible for end users.

Thursday, September 13, 2007

Simple dependency management, Impala style

I've been spending a bit of time working on a simple dependency management system for Impala. You're probably asking: why don't you just use something like Maven or Ivy? Aren't they supposed to be doing that job for you?

Well, the problem is that I've decided I don't want transitive dependency management. What I do want is a simple way for users of the project (including myself, of course) can easily find and download the dependencies they do need. In a funny way, Maven does come to the rescue. It defines a standard format for repository jar entries, which goes like this:
[base url]/[organisation name]/artifact name]/[version]/[artifact name]-[version].jar
For example, I can hold of the Spring framework jar into using the URL

Thankfully, Maven also defines a standard format for source jars.
[base url]/[organisation name]/artifact name]/[version]/[artifact name]-[version]-sources.jar
Okay, so I don't want transitive dependencies, then what do I want?
  • an easy reliable way to get hold of dependencies without having to go to each and every individual project web sites
  • source jars that match the downloaded binaries, so that I can easily attach source for debugging
  • a simple way of saying where the downloaded files should go
Impala uses the concept of a project-specific repository. A project specific repository will have a simple structure consisting of parent folders and subfolders as shown below:

Now, all I need is a simple mechanism which says how individual artifacts are added to this repository. The mechanism Impala uses is a simple text file, which has entries such as below:

main from commons-logging:commons-logging:1.1
main from commons-io:commons-io:1.3
main from log4j:log4j:1.2.13
main from org.springframework:spring:2.0.6 source=true
main from cglib:cglib-nodep:2.1_3
test from junit:junit:3.8.1
test from org.easymock:easymock:2.2
test from org.easymock:easymockclassextension:2.2
For each artifact, I say optionally whether I want to include source (overriding whatever the default setting happens to be).

I still need to do the work of figuring out what the dependencies are, but once I've got there, life is pretty peachy. You get the best of both worlds - simplicity and control.

I still need to do some tweaks to get the mechanism fully ship shape, but it's basically working pretty well. Another feature is that you can specify multiple source locations, including your local Maven repository, so if the artifact you need happens to be lurking on your file system from a previous download, you don't need to waste time trying to get it from the net.

Wednesday, September 5, 2007

Impala, ANT and transitive dependencies

Any project needs a build environment. For a project like Impala, which is trying to provide a simpler environment for Spring development, the choice of build tool is important.

ANT has been around for an awful long time, but it's not exactly the most sexy software around these days. For a while it was the only show in town. Nobody likes writing build scripts, and the one thing that ANT does make you do is write build scripts for your projects.

These days, the obvious other choice is Maven. Maven's two selling points are thus: it saves you from having to write build scripts, and it is supposed to handle dependency management for you. Of course, it cannot do everything. You need to tell Maven what libraries you want to use. Maven in turn will find a whole bunch of other libraries that you don't necessarily know about, which the libraries that you do want to use themselves depend on. This is called transitive dependency management. The problem, though, is twofold. Maven relies on dependency information put in POM XML files. This information is only as good as the person editing the POM. It's not foolproof. Secondly, even if the POM is accurate, it may be more complete than it needs to be. How does it deal with optional dependencies? If these are included in the POM, but not needed for the parts of the library that you are using, then you get a bloated repository. And the one thing I can't stand is bloat.

There are other things I don't like about Maven. Frankly, it tries to do too much, without necessarily doing anything very well. Lots of people I speak to have complained about this. It entails a lack of control over your build environment, which scares me. The horror stories you hear about don't seem to happen with ANT. It's a pain, but one way or another, you know you are going to be able to get the job done.

A second option is Gant, based on Groovy. Gant is built on ANT too, but uses a Groovy DSL to make ANT scripting easier. It seems much better for complex scripting, because you can use proper language features such as iteration and conditionals much more easily than with ANT. Two problems with Gant, though. There's more setup involved, because it requires Groovy and Gant to be installed (as well as ANT, perhaps). Also, it's a bit slow. Groovy is much slower to compile than Java, which has a noticeable impact on build times.

For these reasons, I'm back to ANT. If I need to do anything complex, like handle dependency management, or iterate through projects, I'll write a custom task for that. It's easy enough. ANT is not pretty, but it is effective.