blogs

Going Traveling

I'll be traveling to South-East Asia for eight weeks, leaving this Saturday at 10am. When I get a chance I'll post updates on the blog.

The plan is to meet up with my friend Marc in Vancouver and then fly to Hong Kong. There we will meet our friend Elise. She's studying law in Hong Kong and will be our tour guide for three days. After that all of us are heading to Hanoi, Vietnam. From there on we don't have a set schedule and will make our way down south to Cambodia. We have four weeks for that. Then Marc will be heading home and Elise will head north to Laos. I'm going to head south and find a nice beach in Thailand to finish the vacation on a relaxing note. At the very end I might go and visit my friend Markus who is living in Tokyo. It should be an awesome trip! :-)

Impressions from BioIT World

I attended BioIT World last week as part of the GenoLogics crew that traveled out there. From a marketing and sales perspective it was an excellent show. Traffic at our booth was steady and it was very busy at times.

I thought the technology side of things was a little disappointing. I walked away with very little new concrete information. Most of the talks focused on what we "should" be doing, especially with regards to the semantic web and RDF technology. This is quite interesting, but there is nothing new here. I'm sure most of the audience has already heard this many times. I would have been much more interested to see some concrete examples of how this was actually implemented and put to use. I suppose the problem is that big pharma (who has the money + resources to actually do this) isn't interested in sharing their "secrets" since it is considered a competitive advantage.

Personally I'm still very skeptical around the semantic web and the feasibility of it in practice. While the technology certainly makes sense, the manual effort of unifying the many different systems and mapping them to an established common vocabulary seems almost insurmountable. This is made even more difficult by the fact that a large number of smaller to mid-side labs in academia do not have a proper data management system and are just working with Excel files stored in some sort of directory structure. Good luck indexing that and mapping the contents to an ontology.

The most interesting talks were around IT infrastructure for next generation sequencing. The talks from the Broad Institute and Harvard were great. Some take aways:

  • 1 next generation 454 sequencer generates as much data as 399 current ABI 3730s!
  • 1 gigabit networks are barely adequate for the data, 10 gigabit is the way to go
  • But in general moving that much data around the network is impractical, so they just swap disks and move them between computers. This was termed "SneakerNet". :-)
  • Even if the raw numbers add up, your infrastructure might fail due to secondary effects. For example the Broad Institute disk array was large enough for the data, but it failed since processing software kept hitting the same areas of the disk. This caused the disk to fail. They then had to switch to clustered storage.
  • This is as much a "social" problem as a technology problem. Researcher expectations have to be reset since we realistically can not keep all the data around forever and the data will not always be available in an instant.

Net Neutrality in Canada

I used to have a link to Neutrality.ca on my blog. It was a great website outlining the net neutrality issues in Canada. But I just noticed that the site has been taken down due to "legal concerns". This is a real shame and I wonder what the concerns are.

Net neutrality is an issue that has received a lot of attention in the US, but sadly has been mostly ignored in Canada up until recently. If you care about your rights as a consumer or simply about free speech, you should be taking notice. For more information just search on Google. It will turn up plenty of sites that do an excellent job of explaining what is at stake.

In other news, I'll be attending BioIT World in Boston from April 30 - May 2nd. I'm going primarily to attend the lectures, but I'll also be hanging around the GenoLogics booth quite a bit. Something tells me having an IT savvy person at the booth during an IT trade show is probably a good idea. ;-)

Unsign a JAR with Ant

I was working on the build system today and came across this Ant macro I wrote a while ago. It unsigns a JAR by removing all signatures from the manifest and re-jar'ing it. This is useful if you want to deploy your application via Webstart and want to include all JARs inside one JNLP file, instead of using JNLP extensions for JARs with different signatures. Simply unsign all JARs and then re-sign them with your own signature.

Since there is no straight-forward way to do this in Ant I had to write my own macro. This might come in useful to other people, so I decided to post it:

<macrodef name="unsignjar">
	
    <attribute name="jar"/>
    	
    <sequential>
	<!-- Remove any existing signatures from a JAR file. -->
	<tempfile prefix="usignjar-" destdir="${java.io.tmpdir}" property="temp.file"/>
        <echo message="Removing signatures from JAR: @{jar}"/>
        <mkdir dir="${temp.file}"/>
	        
        <unjar src="@{jar}" dest="${temp.file}">
            <patternset>
                <include name="**"/>
                <exclude name="META-INF/*.SF"/>
                <exclude name="META-INF/*.DSA"/>
                <exclude name="META-INF/*.RSA"/>
            </patternset>
        </unjar>
	        
        <delete file="@{jar}" failonerror="true"/>
	        
        <!-- Touch it in case the file didn't have a manifest.
             Otherwise the JAR task below will fail if the manifest 
	     file doesn't exist. -->
        <mkdir dir="${temp.file}/META-INF"/>
        <touch file="${temp.file}/META-INF/MANIFEST.MF"/>
	        
        <jar destfile="@{jar}" 
            basedir="${temp.file}" 
            includes="**" 
            manifest="${temp.file}/META-INF/MANIFEST.MF"/>
	        
        <delete dir="${temp.file}" failonerror="true"/>
    </sequential>
</macrodef>

To use the macro:

  <unsignjar jar="/some/location/file.jar"/>

Embedded Tomcat Class Loading Trickery

Recently I've embedded Tomcat directly within our application. The idea is that web application extensions to the system can easily share the core of our platform and the root Spring application context. This has worked well up until yesterday when I ran into some weird class loading issues with Tomcat.

Once issue was that if an application included a JAR file that is also included in the core platform, Tomcat would still load the class from the system classloader, instead of loading it from the WAR file using the Tomcat web app classloader.

For example, this is a problem when using the Wicket web framework. Wicket will load a class for a page from inside one of the Wicket core classes using getClass().getClassLoader().loadClass(XYZ). However, since Wicket was loaded using the system class loader it cannot see the web app's classes and this will result in a ClassNotFoundException.

Another problem was using CGLib. When CGLib tried to instantiate a class using ClassLoader.defineClass() it would result in a NoClassDefFoundError. Again, the problem here is that CGLib was loaded from the system class loader and cannot resolve classes from the web app's WAR file.

It took a while to find out why this is happening. According to the Tomcat 5 Class Loader HOW-TO, it should always load classed from the web app itself before loading them from the system class loader, except for special case classes. Looking at the middle of the page however, it does say that the system class loader is used first. Unfortunately for the longest time I was looking at the Tomcat 4 Class Loader HOW-TO and on that page it still indicates that the web app class loader is always used first.

It turns out the system class loader is in fact always used first. Specifically line 1267 of WebappClassLoader in the Tomcat 5.5.17 sources. However, if you use the normal Catalina startup script it resets the system class path to only include a minimal set of classes, so in that case it would not find application specific classes using the system class loader. Therefore running Tomcat normally it would always end up using the WebappClassLoader.

To get the same behaviour when embedding Tomcat in your own application, you have to create a bootstrap class. You only include the bootstrap class in the class path when loading your application. The bootstrap class then creates a URLClassLoader to load in the rest of your application classes. For example:

public class Bootstrap
{
    public static void main(String args[])
    {
        String root = args[0];
        
        try
        {
            List<URL> classpath = new ArrayList<URL>();
            classpath.add(new File(root + File.separator + "conf" + File.separator).toURL());
            addJarFileUrls(classpath, new File(root + File.separator + "libs"));
            
            ClassLoader cl = new URLClassLoader(classpath.toArray(new URL[0]));
            
            // Set the proper classloader for this thread.
            Thread.currentThread().setContextClassLoader(cl);
            
            // Use reflection to load a class to normally load the rest of the app.
            // Reflection will use the Thread's context class loader and therefore pick up
            // the rest of our libraries.
            Class appClass = cl.loadClass("com.neatstep.frank.Application");
            Object app = appClass.newInstance();

            Method m = app.getClass().getMethod("start", new Class[0]);
            m.invoke(app, new Object[0]);
        }
        catch (Exception ex)
        {
            ex.printStackTrace();
            System.exit(1);
        }
    }

    /**
     * Add JAR files found in the given directory to the list of URLs.
     * @param jarUrls the list to add URLs to
     * @param root the directory to recursively search for JAR files.
     */
    private static void addJarFileUrls(List<URL> jarUrls, File root) throws MalformedURLException
    {
        File[] children = root.listFiles();

        if (children == null)
        {
            return;
        }
        
        for (int i = 0; i < children.length; i++)
        {
            File child = children[i];
            
            if (child.isDirectory() && child.canRead())
            {
                addJarFileUrls(jarUrls, child);
            }
            else if (child.isFile() && child.canRead() && 
                     child.getName().toLowerCase().endsWith(".jar"))
            {
                jarUrls.add(child.toURL());
            }
        }
}

And in a separate class file that is included in one of the JAR files which we dynamically add to the URLClassPath above:

public class Application
{
    public void start()
    {
        // Do whatever you want.

        // Initialize embedded Tomcat.
    }
}

After adding this bootstrap class to our application, everything worked fine.

Syndicate content