Montag, 19. Mai 2014

How to share a DocumentBuilder instance in Java

I tried to optimize parsing of XML document in JVoiceXML by using a DocumentBuilder as a static member:

    private final static DocumentBuilder BUILDER;

    static {
        final DocumentBuilderFactory factory =
                DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        factory.setIgnoringComments(true);
        DocumentBuilder builder = null;
        try {
            builder = factory.newDocumentBuilder();
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
        BUILDER = builder;
    }

When I used this construct by multiple threads, I stumbled across the following exception:
 
    org.xml.sax.SAXException: FWK005 parse may not be called while parsing

A quick search with google came up with results like this.Bad news: Parsing XML in Java is not thread safe. An immediate solution is synchronization. However, synchronization adds a lot of overhead and is actually only needed, if calls are actually coming from different threads. A better solution for me was to use ThreadLocal. My current soultion in JVoiceXML is the following:

    private static final ThreadLocal LOCAL_BUILDER
        = new ThreadLocal() {
        @Override
        protected DocumentBuilder initialValue() {
            final DocumentBuilderFactory factory =
                    DocumentBuilderFactory.newInstance();
            factory.setNamespaceAware(true);
            factory.setIgnoringComments(true);
            DocumentBuilder builder = null;
            try {
                builder = factory.newDocumentBuilder();
            } catch (ParserConfigurationException e) {
                e.printStackTrace();
            }
            return builder;
        };
    };

The implementation of ThreadLocal usage is very fast. However, creation of new objects is also very fast in current JVMs. I still have to check if caching really makes sense in this scenario, espacially, in the light of possible memory leaks as indicated by Nancy Deschenes.