Create PDF files with Cocoon 3 and Struts 2

von

Finally I came to the situation with Time & Bill that I need to create invoices. Of course, invoices need to be downloadable in the PDF format. First I thought I would use Apache PDFBox. My idea was to upload some kind of template pdf an replace some strings with my content. But I failed; it seems that LibreOffice is not able to create a PDF which can be manipulated with PDFBox (or vice versa :-)). There is iText, but it is commercial. I prefer to use software from the Apache Software Foundation and iText would be the only closed source exception in Time & Bill. Finally I was looking at Apache FOP again. I have tried this before long time and it was fine, but heavily complex to me. But well, it is based on standards. Looking at it in detail it is not implementing everything which is defined by the standard, but I was pretty sure I could do whatever I wanted with it.

So, now how should I integrate this into my Struts application?

I asked Simone Tripodi for advice. He told me about Cocoon 3. Actually I have worked with Cocoon 2 which is/was highly complex and heavy. You needed a long time to get into. Finally I loved the concepts behind Cocoon 2 and started working on the PIWI Framework, which uses terminology and some concepts and implements it in PHP. Ironically, some people meanwhile say PIWI has become complex too. But well, that’s another story.

Apache Cocoon 3 addresses this complexity. Actually it is so easy to use and lightweight I was deeply impressed. If you would have told me I should use Cocoon 2 in Struts I would have laughed hysterically. But you really can’t compare the two version with each other. Except that the terminology and philosophy behind these two is similar.

To use it together FOP, you need to add these dependencies to your pom.xml:

<dependency>
    <groupId>org.apache.cocoon.pipeline</groupId>
    <artifactId>cocoon-pipeline</artifactId>
    <version>3.0.0-alpha-3</version>
</dependency>
<dependency>
    <groupId>org.apache.cocoon.optional</groupId>
    <artifactId>cocoon-optional</artifactId>
    <version>3.0.0-alpha-3</version>
</dependency>
<dependency>
    <groupId>org.apache.xmlgraphics</groupId>
    <artifactId>fop</artifactId>
    <version>1.0</version>
</dependency>

Then I made up my Struts 2 Action class, as usual. Here is a shortened example:

public class InvoiceAction extends ActionSupport {
    protected InputStream pdfStream;

    public String execute() throws Exception {
  // … creating xml
  String xml = ...
  URL xsltInput = InvoicePipeline.class.getResource("invoice2fo.xsl");
  InvoicePipeline pipeline = new InvoicePipeline(xsltInput);
  pdfStream = pipeline.generateInvoice(xml);
  return SUCCESS;
    }

   public InputStream getPdfStream() {
        return pdfStream;
   }
}

Here is the connection between Struts 2 and Cocoon 3: I created an InvoicePipeline. The constructor argument is a URL to an xslt file. When constructed, I call a generation method and receive an InputStream. On a side note, XML generation is being done by some of own classes. They generate XML out of a Java bean. I have used the codebase of JJSON for this task and just rewrote the JSON output to be XML output. It is very basic and ugly, but works better for me than the complex Apache Commons Betwixt. Unfortunately.

I wanted the invoice as a download, therefore my struts.xml looks like this:

<action name="invoice" class="de.grobmeier.example.InvoiceAction">
    <result type="stream">
        <param name="contentType">application/pdf</param>
        <param name="inputName">pdfStream</param>
        <param name="contentDisposition">attachment; filename="invoice.pdf"</param>
        <param name="bufferSize">1024</param>
    </result>
</action>

As you can see, I use the Struts 2 ResultType “stream”, which will return the InputStream I store in pdfStream. The file name is dynamic in the actual application.

But now lets look into Cocoon 3 itself.

The mentioned InvoicePipeline is my entry point. As the name suggests, it might be a Pipeline. In easy terms, a Pipeline is an object which holds one or more components. A component on the other side is a class which “does something”, like a data transformation, generation of new data or serialization from data to html. Read more on the Cocoon docs.

But actually my own InvoicePipeline class is not of the type “Pipeline” or something like that. It is just a POJO, which does take an URL to xslt as input for the constructor. In addition it offers generateInvoice, a method which takes some XML as a String and returns an InputStream. My idea is to create a temporary file which is streamed back.

It is very appealing that I don’t need to run a server. A plain main method can execute my class and the customized Cocoon 3 pipeline. This fact is also very great for unit testing. If you remember back the time in Cocoon 2 with the so called “blocks” - wow, what a progress this project has made. Cocoon 3 is embeddable.

Let’s look to the important parts of my InvoicePipeline class.

public InputStream generateInvoice(String xmlInput) throws Exception {
    Pipeline<SAXPipelineComponent> pipeline = new NonCachingPipeline<SAXPipelineComponent>();
    pipeline.addComponent(new XMLGenerator(xmlInput));

    // xsltInput is my URL to my XSLT file
    pipeline.addComponent(new XSLTTransformer(xsltInput));
    pipeline.addComponent(new FopSerializer());
...
    File invoice = // …create temp file
...
    FileOutputStream fos = new FileOutputStream(invoice);
    pipeline.setup(fos);
    pipeline.execute();
    fos.flush();
...
    return new FileInputStream(invoice);
}

For sake of easy reading I have left out try/catches and other good practices. Basically what you can see here how I create a NonCachingPipeline. You could say this is a Pipeline without any tricks or treats. It is basic. We add the generic of SaxPipelineComponents, which actually means we can only put SaxPipelineComponent into this Pipeline. Good that we have a bunch of them!

Namely:

  • XMLGenerator
  • XSLTTransformer
  • FopSerializer

The XMLGenerator generates SAX events. As XSLTTransformer is of type SAXConsumer, this component is being used to consume the generated SAX events. Cocoon 3 checks if there is such a consumer following for you and links the gory details. Finally I have added a serializer. The FOP serializer (surprise) finally is creating the stream for you out of the SAX events (again it does implement the SAXConsumer interface).

Now the fun can begin. Call setup with an OutputStream. With execute() you start the execution of this pipeline. Of course you can use this stream in memory and you don’t need to write this to disk (as I do for several reasons). As you can see, Cocoon 3 is all about streaming. This is pretty fast and easy to develop. Together with the streaming result type of Struts this was implemented in pretty short time. And the good thing is, I was even able to write JUnit tests without any pain. Spot the development of Cocoon 3, as it aims to become a player esp in the REST world, as I heard.

Tags: #Apache Cocoon #Java #Open Source