The Twelve Factor App Review – part 1

There is a site I’ve seen around lately called The Twelve Factor App which lists 12 rules for building a web application.

The goals are noble: to build scalable, portable, easily deployable web applications, while being language and framework agnostic. Unfortunately I found several of the points to run against my experience and personal opinion, so I wanted to go over the 12 tenets they recommend and offer my thoughts on each.  Please keep in mind that much of my experience is on ATG, J2EE, and large complex applications.

1) Codebase

Overall this is pretty normal, but in some cases I don’t completely agree with the idea that shared code can only be handled through independently built-packaged libraries included via some sort of dependency manager.  I haven’t really used a good J2EE level dependency manager, perhaps Maven makes this easy?

2) Dependancies

I have several issues with this “factor”.  Firstly, Java doesn’t have a packaging system for distributing libraries and common app bundles.  Next, I’m not sure what a “dependency declaration manifest” actually DOES at the app level or what “dependency isolation tools” I should be using.  ATG has module dependencies defined within the MANIFEST.MF files which is great, but isn’t something you can use to handle extra-ear dependencies (JBoss, JDBC drivers, native apps, etc…).

Further they expect the app to provide/package ALL system tools it may need.  The examples they use are ImageMagick and curl.  This is crazy for many reasons: first, many of these tools are different on each platform, packaging/building/installing these tools on each different platform is a massive effort and not something easily bundled into your app, nor should your Java/Ruby/PHP developers have to deal with multi-platform C++ build issues, secondly most platforms have their own package installation and dependency management system (yum, apt-get, etc…) which ensure supported platform specific versions, which not only will work, but also may be required for support for Enterprise Linux distress for instance.

Also: where does it end?  It’s turtles all the way down.  ImageMagick on RHEL 5, installed via  yum, has 148 dependencies.  Curl has 310.  A J2EE application, in EAR form, may depend on: JBoss 5.1 EAP, JDK 1.6.0_27 (the correct version for the platform and 32 or 64 bit), JDK extended crypto policy files, Oracle JDBC driver jar file, various JBoss server config overrides, curl, NFS mounts, ImageMagick, mplayer, Apache front end configured with SSL certs, proxy configs, etc…  It’s insane to say all that needs to be handled within your EAR deployment mechanism.

3) Config

This starts off making sense: keep your configs separate from your code.  Don’t hardcode configs into your source code.  But then they say you should use environment variables to handle all of your configs….  They also say you shouldn’t “group” config values based on the environment (dev, stage, prod, etc…).  Both ATG and JBoss Seam have very sensible configuration setups with config files, environment level groupings for environment specific overrides, etc…  It works great.  Much better than trying to deal with env variables…

4) Backing Services

No argument here.  Nothing that shouldn’t already be common sense either.

5) Build, release, run

No argument here.  Nothing that shouldn’t already be common sense either.

6) Processes

Heading into the deep end again:)  They push for stateless, share-nothing processes that rely on a database (or similar service) for all persistent or shared data/state.  That’s okay for very simple or very low traffic applications, but in my experience more complex and/or higher traffic web apps benefit hugely from having sticky sessions with local RAM based session state.
A user’s session on a serious eCommerce application contains a lot of data with somewhat complex relationships: basic user profile data (username, name, gender), extended user profile data (shopping preferences, shipping address(es), shopping history), current cart data (which could be as simple as a list of SKUs, or could be full copies of SKU data to avoid catalog data changes from impacting cart items, with cloned pricing to avoid pricing changes impacting the cart, coupons, promotions, shipping calculations, shipping methods, tax lookup data, and more), session history data (as simple as breadcrumbs or as complex as browsing history data being used to calculate recommendations, promotions, cross-sell, and up-sell), and checkout flow state and related data.

They seem to recommend that at the end of every page request, this complex relationship of Objects (Java components or beans or whatever you use) should be serialized or mapped back to database backing tables (manually or via an ORM), persisted in a series of inserts and updates (the most expensive type of operations) to a remote database across the network, and those objects should be cleaned up/purged/garbage collected.  Then at the start of every page request, a cookie identifier should be used to identify all of that data again, probably with multiple separate sequential lookups, select it out of the database, parse it back into new Objects you’ve created again, and bound together based on their relationships, before you can service the request.

This is CPU, memory, and network intensive.  You’d be creating massive additional load on your database and dramatically increasing your app server CPU utilization and increasing the request response time.  It’s MUCH easier to build session state data once, keep it around for the life of the session, and just use intelligent sticky session routing on your load balancer/proxy.

It’s late, so I’ll address the last 6 tenets in another post!  Thanks for reading!

7 thoughts on “The Twelve Factor App Review – part 1

  1. We are currently using Maven to build and package a fairly big ATG application (more than 1500 Java source files). It works well now, but it took at least a couple of months for a very experienced engineer to get it to a state where it could reliably produce dependencies in ATG’s MANIFEST.MF files from Maven’s pom.xml files: all of this was done by creating a new Maven plugin and a new Maven archetype. Also, if you start using Maven, put a repository manager (such as Artifactory) in front of it to make your builds less dependent on Internet connectivity and repository servers all over the world.

    I agree with your observation about the app needing to embark all the dependencies, this just doesn’t apply to our world where we don’t control the deployment environment. However, I’ve recently started to package JBoss and ATG as RPM packages and being able to declare dependencies in the spec file helps a lot in achieving a reliable and repeatable installation.

    • Good to know about Maven. I’ve just never worked on any projects where people used it, so I don’t have any real hands-on experience with it. Do you find it’s easier to have Maven deal with the MANIFEST.MF files and other dependancies, versus just editing those files by hand when needed in SVN?

      I agree that if you’re able to manage an RPM repo, and you’re deployment environments are standardized around a specific Linux distro, then that can help immensely with setting up new environments. You can also use tools like puppet and chef in that arena. I just think that’s outside the core application developer’s area of responsibility.

      • I find dependency management to maven’s most useful feature. If I need to add a new dependency to a project, I can simply add the entry to the pom.xml and it is immediately make it available to everyone working on the project, the continuous integration server, without having to add binaries to the source repo. Like Sebastiano, we have a simple custom maven plugin that reads the dependencies from the pom and updates the ATG-Class-Path manifest entry accordingly

      • From a developer point of view, editing Maven’s pom.xml files is not much different from editing MANIFEST.MF files. Thankfully adding dependencies is still something that occurs in the vast majority of cases in the first months of a project and less and less as the project matures.

        • I guess my question, not having any hands-on Maven experience, is what does Maven do better than just a “normal” ANT/Jenkins/ATG Manifest file based build system that’s worth the time and effort it seems to have taken to get working smoothly?

          Also would either of you be able to release the Maven customizations or would you be interested in writing up a guest blog post or something covering how you use Maven for ATG apps, etc…? I know I’d love to learn more about it, and I’m sure many others would too!

          • Sorry for the belated reply.

            I have appreciated the convention that Maven imposes on a project by forcing the layout of the sources (src/main/java, etc.) and of the generated classes (target/classes, target/test-classes). I have also appreciated to be able to declare dependencies on jar files or on other ATG modules and having Maven generate Eclipse, IntelliJ, ATG manifest files and also call runAssembler with all the correct modules. The generated EAR can then be uploaded as a Maven artifact to a repository where other tools can make use of it.

            In my opinion the result was worth the effort, and we’re looking forward to extend this way of working to our future ATG projects. The only objection I have is that there’s far too much XML to write, and this is why I’m inclined to try something like buildr, which can interact with Maven repositories and seems less inflexible in enforcing its convention (though picking up buildr and Ruby at the same time is not an easy task).

            I don’t think unfortunately that we will be able to release the Maven customizations, but I’ll talk to the engineer who worked on them and I’ll see if he is available to write up something.

            • Thanks for the info! Would love to see some info or code released so we don’t have to go through the same learning curve from ground zero, if possible.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>