The Twelve Factor App Review - part 1

There is a site I've seen around lately called The Twelve Factor App which lists 12 rules for building a web application.

The goals are noble: to build scalable, portable, easily deployable web applications, while being language and framework agnostic. Unfortunately I found several of the points to run against my experience and personal opinion, so I wanted to go over the 12 tenets they recommend and offer my thoughts on each. Please keep in mind that much of my experience is on ATG, J2EE, and large complex applications.

1) Codebase

Overall this is pretty normal, but in some cases I don't completely agree with the idea that shared code can only be handled through independently built-packaged libraries included via some sort of dependency manager. I haven't really used a good J2EE level dependency manager, perhaps Maven makes this easy?

2) Dependancies

I have several issues with this "factor". Firstly, Java doesn't have a packaging system for distributing libraries and common app bundles. Next, I'm not sure what a "dependency declaration manifest" actually DOES at the app level or what "dependency isolation tools" I should be using. ATG has module dependencies defined within the MANIFEST.MF files which is great, but isn't something you can use to handle extra-ear dependencies (JBoss, JDBC drivers, native apps, etc...).

Further they expect the app to provide/package ALL system tools it may need. The examples they use are ImageMagick and curl. This is crazy for many reasons: first, many of these tools are different on each platform, packaging/building/installing these tools on each different platform is a massive effort and not something easily bundled into your app, nor should your Java/Ruby/PHP developers have to deal with multi-platform C++ build issues, secondly most platforms have their own package installation and dependency management system (yum, apt-get, etc...) which ensure supported platform specific versions, which not only will work, but also may be required for support for Enterprise Linux distress for instance.

Also: where does it end? It's turtles all the way down. ImageMagick on RHEL 5, installed via yum, has 148 dependencies. Curl has 310. A J2EE application, in EAR form, may depend on: JBoss 5.1 EAP, JDK 1.6.0_27 (the correct version for the platform and 32 or 64 bit), JDK extended crypto policy files, Oracle JDBC driver jar file, various JBoss server config overrides, curl, NFS mounts, ImageMagick, mplayer, Apache front end configured with SSL certs, proxy configs, etc... It's insane to say all that needs to be handled within your EAR deployment mechanism.

3) Config

This starts off making sense: keep your configs separate from your code. Don't hardcode configs into your source code. But then they say you should use environment variables to handle all of your configs.... They also say you shouldn't "group" config values based on the environment (dev, stage, prod, etc...). Both ATG and JBoss Seam have very sensible configuration setups with config files, environment level groupings for environment specific overrides, etc... It works great. Much better than trying to deal with env variables...

4) Backing Services

No argument here. Nothing that shouldn't already be common sense either.

5) Build, release, run

No argument here. Nothing that shouldn't already be common sense either.

6) Processes

Heading into the deep end again:) They push for stateless, share-nothing processes that rely on a database (or similar service) for all persistent or shared data/state. That's okay for very simple or very low traffic applications, but in my experience more complex and/or higher traffic web apps benefit hugely from having sticky sessions with local RAM based session state.
A user's session on a serious eCommerce application contains a lot of data with somewhat complex relationships: basic user profile data (username, name, gender), extended user profile data (shopping preferences, shipping address(es), shopping history), current cart data (which could be as simple as a list of SKUs, or could be full copies of SKU data to avoid catalog data changes from impacting cart items, with cloned pricing to avoid pricing changes impacting the cart, coupons, promotions, shipping calculations, shipping methods, tax lookup data, and more), session history data (as simple as breadcrumbs or as complex as browsing history data being used to calculate recommendations, promotions, cross-sell, and up-sell), and checkout flow state and related data.

They seem to recommend that at the end of every page request, this complex relationship of Objects (Java components or beans or whatever you use) should be serialized or mapped back to database backing tables (manually or via an ORM), persisted in a series of inserts and updates (the most expensive type of operations) to a remote database across the network, and those objects should be cleaned up/purged/garbage collected. Then at the start of every page request, a cookie identifier should be used to identify all of that data again, probably with multiple separate sequential lookups, select it out of the database, parse it back into new Objects you've created again, and bound together based on their relationships, before you can service the request.

This is CPU, memory, and network intensive. You'd be creating massive additional load on your database and dramatically increasing your app server CPU utilization and increasing the request response time. It's MUCH easier to build session state data once, keep it around for the life of the session, and just use intelligent sticky session routing on your load balancer/proxy.

It's late, so I'll address the last 6 tenets in another post! Thanks for reading!

The Twelve Factor App Review - part 1

1) Codebase

2) Dependancies

3) Config

4) Backing Services

5) Build, release, run

6) Processes

No comments yet

Continue reading

AI Coding Zen and the Art of Context Management

Managing Home and Work GitHub Accounts from the Same Machine

Updates to 10MinuteMail: Mobile-Friendly and Faster