Site Network: Personal | Professional | Photography

Technical Blog

This blog will contain content related to Java, Seam, Security, my sites and projects, as well as other technical subjects I am interested in.

Comments and questions are welcome!

Archive for the ‘ATG’ Category

Make Google Ignore JSESSIONID

Tuesday, February 16th, 2010

Search engines like Google will often index content with params like JSESSIONID and other session or conversation scope params. This causes two problems: first the links returned in the Google search results can have these parameters in them, resulting in “session not found” or other incompatible session state issues. Secondly it can cause a single page of content, to be indexed multiple times (with differing parameters) this diluting your page’s rank.

I’ve posted two solutions to this issue in the past: Using Apache to ReWrite URLs to remove JSESSIONID and a more advanced solution of using a Servlet Filter to avoid adding JSESSIONID for GoogleBot Requests.

Now there’s an even better way to handle this. Google has added an amazing new feature to their Webmaster Tools which allows you to specify how the GoogleBot indexer should handle various parameters. You can ignore certain parameters such as JSESSIONID, cid, and others, and also specifically not ignore other parameters such as productId, skuId, etc…

Log into your Google Webmaster Tools, and select the site you wish to work with. Under “Site Configuration” -> “Settings” there is a new section at the bottom called “Parameter handling”. Click on “adjust parameter settings” to expand the parameter handling configuration for your site. Sometimes Google will suggest various parameters it has discovered while crawling your site, and other times you just enter the parameters you want Google to ignore or pay attention to.

Google Webmaster Tools Parameter Handling Interface

This is a much more elegant solution to the JSESSIONID problem, and also allows you to easily handle other parameters your site may use for either session state or dynamic content generation correctly. The only downside is that this only impacts Google, whereas with the correct configuration my older two solutions can handle any Search Engine Bot. Maybe other search providers will or do provide a similar feature.

Why ATG’s Core Based Licensing is Stupid

Tuesday, February 2nd, 2010

ATG, like most enterprise software companies started by licensing their product based on how many CPUs you ran it on. Back in 1999 this was a pretty fair way to do things. It meant that big companies running a very high traffic site on a big Sun E4500 or E10k paid a lot more than a smaller company running on a pair of E450s. They handled more traffic and hence ideally made more money off of the site, and therefore paid more. Overall it was a decently fair model, and very easy to enforce in the software. Upon startup the software checks to see how many CPUs the server has and checks that against the license file and only starts if the license file matches or exceeds the CPU count. Makes sense, right? Most folks were doing the same thing at the time.

One aspect of this system is that year over year, as processors got faster and faster (from 250 MHz to 480 MHz for example), you got more power for the same licensing cost. In generally this was partially or fully offset by the increasing complexity of the software you were running, but worst case scenario it kept you from LOSING request handling ability over time, and best case scenario you were able to increase your traffic handling ability a bit, as Moore’s law drove clock speeds up.

On the chart above you see the green line which is the number of transistors on a CPU growing just like you’d expect from Moore’s law. This line can be thought to translate roughly to performance, and has been on this trend for over 30 years.

However, the blue line, which is clock speed (MHz or GHz) does something very odd around 2003. It flattens out. What happened?

Processor design changed. Due to some limitations in our current chip technology, going faster and faster (and smaller and smaller) couldn’t keep happening due to some physical and quantum limitations. So instead, companies like Intel and AMD began designing and building CPUs that had multiple “cores”. They went wider instead of faster. Basically each core is like a mini-CPU that, if your software supports it, means you can get more work done per second without having to have a faster clock speed.

Instead of a CPU with a single 3.4 GHz core, we have a CPU with four 2.66 GHz cores. Now keep in mind that the actual useful performance of the CPU kept climbing as it has for the last 30 years. It’s just that instead of faster clock speeds, we moved into multiple cores.

The problem is that software reports each “core” as a CPU. That means a server with a single quad-core CPU appears to be a 4 CPU server. That means CPU/core based licensing now costs FOUR TIMES AS MUCH as it did last year for the same request handling ability. We’re not taking hundreds of dollars here. We’re talking hundreds of thousands or millions of dollars for each customer.

To make matters worse the latest generation of CPUs, Intel’s Nehalems, use something called HyperThreading to make each core do more work. The upside is this generation of chips performs better than the old ones, as they should. The downside is they now report as twice as many actual cores, due to the HyperThreading. A quad core single CPU now reports as 8 CPUs. You can disable HyperThreading, but that actually introduces a 20-40% performance penalty (depending on how you benchmark it, etc…), so in most cases you’re actually getting LESS performance than you did from last years chips. At that point you can either cough up another $500,000 in license costs, or have your brand new server be slower than your old one. Great options.

Fortunately most companies saw this issue when it first reared its ugly head back in 2003, and have moved to socket based licensing. They are basically licensing the same way they always have, just redefining the CPU as the “socket” or the physical spot on the motherboard the CPU plugs into, and getting away from things like cores, hyperthreading, and that whole mess. Customers of companies which made that change (such as Oracle, JBoss, and many others) essentially end up paying the same as they always have, and everyone goes home happy. The licensing cost/performance curve for those folks has stayed pretty stable over the past 10-15 years.

Unfortunately ATG has not changed their licensing at all. This means that ATG customers are paying 4-8 times as much for licenses than they would be in 2002. And it’s only getting worse. Processor design is continuing to go wider not faster, and ATG customers will continue to be massively penalized by this CPU architecture trend.

I’ve spoken to many people at ATG, and the response is generally the same: “We understand what you’re saying, we are aware of CPU architecture changes. But changing our licensing is a big deal and takes time to do right.” Okay, I buy that. You’ve had SEVEN years so far! This has been a growing issue since ~2003 and one that pretty much all the other players in the space have handled since then.

I posted about this almost two years ago in my Rant About Core Based Licensing, but unfortunately nothing has changed on the ATG front.

It’s getting harder and harder to get dual core CPU servers, and pretty soon you won’t be able to get anything smaller than a Nehalem quad with HyperThreading. This means that out of the box, if you want two small servers (for redundancy) you will need 16 cores of ATG Commerce licensing. That’s millions of dollars. If you disable HyperThreading, and take the 20%+ performance penalty, you “only” need 8 cores of ATG Commerce licensing. That’s still probably close to a million dollars (I don’t have actual costs handy). Not only is ATG penalizing all of their existing customers, but they’re really forcing themselves out of the mid-market they are trying to target.

The ATG “starter” bundles are becoming impossible to implement due to this as well. “two cores of commerce” means you can run a single server, which doesn’t offer any redundancy. “four cores of commerce” means if you can manage to find new servers that still have a single dual core proc available, you’re limited to really old and slow chips. For instance, looking at available single processor servers from one major hosting provider, the “best” dual core you can get is a Xeon 3060 dual core 2.4 GHz with a 4 MB cache and 667 MHz RAM bus speed. The best single single processor available is a Nehalem 5570 with a quad 2.93 GHz HyperThreaded chip with 8 MB caches and 1333 MHz RAM bus speed. Real world I’d expect the Nehalem to deliver at least four times the request handling ability as the 3060, if not more. If you’re using Oracle, JBoss, or almost any other piece of enterprise commercial software out there, you, the customer, can leverage the best hardware and get more bang for your license buck. You can upgrade and quadruple your real world performance for free (like you’ve been able to do for years and years). If you’re on ATG, the modern server will quadruple your price instead.

So if you’re an ATG customer, ATG partner, or ATG employee, be aware of this issue, and try to get ATG to adopt socket based pricing. Thanks. Exponentially increasing software costs hurt the customers in the short term, and will hurt ATG in the long term.

Terrible Code

Monday, November 2nd, 2009
request.setParameter("qualifySkus", getSkusRepository(d, cItem));
  1. “qualifySkus” is confusing. Is it an array/list/collection of “qualifiedSKUs” or a flag that’s a result of “qualifyingSkus” or….
  2. “qualifySKus” should be a constant with a nice comment, not an in-line String.
  3. The method getSkusRespository seems like it would return a catalog repository, doesn’t it? Instead it takes in a List of String SkuIds, loads up the corresponding SKU RepositoryItems, removes any that have the property “isLive” set to false, and removes any that have a current inventory stock level of zero. It then returns an ArrayList of those filtered SKU RepositoryItems. Perhaps a better name might be “getLiveInStockSKUs”?
  4. What on earth is “d”? Even looking at the full code of this class, it’s very difficult to tell what d is meant to contain. It’s actually a List of Strings of SkuIds that are qualifying skus for a given promo. “qualifiedSkus” would be a better name.
  5. cItem is a commerce item. However it’s not actually used by the getSkusRepository method at all. There’s no reason to pass it in.
  6. This line is in an ATG droplet and shoves the result of the getSkusRepository method into a request param before servicing an oparam. However, as you can see, it doesn’t inspect the output of the method. As I explained above, the method actually filters a list of SKUs based on isLive and current inventory state. It’s very possible that there will be no live and in-stock SKUs, and the param’s value will be null or an empty list. In that case, we’d actually want to render a different oparam, which is defined and called elsewhere, but not here. Validate your output!

That’s six issues in one line. Please don’t write code like this.

Flush the Cache Droplet Upon CA Deployments

Monday, October 12th, 2009

Hopefully you’re already using the ATG Cache Droplet extensively in your ATG eCommerce application, as I recommended in my ATG Performance Tuning post on Improving JSP Serving Time for an ATG Application.  If you are, you’re probably using a smaller value for the cacheCheckSeconds parameter than you’d like, in order to prevent stale data after CA deployments update the catalog, media, or promo repositories.

You can solve this problem by using a component triggered by Deployment Events from the DeploymentAgent, which after a successful deployment flushes the Cache Droplet’s cache.  This should allow you to set a very long cache expiration time using the cacheCheckSeconds param, and not have to worry about displaying outdated data.

I’ve added this code into the open source ATG eCommerce framework Foundation, hosted by Spark::red, the best ATG Hosting Company :)

There are three parts to this solution: the Java class, it’s properties file, and adding it to the DeploymentAgent’s list of event listeners.

DeploymentEventCacheDropletInvalidator.java

/**
 * Copyright 2009 Devon Hillard (devon@digitalsanctuary.com)
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *       http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */
package org.foundation.deployment;

import atg.deployment.common.event.DeploymentEvent;
import atg.deployment.common.event.DeploymentEventListener;
import atg.droplet.Cache;
import atg.nucleus.GenericService;
import atg.nucleus.ServiceException;

/**
 * The Class DeploymentEventCacheDropletInvalidator will flush the Cache droplet's cache upon a successful CA
 * deployment.
 *
 * @author Devon Hillard
 */
public class DeploymentEventCacheDropletInvalidator extends GenericService implements DeploymentEventListener {

    /** The Cache Droplet. */
    private Cache mCacheDroplet;

    /** The event state. */
    private int mEventState;

    /** The active. */
    private boolean mActive;

    /**
     * Deployment event handling method. This will flush the Cache droplet's cache after a CA deployment completes.
     *
     * @param pEvent the deployment event
     *
     * @see atg.deployment.common.event.DeploymentEventListener#deploymentEvent(atg.deployment.common.event.DeploymentEvent)
     */
    public void deploymentEvent(final DeploymentEvent pEvent) {
        if (isActive() && (pEvent.getNewState() == getEventState())) {
            if (isLoggingInfo()) {
                logInfo("DeploymentEventCacheDropletInvalidator.deploymentEvent:" + "Deployment has completed.");
            }
            getCacheDroplet().flushCache();
            if (isLoggingInfo()) {
                logInfo("DeploymentEventCacheDropletInvalidator.deploymentEvent"
                        + "Cache droplet cache has been flushed.";);
            }
        }
    }

    /**
     * Do start service.
     *
     * @throws ServiceException the service exception
     *
     * @see atg.nucleus.GenericService#doStartService()
     */
    @Override
    public void doStartService() throws ServiceException {
        if (isLoggingInfo()) {
            logInfo("DeploymentEventCacheDropletInvalidator.doStartService:" + "starting up.");
        }
        if (getCacheDroplet() == null) {
            throw new ServiceException("DeploymentEventCacheDropletInvalidator: cache droplet was not set.";);
        }
        if (getEventState() == 0) {
            throw new ServiceException("DeploymentEventCacheDropletInvalidator: event state was not set.");
        }
    }

    /**
     * Do stop service.
     *
     * @throws ServiceException the service exception
     *
     * @see atg.nucleus.GenericService#doStopService()
     */
    @Override
    public void doStopService() throws ServiceException {
        if (isLoggingInfo()) {
            logInfo("DeploymentEventCacheDropletInvalidator.doStopService:" + "stopping.");
        }
    }

    /**
     * Gets the Cache Droplet.
     *
     * @return the cacheDroplet
     */
    public Cache getCacheDroplet() {
        return this.mCacheDroplet;
    }

    /**
     * Sets the Cache Droplet.
     *
     * @param pCacheDroplet the cacheDroplet to set
     */
    public void setCacheDroplet(final Cache pCacheDroplet) {
        this.mCacheDroplet = pCacheDroplet;
    }

    /**
     * Gets the event state.
     *
     * @return the event state
     */
    public int getEventState() {
        return this.mEventState;
    }

    /**
     * Sets the event state.
     *
     * @param pEventState the new event state
     */
    public void setEventState(final int pEventState) {
        this.mEventState = pEventState;
    }

    /**
     * Checks if is active.
     *
     * @return true, if is active
     */
    public boolean isActive() {
        return this.mActive;
    }

    /**
     * Sets the active.
     *
     * @param pActive the new active
     */
    public void setActive(final boolean pActive) {
        this.mActive = pActive;
    }

}

DeploymentEventCacheDropletInvalidator.properties

$class=org.foundation.deployment.DeploymentEventCacheDropletInvalidator
$scope=global

# If true, this service will invalidate the Cache droplet's cache when the appropriate deployment event is fired.
active=true

# The cache droplet to invalidate.  Defaults to the standard Cache droplet.
cacheDroplet=/atg/dynamo/droplet/Cache

# The deployment event's newState to trigger a cache flush of the Cache droplet.
# IDLE = 1;
# DEPLOYMENT_COMPLETE = 2;
# DEPLOYMENT_DELETED = 7;
# EVENT_INTERRUPT = 6;
# ERROR = 3;
# BEGIN_LOCK = 201;
# DONE_LOCK = 202;
# ERROR_LOCK = 203;
# BEGIN_PREPARE = 301;
# DONE_PREPARE = 302;
# ERROR_PREPARE = 303;
# BEGIN_CREATE = 401;
# DONE_CREATE = 402;
# ERROR_CREATE = 403;
# BEGIN_INSTALL = 501;
# DONE_INSTALL = 502;
# ERROR_INSTALL = 503;
# BEGIN_LOAD = 601;
# DONE_LOAD = 602;
# ERROR_LOAD = 603;
# BEGIN_APPLY = 701;
# BEGIN_APPLY_COMMITTED = 702;
# DONE_APPLY = 703;
# ERROR_APPLY = 704;
# ERROR_APPLY_COMMITTED = 705;
# BEGIN_ACTIVATE = 801;
# DONE_ACTIVATE = 803;
# ERROR_ACTIVATE = 804;
# BEGIN_STOP = 901;
# DONE_STOP = 902;
# ERROR_STOP = 903;
eventState=1

/atg/epub/DeploymentAgent.properties

deploymentEventListeners+=\
	/foundation/deployment/DeploymentEventCacheDropletInvalidator

——————————
Edit: at least in our cluster, the DEPLOYMENT_COMPLETE (2) event is never triggered on the client side by the DeploymentAgent, so I’ve switched this to use the transition to IDLE (1).

ATG Commerce MC Edition Hosting Bundle

Monday, August 24th, 2009

ATG offers a “starter pack” of their eCommerce solution for smaller customers called the MC Edition. ATG Hosting provider Spark::red has just announced a special ATG eCommerce Hosting Bundle designed for customers with the ATG MC Edition eCommerce licenses.

Spark::red is pleased to offer a hosting package designed specifically for ATG MC4 customers. By providing an affordable solution tailored to the licenses and needs of the MC4 customer, and by delivering all of the required Oracle, JBoss, and RedHat licenses and support agreements, this package is the fastest way to launch an ATG Commerce application.

Spark::red is the only hosting provider with a package designed for ATG MC Edition customers. This should make it much easier for mid-market customers to get started with ATG’s eCommerce software.