performance

/Tag: performance

Improving ATG Performance With a CDN

Why use a CDN?

A Content Delivery Network, or CDN, is essentially a system of geographically distributed web servers which serve static content, typically images, video, and other bandwidth intensive files. This serves two purposes: it keeps your servers from having to handle those requests and it serves those files to the end user from a low latency server closer to the user (network-wise). Both of these aspects improve the user’s perception of page and site performance. CDNs can also be extremely useful for things like streaming video or other very high bandwidth uses.

How do CDNs work?

CDNs typically work in one of two ways: for some you have to deploy the files to the CDN manually via FTP or some similar mechanism while others work as a transparent proxy automatically loading the files from the source or origin (your servers) into the CDN as users request them. The latter is preferable as you don’t need to take the CDN into consideration when building your application’s page and referencing media, this also makes handling non-production environments more complex. Also it allows the media to be reloaded from the origin based on cache expiration headers, so you don’t need to do anything special during deployments of new media. However those CDN solutions also seem to be more expensive, so it’s a balance you have to weigh yourself.

Roll Your Own Apache Pseudo CDN

You can also roll a pseudo-CDN yourself using Apache. I call it a pseudo-CDN because unlike Akamai and other large providers you don’t get the advantages of hundreds or thousands of geographically distributed servers. You also don’t get lots of fancy math routing user’s requests to the quickest servers based on location, network congestion, and more. What you do get is transparent proxying and off-loading the request handling from your application servers.

This means you don’t have to do anything special or complex when coding your web application and your JSPs to facilitate the CDN, and it means that your application servers are freed up from having to handle the requests for static media, large and small, which means they have more CPU time available for handling the real dynamic processing of your web application.

Apache makes this simple by way of the mod_disk_cache module. I’d recommend avoiding the mod_mem_cache. Even though it sounds like it would be the preferred caching mechanism, I have had significant problems with mem_cache, and have abandoned it. If you’re using Linux (and you should be) the kernel’s ability to aggressively cache recently accessed files means that when you’re using mod_disk_cache, Apache will cache the files you specify on the local hard drive and will use all available RAM to cache those files in memory for rapid serving. If you plan on using mod_gzip and mod_disk_cache together, please read my post on the issues encountered using them together.

Improving Secondary Asset Loading Time for an ATG Application

Now that we covered improving the performance of serving the HTML from the JSP, we need to tackle the bigger problem of all of the secondary assets and media that the page loads to display correctly. This includes images, Javascript, CSS, Flash, videos, etc…

The reason that these secondary page assets are so critical for page performance is two-fold. First, there are many more of them for each page than the single HTML file. This means more HTTP connections have to opened and closed, more files have to be transfered from the server to the user’s computer. This takes a lot of time. Second, most of these assets tend to be static, they don’t change very often. This means we can cache them.

Reduce the number of assets

The first thing we need to do is to try to reduce the number of secondary assets which need to be loaded. You can try to simplify the page design to require less assets, reduce the number of images used, replace images with text (which is more accessible and search engine friendly anyhow), etc… Also reducing large files like videos and Flash files can make a significant improvement on page load times. Personally, for things other than video players and similar things, I strongly dislike the use of Flash. There is an impressive amount of rich interface and interaction that can be created using DHTML and AJAX. It generally performs better, loads faster, and is easier to make search engine friendly.

(more…)

Setting Cache Headers from JBoss

Having control over the HTTP response headers allows you to set cache related headers in responses for various content you’d like cached on the browser (or an intermediary proxy). I created the ATG Cache Control DAS pipeline Servlet a year ago, but when you’re using JBoss you need another solution.

Since the DAF pipeline is only executed for JSPs the pipeline Servlet it doesn’t allow you to set headers for the static media items you’re more likely to want to cache. I created a Servlet Filter which allows you to set cache headers in JBoss based on URI patterns. It doesn’t allow the same fine grained control that the old pipeline Servlet does, but it should work for most situations.

Servlet Filters are very similar to ATG pipeline Servlets in that they are executed within the lifecycle of a request and can read the request and modify the response. The filter I created gets configured from the web.xml and sets the response headers relating to caching. You can configure different instances of the filter for each cache time you need, an hour, a day, or a week, and map different URL patterns to the appropriate instances.

I’ve added the filter into Foundation, the open source ATG e-commerce framework project, which is hosted at the Spark::red Development Community.

The Servlet filter code looks like this:

[java] package org.foundation.servlet.filter;

import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.Locale;
import java.util.TimeZone;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletResponse;

/**
* The Class CacheHeaderFilter.
*
* @author Devon Hillard
*/
public class CacheHeaderFilter implements Filter {

/**
* The Constant MILLISECONDS_IN_SECOND.
*/
private static final int MILLISECONDS_IN_SECOND = 1000;

/** The Constant POST_CHECK_VALUE. */
private static final String POST_CHECK_VALUE = "post-check=";

/** The Constant PRE_CHECK_VALUE. */
private static final String PRE_CHECK_VALUE = "pre-check=";

/** The Constant MAX_AGE_VALUE. */
private static final String MAX_AGE_VALUE = "max-age=";

/** The Constant ZERO_STRING_VALUE. */
private static final String ZERO_STRING_VALUE = "0";

/** The Constant NO_STORE_VALUE. */
private static final String NO_STORE_VALUE = "no-store";

/** The Constant NO_CACHE_VALUE. */
private static final String NO_CACHE_VALUE = "no-cache";

/** The Constant PRAGMA_HEADER. */
private static final String PRAGMA_HEADER = "Pragma";

/** The Constant CACHE_CONTROL_HEADER. */
private static final String CACHE_CONTROL_HEADER = "Cache-Control";

/** The Constant EXPIRES_HEADER. */
private static final String EXPIRES_HEADER = "Expires";

/** The Constant LAST_MODIFIED_HEADER. */
private static final String LAST_MODIFIED_HEADER = "Last-Modified";

/** The Constant CACHE_TIME_PARAM_NAME. */
private static final String CACHE_TIME_PARAM_NAME = "CacheTime";

/** The Static HTTP_DATE_FORMAT object. */
private static final DateFormat HTTP_DATE_FORMAT = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);

static {
HTTP_DATE_FORMAT.setTimeZone(TimeZone.getTimeZone("GMT"));
}

/** The reply headers. */
private String[][] mReplyHeaders = { {} };

/** The cache time in seconds. */
private Long mCacheTime = 0L;

/**
* Initializes the Servlet filter with the cache time and sets up the unchanging headers.
*
* @param pConfig the config
*
* @see javax.servlet.Filter#init(javax.servlet.FilterConfig)
*/
public void init(final FilterConfig pConfig) {
final ArrayList<String[]> newReplyHeaders = new ArrayList<String[]>();
this.mCacheTime = Long.parseLong(pConfig.getInitParameter(CACHE_TIME_PARAM_NAME));
if (this.mCacheTime > 0L) {
newReplyHeaders.add(new String[] { CACHE_CONTROL_HEADER, MAX_AGE_VALUE + this.mCacheTime.longValue() });
newReplyHeaders.add(new String[] { CACHE_CONTROL_HEADER, PRE_CHECK_VALUE + this.mCacheTime.longValue() });
newReplyHeaders.add(new String[] { CACHE_CONTROL_HEADER, POST_CHECK_VALUE + this.mCacheTime.longValue() });
} else {
newReplyHeaders.add(new String[] { PRAGMA_HEADER, NO_CACHE_VALUE });
newReplyHeaders.add(new String[] { EXPIRES_HEADER, ZERO_STRING_VALUE });
newReplyHeaders.add(new String[] { CACHE_CONTROL_HEADER, NO_CACHE_VALUE });
newReplyHeaders.add(new String[] { CACHE_CONTROL_HEADER, NO_STORE_VALUE });
}
this.mReplyHeaders = new String[newReplyHeaders.size()][2];
newReplyHeaders.toArray(this.mReplyHeaders);
}

/**
* Do filter.
*
* @param pRequest the request
* @param pResponse the response
* @param pChain the chain
*
* @throws IOException Signals that an I/O exception has occurred.
* @throws ServletException the servlet exception
*
* @see javax.servlet.Filter#doFilter(javax.servlet.ServletRequest, javax.servlet.ServletResponse,
* javax.servlet.FilterChain)
*/
public void doFilter(final ServletRequest pRequest, final ServletResponse pResponse, final FilterChain pChain)
throws IOException, ServletException {
// Apply the headers
final HttpServletResponse httpResponse = (HttpServletResponse) pResponse;
for (final String[] replyHeader : this.mReplyHeaders) {
final String name = replyHeader[0];
final String value = replyHeader[1];
httpResponse.addHeader(name, value);
}
if (this.mCacheTime > 0L) {
final long now = System.currentTimeMillis();
final DateFormat httpDateFormat = (DateFormat) HTTP_DATE_FORMAT.clone();
httpResponse.addHeader(LAST_MODIFIED_HEADER, httpDateFormat.format(new Date(now)));
httpResponse.addHeader(EXPIRES_HEADER, httpDateFormat.format(new Date(now
+ (this.mCacheTime.longValue() * MILLISECONDS_IN_SECOND))));
}
pChain.doFilter(pRequest, pResponse);
}

/**
* Destroy all humans!
*
* @see javax.servlet.Filter#destroy()
*/
public void destroy() {
}

}[/java]

And the web.xml configuration looks like this:

[xml] <filter>
<filter-name>CacheFilterOneWeek</filter-name>
<filter-class>org.foundation.servlet.filter.CacheHeaderFilter</filter-class>
<init-param>
<param-name>CacheTime</param-name>
<param-value>604800</param-value>
</init-param>
</filter>

<filter>
<filter-name>CacheFilterOneDay</filter-name>
<filter-class>org.foundation.servlet.filter.CacheHeaderFilter</filter-class>
<init-param>
<param-name>CacheTime</param-name>
<param-value>86400</param-value>
</init-param>
</filter>

<filter>
<filter-name>CacheFilterOneHour</filter-name>
<filter-class>org.foundation.servlet.filter.CacheHeaderFilter</filter-class>
<init-param>
<param-name>CacheTime</param-name>
<param-value>3600</param-value>
</init-param>
</filter>

…….

<filter-mapping>
<filter-name>CacheFilterOneDay</filter-name>
<url-pattern>*.js</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>CacheFilterOneDay</filter-name>
<url-pattern>*.css</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>CacheFilterOneWeek</filter-name>
<url-pattern>*.jpg</url-pattern>
</filter-mapping>
<filter-mapping>
<filter-name>CacheFilterOneWeek</filter-name>
<url-pattern>*.gif</url-pattern>
</filter-mapping>[/xml]

Improving JSP Serving Time for an ATG Application

Improving the performance of the JSPs that serve your HTML pages is the first step in improving the overall site performance. The user’s browser can not start rendering the page or requesting the secondary media. Also the faster the page request is completed, the sooner you have a thread free to handle the next request.

There are two parts to this: first, the time it takes the JSP servlet to generate the HTML response, and secondly the time it takes to transmit that HTML response back to the user’s browser.

Caching content sections

The easiest way to reduce the time it takes for the JSP servlet to generate the response is by reducing the amount of dynamic content on the page. Or more precisely by reducing the amount of real-time or unique individual content on the page.

The Cache droplet is THE most under-utilized ATG droplet.

The Cache droplet caches the rendered output of the contents of the oparam based on a content key (such as category, user gender, logged in/logged out state, etc..) for a configured period of time. This can be very useful for things like navigation menus dynamically built based on the catalog. The catalog won’t change too often, so this dynamically generated menu can be safely cached for hours. Or for some or all of a category or product page, when you set the key to the category id or product id.

Look at your pages and evaluate what parts of the page don’t change that frequently. Even if you can only cache the page or block for five minutes, that can be a huge performance win.

Read on for more….
(more…)