Site Network: Personal | Professional | Photography

Technical Blog

This blog will contain content related to Java, Seam, Security, my sites and projects, as well as other technical subjects I am interested in.

Comments and questions are welcome!

Google Speed!

Saturday, August 1st, 2009

Google has a nice collection of tutorials, tech talks, etc… related to making websites run faster. You can find it here: Let’s make the web faster.

I strongly disagree with the first tutorial however. CSS: Using every declaration just once.

The premise of the tutorial is that in order to make your website faster, you should make the code you send to the browser smaller (TRUE) and that one of the best ways to optimize CSS files is to use every declaration just once. This second part is the part I disagree with so strongly.

Basically the author recommends reversing the standard organization of the CSS declarations. Here are the before and after examples from the tutorial:

Before:

h1, h2, h3 { font-weight: normal; }
a strong { font-weight: normal !important; }
strong { font-style: italic; font-weight: normal; }
#nav { font-style: italic; }
.note { font-style: italic; }

After:

h1, h2, h3, strong { font-weight: normal; }
a strong { font-weight: normal !important; }
strong, #nav, .note { font-style: italic; }

So instead of having your selector/class/id contain all of the relevant styling declarations, you group the relevant selectors for each declaration. It’s changing the traditional one-to-many relationship of selectors and style declarations to a many-to-one relationship.

The upside, and the premise of the article, is that this reduces the size of the CSS by reducing duplicate lines of declarations. This is true, but is a very small upside compared with the downside.

The downside of this is that it makes writing, maintaining, and debugging your CSS MUCH harder. The styles applied to a given selector/element/class/id are now spread out over the entire CSS file.

If you’re a Java developer, this is akin to re-factoring every property in every bean you have into their own interface (since Java doesn’t do multiple inheritance) and having each bean implement a huge number of these interfaces. It’s backwards, and makes maintaining your application amazingly difficult.

The other thing is that the CSS file size savings are only 20-40%. If you’re gzipping your CSS files on your web server and setting a far futures expiration header (and you are doing both of those things, right?), then the savings are closer to 2-4%. Totally not worth it considering the downsides above.

MSSQL, jTDS, NVARCHAR and Slow Indexes

Saturday, May 23rd, 2009
Mr. Slow

Mr. Slow by Vox

An application I’ve built is going into production soon. It’s the first application I’ve been involved with which will be using MSSQL server in production, I have some learning about MSSQL to do. After some research, I ended up using the jDTS JDBC driver instead of the Microsoft JDBC driver (which is feature incomplete and has some serious open issues).

We recently began performance testing and saw some odd behavior. Initially the application was performing well. However after few runs of the stress test the performance went from good to awful. The main web service call went from 600 ms to 23,000 ms. The database server’s CPU was pinned, and the app servers were barely loading, spending all their time waiting for the database server to return queries. Stranger still, my local instance (running against PostgreSQL) performed well with the same code and same stress tests. Luckily a smart MSSQL DBA was able to figure out why the database was burning so much CPU and responding so slowly.

One of the primary queries is against a table which has been growing. The select query is simple and had an indexed column in the WHERE clause. Even as the table grew to over a million rows, it should have been a very quick query. Unfortunately it was taking several seconds to complete. My local instance had over 30 million rows in the same table in PostgreSQL and the query was lightening fast. The DBA discovered that the query execution was converting the indexed varchar column into nvarchar values for comparison with the query parameter used in the WHERE clause which was inexplicably coming over as an nvarchar. This datatype mismatch between the column definition and the query parameter meant that MSSQL was doing a scan of the million+ record index instead of the almost instant seek it should have been doing.

It turns out that jTDS sends String parameters over as Unicode (nvarchar) by default. It’s easily fixed by adding this property to your connection-url:

sendStringParametersAsUnicode=false

That immediately fixed the performance issues.

So, if you’re using jTDS and are using indexed varchar columns in your queries, you should add the property above to avoid your indexes being wasted and your queries running slowly.

Smush.it Shrinks Your Images

Tuesday, March 10th, 2009

In my post on Improving Secondary Asset Loading Time in order to increase the performance of your web application I briefly touched on shrinking your media files to the smallest size that will give you acceptable quality.

Rather than waste your time playing with Photoshop export settings for each image, you can now use Smush.it, a web applicaiton built to shrink your image files for you:

Performance just got a little bit easier. Optimizing images by hand is time consuming and painful. Smush it does it for you.

I’ve used it on a few images, and so far I’m impressed with the results.

Do you have any handy tricks for optimizing the size of your media files?

Additional CDN Information

Friday, January 23rd, 2009

Speed
A few popular commercial CDN services are:

End user performance is typically increased roughly 20-25% by the addition of a good CDN solution. This does not count the reduced capacity requirements of your server farm, which is an added bonus.

Photo by jpctalbot

Improving ATG Performance With a CDN

Monday, January 19th, 2009

Why use a CDN?

A Content Delivery Network, or CDN, is essentially a system of geographically distributed web servers which serve static content, typically images, video, and other bandwidth intensive files. This serves two purposes: it keeps your servers from having to handle those requests and it serves those files to the end user from a low latency server closer to the user (network-wise). Both of these aspects improve the user’s perception of page and site performance. CDNs can also be extremely useful for things like streaming video or other very high bandwidth uses.

How do CDNs work?

CDNs typically work in one of two ways: for some you have to deploy the files to the CDN manually via FTP or some similar mechanism while others work as a transparent proxy automatically loading the files from the source or origin (your servers) into the CDN as users request them. The latter is preferable as you don’t need to take the CDN into consideration when building your application’s page and referencing media, this also makes handling non-production environments more complex. Also it allows the media to be reloaded from the origin based on cache expiration headers, so you don’t need to do anything special during deployments of new media. However those CDN solutions also seem to be more expensive, so it’s a balance you have to weigh yourself.

Roll Your Own Apache Pseudo CDN

You can also roll a pseudo-CDN yourself using Apache. I call it a pseudo-CDN because unlike Akamai and other large providers you don’t get the advantages of hundreds or thousands of geographically distributed servers. You also don’t get lots of fancy math routing user’s requests to the quickest servers based on location, network congestion, and more. What you do get is transparent proxying and off-loading the request handling from your application servers.

This means you don’t have to do anything special or complex when coding your web application and your JSPs to facilitate the CDN, and it means that your application servers are freed up from having to handle the requests for static media, large and small, which means they have more CPU time available for handling the real dynamic processing of your web application.

Apache makes this simple by way of the mod_disk_cache module. I’d recommend avoiding the mod_mem_cache. Even though it sounds like it would be the preferred caching mechanism, I have had significant problems with mem_cache, and have abandoned it. If you’re using Linux (and you should be) the kernel’s ability to aggressively cache recently accessed files means that when you’re using mod_disk_cache, Apache will cache the files you specify on the local hard drive and will use all available RAM to cache those files in memory for rapid serving. If you plan on using mod_gzip and mod_disk_cache together, please read my post on the issues encountered using them together.