Search engines like Google will often index content with params like JSESSIONID and other session or conversation scope params. This causes two problems: first the links returned in the Google search results can have these parameters in them, resulting in “session not found” or other incompatible session state issues. Secondly it can cause a single page of content, to be indexed multiple times (with differing parameters) this diluting your page’s rank.
I’ve posted two solutions to this issue in the past: Using Apache to ReWrite URLs to remove JSESSIONID and a more advanced solution of using a Servlet Filter to avoid adding JSESSIONID for GoogleBot Requests.
Now there’s an even better way to handle this. Google has added an amazing new feature to their Webmaster Tools which allows you to specify how the GoogleBot indexer should handle various parameters. You can ignore certain parameters such as JSESSIONID, cid, and others, and also specifically not ignore other parameters such as productId, skuId, etc…
Log into your Google Webmaster Tools, and select the site you wish to work with. Under “Site Configuration” -> “Settings” there is a new section at the bottom called “Parameter handling”. Click on “adjust parameter settings” to expand the parameter handling configuration for your site. Sometimes Google will suggest various parameters it has discovered while crawling your site, and other times you just enter the parameters you want Google to ignore or pay attention to.
This is a much more elegant solution to the JSESSIONID problem, and also allows you to easily handle other parameters your site may use for either session state or dynamic content generation correctly. The only downside is that this only impacts Google, whereas with the correct configuration my older two solutions can handle any Search Engine Bot. Maybe other search providers will or do provide a similar feature.