Make Google Ignore JSESSIONID

Home/ATG, JBoss, Seam/Make Google Ignore JSESSIONID

Search engines like Google will often index content with params like JSESSIONID and other session or conversation scope params. This causes two problems: first the links returned in the Google search results can have these parameters in them, resulting in “session not found” or other incompatible session state issues. Secondly it can cause a single page of content, to be indexed multiple times (with differing parameters) this diluting your page’s rank.

I’ve posted two solutions to this issue in the past: Using Apache to ReWrite URLs to remove JSESSIONID and a more advanced solution of using a Servlet Filter to avoid adding JSESSIONID for GoogleBot Requests.

Now there’s an even better way to handle this. Google has added an amazing new feature to their Webmaster Tools which allows you to specify how the GoogleBot indexer should handle various parameters. You can ignore certain parameters such as JSESSIONID, cid, and others, and also specifically not ignore other parameters such as productId, skuId, etc…

Log into your Google Webmaster Tools, and select the site you wish to work with. Under “Site Configuration” -> “Settings” there is a new section at the bottom called “Parameter handling”. Click on “adjust parameter settings” to expand the parameter handling configuration for your site. Sometimes Google will suggest various parameters it has discovered while crawling your site, and other times you just enter the parameters you want Google to ignore or pay attention to.

Google Webmaster Tools Parameter Handling Interface

This is a much more elegant solution to the JSESSIONID problem, and also allows you to easily handle other parameters your site may use for either session state or dynamic content generation correctly. The only downside is that this only impacts Google, whereas with the correct configuration my older two solutions can handle any Search Engine Bot. Maybe other search providers will or do provide a similar feature.

By | 2017-05-18T15:15:57+00:00 February 16th, 2010|ATG, JBoss, Seam|4 Comments

About the Author:

4 Comments

  1. Pete Coultas February 17, 2010 at 3:34 am - Reply

    Hi,

    We used canonical tagging, something like . See http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

    This not only gets around issues like JSESSIONID being indexed by Google, but also helps massively with duplicate content. It should also work on other search engines as it is not Google specific.

    Worked like a dream for us!

    Pete.

    • Devon February 17, 2010 at 6:42 am - Reply

      Pete,

      very good point, and thanks for bringing it up. Canonical tagging is very powerful and I’m hoping to see it used more and more often. It’s still “relatively” new to the scene and I haven’t seen a ton of sites making use of it. I’m glad to see you have adopted it and found it useful.

      However, at least for many ATG based sites, many pages may take in a large number of content altering parameters (productId, skuId, reviewsPaginationIndex, etc…) so it may be much simpler to just ignore the “bad” parameters, rather than writing the code to dynamically construct the correct canonical link. It would be pretty easy to make a droplet which has a list of “bad” params, and generates the canonical link just by “cleaning” the requested URI. Not perfect, but simpler to implement.

      Devon

  2. Eduard Roth October 3, 2014 at 3:34 am - Reply

    Use of canonical doesn’t help too much in our case. Google has indexed several pages with JSESSIONID in the url.

    • Devon October 6, 2014 at 8:08 am - Reply

      Eduard,

      is the issue that canonical tagging didn’t prevent Google from indexing pages with JSESSIONID on them, or that you already had URLs like that in the Google index, and now switching to canonical doesn’t remove them?

Leave A Comment