MSSQL, jTDS, NVARCHAR and Slow Indexes

Mr. Slow

Mr. Slow by Vox

An application I’ve built is going into production soon. It’s the first application I’ve been involved with which will be using MSSQL server in production, I have some learning about MSSQL to do. After some research, I ended up using the jDTS JDBC driver instead of the Microsoft JDBC driver (which is feature incomplete and has some serious open issues).

We recently began performance testing and saw some odd behavior. Initially the application was performing well. However after few runs of the stress test the performance went from good to awful. The main web service call went from 600 ms to 23,000 ms. The database server’s CPU was pinned, and the app servers were barely loading, spending all their time waiting for the database server to return queries. Stranger still, my local instance (running against PostgreSQL) performed well with the same code and same stress tests. Luckily a smart MSSQL DBA was able to figure out why the database was burning so much CPU and responding so slowly.

One of the primary queries is against a table which has been growing. The select query is simple and had an indexed column in the WHERE clause. Even as the table grew to over a million rows, it should have been a very quick query. Unfortunately it was taking several seconds to complete. My local instance had over 30 million rows in the same table in PostgreSQL and the query was lightening fast. The DBA discovered that the query execution was converting the indexed varchar column into nvarchar values for comparison with the query parameter used in the WHERE clause which was inexplicably coming over as an nvarchar. This datatype mismatch between the column definition and the query parameter meant that MSSQL was doing a scan of the million+ record index instead of the almost instant seek it should have been doing.

It turns out that jTDS sends String parameters over as Unicode (nvarchar) by default. It’s easily fixed by adding this property to your connection-url:

sendStringParametersAsUnicode=false

That immediately fixed the performance issues.

So, if you’re using jTDS and are using indexed varchar columns in your queries, you should add the property above to avoid your indexes being wasted and your queries running slowly.

Why I love Debian (and PostgreSQL)

I woke up this morning, got online through my new UMTS/HSPDA modem, and discovered that one of my servers had a load average of 239+

Not the best way to start the day.

Turns out an rsync backup job between two servers had gone nuts and was spinning them up through the roof. A couple

sudo killall rsync

commands, and everything started settling down.

What’s important to note is that while the CPU was pegged, and the load average was over 200, I was still able ssh in, run top, ps, netstat, and navigate around looking at log files, with very little delay. It was just a little slow, but no more than a second wait for anything.

Also, all of my apps which used MySQL as their backend were all dead, with “database refused connection” errors. All of my apps which used PostgreSQL as their backend were a little slow, but were all still up and functioning without errors.

So, Debian and PostgreSQL for the win!

Oracle Export (exp) and Initial Extent Size Issues

If you have a large database in Oracle, with a tablespace with say 2 gigabytes worth of data in it, and you then go in a delete a large number of rows from a large number of tables, and shrink it down to about 300 megabytes worth of data, and then you create an Oracle export using exp, you might expect you could then import this Oracle dump file, into another database, and have it take up 300 MB.

You’d be wrong.

The dump file ends up with all of the create table and create index commands using an INITIAL extent storage setting based on the size of the old table at its fullest. So when you run the import of the dump file, it basically eats up 2 gigabytes of tablespace for 300 MB of data. You can’t edit the INITIAL values in the dump file, since it’s binary, and if you edit it, you corrupt it. Oracle doesn’t seem to have any great ways to fix this, so here’s my hack:

  1. Do the full export, with compress=n (this is useful regardless).
  2. Generate a create tables script (I used my SQLDeveloper GUI client) that just creates the tables (no INITIAL settings)
  3. Generate a create constraints script (I used my SQLDeveloper GUI client) that just creates constraints/indexes
  4. Run the create tables script on the new database
  5. Run the import with these options: ignore=yes constraints=no indexes=no
  6. Run the create constraints script

Now you have a 300 MB database. If you export from this, you end up with an export file that will create other 300 MB databases and you can share it with your friends.

Good luck!

P.S. Oracle DBAs might have a better way of doing this. I don’t know.

Why Oracle makes me nuts

When left to my own devices I tend to use PostgreSQL or MySQL. They’re both free, open source, run on every platform I use, and I understand how to use them, tune them, etc…. However, when I’m building an application and database scripts that will be deployed on Oracle, and using an application which doesn’t really support PostgreSQL or MySQL for production anyhow, I will use Oracle. I don’t have much against Oracle really, mostly I just don’t understand it’s inner workings very well, and usually I only deal with it when it breaks.

Installing Oracle on Mac OS X has been possible for a while, but has been FAR from easy. More often than not the installers would throw fatal errors and die horrible deaths. However, you can now get a pre-packaged Oracle 10 XE installation running on Linux as a downloadable Parallels image.

http://ptn.parallels.com/en/ptn/desc/?VA_id=169

Which takes a lot of the pain away. Just download, run, and poof you have a running Oracle 10 instance with a web admin, and you can plug your SQLDeveloper, JBoss, ATG, etc… into it.

Until it starts throwing errors. Not Oracle. Oracle just sits there happily, but your applications start blowing up. Not at first, but when you start trying to DO stuff. Like say running a bunch of install scripts, and it starts failing 1/2 way through. I finally tracked it down as a known issue, mentioned on a few blogs like this one:

http://it.newinstance.it/2007/06/01/ora-12519-tnsno-appropriate-service-handler-found/

So there’s a hack/fix at least, but come on Oracle! When your database falls over faster than any open source database you have issues.

This isn’t an “I hate Oracle” rant. When it comes to huge production systems I’m not 100% sure I’d pick anything other than Oracle on Veritas. But I just wish they made it easier to use all around. Has anyone seen a very serious production system using PostgreSQL? I’m sure they exist, but I haven’t seen one. By serious I mean 8,000+ simultaneous jdbc connection, tens thousands of transactions/minutes, tables with 10+ million rows, and queries that do multi table joins across tables of that size, and tested disk, hardware, and server failover. I’m not doubting that it can be done, but who’s doing it?