Category: SQL

Rant

Sql Reference Tables

Reference Book Image from http://www.sxc.hu/photo/1022436
Disclaimer: Personally, I prefer to use uniqueidentifier as my primary key data type. It cuts down on improper joins (similar looking values aren’t in every table, ie 1, 2, 3) and is infinitely portable among environments. I think the extra space and overhead is worth it.

We have some reference tables in our database at work. This itself isn’t abnormal, everyone has these. What made these tables special was that the IDENTITY Primary Key for the table started at 0. At first I thought that this was awesome. I know it is a debate for another day, but I love that my arrays, counters, etc start at 0 in C#. I know it may not be necessary anymore or whatever, but it comforts me, okay? 🙂

Things soon changed for the worse when I realized that 0 being a significant data value was a problem. “Why?”, you may ask. Well, when you just generically “new up” an int, what is the default value? That’s right, 0. So, if someone doesn’t set a property when trying to identify some reference value, in a 1-based table (or a GUID one) you will get an error because that isn’t a valid value. However, in a 0-based table, that “forgetfulness” will match right up with a value and go in no questions asked. You can’t validate it, because 0 is perfectly fine to have.

Lesson Learned: Zeroes are fine for arrays and loops, but keep them out of reference tables!

Code Optimization

What Cost, Functions?

This post is about Sql, specifically in the Sql Server RDBMS, but I believe that the principles will hold up across platforms.

If you are using a function in a WHERE clause in your SQL, like the following:

SELECT a.id, a.name
FROM some_table a
WHERE a.hash_value = fn_hash_function('value')

The function will be evaluated for every row in your table. This will result in performance decreasing exponentially as your table grows.

A better way to accomplish this query is to do the following:

DECLARE @input_hash varchar(1000)
SET @input_hash = fn_hash_function('value')

SELECT a.id, a.name
FROM some_table a
WHERE a.hash_value = @input_hash

This code will only evaluate the function one time and as long as you have the comparison column indexed properly, you will see very impressive results. On one query on my local machine, Sql Profiler showed that query that was taking over 7000ms begin to return in about 70ms.

The performance gains in that exact instance were less drastic on a Sql Server running on an actual server (it had just been powering its way through), but in time it would have gotten unbearably slow, as well.

Hope that helps someone.