Over the years, I have learned that one thing that I really love to do is performance tuning… especially on a database. I started learning database design and development before I even stepped foot into indexes and I really wish that I had really learned how a database worked before I just jumped into SQL. I have done a lot of research and testing across several different databases and have learned a lot of ways to speed up queries. Database design is, of course, essential. However, if I had to choose the number 1 way to effectively speed up your database and properly handle your data, I would definitely choose to employ great index design (and maintenance).

Indexes are often overlooked, but are crucial when it comes to database performance. Anyone who has ran a query that takes more than a second should look into indexes. I have had pages go from being over 3,500ms to less that 50ms all due to proper index implementation.

Consider, you were looking for a check that you wrote recently. You need to know the check number, the date that it was written, the recipient, memo line and the total amount. We can assume that the unique identifier for this situation would be your check number. Without proper indexes, you are telling your database to start from the first check that you have ever written your entire life. No one would look for a check in this manner. Also, you would typically start looking at more recent checks and not checks from 2-3 years ago. The order in which you retrieve data is also important when you are creating your indexes. So, since we know we will pull check information based on the most recent check numbers first, it only makes sense to have one index on the checkNumber column, remembering to select descending order (by default, most databases will create ascending which is rarely how most data-driven systems need to access data).

Now, I want to consider another situation where we may want a new index on our check book information. Let’s assume we received our bank statement for last month and it is not balancing correctly. We want to look through our checks for that month to see what the problem is. With our last index (descending CheckNumbers), we we would actually start looking at our last check first. So, if we had written 100 checks since the last bank statement, we would be looking through 100 records that we knew we didn’t need anyway. This is why it would be good to put an index on our date field. Now, assuming that we would be more interested in more current data than data that could be decades old, I would create the date index descending as well. There is actually nothing wrong with having multiple indexes on the same column (one ascending and one descending). It all depends on how you plan on using your data. I know that most databases will typically choose the appropriate index (please comment if you know otherwise), but you can also force your sql statement to use the index you know will be accurate. Keep in mind that with growing data, you also have growing indexes… meaning an index that is great today may not be so great a year from now.

You should always consider what data you will filter on and that is what you will most likely need indexes on. I understand that you should always be aware of your environment and application; there is truly no need to have a dozen indexes on a table that is extremely small or not used frequently. I think the best rule of thumb is to simply be on the lookout for issues and understand when implementing the index could resolve your issue.

Another good way to look at indexes is to think of a book. The database is a book full of pages. The indexes separate a column into pages. For example, say you have the letters of the alphabet to look through. That would be 26 records. Say you create an index descending and an index ascending on the alphabet. For this example, let’s assume that we will store 10 records on each index page. So, it would have 3 pages (26/10 = 2 full pages + 1 page with 6 records). It would look something like this:


Index Page 1 Index Page 2 Index Page 3
A
B
C
D
E
F
G
H
I
J

K
L
M
N
O
P
Q
R
S
T

U
V
W
X
Y
Z

So, in our example, let’s say that we want to look for the letter Q. Without an index, our database would look through every record to find what it needed, even after it located Q. With the proper data type and the index created above, the database would read the pages similar to a person flipping through a dictionary. If you knew you were looking for “quick” in a dictionary, you wouldn’t start on the page that has A – J (like Index Page 1), you would skip to the page that had K – T (like Index Page 2) and look for the Q’s. You would skip any other pages after you find the word “quick”, because you know that “quick” will never be found ascending (or alphabetically in this example) again.

Another thing to keep in mind is that records are stored from first to last. So, consider you are filtering queries based on the last name of a customer. Those last names could be in any kind of order in your table. Let’s assume we have a Customers table similar to this:

Last Name First Name
Jones Jim
Cook Debra
Smith James
Jones Timothy
West Joe

Notice, they are stored from top to bottom as Jones, Cook, Smith, Jones, West. If your database has to look through those records from beginning to end it will have to go through every single record in the database. So, if you were looking for Mr. Cook, the database would look at all 5 records above. Starting at Jones, then Cook, Smith, Jones and finally West, even though it already found Cook in the second record. Would you look through an entire address book looking for a contact that you already found on the first page of your address book?

Now, consider we have all last names in an index organized ascending. Now, they will look more like this:

Last Name First Name
Cook Debra
Jones Jim
Jones Timothy
Smith James
West Joe

Now, when the database scans those pages you can see that it will only have to look at the first record. As soon as it found Cook and realized that Jones was the next in the list it would know that it had found the only Cook even though it hadn’t even looked through the rest of the records.

There are other factors in some database index design, such as padding (which would basically just leave room on the page), that I did not cover in this post. Hopefully, I was able to give a few examples that may be helpful for you. The simplest guidance I could probably offer is to think of index pages the same way you would think of flipping through pages of a book. When you know how a book is organized you know exactly where to find what you are looking for. A database is the same way. No matter how much data it holds, as long as it knows where to look it can be unbelievably “quick”. :-)

Always consider using databases for YOUR CLIENTS/CUSTOMERS and not for YOURSELF. It doesn’t matter what you think about them, if it can help your customers application or your own application’s users, then I would suggest using them. I once programmed an application that had years worth of data. I had implemented indexes on the development database server and on the testing database server, however, when the application launched it was SUPER SLOW! The first thing I considered was it had to be an index issue. I wondered what I did wrong and when I went to look at the production/live database, I realized that the indexes didn’t exist. I questioned the “DBA” about the indexes and they said “Well, they are searching for two years worth of data… they should know it is going to take long!!” Of course, they would expect it to take long, but my philosophy is that your job is not to meet their expectations… it is to exceed them!

There are other factors in some database index design, such as padding (which would basically just leave room on the page), that I did not cover in this post. In the future, I will try to go into index maintenance. Please feel free to write or comment with any questions, corrections, better methods, etc.

Goodnight!