Wednesday, January 28, 2004

A Few Tips About SQL Joins

First of all, let me apologize for being gone for a few days. Things have been hectic at work. We just demoed our new product and the demo went -great-.

Today I thought I would write about the various kinds of joins available to programmers using SQL. It's a bit of a beginners blog, really. But an important one if you're not familiar with the details of how joins work.

Most databases using SQL support three kinds of joins:


  1. Inner Join
  2. Left Join
  3. Right Join


I'll talk a little about each one.

Inner Join


The inner join is the most efficient join. It pulls two tables together based on key values. Only records that actually contain the same key value in both tables are pulled into the result set. For instance, if I have a customers table and a customer with a customerid of 1, and a purchases table with 10 purchases for customer id 1, then doing an Inner Join on the customers and purchases table using the customerid would return 10 records. If I have a customer id of 2, but no purchases records, then doing an Inner Join on the two tables would return 0 records.

Left Join


Left Join is in some SQL syntax called Left Outer Join. It takes all of the records from the table on the left and joins it with all matching records on the table to the right, filling the field values with Null if there is no matching record. So, for example, a left join statement like...

Select * from customers left join purchases on customers.customerid = puchases.cusomterid where customers.customerid = 1

...would return at least one record for customer 1, even if the customer had no purchases. If the customer did have purchases, all of those purchase records would be returned.

Right Join


Right Join is in some SQL syntax called Right Outer Join. It takes all of the records from the table on the right and joins it with all matching records on the table to the left, filling the field values with Null if there is no matching record. Right Joins aren't used very often since they aren't as intuitive as left joins. So, for example, a right join statement like...

Select * from purchases right join customers on customers.customerid = puchases.cusomterid where customers.customerid = 1

...would accomplish the same thing as the Left Join example above.

Join Performance


As stated early, the Inner Join gives you the best performance, since only matching records are joined. This allows efficient use of the table indexes. Right and Left joins join all records regardless of whether or not they match, and so the indexes don't help as much. There's also the issue of Nulls, which different databases handle differently, and different programming languages handle differently. Nulls are messy and require extra code to deal with in VB.Net, so I would suggest avoiding Right and Left joins unless you just absolutely need them. Don't use them as a short cut.

NOTE: Always join on columns that are indexed! Joining on unindexed columns for tables with large amounts of records is horribly inefficient and slow.