Sometimes Bad Things Happen
Ive seen it many times and im sure i will continue to. But sometimes, people do bad things in SQL Server. JOINS using ORs are one of them.
People are inheritently lazy and i certainly fall under this umberella at times. However, being lazy rarely helps the optimiser do its best work. Using OR conditions in your JOINS is bad news bears for the optimiser most the time. Lets see what happens…
Show and Tell
Let’s take a relatively simple query and run it against the StackOverflow2010 database.
For transparency purposes, I have trimmed down the size of the two tables in question so their row counts here are:
- Users table – 19000
- Posts table – 59000
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId OR U.Reputation = P.OwnerUserId
The keenest eye will see that this query isn’t actually useful at all really considering what we’re joining on, however, that’s not important for this task. You will notice that were joining on ID OR the Reputation field from the Users table. I know, it’s not pretty.
How does this query run? Well lets have a look.
In Ran Fine In Development
When we run this query, you might as well go get yourself a coffee and do a few chores around the house, as its not quick.
Finally after 3 minutes 40 seconds, the query finished. What took so long? Its such a simple query, right?
Well, looking at the execution plan, we can see that we took a whopping 2 minutes plus spooling data to TempDB:
This is explained a little more when we look at our logical reads in the messages tab. We did a huge amount of reads against our worktable within TempDB:
Why so many reads though? Well, again, the eagle eyed amongst you will have noticed that our friendly operator the Table Spool is utilised by a nested loop operator. Dare we look at the number of executions on the Table Spool?
So this operator executed 19000 times? Sad trombone.
Looking For A Way Out
So, we now know how ‘not’ to write our
silly demo query, but how can we improve it?
I have had to overcome this sort of logic several times now at work, despite feeding back into the development teams, I still see it every few months. The way I find most efficient is to utilise our friend the UNION ALL function.
See, the problem with our original query is obviously the OR operator within our JOIN. This can quite frankly confuse SQL and cause it to do some really bad decisions regarding how is best to run our query.
Of course, we want to remove that nasty table spool all together if possible. That took over 50% of our total run time.
Lets split that OR up into two seperate queries first. Here is the first half
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId
The first query simply joins the Users.Id on the Posts.OwnerUserID. Nothing else. This query runs in less than a second (78ms CPU time to be precise). Our plan is also looking healthier now. The nested loop operator has gone and it’s been replaced by a Hash Match join, which is better suited when looking at the record counts passing through the plan.
Lets check on those reads next…
Ok then, so far, so good. We are in a much better place. Next let’s do the same for our other half of the query
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Reputation = P.OwnerUserId
The other half of our query runs in about the same time as the other half. So, we now have two very quick queries. Right, so lets put them together using the UNION ALL operator
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId UNION ALL SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Reputation = P.OwnerUserId
How does it run now I hear you ask? Insanely quickly, that’s how it runs. We clock up just 208ms elapsed time now. Looking at our plan, we have a nice plan shape which looks sensible. Not perfect, but a lot better than it did.
Looking at our reads in the messages tab, the biggest hitter is against our Posts table with 25,970, which for now is totally acceptable.
So, as we file this one under ‘how to confuse SQL Server’, what have we learned?
Using the logic of OR within our join made SQL do some silly things. Using this logic we invoked a table spool. We then spent a long two minutes writing the Posts.OwnerUserID to disk.
Once SQL had done that, we then looped through 1900 times matching up those OwnerUserIDs.
Even writing the process feels long and laborious.
Simplifying the query to do one of those joins at a time and then finally bringing those two result sets together at the end, totally removed the Table Spool operator. You are then left with a very fast and efficient query.
Thanks for reading.