shows list of posts, pages, images, products and any custom post types!
I will be honest here, at one point I had become a little blind to seeing CXPACKET waits on my SQL instances. As a DBA having to work across lots of different instances, you will see this quite a bit (or at least I have).
Countless times had I heard phrases such as “Yeah nothing to worry about there” and “that’s just indicating parallelism is happening”.
But, dear reader, that’s not exactly true. It took me far too long to realise the following concept and so I thought I would write a post about it.
Parallel queries are good, right?
As with most questions around SQL Server, the answer here is ‘it depends’. I’m going to try and demonstrate how it’s not always the best thing you want to see within your execution plans, in certain scenarios.
For this demo, we will be using an instance with 4 Cores. The database we’re going to use will be the StackOverFlow database (the 140Gb one).
First, for the purposes of this demo, we will set the defaults for SQL Server regarding the MAXDOP and Cost Threshold For Parallelism. Out of the box, SQL Server will set these as:
- MAXDOP = 0
- Cost Threshold For Parallelism = 5
EXEC sys.sp_configure N'show advanced options', N'1' RECONFIGURE WITH OVERRIDE EXEC sys.sp_configure N'cost threshold for parallelism', N'5' EXEC sys.sp_configure N'max degree of parallelism', N'0' GO
Next, we will set up a supporting index for our demo on the Users table
CREATE INDEX Location ON dbo.Users (Location); GO
So that’s the setup sorted. Now we’re going to run a simple query to find which users reside in a specific location, from the Users table
SELECT u.Id, u.DisplayName FROM dbo.Users u WHERE u.Location = N'Netherlands'
The plan for this query looks as you would expect. We are seeking into our Location index to find the matched locations first. After that, we are performing a key lookup to go and find the DisplayNames associated with each user found. Nothing surprising here, yet.
As you will note, we have our parallel operator indicating that everything to the right of said operator went parallel. This is surely a good thing right?
Let’s take a look just how much this query went parallel. Digging into the properties for our index seek on Location, we will click into the Actual I/O Statistics > Actual Logical Reads:
Ok, so we went parallel great, but hold on a second. Only one thread actually read any rows (ignore Thread 0, that’s our coordinating thread to manage all the others). Thread 2 was the only one doing any work here, the rest of our threads went and took a nap.
If we check the key lookup operator, what do you know, the same thread did all the work here too:
You may be thinking that it’s not that much of a big deal and within the context of this one tiny query, I would agree. This query still finishes in less than a second.
However, we have to mindful when seeing that parallel operator that all may not appear as it first seems. Sure this query tells us that it went parallel across 4 threads, but only one of those threads read any rows, the rest simply had nothing to do and took a nap. Remember that for the next part.
Using the same query were going to add an operator which will require a memory grant, a Sort.
SELECT u.Id, u.DisplayName FROM dbo.Users u WHERE u.Location = N'Netherlands' ORDER BY u.DisplayName;
As you would expect, our plan looks pretty similar, apart from the fact we now have a Sort operator in the mix
However, you may notice something else too. We also have that horrid yellow bang on the sort operator. Looking at the tooltip for the sort operator, we see this
This query had to spill to disk so that it could complete the operation. What? Why? This query returns just 8952 rows, why did we have to spill to disk?
The More You Know
The total memory grant for this entire query was 4.1Mb.
We know that everything to the right of our parallelism operator went ‘parallel’ and that parallelism was grossly uneven (waves at Mr.CXPACKET)
Lets look at the memory allocation for the sort operator
So the memory grants were all perfectly distributed across all four threads. Yes, the memory was perfectly distributed across ‘all’ threads. That works out at just 936Kb per thread allocated. Remember our diligent thread number 2 from earlier? That poor guy got just 936 to handle ALL of the work and we wasted 2.8Mb of memory on the workers that were taking a nap (Threads 1, 3 and 4). This is why we had to spill to disk.
Rows read per thread is not linked to the amount of memory each thread gets. Memory was shared out across all threads evenly despite the fact that only thread 2 did all the work.
A Bad Work-Around
For the purposes of this demo, we can ‘correct’ this by doing the following, but who would do this, seriously. Disclaimer: Seriously, don’t do this as a solution
SELECT u.Id, u.DisplayName FROM dbo.Users u WHERE u.Location = N'Netherlands' ORDER BY u.DisplayName OPTION (MAXDOP 1);
Using the hint MAXDOP 1 within our query forces the plan to go serial. Let’s take another look at the thread distribution for rows read on the index seek
As expected, just one thread does all the work, but the difference is this time that SQL Server knows this plan did not run parallel because we told it not to. Let’s take a look at that memory distribution now on the Sort operator
Perfect, so now all the memory was allocated to the one thread, the only thread. Because of this, our Sort operator no longer spills to disk
Best Practice Is Normally For The Best
One of the first things I remember being taught as a DBA was to ensure you change SQL Server’s defaults regarding MAXDOP and Cost Threshold For Parallelism. At the time, I wasn’t quite sure why, however.
I hope this post illustrates one of the many reasons why you should change these settings from their defaults to suit your workloads accordingly.
Parallel queries are not always for the best and we can see how having queries go parallel can sometimes bring about its own issues.
So, the next time you see that CXPACKET wait type, don’t just disregard it like I did so many times, it just might be hiding a nasty little surprise.
Thanks for reading.
Despite being a DBA for a few years now, only just have I come to appreciate the benefits Index Views can bring. If you have commonly aggregated data sets which are constantly requested, Index Views could be your new best friend.
Show Me What You Can Do
Let’s take a rather messy query, which has been completely and utterly fabricated by yours truly, purely for this post
USE StackOverflow2010; GO SELECT DAY(P.CreationDate), MONTH(P.CreationDate), YEAR(P.CreationDate), 'http://stackoverflow.com/questions/' + RTRIM(P.Id) AS URL FROM dbo.Posts P GROUP BY DAY(P.CreationDate),MONTH(P.CreationDate),YEAR(P.CreationDate),'http://stackoverflow.com/questions/' + RTRIM(P.Id) GO
As you can imagine, this is hardly Mr Speedy when it comes to run time.
The plan isn’t great, we have some stinky things going on here such as implicit conversions and expensive parallel operators (of course this was intentional for the purpose of the post):
So currently, we have a bit of a stinky query which clocks up around 15,923 milliseconds CPU time and 800,000 logical reads.
Let’s say, dear reader, for explanation purposes, we have this query linked to a front end process which the end-user can call anytime they wish. Simply by clicking a button in our hypothetical application, they can call this dataset anytime they wish. Scary huh? I know it is and it recently happened to me at my place of work.
View From The Afternoon
Something which I recently implemented at work was an indexed view, for such a requirement. Of course, the query was not based on the StackOverflow database, you’re just going to have to use your imagination.
What we can do, in times like these is utilise the power of an indexed view. What indexed views give us is a ‘materialised’ version of that dataset, anytime we need it. Of course, the data will get updated to reflect any changes to the underlying datasets.
Show Me All The Things
So let’s see what difference implementing an indexed view can do for us in this situation.
Let’s take our nasty little query and create a view out of it. In order for us to put an index on it, however, we’re going to have to create it with the option ‘WITH SCHEMABINDING’ which will lock this object down and not allow us to amend any of the associated tables without dropping this view first.
CREATE VIEW dbo.MyPointOfView WITH SCHEMABINDING AS SELECT DAY(P.CreationDate) AS [DAY], MONTH(P.CreationDate) AS [MONTH], YEAR(P.CreationDate) AS [YEAR], 'http://stackoverflow.com/questions/' + RTRIM(P.Id) AS URL, COUNT_BIG(*) AS [Big Old Count] FROM dbo.Posts P GROUP BY DAY(P.CreationDate),MONTH(P.CreationDate),YEAR(P.CreationDate),'http://stackoverflow.com/questions/' + RTRIM(P.Id) GO
Now we have our view, lets add an index to the shiny new view:
CREATE UNIQUE CLUSTERED INDEX IX_MyFavouriteIndex ON dbo.MyPointOfView(URL);
Right, so new view and a new index. Shall we see what happens when we try and retrieve that dataset?
SELECT [Day], [Month], [Year], [URL] FROM dbo.MyPointOfView
Now, just check out the CPU time and logical reads!
The plan shows, we are able to simply scan our dataset directly from the view created:
The results speak for themselves, CPU time and logical reads are seriously reduced after utilising our indexed view. The before and after figures are quite impressive:
As with most things in life, there are caveats. Would I blindly recommend using indexed views throughout your databases? Nope. There are quite a few gotchas with using index views. The most important ones in my opinion include:
- You cannot alter the underlying tables without dropping the view first
- Inserts, updates and deletes activity on the underlying tables will take a hit. At the end of the day, we’re adding an index.
However, when the time is right, they can be a great tool to utilise. Try them out and test, test, test.
Thanks for reading.
Sometimes Bad Things Happen
Ive seen it many times and im sure i will continue to. But sometimes, people do bad things in SQL Server. JOINS using ORs are one of them.
People are inheritently lazy and i certainly fall under this umberella at times. However, being lazy rarely helps the optimiser do its best work. Using OR conditions in your JOINS is bad news bears for the optimiser most the time. Lets see what happens…
Show and Tell
Let’s take a relatively simple query and run it against the StackOverflow2010 database.
For transparency purposes, I have trimmed down the size of the two tables in question so their row counts here are:
- Users table – 19000
- Posts table – 59000
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId OR U.Reputation = P.OwnerUserId
The keenest eye will see that this query isn’t actually useful at all really considering what we’re joining on, however, that’s not important for this task. You will notice that were joining on ID OR the Reputation field from the Users table. I know, it’s not pretty.
How does this query run? Well lets have a look.
In Ran Fine In Development
When we run this query, you might as well go get yourself a coffee and do a few chores around the house, as its not quick.
Finally after 3 minutes 40 seconds, the query finished. What took so long? Its such a simple query, right?
Well, looking at the execution plan, we can see that we took a whopping 2 minutes plus spooling data to TempDB:
This is explained a little more when we look at our logical reads in the messages tab. We did a huge amount of reads against our worktable within TempDB:
Why so many reads though? Well, again, the eagle eyed amongst you will have noticed that our friendly operator the Table Spool is utilised by a nested loop operator. Dare we look at the number of executions on the Table Spool?
So this operator executed 19000 times? Sad trombone.
Looking For A Way Out
So, we now know how ‘not’ to write our
silly demo query, but how can we improve it?
I have had to overcome this sort of logic several times now at work, despite feeding back into the development teams, I still see it every few months. The way I find most efficient is to utilise our friend the UNION ALL function.
See, the problem with our original query is obviously the OR operator within our JOIN. This can quite frankly confuse SQL and cause it to do some really bad decisions regarding how is best to run our query.
Of course, we want to remove that nasty table spool all together if possible. That took over 50% of our total run time.
Lets split that OR up into two seperate queries first. Here is the first half
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId
The first query simply joins the Users.Id on the Posts.OwnerUserID. Nothing else. This query runs in less than a second (78ms CPU time to be precise). Our plan is also looking healthier now. The nested loop operator has gone and it’s been replaced by a Hash Match join, which is better suited when looking at the record counts passing through the plan.
Lets check on those reads next…
Ok then, so far, so good. We are in a much better place. Next let’s do the same for our other half of the query
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Reputation = P.OwnerUserId
The other half of our query runs in about the same time as the other half. So, we now have two very quick queries. Right, so lets put them together using the UNION ALL operator
SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Id = P.OwnerUserId UNION ALL SELECT U.Id, P.OwnerUserId, U.Reputation, P.OwnerUserId FROM dbo.Users AS U INNER JOIN dbo.Posts AS P ON U.Reputation = P.OwnerUserId
How does it run now I hear you ask? Insanely quickly, that’s how it runs. We clock up just 208ms elapsed time now. Looking at our plan, we have a nice plan shape which looks sensible. Not perfect, but a lot better than it did.
Looking at our reads in the messages tab, the biggest hitter is against our Posts table with 25,970, which for now is totally acceptable.
So, as we file this one under ‘how to confuse SQL Server’, what have we learned?
Using the logic of OR within our join made SQL do some silly things. Using this logic we invoked a table spool. We then spent a long two minutes writing the Posts.OwnerUserID to disk.
Once SQL had done that, we then looped through 1900 times matching up those OwnerUserIDs.
Even writing the process feels long and laborious.
Simplifying the query to do one of those joins at a time and then finally bringing those two result sets together at the end, totally removed the Table Spool operator. You are then left with a very fast and efficient query.
Thanks for reading.
Nuts and Bolts
Deep within my little toolbox, is this little trick. Firstly, lets just set the record straight here, I don’t like little tricks, hacks and especially quick wins. However, with that said, sometimes you’re in a bind and you gotta get that query over the finishing line because that line at your desk isn’t getting any shorter (its actually Teams messages, but you get the point).
So, with that said, this is a little trick I once learnt from the T-SQL master Itzik Ben-Gan, whilst on a training course he delivered.
Batch Mode: ON
So what is batch mode? Batch mode
sometimes turns the dial up to eleven on your queries, especially ones which have a bunch of analytical functions within them. Batch mode will allow the operators within your execution plan to work on ‘batches’ of rows instead of a row by row basis, this, of course, can do wonders for performance.
Normally, SQL server would only tread these murky waters if Columnstore indexes were present on any of your tables involved. However, we can fool SQL into performing the same behaviours in a number of ways.
Show Me All The Things
So let us see this thing in action. To ‘trick’ SQL server into thinking here, we are going to use a Columnstore index. However, unlike traditional Columstore indexes, this index is going to be tiny in terms of footprint and won’t even get used. (i know that sounds strange, bear with me).
First, let us turn on STATS IO and TIME. This allows us to verify any improvements we have made in terms of IO and CPU time
SET STATISTICS IO, TIME ON;
We will now execute a stored procedure which I have created. The contents of the stored procedure are not important here for this post. Ultimately, it just contains a bunch of aggregation functions which aren’t really doing much, other than erm, aggregating against the dbo.Comments table from the StackOverflow database.
USE StackOverflow2010; EXEC dbo.AllTheAggregations
Much like the content of the procedure, the results aren’t really important here either. However, let us have a look at the execution plan for this procedure
Nothing too crazy going on there as you can see. I will ignore the elephant in the room and choose not to mention the spill happening on that Hash Match, oh I just did.
Anyway, if we check a couple of the operators involved, we will see that they are all operating in row mode
Now let us do something. Let’s create a columnstore index on the dbo.Comments table
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCX_ZoomZoom ON dbo.Comments (Id) WHERE Id = -1 and Id = -2;
As you will see, something looks a bit strange there. In order for anything to get through to this index, the ID would need to equal both -1 and -2, which of course will never happen. However, that doesn’t matter.
Now we have a “Columnstore index” on the Comments table, let’s run our procedure again
USE StackOverflow2010; EXEC dbo.AllTheAggregations
Now, let’s take a look at the execution plan too. Of course, we can see a couple of new operators involved now.
However, let’s have a look at the execution mode now for a couple of the operators
Interesting, so now most of our operators are executing in batch mode. This is pretty cool and has a great improvement to the stored procedure we have been using. Let us compare the output of the statistics IO and TIME
As you can see above. We have about a 40% improvement on CPU time alone, just from adding our Columnstore index that will never even get populated or used. Not a bad improvement by anyone’s opinion.
Having that Columnstore index on the Comments table has enabled SQL to use Batch mode execution across a few of the operators seen in our execution plan. This enables much more efficient processing of the data rather than the row mode executions we saw earlier. Batches of row (up to 900 at a time) can be processed simultaneously.
EXEC sp_BlitzIndex @DatabaseName = 'StackOverflow2010', @SchemaName = 'dbo',@TableName = 'Comments'
As you can see above, our index NCCX_ZoomZoom has an insanely small footprint at 0.1MB. Additionally, we can see that it has 0 scans/seeks, 0 deletes and 0 updates. Nice.
When all is said and done
Of course, the differences in the above example are not gigantic, however, I hope it gives a picture of what enabling batch mode can sometimes achieve. I have had scenarios at work where enabling batch mode in such ways has taken literally an hour or two of query run times. Sure, this is not something I would advocate using often but when you need to get something running better (or running at all) until you can implement a better solution, its good to have a trick or two up your sleeve such as this one.
There are numerous ways to achieve such behaviour within SQL Server. For more examples, I would recommend checking out the excellent Kendra Little blog here
Thanks for reading
What are extended events?
Extended Events first appeared with the release of SQL Server 2008.
As a successor to the bulky and somewhat resource intensive SQL Profiler and SQL Trace tool, Extended Events was a new lightweight tool which can be used as and when required to gather useful metrics and information in a much more lightweight and non intrusive manner due to its architecture.
Typical use cases for Extended Events would include:
- Troubleshooting deadlocks
- Identifying poor performing queries
- Capturing key wait events
- Capturing TempDB spills
- Collecting general system activity
Extended Events Architecture
Extended Events use an asynchronous model to minimise the impact put upon the SQL Server. This means that the capture of the metrics is a separate process to that of the filtering and processing of data. This is different to that of its predecessors SQL Profiler and SQL trace.
Extended events can even be integrated into the ETW (Event Tracing For Windows) logging functionality to correlate with data outside of SQL Server.At a high level, Extended Events will consist, or include the following:
- Package – Container for Extended Events
- Package0 – Extended Events system objects. This is the default package
- sqlos – SQL Server Operating System related objects
- sqlserver – SQL Server related objects
- Event – Events are monitoring points that trigger when a point is reached within the database engine’s code base. At the point in which the event fires, it can contain useful information regarding the system state which can be then used for tracing purposes and or further triggering actions. Events will have a specific categorisation, similar to the ETW logging events, defined by the following properties:
- Channel – The channel identifies the target audience for an event. (Admin, operational, Analytic and Debug)
- Keyword – Application specific categories that make it easier to specify and retrieve specific events
- Predicate – Predicates define the boolean conditions for when an event should fire
- Action – Actions allow us to collect additional information for when an event is triggered. Actions only occur after a predicate has been evaluated and an event is going to fire. Actions can bring a overhead to them as they function synchronously using the same thread of that of the event in question.
- Targets – Targets consume event data after the event has fired. Targets are used to store event data, analyse/aggregate it or start a task. The following targets exist:
- Event Bucketing
- Event File
- Event Paring
- Event Tracing for Windows
- Ring Buffer
- Synchronous event counter
- Maps – Maps are collectors that convert integer values from an internal map into readable text values
- Types – Data dictionary of data types used by Extended Events
Demo – Using Extended Events To Capture Deadlock Information
We will now go through a demonstration to show how we can use Extended Events to capture key deadlock information. Capturing detailed deadlock information using Extended Events would very much be a common use case. Lets give it a go ourselves.
The first step is to create ourselves a new session within Management > Extended Events > Sessions:
We are now presented with the new session window. Here, we have a few options which we need to be aware of. First, we need to give our session a relevant name. Additionally, we will select the options to start the session at server startup and to start the session immediately after the session has been created:
Next, we need to select Events from the left side menu options. This is where we will specify what events we wish to captured. Within the Event Library search field, we will enter the text “deadlock” to filter down the events accordingly:
In order to select the events, we need to highlight the required one and click the arrow to move it into the Select events window. For this demo, we will select the three following events:
- lock_deadlock – This event is when an attempt to acquire a lock is cancelled for the victim of a deadlock
- lock_deadlock_chain – This event is when an attempt to acquire a lock generates a deadlock. This event is raised for each participant in the deadlock
- xml_deadlock_report – Produces a deadlock report in XML format
Once we have our three events within the ‘Selected Events’ window, highlight all three and select the configure button. This will allow us to configure options for our events. On the Global fields (Action) tab, scroll down and tick the sql_text option:
We now need to select the Data Storage option on the left side menu. Within this section, we will specify our target(s) for the event data. For this example, we will select event_file from the drop down menu for the Type column. Additionally, we can browse to a file location to save the file to and give our file a relevant name:
We can now select the OK button to finalise the setup. Once we have done this, we will be able to see our newly created session listed under Sessions, along with event file which we chose as our target:
Now that we have our new session running, we need to generate a deadlock. We are not going to go into ‘how to create a deadlock’ here in this post, however, should you need an example of this, check out Brent Ozar’s post here which gives a great example.
After we have experienced (created in this instance) a deadlock, we can use our session to identify what happened. Lets open the session data by opening the event file:
Once we open the file, we will be presented with the window shown below. As you can, we have many events detailed:
So what information have we been able to capture? Lets take a look
lock_deadlock_chain – As you can see, we have two entries for this. Each entry details the sessions which were involved in the deadlock, in our example we can see that both sessions 59 and 60 were involved in the deadlock:
lock_deadlock – this event details information related to the victim chosen when the deadlock occurred:
xml_deadlock_report – this event allows us to capture the xml data relating to the deadlock. Of course, using the captured xml data will prove very useful when trying to diagnose the cause of such deadlocks. Here is the xml data represented as a deadlock graph:
Extended Events – Conclusions So Far
Of course, we have only briefly touched on Extended Events within this blog post. The sheer number of events which can be monitored and captured is breathtaking. Add in the long list of actions and targets to select from too and it can all get a bit daunting.However, with that said, its all very exciting too. Once i get well versed with Extended Events, im sure they will quickly become a daily staple in my SQL toolbox. I know for many DBAs, myself included here, they have been an area which has been ignored and almost forgot about at times. However, after researching some of the topics discussed in this post, i know how useful they will prove to be. So, bye bye SQL Profiler, there is a new guy in town (at least for me).
p.s. I didn’t really use SQL Profiler
This year, i was lucky enough to attend SQLBits 2019 in Manchester England.
This was my first time at SQL Bits and was keen to see what it had to offer, compared to other SQL events I have been to such as Data Relay.
Through work, I signed up for the Thursday training day. The session I chose to attend was the brilliant Erik Darling’s session called Total Server Performance Tuning. Erik has recently departed from his consulting services for Brent Ozar and has started his own consultancy company called Erik Darling Data. His blog posts are incredible, and you can find his work here and here.
After the trek up to Manchester (hello roadworks!), we finally got to the venue, the Manchester Central Convention Complex.
We were first greeted in order to pick up our name badges, once that was done, we were handed an awesome goodie bags, full of promotional treats.
The general area for the stands was quite impressive, with all the big SQL server names there to see, Microsoft (obviously), Redgate and Sentry One to name a few.
After a quick browse of the stands and a chat to a few guys at the Redgate stand, we continued onto the training rooms.
I initially met Eric at the entrance who was extremely welcoming and thankful that I had chosen his session to attend. For someone whose work I admire greatly, this was awesome.
The daily session worked out liked this:
Hardware Considerations from a performance point of view
Resource Waits including parallelism, memory operators
More query tuning
Overall, the training session was an extremely high standard (duh) with relevant real work solutions, to real-world problems (there’s a tagline). Erik’s sense of humour also helped with the proceedings.
But seriously, the content was great and I’m looking forward to re-reviewing the slides and demo scripts which Eric kindly made available to everyone in the session. The query tuning sessions were the best, in my opinion. Some of the information presented gave me that ‘oh now I get it’ moment, countless times. Key principles concerned with executions plans were presented in a really simple, straight forward way. Not an easy feat, given the subject matter.
I will hopefully do a bit more of a technical post regarding the demo scripts which Erik provided. I intent to have a deep dive into the examples he presented, I think trying to understand these fully would make a great blog post.
Thanks for reading.