Search This Blog

Friday, September 22, 2023

A Study in SQL Server Ad hoc Query Plans

A Study in SQL Server Ad hoc Query Plans

We often think of ad hoc query plans, perhaps due to a dismissive attitude, as single-use plans that won't be reused frequently or at least not in the very near future. By that time, the plan might have already been evicted from the cache by SQL Server. However, what if a significant portion of our overall workload consists of such queries? Let's say, for the sake of argument, more than 50%. In this scenario, caching these execution plans would simply waste SQL Server's memory, and the server's memory in general. Some might argue this can bloat the plan cache.

As an illustration, consider a real-world scenario where the memory consumed by these query plans is a staggering 30GB!


SELECT 
    instance_name AS name,
    cntr_value / 128 AS pages_mb,
    cntr_value / 128 / 1024 AS pages_gb
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Cache Pages'
ORDER BY pages_mb DESC;







Next, I fire up a slight variation of the query to see the type of workloads using the most memory:

;WITH cte_plan_cache_usage_by_obj_type AS 
(
	SELECT COALESCE(objtype, '-- ALL Plans') objtype,
           Sum(Cast(size_in_bytes AS BIGINT)) / 
		              1024 / 1024 size_mb
    FROM   sys.dm_exec_cached_plans
    GROUP  BY rollup ( objtype )
)
SELECT objtype,
       size_mb,
	   'percent' = 
		   ( size_mb * 100 ) / 
                       (SELECT size_mb
			FROM   cte_plan_cache_usage_by_obj_type
			WHERE  objtype = '-- ALL Plans'
		   ) 
FROM   cte_plan_cache_usage_by_obj_type
ORDER  BY size_mb DESC









Wow, the ad hoc queries are using a staggering 94%!

This workload is predominantly ad hoc. And no, I am not in a position to instruct the application team to modify their queries, at least not in the short term. To paraphrase a famous quote:

“In the long run we will be all dead”
- John Maynard Keynes

So, any improvements must be made on the backend SQL Server.

Fortunately, for these types of workloads, SQL Server offers a configurable setting called “optimize for ad hoc workloads”. This setting helps limit or reduce the memory used by single-use ad hoc queries. The key here is "single-use". How does it work? When this setting is enabled, the optimizer doesn't cache the full execution plan for an ad hoc query initially. Instead, it caches only a much smaller plan "stub" along with its query hash and plan hash. This plan "stub" isn't a full execution plan, so it's not reusable for subsequent runs of the same query. However, if that same query is executed again soon after, the optimizer will then cache the reusable, full execution plan. In environments where single-use, ad hoc queries constitute a large portion of the workload, this setting can save a significant amount of memory, freeing it up for other tasks.

Note: For two ad-hoc queries to be reused in SQL Server, they must have an exact text match. This match is sensitive to both case and spacing, even on servers that are case-insensitive.


By default, this setting is turned off. However, I occasionally encounter situations where I find it beneficial to enable it. Why? Because memory in SQL Server is valuable. Perhaps an even more compelling reason is that many SQL servers have a limited memory allocation. As much as I'd like to, I can't convince myself, let alone my manager, to allocate 2TB of RAM to every server.

In the example I presented, the 30GB plan cache might appear excessive without understanding the context. Heck, most of my SQL servers don't even have that much total memory. To shed some light, here's some context regarding this SQL Server:


SQL Version: 

Microsoft SQL Server 2019 (RTM-CU18) (KB5017593) - 15.0.4261.1 (X64)

Enterprise Edition: Core-based Licensing (64-bit)
Windows Server 2019 Standard 10.0 <X64> (Build 17763: ) (Hypervisor)

Database Size is 60TB

Application: Bigdata/Data Warehouse/Analytical Reports

CPU and Memory Configuration:

(MEMORY INFO IS IN GIGABYTES)




While a MAXDOP of 8 is the recommended best practice, I'm skeptical whether it's the optimal setting for this SQL Server, particularly considering its primarily analytical workload and 112 cores. It might be worthwhile to adjust this value dynamically based on the time of day and expected workload. For example, if there's an extensive ETL job running every night for 5 hours, designed for exclusive use of the entire server, thereby blocking or prohibiting other users, then that ETL job might benefit from a higher MAXDOP value. I'll explore this in more detail later on.


In this context, where the SQL instance has more than 2TB of RAM available, the 30GB cache size represents less than 1.5%. This percentage is, on average, lower than that of most other SQL Servers. However, in absolute terms, it's still significant, so it's worth investigating.

Let's first check what the current setting is for 'optimize for ad 'hoc workloads'.







It's not enabled, which is the default, so it remains in the turned-off state. Ordinarily, I wouldn't spend more time investigating this; I would simply enable the setting. However, in this case, I've been hired solely to investigate and make recommendations. I'm not authorized to implement any changes myself.

Therefore, I need to delve deeper into my investigation. But before proceeding, let's discuss: what precisely defines an execution plan as 'ad hoc'?

As I highlighted at the outset, some of us often view an ad hoc plan or query as something executed only once. This perception is only partially accurate, especially when considering how SQL Server categorizes a query plan as 'ad hoc' or otherwise in its cache: 
“Adhoc: Ad hoc query. Refers to Transact-SQL submitted as language events by using osql or sqlcmd instead of as remote procedure calls”

That description is straight from this MS article:



Even though it doesn't explicitly mention SSMS, queries executed from SSMS are also labeled as ad hoc by the optimizer.

It's not the frequency of a query's execution that determines its classification as an ad hoc query. However, the number of query executions does influence whether the "optimize for ad hoc workloads" setting caches the full execution plan or just a stub.

In practical terms, ad hoc queries are those that either lack parameters or cannot be auto-parameterized by the optimizer and are not part of a database object (like stored procedures, functions, views, etc.). Consider, for instance, this simple query:

SELECT NAME FROM SYS.DATABASES DBS WHERE  NAME = 'tempdb';


Observe that the query utilizes a constant value 'tempdb' in the WHERE clause, rather than a variable or parameter. Even so, the optimizer can—and often will—attempt to parameterize this query if it's included within a stored procedure or sent as a prepared SQL query. Interactive client tools like sqlcmd, osql, SSMS, and others don't execute such queries as prepared statements. As a result, the optimizer labels them as ad hoc queries and caches their execution plans accordingly. If you have processes that frequently run certain queries using such tools, consider using sp_executesql. This way, they're cached as prepared plans, which can be reused to enhance query performance.

Now, let's examine the subsequent example where the same SQL statement is executed in three distinct manners to obtain identical results:


-- Ad hoc query
SELECT NAME FROM sys.databases dbs WHERE  NAME = 'tempdb'; 
GO
-- Prepared SQL statement
EXEC Sp_executesql N'SELECT NAME FROM sys.databases dbs WHERE  NAME = @db_name',
                   N'@db_name sysname',
                   N'tempdb' 
GO
-- Stored Procedure
IF Object_id('sp_test_adhoc_plans_cache', 'P') IS NOT NULL
  DROP PROCEDURE sp_test_adhoc_plans_cache
GO
CREATE PROCEDURE sp_test_adhoc_plans_cache
(@db_name SYSNAME = 'tempdb')
AS
    SELECT NAME FROM sys.databases dbs WHERE  NAME = @db_name;
GO

EXEC sp_test_adhoc_plans_cache
GO


Now, observe how the optimizer processed and cached those queries:


SELECT cp.plan_handle,
       cp.objtype,
       cp.cacheobjtype,
       cp.size_in_bytes,
       cp.usecounts,
       qt.text,
       qs.query_hash,
       qs.query_plan_hash
FROM   sys.dm_exec_cached_plans cp
       inner join sys.dm_exec_query_stats qs on qs.plan_handle = cp.plan_handle
       CROSS apply sys.Dm_exec_sql_text(cp.plan_handle) qt
WHERE  qt.text LIKE '%SELECT NAME FROM sys.databases dbs WHERE  NAME =%'
       AND qt.text NOT LIKE 'SELECT cp.plan_handle%'


For clearer visibility of its content, I've divided the results into two screenshots:







Observe that the query hash and plan hash values for them are identical. However, the plan handles and objtype differ, even for the same input value, 'tempdb'. While the cached plan size for the ad hoc query is considerably smaller than for the other plans, it remains larger than it would be if 'optimize for ad hoc workloads' were enabled. Let's confirm that:

EXEC sp_configure 'optimize for ad hoc workloads', 1;

reconfigure with override;


Let's remove the already cached plans:

DBCC FREEPROCCACHE (0x060001002C0F151A10D0B0C3D30100000100000000000000000000000000000);
DBCC FREEPROCCACHE (0x060001008A0FCA2810A6C9B5D3010000010000000000000000000000000000);
DBCC FREEPROCCACHE (0x05000100D563663F50416C7CE30100000100000000000000000000000000000);


Rerun the same three queries and examine the cache information:



Note that the plan_handle values differ from those in the previous screenshot, as I cleared the earlier plans from the cache.


Observe the size of the ad hoc plan: merely 456 bytes, with the cacheobjtype listed as 'Compiled Plan Stub'. That's correct – it's only a stub. It doesn't contain an execution plan, so technically, there's no cached plan available for reuse. However, should the same ad hoc query be executed again in the near future, the optimizer will generate a full execution plan for it, which can be reused by subsequent invocations of the same query. Let's validate that:

-- Ad hoc query

SELECT NAME FROM sys.databases dbs WHERE  NAME = 'tempdb';
GO

Here is the info from sys.dm_exec_cached_plans after running the ad hoc query the second time:





The cacheobjtype for the Adhoc plan is no longer a tub.

In summary, when 'optimize for ad hoc workloads' is enabled:

  • The SQL Server optimizer doesn't immediately cache the full execution plan for ad hoc queries. To clarify, this setting only impacts queries that the optimizer classifies as Ad Hoc.

  • The optimizer will cache the plan stub, accompanied by the query and plan hash values. This allows it to recognize if the same query is re-executed. In such an instance, it will replace the stub with the complete execution plan.

Given this information, would I recommend enabling the 'optimize for ad hoc workloads' setting? Given that over 90% of the workload is ad hoc, the risks seem minimal. However, I'd like to examine one more piece of data. The following query displays the aggregate memory usage for:

  • Ad hoc query plans
  • Plans that are not stubs
  • Plans with a use count of 2 or fewer
  • Plans not used by a query in over 5 hours

;with cte_cached_adhoc_plans as
(
SELECT plan_handle,
       MAX(last_execution_time) last_execution_time,
       SUM(execution_count)     execution_count
FROM   sys.dm_exec_query_stats
GROUP  BY plan_handle 
)
SELECT	COUNT(*) [Plan Count],
		SUM(CAST(size_in_bytes AS BIGINT)) / 
		              1024 / 1024 [Size MB],
		MAX(cte_cached_adhoc_plans.execution_count) [Max Exec Count],
		SUM(cte_cached_adhoc_plans.execution_count) [Total Exec Count]

FROM       sys.dm_exec_cached_plans cp 
INNER JOIN cte_cached_adhoc_plans ON cte_cached_adhoc_plans.plan_handle = cp.plan_handle 
WHERE cp.objtype = 'Adhoc'
  AND cte_cached_adhoc_plans.last_execution_time < DATEADD(HOUR, -5, GETDATE())
  AND cp.objtype != 'Compiled Plan Stub'
  AND cte_cached_adhoc_plans.execution_count <= 2











That's nearly 28GB. In this case, I would recommend enabling the 'optimize for ad hoc workloads' setting. However, like any recommendation involving configuration changes, it comes with caveats. Exercising caution is essential, as any change can have potentially unintended and sometimes adverse effects. The organization should have a robust change control procedure in place, complete with a back-out plan.

As an alternative, the following query can be used to generate a script to remove individual ad hoc query plans from the cache. Please tailor the filter conditions to fit your needs. This script could be scheduled to run at appropriate times, perhaps after peak business hours and just before nightly ETL or maintenance tasks commence.


;with cte_cached_adhoc_plans as
(
SELECT plan_handle,
       MAX(last_execution_time) last_execution_time,
       SUM(execution_count)     execution_count
FROM   sys.dm_exec_query_stats
GROUP  BY plan_handle 
)
SELECT	TOP 1000
	[Remove Cached Plan] = 
	'DBCC FREEPROCCACHE (0x' + convert(varchar(max), cte_cached_adhoc_plans.plan_handle, 2) + ');' 
FROM       sys.dm_exec_cached_plans cp 
INNER JOIN cte_cached_adhoc_plans ON cte_cached_adhoc_plans.plan_handle = cp.plan_handle 
WHERE cp.objtype = 'Adhoc'
  AND cte_cached_adhoc_plans.last_execution_time < DATEADD(HOUR, -5, GETDATE())
  AND cp.objtype != 'Compiled Plan Stub'
  AND cte_cached_adhoc_plans.execution_count <= 2








In conclusion, as we're aware, SQL Server will evict cached plans and cached data pages from the buffer cache as necessary, using its continually evolving algorithm. So, why not trust its judgment? I generally concur, but with a caveat. The default settings and behaviors are effective for many SQL installations in a majority of scenarios. However, they don't cater to every SQL Server instance or every situation.


P.S.: As a huge fan of Sherlock Holmes stories, the title of this post pays homage to the famous 'A Study in Scarlet' novel.