We often think of ad hoc query plans, perhaps due to my own dismissive attitude towards it, as single-use plans that won't be reused frequently or at least not in the very near future. By that time, the plan might have already been evicted from the cache by SQL Server. However, what if a significant portion of our overall workload consists of such queries? Let's say, for the sake of argument, more than 50%. In this scenario, caching these execution plans would simply waste SQL Server's memory, and the server's memory in general. Some might argue this can bloat the plan cache.
As an illustration, consider a real-world scenario where the memory consumed by these query plans is a staggering 30GB!
SELECT
instance_name AS name,
cntr_value /128AS pages_mb,
cntr_value /128/1024AS pages_gb
FROM sys.dm_os_performance_counters
WHERE counter_name ='Cache Pages'ORDERBY pages_mb DESC;
Next, I fire up a slight variation of the query to see the type of workloads using the most memory:
;WITH cte_plan_cache_usage_by_obj_type AS
(
SELECT COALESCE(objtype, '-- ALL Plans') objtype,
Sum(Cast(size_in_bytes ASBIGINT)) /1024/1024 size_mb
FROM sys.dm_exec_cached_plans
GROUPBYrollup ( objtype )
)
SELECT objtype,
size_mb,
'percent'=
( size_mb *100 ) /
(SELECT size_mb
FROM cte_plan_cache_usage_by_obj_type
WHERE objtype ='-- ALL Plans'
)
FROM cte_plan_cache_usage_by_obj_type
ORDERBY size_mb DESC
Wow, the ad hoc queries are using a staggering 94%!
This workload is predominantly ad hoc. And no, I am not in a position to instruct the application team to modify their queries, at least not in the short term. To paraphrase a famous quote:
“In the long run we will be all dead”
- John Maynard Keynes
So for now, any improvements must be made on the backend SQL Server.
Fortunately, for these types of workloads, SQL Server offers a configurable setting called “optimize for ad hoc workloads”. This setting helps limit or reduce the memory used by single-use ad hoc queries. The key here is "single-use". How does it work? When this setting is enabled, the optimizer doesn't cache the full execution plan for an ad hoc query initially. Instead, it caches only a much smaller plan "stub" along with its query hash and plan hash. This plan "stub" isn't a full execution plan, so it's not reusable for subsequent runs of the same query. However, if that same query is executed again soon after, the optimizer will then cache the reusable, full execution plan. Now if you have a very long running analytical query that runs once a day, should its full execution plan be cached? My qualified answer would be no, because its very likely that cached plan would be cached out by next day anyways.
In environments where single-use, ad hoc queries constitute a large portion of the workload, this setting can save a significant amount of memory, freeing it up for other tasks.
Note: For two ad-hoc queries to be reused in SQL Server, they must have an exact text match. This match is sensitive to both case and spacing, even on servers that are case-insensitive.
By default, this setting is turned off. However, I occasionally encounter situations where I find it beneficial to enable it. Why? Because memory in SQL Server is valuable. Perhaps an even more compelling reason is that many SQL servers have a limited memory allocation. As much as I'd like to, I can't convince myself, let alone my manager, to allocate 2TB of RAM to every server.
In the example I presented, the 30GB plan cache might appear excessive without understanding the context. Heck, most of my SQL servers don't even have that much total memory. To shed some light, here's some context regarding this SQL Server:
SQL Version:
Microsoft SQL Server 2019 (RTM-CU18)
(KB5017593) - 15.0.4261.1 (X64)
Enterprise Edition: Core-based
Licensing (64-bit)
Windows Server 2019 Standard 10.0
<X64> (Build 17763: ) (Hypervisor)
While a MAXDOP of 8 is the recommended best practice, I'm skeptical whether it's the optimal setting for this SQL Server, particularly considering its primarily analytical workload and 112 cores. It might be worthwhile to adjust this value dynamically based on the time of day and expected workload. For example, if there's an extensive ETL job running every night for 5 hours, designed for exclusive use of the entire server, thereby blocking or prohibiting other users, then that ETL job might benefit from a higher MAXDOP value. I'll explore this in more detail later on.
In this context, where the SQL instance has more than 2TB of RAM available, the 30GB cache size represents less than 1.5%. This percentage is, on average, lower than that of most other SQL Servers. However, in absolute terms, it's still significant, so it's worth investigating.
Let's first check what the current setting is for 'optimize for ad 'hoc workloads'.
It's not enabled, which is the default, so it remains in the turned-off state. Ordinarily, I wouldn't spend more time investigating this; I would simply enable the setting. However, in this case, I've been hired solely to investigate and make recommendations. I'm not authorized to implement any changes myself.
Therefore, I need to delve deeper into my investigation. But before proceeding, let's discuss: what precisely defines an execution plan as 'ad hoc'?
As I highlighted at the outset, some of us often view an ad hoc plan or query as something executed only once. This perception is only partially accurate, especially when considering how SQL Server categorizes a query plan as 'ad hoc' or otherwise in its cache:
“Adhoc: Ad hoc query. Refers to Transact-SQL submitted as language events by using osql or sqlcmd instead of as remote procedure calls”
That description is straight from this MS article:
Even though it doesn't explicitly mention SSMS, queries executed from SSMS are also labeled as ad hoc by the optimizer.
It's not the frequency of a query's execution that determines its classification as an ad hoc query. However, the number of query executions does influence whether the "optimize for ad hoc workloads" setting caches the full execution plan or just a stub.
In practical terms, ad hoc queries are those that either lack parameters or cannot be auto-parameterized by the optimizer and are not part of a database object (like stored procedures, functions, views, etc.). Consider, for instance, this simple query:
Observe that the query utilizes a constant value 'tempdb' in the WHERE clause, rather than a variable or parameter. Even so, the optimizer can—and often will—attempt to parameterize this query if it's included within a stored procedure or sent as a prepared SQL query. Interactive client tools like sqlcmd, osql, SSMS, and others don't execute such queries as prepared statements. As a result, the optimizer labels them as ad hoc queries and caches their execution plans accordingly. If you have processes that frequently run certain queries using such tools, consider using sp_executesql. This way, they're cached as prepared plans, which can be reused to enhance query performance.
Now, let's examine the subsequent example where the same SQL statement is executed in three distinct manners to obtain identical results:
-- Ad hoc querySELECT NAME FROM sys.databases dbs WHERE NAME ='tempdb';
GO-- Prepared SQL statementEXEC Sp_executesql N'SELECT NAME FROM sys.databases dbs WHERE NAME = @db_name',
N'@db_name sysname',
N'tempdb'GO-- Stored Procedure
IF Object_id('sp_test_adhoc_plans_cache', 'P') ISNOTNULLDROPPROCEDURE sp_test_adhoc_plans_cache
GOCREATEPROCEDURE sp_test_adhoc_plans_cache
(@db_name SYSNAME ='tempdb')
ASSELECT NAME FROM sys.databases dbs WHERE NAME =@db_name;
GOEXEC sp_test_adhoc_plans_cache
GO
Now, observe how the optimizer processed and cached those queries:
SELECT cp.plan_handle,
cp.objtype,
cp.cacheobjtype,
cp.size_in_bytes,
cp.usecounts,
qt.text,
qs.query_hash,
qs.query_plan_hash
FROM sys.dm_exec_cached_plans cp
innerjoin sys.dm_exec_query_stats qs on qs.plan_handle = cp.plan_handle
CROSS apply sys.Dm_exec_sql_text(cp.plan_handle) qt
WHERE qt.textLIKE'%SELECT NAME FROM sys.databases dbs WHERE NAME =%'AND qt.textNOTLIKE'SELECT cp.plan_handle%'
For clearer visibility of its content, I've divided the results into two screenshots:
Observe that the query hash and plan hash values for them are identical. However, the plan handles and objtype differ, even for the same input value, 'tempdb'. While the cached plan size for the ad hoc query is considerably smaller than for the other plans, it remains larger than it would be if 'optimize for ad hoc workloads' were enabled. Let's confirm that:
EXECsp_configure'optimize for ad
hoc workloads', 1;
Rerun the same three queries and examine the cache information:
Note that the plan_handle values differ from those in the previous screenshot, as I cleared the earlier plans from the cache.
Observe the size of the ad hoc plan: merely 456 bytes, with the cacheobjtype listed as 'Compiled Plan Stub'. That's correct – it's only a stub. It doesn't contain an execution plan, so technically, there's no cached plan available for reuse. However, should the same ad hoc query be executed again in the near future, the optimizer will generate a full execution plan for it, which can be reused by subsequent invocations of the same query. Let's validate that:
Here is the info from sys.dm_exec_cached_plans after running the ad hoc query the second time:
The cacheobjtype for the Adhoc plan is no longer a tub.
In summary, when 'optimize for ad hoc workloads' is enabled:
The SQL Server optimizer doesn't immediately cache the full execution plan for ad hoc queries. To clarify, this setting only impacts queries that the optimizer classifies as Ad Hoc.
The optimizer will cache the plan stub, accompanied by the query and plan hash values. This allows it to recognize if the same query is re-executed. In such an instance, it will replace the stub with the complete execution plan.
Given this information, would I recommend enabling the 'optimize for ad hoc workloads' setting? Given that over 90% of the workload is ad hoc, the risks seem minimal. However, I'd like to examine one more piece of data. The following query displays the aggregate memory usage for:
Ad hoc query plans
Plans that are not stubs
Plans with a use count of 2 or fewer
Plans not used by a query in over 5 hours
;with cte_cached_adhoc_plans as
(
SELECT plan_handle,
MAX(last_execution_time) last_execution_time,
SUM(execution_count) execution_count
FROM sys.dm_exec_query_stats
GROUPBY plan_handle
)
SELECTCOUNT(*) [Plan Count],
SUM(CAST(size_in_bytes ASBIGINT)) /1024/1024 [Size MB],
MAX(cte_cached_adhoc_plans.execution_count) [MaxExecCount],
SUM(cte_cached_adhoc_plans.execution_count) [Total ExecCount]
FROM sys.dm_exec_cached_plans cp
INNERJOIN cte_cached_adhoc_plans ON cte_cached_adhoc_plans.plan_handle = cp.plan_handle
WHERE cp.objtype ='Adhoc'AND cte_cached_adhoc_plans.last_execution_time < DATEADD(HOUR, -5, GETDATE())
AND cp.objtype !='Compiled Plan Stub'AND cte_cached_adhoc_plans.execution_count <=2
That's nearly 28GB. In this case, I would recommend enabling the 'optimize for ad hoc workloads' setting. However, like any recommendation involving configuration changes, it comes with caveats. Exercising caution is essential, as any change can have potentially unintended and sometimes adverse effects. The organization should have a robust change control procedure in place, complete with a back-out plan.
As an alternative, the following query can be used to generate a script to remove individual ad hoc query plans from the cache. Please tailor the filter conditions to fit your needs. This script could be scheduled to run at appropriate times, perhaps after peak business hours and just before nightly ETL or maintenance tasks commence.
;with cte_cached_adhoc_plans as
(
SELECT plan_handle,
MAX(last_execution_time) last_execution_time,
SUM(execution_count) execution_count
FROM sys.dm_exec_query_stats
GROUPBY plan_handle
)
SELECT TOP 1000
[Remove Cached Plan] ='DBCC FREEPROCCACHE (0x'+convert(varchar(max), cte_cached_adhoc_plans.plan_handle, 2) +');'FROM sys.dm_exec_cached_plans cp
INNERJOIN cte_cached_adhoc_plans ON cte_cached_adhoc_plans.plan_handle = cp.plan_handle
WHERE cp.objtype ='Adhoc'AND cte_cached_adhoc_plans.last_execution_time < DATEADD(HOUR, -5, GETDATE())
AND cp.objtype !='Compiled Plan Stub'AND cte_cached_adhoc_plans.execution_count <=2
In conclusion, as we're aware, SQL Server will evict cached plans and cached data pages from the buffer cache as necessary, using its continually evolving algorithm. So, why not trust its judgment? I generally concur, but with a caveat. The default settings and behaviors are effective for many SQL installations in a majority of scenarios. However, they don't cater to every SQL Server instance or every situation.
P.S.: As a huge fan of Sherlock Holmes stories, the title of this post pays homage to the famous 'A Study in Scarlet' novel.