Search This Blog

Tuesday, May 21, 2019

Find SQL Server Indexes With Heavy Maintenance Overhead

Find SQL Server Indexes With Heavy Maintenance Overhead
Everyone knows indexes are critical for query performance but comes with overhead: For every single INSERT, UPDATE, and DELETE operation SQL SErver has to maintain every index on that table. 

For example, when you INSERT a row, SQL Server:
  • Finds the right leaf page in every index 
  • Splits pages if needed (fragmentation!)
  • Updates statistics on every index
  • Writes to transaction log for every index change
So ror a table with 10 indexes, that's 10x the I/O, 10x the log writes, 10x the CPU cycles, all for just one INSERT.

Usually, this overhead is worth the performance gain we get out of them. But some indexes create heavy maintenance overhead while sitting mostly unused for queries or constraints. I've seen this most often in massive data warehouses, but it can cripple OLTP performance too.

The following script finds exactly those culprits:




-- See blog post: https://sqlpal.blogspot.com/2019/05/do-you-have-rarely-used-indexes-that.html
SET NOCOUNT ON;
USE [Your Database Name];
/*==========================================================================================
  Script Name : Find-Heavy-Maintenance-Low-Usage-Indexes.sql

  Purpose:
      Identifies indexes that incur heavy maintenance overhead (lots of writes) 
      but provide little query benefit (few reads). These are prime candidates for removal.

  Logic:
      - Calculates total reads (seeks + scans + lookups) vs writes (updates) from dm_db_index_usage_stats
      - Filters for indexes with write_to_read_ratio > 10 (mostly write overhead)
      - Only non-unique indexes with significant activity (>1M writes, <1K reads)
      - Helps find indexes that hurt more than they help

  Thresholds (tune for your environment):
      - total_user_writes > 1,000,000  (significant maintenance cost)
      - total_user_reads < 1,000       (minimal query benefit)
      - write_to_read_ratio > 10       (writes >> reads)

  Prerequisites:
      - Run after server uptime of 24+ hours for reliable usage stats
      - dm_db_index_usage_stats resets on server restart/index rebuild

     SAFETY WARNING:
      - Stats reset on SQL restart, index rebuilds, or stats updates
      - Review execution plans before dropping ANY index
      - Test in dev first since some "write-heavy" indexes serve critical constraints

==========================================================================================*/

WITH index_usage AS
(
    SELECT 
        DB_NAME(iu.database_id) AS db_name,
        OBJECT_NAME(iu.object_id, iu.database_id) AS object_name,
        i.name AS index_name,
        i.type_desc AS index_type,
        
        -- Total read operations (query benefit)
        SUM(iu.user_seeks + iu.user_scans + iu.user_lookups) AS total_user_reads,
        
        -- Total write operations (maintenance cost)
        SUM(iu.user_updates) AS total_user_writes
        
    FROM sys.dm_db_index_usage_stats iu
    INNER JOIN sys.indexes i 
        ON i.object_id = iu.object_id 
        AND i.index_id = iu.index_id
    
    WHERE 
        iu.database_id = DB_ID() 
        AND i.index_id > 0
        AND i.is_unique = 0
        
    GROUP BY 
        iu.database_id,
        iu.object_id,
        i.name,
        i.type_desc
)

SELECT 
    *,
    
    -- Write-to-read ratio (higher = more maintenance overhead)
    total_user_writes * 1.0 / NULLIF(total_user_reads, 0) AS write_to_read_ratio

FROM index_usage

WHERE 
    -- High maintenance cost
    total_user_writes > 1000000                    -- 1M+ writes = significant overhead
    
    -- Low/no query benefit
    AND total_user_reads < 1000                    -- <1K reads = rarely used
    
    -- Mostly write overhead
    AND (
        total_user_writes * 1.0 / NULLIF(total_user_reads, 0) > 10  -- 10:1 write bias
        OR total_user_reads = 0                                    -- Never used
    )

ORDER BY write_to_read_ratio DESC;  -- Worst offenders first

/*
  USAGE TIPS:
  
  1. Run after 24+ hours uptime for reliable stats
  2. Higher thresholds = fewer but more certain candidates
  3. Check execution plans before dropping ANY index
  4. Consider business constraints (FKs, app assumptions)
  
  EXAMPLE THRESHOLD ADJUSTMENTS:
  -- More aggressive:
  -- total_user_writes > 500000 AND total_user_reads < 500
  
  -- More conservative:
  -- total_user_writes > 5000000 AND total_user_reads < 100
*/


Before dropping anything: Validate with execution plans and test workload impact first.



Thursday, May 16, 2019

Using Extended Events To Capture Backup and Restore Progress

Using Extended Events To Capture Backup and Restore Progress

If you are running a DATABASE BACKUP or RESTORE command manually, SQL Server will show you the progress at a specified % completion interval. For the BACKUP, the default is every approximately 10%. You can change that frequency interval by specifying STATS [ = percentage ] option.


BACKUP DATABASE [AdminDBA]
TO  DISK = N'O:\MSSQL13.SQL2016AG01\MSSQL\Backup\AdminDBA.bak' WITH
NOFORMAT, NOINIT, 
NAME = N'AdminDBA-Full Database Backup',
SKIP,
NOREWIND,
NOUNLOAD, 
STATS = 10
GO

But what if the backup/restore was started from a different session that you don't have access (another DBA, scheduled job etc.) or you need more information to troubleshoot issues?

Here I should first mention that there are already couple options to track the progress.

You could review or query the sql server error logs (unless trace flag 3226 is enabled). By default its disabled. You can if that trace flag is enabled using:

DBCC TRACESTATUS(3226);















If trace flag 3226 is enabled, the successful backup messages are suppressed in the error log.

Or you could use one of the popular DMVs:

SELECT session_id as SPID, command, a.text AS Query, start_time, percent_complete, dateadd(second,estimated_completion_time/1000, getdate()) as estimated_completion_time 
FROM sys.dm_exec_requests r CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) a 
WHERE r.command in ('BACKUP DATABASE','RESTORE DATABASE')


Starting with SQL Server 2016, you can turn on the backup_restore_progress_trace extended event to trace both Backup and Restore progress. It comes with rich collection and diagnostic capabilities of extended events that will give you great insight into backup/restore operations and help you troubleshoot any issues better.

Here is how to setup the trace using SSMS:

Connect to the SQL Server then expand all way up to the Extended Events -> Sessions, right click and select New Session Wizard:
























Accept the welcome page and on the next page give the session a name. If you choose to you can check box against the Start the event session at Server Startp. I will leave it unchecked though.





















There is no built in trace template so leave the Do not use a template selected and click next.




















On the next screen, type in "backup" to search in the EventLibrary. select the "backup_restore_progress_trace" then click > to add it to the Selected Events box.









FYI: Here are the data fields that are specific to this event and are automatically captured.










On the next screen it will display list of Global fields if you would like to capture.  For this example, I am not selecting any of them.


















On the following screen you can add any filters you would like.



















Just for the hack of it here, I have added  a filter to exclude system databases from the trace.










On the next screen, configure the data storage options. Generally I prefer to store the trace data in file.




Click next and it will bring you to the Summary page.
You can click on Script to generate the script to create event.
Click on Finish to create the event.





You could choose option to start the trace immediately.
If not, right click on the newly created event and select Start Session to start the trace.








Once the trace is started, you can view the Live trace by right clicking the trace and select Watch Live Data.




















Or expand the event, select the file, right click and click View Target Data.















Here is sample trace data.






























Sample TSQL code to create the extended event trace:

CREATE EVENT SESSION [Monitor Backup Progress] ON SERVER
ADD EVENT sqlserver.backup_restore_progress_trace
ADD TARGET package0.event_file(SET filename=N'Monitor Backup Progress',max_file_size=(10))
WITH
(
   MAX_MEMORY=4096 KB,
   EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
   MAX_DISPATCH_LATENCY=30 SECONDS,
   MAX_EVENT_SIZE=0 KB,
   MEMORY_PARTITION_MODE=NONE,
   TRACK_CAUSALITY=OFF,
   STARTUP_STATE=OFF)
GO


Find identical duplicate indexes

Unlike some other RDBMSs, SQL Server does not stop you from creating duplicate indexes. Often, the developer or DBA adding a new index doesn’t check whether an index with the same key columns and order already exists. This situation is more common than many realize.

Duplicate indexes are not only unnecessary, they are bad. They don’t make any query faster, but they still have to be maintained. Every INSERT, UPDATE, and DELETE has to update all copies of the index, that increases write latency and transaction log usage. They also consume extra disk space and memory, and they lengthen index maintenance and backup operations without providing any additional benefit.

When I started looking for identical duplicate nonclustered indexes, same table, same columns, same order, I wanted something flexible enough to handle a few different scenarios. That’s what this script is for: it helps you spot indexes that are essentially copies of each other so you can decide whether to keep or drop them.

There are two knobs you can tweak:

  • If you want to treat indexes as duplicates even when the key columns are in a different order, set @disregard_column_order = 1. Leave it at 0 if you only care about truly identical definitions, including column order.

  • By default, the script ignores cases where one index is clustered and the other is nonclustered. If you also want to see those pairs, set @include_clustered_indexes = 1.


-- See blog post: https://sqlpal.blogspot.com/2019/05/find-identical-duplicate-indexes-revised.html
SET NOCOUNT ON;
USE [Your Database Name;
/*==========================================================================================
  Script Name : Find_Identical_Duplicate_Indexes.sql
  Purpose:
      Find identical duplicate indexes on the same table (same key columns, same order
      unless configured otherwise), so you can evaluate and potentially drop redundant ones.
  High-level approach:
      - Build a comma-separated list of key columns for every index in the database.
      - Optionally:
          * Include clustered vs nonclustered index pairs.
          * Ignore column order when comparing indexes.
      - Group by (schema, object, index_columns) and return those with COUNT(*) > 1.

  Variables/Parameters:
      @include_clustered_indexes (bit)
          0 = Ignore duplicates where one index is clustered and the other is nonclustered
          1 = Include those cases as duplicates too.

      @disregard_column_order (bit)
          0 = Only consider indexes duplicates if key columns match AND are in the same order.
          1 = Consider indexes duplicates even when the same key columns are in a different order.

  Notes / Caveats:
      - This script does NOT:
          * Consider ASC/DESC sort order differences.
          * Consider included columns.
          * Distinguish filtered vs non-filtered indexes beyond exposing filter_definition.
      - Do NOT blindly drop indexes; review each case in the context of workload and plans.
==========================================================================================*/


/*
    Whether to include identical indexes where one is clustered and 
    the other one is non-clustered
*/
DECLARE @include_clustered_indexes bit = 0;


/*
    Whether to find duplicate indexes where all key columns are the same,
    but not necessarily in the same order. Typical usage:
      - 0 (default): require same column order (more strict, closer to “identical”).
      - 1: ignore key column order, treat any permutation of the same columns as duplicates.
*/
DECLARE @disregard_column_order bit = 0;


;WITH cte AS
(
    SELECT
        o.schema_id,
        o.type_desc,
        o.object_id,
        i.index_id,
        i.name AS index_name,

        /*
            Build a comma-separated list of key column names for this index.
              - We need a stable string representation of the key columns to compare indexes.
              - Ordering of columns in the list is controlled by @disregard_column_order:
                    @disregard_column_order = 0 → order by key_ordinal (true index order)
                    @disregard_column_order = 1 → order by column_id (logical column order)
        */
        index_columns =
            COALESCE(
                STUFF
                (
                    (
                        SELECT
                            CAST(',' + COL_NAME(object_id, column_id) AS varchar(max))
                        FROM sys.index_columns
                        WHERE object_id = i.object_id
                          AND index_id  = i.index_id
                          AND is_included_column = 0          -- Only key columns, not INCLUDE
                        ORDER BY 
                            object_id, 
                            index_id,
                            CASE 
                                WHEN @disregard_column_order = 1 
                                    THEN column_id           -- Ignore index key order
                                ELSE key_ordinal            -- Respect index key order
                            END
                        FOR XML PATH(''), TYPE
                    ).value('.', 'varchar(max)')
                    , 1, 1, ''
                )
            , ''),

        -- Index metadata for review and decision making
        i.type_desc       AS index_type,
        i.is_unique,
        i.data_space_id,
        i.ignore_dup_key,
        i.is_primary_key,
        i.is_unique_constraint,
        i.fill_factor,
        i.is_padded,
        i.is_disabled,
        i.is_hypothetical,
        i.allow_row_locks,
        i.allow_page_locks,
        i.has_filter,
        i.filter_definition

    FROM sys.indexes AS i
    INNER JOIN sys.objects AS o 
        ON o.object_id = i.object_id
    WHERE 
        OBJECTPROPERTY(o.object_id, 'IsMsShipped') = 0   -- Skip system objects
        AND i.index_id <> 0                              -- Skip the heap "index"
        /*
            Control whether clustered indexes participate:

            - If @include_clustered_indexes = 0:
                  i.index_id > 1 → only nonclustered indexes.
            - If @include_clustered_indexes = 1:
                  i.index_id > 0 → clustered + nonclustered.
        */
        AND i.index_id > CASE WHEN @include_clustered_indexes = 1 THEN 0 ELSE 1 END
)

-- Find indexes with identical index_columns on the same object and same type
SELECT
    SCHEMA_NAME(i1.schema_id) AS schema_name,
    i1.type_desc,
    OBJECT_NAME(i1.object_id) AS object_name,
    i1.index_name,
    i1.*  -- Includes index_columns + metadata for review
FROM cte AS i1
INNER JOIN 
(
    /*
        Identify combinations of (schema_id, type_desc, object_id, index_columns)
        that occur more than once → those represent duplicate index definitions.
    */
    SELECT 
        schema_id, 
        type_desc, 
        object_id, 
        index_columns
    FROM cte
    GROUP BY 
        schema_id, 
        type_desc, 
        object_id, 
        index_columns
    HAVING COUNT(*) > 1
) AS i2
    ON  i1.schema_id     = i2.schema_id
    AND i1.type_desc     = i2.type_desc
    AND i1.object_id     = i2.object_id
    AND i1.index_columns = i2.index_columns
ORDER BY 
    schema_name, 
    i1.type_desc, 
    object_name, 
    i1.index_name;



This query does not take the ASC or DESC sort direction into account. You might have two indexes with the same key columns where one is ASC and the other is DESC, and there can be perfectly valid reasons for that. It also doesn’t distinguish whether one or both indexes are filtered. I’d love to hear your feedback before I invest more effort covering every edge case. 

Download the SQL script from GitHub:




Wednesday, May 1, 2019

Capture Trend in Dead Lock Occurances

Capture Trend in Dead Lock Occurances
Deadlocks are usually detected and resolved automatically by SQL Server by killing one of the SPIDs involved in it.  You may have an alert setup to notify you and/or users may approach you with the issue.

But this post is not about finding the root cause and fix the issue for good. 

Ideally, you don't want to see dead locks occur at all and especially not on a regular basis.  But situations could arise where a database starts experiencing deadlocks. In this post I would like to share a code I have used to capture the daily count of deadlocks for a trending report.

Note: I am using the DMV sys.dm_os_performance_counters in this code so it will not work if the SQL Server performance counters are disabled for some reason.

You can use this query to check if the performance counters are enabled or disabled.

SELECT COUNT(*) FROM sys.dm_os_performance_counters;  

If the return value is 0 rows, more likely than not, it indicates that 
the performance counters have been disabled.



Step 1: Create empty table to store the deadlock counts


USE [AdminDBA]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[deadlock_counter](
       [ID] [int] IDENTITY(1,1) NOT NULL,
       [record_created_date] [datetime] NOT NULL,
       [SQLStartedOn] [datetime] NOT NULL,
       [object_name] [nchar](128) NULL,
       [counter_name] [nchar](128) NULL,
       [instance_name] [nchar](128) NULL,
       [cntr_value] [bigint] NULL,
       [cntr_type] [int] NULL,
       [AveragePerDay] [bigint] NULL,
PRIMARY KEY CLUSTERED
(
       [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

 Step 2: Create a SQL Server Agent Job


/*

Please update email operator value for @notify_email_operator_name parameter

Please feel free to update any of the settings including the schedule you would like to use.

*/


USE [msdb]
GO
BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'DBA - Deadlock Counter',
             @enabled=1,
             @notify_level_eventlog=0,
             @notify_level_email=2,
             @notify_level_netsend=0,
             @notify_level_page=0,
             @delete_level=0,
             @description=N'Populate deadlock performance counter values',
             @category_name=N'[Uncategorized (Local)]',
             @owner_login_name=N'sa',
             @notify_email_operator_name=N'DBA', @job_id = @jobId OUTPUT

IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'deadlock performance counter values',
             @step_id=1,
             @cmdexec_success_code=0,
             @on_success_action=1,
             @on_success_step_id=0,
             @on_fail_action=2,
             @on_fail_step_id=0,
             @retry_attempts=0,
             @retry_interval=0,
             @os_run_priority=0, @subsystem=N'TSQL',
             @command=N'set nocount on
set transaction isolation level read uncommitted
go
use AdminDBA
go
-- the counter values are cumulative since SQL Server service was started, not per second
insert into deadlock_counter
SELECT getdate() record_created_date,
       d.create_date SQLStartedOn, p.*, AveragePerDay = CONVERT(BIGINT, (( 1.0 * p.cntr_value / NULLIF(Datediff(dd, d.create_date,CURRENT_TIMESTAMP), 0 ))))
-- INTO deadlock_counter
FROM   sys.dm_os_performance_counters p
       INNER JOIN sys.databases d ON d.NAME = ''tempdb''
WHERE  Rtrim(p.counter_name) = ''Number of Deadlocks/sec''   
--       AND cntr_value > 0
      AND Rtrim(p.instance_name) = ''_Total''
ORDER  BY cntr_value DESC

-- select * from deadlock_counter',
             @database_name=N'master',
             @flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'Daily',
             @enabled=1,
             @freq_type=4,
             @freq_interval=1,
             @freq_subday_type=1,
             @freq_subday_interval=0,
             @freq_relative_interval=0,
             @freq_recurrence_factor=0,
             @active_start_date=20190228,
             @active_end_date=99991231,
             @active_start_time=80000,
             @active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO


Step 3: Query to extract deadlock counts for each day/interval


-- Note: the counter values are cumulative since SQL Server service was started, not per second

set nocount on
set transaction isolation level read uncommitted
select * ,cntr_value - coalesce(lag(cntr_value) over (order by id), cntr_value) new_deadlocks
from AdminDBA..deadlock_counter
order by ID desc