Search This Blog

Monday, September 22, 2025

Set Theory vs. Batch Mode in SQL Server

Set Theory vs. Batch Mode in SQL Server

Not long ago, a colleague of mine was completely shocked when he first heard about SQL Server’s Batch Mode feature. His immediate reaction was: “Wait, but SQL is set-based. Doesn’t that mean it already processes everything as a set instead of row by row?”

That’s actually a really common assumption, and it makes sense at first glance. After all, people often describe relational databases as powerful because they “process sets of data at once.” But the reality is a little more nuanced. Let’s clear this up.

Set-Based SQL Is About How You Express Queries

When we say SQL is a set-based language, we’re talking about how you, the database professional, express what you want. SQL is declarative. You write what result set you need, and you leave it up to SQL Server to figure out how to produce it.

For instance:

  • “Give me the top-selling products this season.”

  • “Show me all customers in the West region.”

You usually don't specify whether SQL Server should use a particular index, scan all rows, or parallelize execution. That’s the optimizer’s job.

But, and here’s where the misconception comes in, asking for a set result doesn’t mean SQL Server executes it in one giant, magical sweep. Under the covers, the engine usually streams rows through operators in a plan, often one at a time, kind of like an assembly line.

Enter Batch Mode

That’s where Batch Mode changes the game. Instead of passing rows through operators one at a time, Batch Mode allows the engine to group rows together, typically around 900 at a time, and process them as a single unit. Think of it as SQL Server “vectorizing” execution.

Here’s a mental image:

  • Row Mode: You buy 10 apples at the grocery store, and the cashier scans each apple one by one.

  • Batch Mode: The 10 apples come in a bag, and the cashier just scans the bag once.

The result? Less overhead and faster performance, especially for analytic and large-scale queries.

Set Theory vs. Batch Mode

This is why it’s important not to conflate relational set theory with Batch Mode execution. They live in completely different layers:


Batch Mode

Set Theory / Relational Model

What is it?

Query execution engine optimization

Query Writing or formulation approach

How it works?

Rows, handled in chunks (~900) internally

Sets/relations (no row-by-row logic)

Why use it?

Improve efficiency of analytic queries

Promote declarative, efficient operations

Where it operates?

Physical (implementation/execution)

Logical (query semantics/design)

Example

using a vectorized batch operator

SELECT SUM(column) FROM table;


So, SQL being set-based tells us how we describe queries. Batch Mode tells us how the engine might execute them internally. Two different things, both powerful—but they shouldn’t be confused.

Batch Mode Through the Versions

Batch Mode wasn’t always around. It showed up with SQL Server 2012 alongside Columnstore indexes, which were built for data warehousing workloads. For years, Batch Mode was tied entirely to columnstore. Then SQL Server 2019 broadened access by making Batch Mode "available" on regular rowstore tables as well (as long as the database compatibility level is set to 150 or higher).

Here’s a quick version history:

SQL Server Version

Batch Mode on Columnstore

Batch Mode on Rowstore

Pre 2012

No

No

2012

Yes

No

2014

Yes

No

2016

Yes

No

2017

Yes

No

2019 and later

Yes

Yes


Of course, even when Batch Mode is available, it doesn’t mean every query will use it. The optimizer decides case by case which execution path is best. Sometimes, Row Mode is still the winner.

Wrapping Up

SQL’s set-based nature is about your interface with the language. Batch Mode, on the other hand, is an internal engine optimization that decides how the work actually gets done.



Demo


/* ** Demo: Batch Mode Vs Row Mode **

Let’s walk through a quick demo to see Batch Mode in practice. 
This works in SQL Server 2019 or later, since Batch Mode is available for 
rowstore tables (provided the database compatibility level is 150 or higher).

The script performs the following actions:

1. Creates a BatchModeDemoDB database 
2. Creates table BatchModeDemoTbl
3. Loads data into BatchModeDemoTbl table
4. Runs a query that is likely to use the Batch mode
5. Runs a second query against a smaller number of records to demo the row mode
6. Cleans up the BatchModeDemoDB database

*/


USE master;
GO
IF EXISTS(SELECT * FROM SYS.DATABASES WHERE NAME = 'master' AND compatibility_level < 150 )
BEGIN
	RAISERROR('Batch Mode feature is not available on your current version of SQL Server.', 16, 1)
END;
ELSE
BEGIN
	IF DB_ID('BatchModeDemoDB') IS NOT NULL DROP DATABASE BatchModeDemoDB;
	CREATE DATABASE BatchModeDemoDB;
	ALTER DATABASE [BatchModeDemoDB] SET RECOVERY SIMPLE WITH NO_WAIT;
	ALTER DATABASE [BatchModeDemoDB] MODIFY FILE ( NAME = N'BatchModeDemoDB', SIZE = 100MB)
	ALTER DATABASE [BatchModeDemoDB] MODIFY FILE ( NAME = N'BatchModeDemoDB_log', SIZE = 300MB)
END;
GO

USE BatchModeDemoDB;
ALTER AUTHORIZATION ON DATABASE::[BatchModeDemoDB] TO [sa];

-- Create a test table
DROP TABLE IF EXISTS dbo.BatchModeDemoTbl;
CREATE TABLE dbo.BatchModeDemoTbl
(
    Id       INT IDENTITY(1,1) PRIMARY KEY,
    Category INT NOT NULL,
    Amount   DECIMAL(10,2) NOT NULL
);

-- Insert a large amount of sample data, e.g.  1 million rows
INSERT INTO dbo.BatchModeDemoTbl (Category, Amount)
SELECT TOP (1000000)
    (ABS(CHECKSUM(NEWID())) % 10) AS Category,
    CAST(ABS(CHECKSUM(NEWID())) % 1000 AS DECIMAL(10,2))
FROM sys.objects a
CROSS JOIN sys.objects b
CROSS JOIN sys.objects c;

-- Lets run two sample queries
SET STATISTICS XML ON;

-- ** This query should use the Batch mode **
SELECT Category, SUM(Amount) AS TotalAmount
FROM dbo.BatchModeDemoTbl
GROUP BY Category;


/* ** This next query should use the row mode **
SQL Server is more likely to choose Row Mode 
for small result sets, since the overhead of 
batching isn’t worth it.
*/

SELECT Category, SUM(Amount) AS TotalAmount
FROM dbo.BatchModeDemoTbl
WHERE Id < 1000
GROUP BY Category;


SET STATISTICS XML OFF;
USE master;

-- Clean up
IF DB_ID('BatchModeDemoDB') IS NOT NULL DROP DATABASE BatchModeDemoDB;


After you run the above script, the results should look like this:

















  • In the execution plan (Ctrl+M in SSMS or via the XML output), look for operators like Hash Match, Clustered Index Scan or Aggregate.

  • Check the Actual Execution Mode property. If Batch Mode was used, you’ll see:

  • Actual Execution Mode = Batch
    Estimated Execution Mode = Batch










If the optimizer decided Row Mode was more efficient, you’ll see “Row.”

Next, try some demo queries on your own, compare Row vs. Batch, and look for execution plan changes.

Tip: For wider queries (more columns, larger aggregates), SQL Server is more likely to choose Batch Mode, since batching pays off with vectorized operators.