For a very important client, I currently manage over 200 SQL Servers (all VMWare VMs) and over 500 SQL Instances running on them, hosting more than 5000 databases.
Except for about 50 large and critical databases, I don't really know what the rest of the databases are being used for, what type of applications they support, their workload characteristics or even which department/people use them.
Why, because this organization's policy is to "buy", never build a software application they need. And they have a good reason for it, almost all their apps come with the hardware they purchase, that will often have two or more applications 1) UI for the administrators 2) UI for the employees and 3) UI for the customers.
Only reason I do know about the database that are both, large and critical, is simply because being large + critical tend to make them stand out whenever there is any issue or outage.
Regarding the SQL Server storage, this maybe presumtive to say, but I think most SQL DBAs configure the database with auto growth setting ON and unlimited size. That is the default setting for SQL Server out of the box. There can be and are exceptions to this. Let me share an anecdote from past:
Once upon a time, I once supported a highly transactional database (primary publisher database in a distributed-replication topology) with more than 1000 end users whose job was to constantly add new data, 8 hours a day, 5 days a week. So in order to maintain performance of DML queries, primarily inserts and occasional updates, this database had a very limited number of indexes. No end user queries can be run here. The workload of read/querying of this data was offloaded to other database servers, located across the country. We had setup a transaction replication with 3 subscriber sql servers in the local data center, which we then configured as publisher (more accurately, as republisher) to sql servers located on 3 different regions of United States, and 2 international locations. These regional and international sql servers were used by the end user applications to query the data and this is where we added any additional indexes required by the end user applications. This was a very carefully planned and configured replication topology.
We could not afford to let anyone run a large transaction on the primary publisher database, as that would clog the global replication pipeline that we had so carefully constructed. The stakes were high, as this database generated hundreds of millions in profit, after all this was a very high margin business so most of the revenue was profit.
So in order to control and limit the flow of replication traffic, one of the things we did was to limit the max file size for the transaction log to 100MB. Any attempt to run a large transaction would fail as soon as it hit that limit.
However in this current environment, so far I have had no need to change those two settings so all databases are configured with auto growth setting ON and unlimited max size. This is fine until of course when the disk runs out of free space.
We do monitor SQL servers for the uptime, availability, performance, all the typical stuff. And one of the things we monitor is the disk space. And the email alert condition for our disk space monitoring is a simple one:
% free space falls below 15% and amount of free space is below 100GB
This has worked best of us in this environment, as it had practically eliminated false alerts. So, free space on a 10TB drive can fall below 15%, for example to 500GB, which is only 5% free space, this would still not trigger an alert as overall free space is still over 100GB. We think that "usually" gives us just enough heads up before the disk might completely fill up and cause an outage.
I usually get on average 5 disk space alerts a week. Then begins the troubleshooting and fixing the issue as soon as possible.
Fortunately, we have been successful in using a standardized configuration for SQL and for the most part, we use the following convention for the disks configuration:
Drive Letter | Purpose |
C: | OS/System derive |
D: | SQL Server Software/binaries |
K: | Data files |
L: | Transaction Log |
O: | SQL Server Database Backups |
T: | SQL Server Temp files |
If the filestream feature is being used then that usually goes on a different drive, simply because they tend to grow much faster in our environment.
So when I see an email alert come through for one (or more) disks on one more SQL Servers, just by looking at the drive letter, I can instantly run some scenarios in my head why that particular drive could be getting low on the disk space.
Now, this article is not about a how-to troubleshoot disk space issues, but to give some examples:
If it is our O drive (database backups), some of the reasons for it could be:
- Backup job is 1) failing or 2) failing to automatically delete old backup files as it was designed to!
- A very huge, one off, transaction was run that 1) created a very large transaction log backup 2) increased the size of the nightly differential backups, until the next weekly full backup
- Significantly amount of data being changed but not necessarily in a one huge transaction. This would increase the size of transaction backups and differential backups
- Database/s grew organically and significantly, causing the backup size to growth
- New databases added either through a restore of a large database or immediately populated through a very large data load (or series of small but frequent data loads)
If the L drive (transaction logs) is out of space:
- Transaction log is not being backed or failing!
- A huge, one off, transaction was run. We do not have a job to automatically reclaim the space so the transaction log will keep the it's auto grown size until manually reclaimed
If the K drive (Data files) is out of space:
- This is usually because the database is organically growing at much faster pace than we were told
- New databases added
Of course, you may want to reach out to the application folks to find out what is going on.
And so on...
Before I look into the SQL Server and/or RDP into the server, I have a PowerShell script that I run to get a quick summary of disk space report and largest files on a disk, and in most cases that also tells me the type of file (backup, transaction log, data file etc.) and what particular SQL instance it belongs to and what database. I then connect to the SQL instance in SSMS, do a quick analysis on the reasons, fix the problem and be done with in matter for 5 minutes or even less, all without having to log into the server.
Often, the solution involves:
- If the reason is organic growth of the database, or new/restored databases, then often it requires adding more disk space
- If the transaction log grew because of a one off transaction, say from 5GB to 5TB, then it may just need to be shrunk back to 5GB
- Then sometimes there are old files/backups are lying that are no longer needed and can be safely removed
- Fixing the failing jobs
Now about the PowerShell scripts, here are the parts of that PowerShell script:
Get the disk space info:
Remember to change the value for $computer_fqdn
This assumes that your currently logged-on user account has the necessary permissions on the remote computer. If not then you can provide a different credentials by adding "-Credential (Get-Credential)" to the Get-WMIObject command.
Get-WMIObject -ComputerName $computer_fqdn Win32_LogicalDisk -ErrorVariable ErrVar -ErrorAction SilentlyContinue | Where-Object {$_.MediaType -eq 12} | Select-Object __SERVER, @{n='DriveLetter';e={$_.Name}}, VolumeName, @{n='Capacity (Gb)' ;e={"{0:n0}" -f ($_.size/1gb)}}, @{n='FreeSpace (Gb)';e={"{0:n0}" -f ($_.freespace/1gb)}}, @{n='PercentFree';e={[Math]::round($_.freespace/$_.size*100)}} | Format-Table -AutoSize
Caveat: The above PowerShell command currently does not support the Mount Points.
Results:
Find files on remote computer:
More accurately though, the script will search for files on a particular disk drive, not the entire computer. For that, you would need to run the script separately for each drive. Optionally, you can also enter a folder/subfolder if you don't want it to start from the root of the drive.
The following script is written to look for the largest files and you can define what that means through the variable $filter_by_size. You can then also specify a wild card filter for the file names through $filter_by_name, and, when the file was last modified in $filter_by_last_modified_before_days.
In below example, I am looking for top 5 largest files in O:\ containing *FULL*.BAK in the file name and at least larger than 100MB and last modified date is at least before 8 days. You can change the filter values to your needs and if you don't want to use a filter, just comment it out by putting the hash sign (#) in front it. And lastly, don't forget to change the value for $computer_name variable to your server name. You can also enter multiple server names, separated by a comma, for example:
$computer_name = @("Server1", "Server2", Server3")
# Name of the remote computer $computer_name = @("SQLServer1") # Drive letter or a subfolder on a drive to search for the large sizes $remote_folder = 'O:\' # Add optional filters $filter_by_name = '*FULL*.BAK' $filter_by_size = 100*1024*1024 # 100*1024*1024 = 100MB $filter_by_last_modified_before_days = -8 # Note the minus (-) sign
$top_n = 5 # Limit the results to the top n files by size
# Make sure the $filter_by_last_modified_before_days is a negative value if($filter_by_last_modified_before_days -gt 0) {$filter_by_last_modified_before_days = $filter_by_last_modified_before_days * -1} # Set the filters to default values if not already Set by the caller if($top_n -eq $null -or $top_n -eq 0 -or $top_n -eq '') {$top_n=50} if($filter_by_name -eq $null -or $filter_by_name -eq '') {$filter_by_name='*'} if($filter_by_size -eq $null -or $filter_by_size -eq '') {$filter_by_size=0} if($filter_by_last_modified_before_days -eq $null -or $filter_by_last_modified_before_days -eq '') {$filter_by_last_modified_before_days=0} # Lets get the fqdn for the remote computer $computer_fqdn = @() foreach($computer in $computer_name){$computer_fqdn += @([System.Net.Dns]::GetHostEntry($computer).HostName)} $large_files = @(Invoke-Command -computername $computer_fqdn -ArgumentList $remote_folder, $filter_by_name, $filter_by_size, $top_n ` -scriptBlock {Get-ChildItem $Using:remote_folder -Filter $Using:filter_by_name -recurse -ErrorAction SilentlyContinue –Force | where-object {$_.length -gt $Using:filter_by_size} | Sort-Object length -Descending | select fullname, @{Name="Size(GB)";Expression={[Math]::round($_.length / 1GB, 2)}}, @{Name="Size(MB)";Expression={[Math]::round($_.length / 1MB, 0)}}, LastWriteTime ` -First $Using:top_n } ) # Display $large_files | sort -Property 'Size(MB)' -Descending | ft PSComputerName, 'Size(MB)', 'Size(GB)', LastWriteTime, FullName #$files_to_delete = $large_files | where-object {$_.LastWriteTime -lt ((Get-date).AddDays($filter_by_last_modified_before_days))} | select FullName $large_files_count = $large_files.Count "Number of files: $large_files_count"
Download above PowerShell script from GitHub at:
Delete files on remote computer:
Now to the final part, deleting files remotely. If you just need to delete a specific file on the remote computer:
Invoke-Command -ComputerName '<Computer Name>' -ScriptBlock {Remove-Item '<FileName>'}
To remove multiple files:
Remove-Item 'O:\MSSQL\BACKUPS\FULL\TESTDB1.BAK', 'O:\MSSQL\BACKUPS\FULL\TESTDB1.BAK'
OR, pass the list of files as an array:
Remove-Item @('O:\MSSQL\BACKUPS\FULL\TESTDB1.BAK', 'O:\MSSQL\BACKUPS\FULL\TESTDB1.BAK')
You can also use wildcards, like this example from the Microsoft:
Remove-Item * -Include *.doc -Exclude *1*
However, I usually first use the previous PowerShell script to search the files on the remove computer that meet my criteria for deletion, and store the results into $large_files variable then use the following PowerShell code to delete the files.
$files_to_delete = $large_files foreach($file_to_delete in $files_to_delete) { Invoke-Command -ArgumentList $file_to_delete.PSComputerName, $file_to_delete.FullName -ComputerName $file_to_delete.PSComputerName -ScriptBlock {get-item $Using:file_to_delete.FullName} # Invoke-Command -ArgumentList $file_to_delete.PSComputerName, $file_to_delete.FullName -ComputerName $file_to_delete.PSComputerName -ScriptBlock {remove-item $Using:file_to_delete.FullName} }
For safety reasons, I have commented out the line that actually deletes the files, so you have opportunity to review the output before uncommenting and re-execute the FOREACH block.
Now, we can't always delete the files that we "think" we no longer need, can we? Sometimes you may not be sure if the files are redundant and can be safely deleted. So, you may just want to compress the files. In situation like this, I prefer to use the NTFS file compression property as it compresses files in place and transparently.
However as far as I know, this is a bit tricky to do in PowerShell. Using the Set-ItemProperty PowerShell command, you can change other attribute bits for a file/folder like ReadOnly, Archive etc. but not the Compressed bit. So instead I use the "compact.exe" command, still from within the PowerShell, like this:
compact /C <File Name>
PowerShell does comes with a Compress-Archive command that compresses file/s in Zip format.
Invoke-Command -ComputerName '<ComputerName>' -ScriptBlock {
@{
Path = "O:\dbbackup.bak"
CompressionLevel = "Fastest"
DestinationPath = "O:\dbbackup.bak.zip"
}
Compress-Archive @compress
}
But now you still have the original file taking up the disk space. Once you verify that the Zip file was successfully created, you can then manually delete the original file.