Understanding the SQL Server Proportional fill algorithm
When creating a database, SQL Server maps this database with minimum two operating system files; the database data file MDF and the database log file LDF. Logically, the database data files are created under a collection set of files that simplifies the database administration, this logical file container is called the Filegroup. Database data files can come in two types, the Primary data files that contains metadata information for the database and pointers to the other database files in addition to the data, where each database should have only one Primary data file and the optional Secondary files to store data only. The basic storage unit in SQL Server is the Page, with each page size equal to 8KB. And each 8 pages with 64KB size called Extent.
The SQL Server component that is responsible for managing the data storage within the database files, called the SQL Server Storage Engine. It uses a fill mechanism that writes data to the database files depending on the amount of free space in each data file rather than writing in each file until it is full then moving to the second one sequentially. This data filling algorithm is called Proportional Fill Algorithm. In other words, the SQL Server Storage Engine will write more frequently to the files with more free space. For example, if the first data file has 10 MB free space and the second one has 20 MB, the storage engine will fill one extent to the first file and two extents to the second one. If auto-growth is enabled to the database files and the database files become full, the SQL Server Database Engine will expand the database files one at a time and write to that file, once the expanded file becomes full, SQL Server Database Engine will expand the second database file and so on.
In the Proportional Fill Algorithm, each database data file will be assigned with a ranking integer number to specify how many times this file will be skipped from the writing process to the next file depending on the free space of that file, this number is called the Skip Target where the minimum value equal to 1 means that a write process will be performed on that file. The Skip Target can be measured by dividing the number of free extents in the file with the largest free space amount by the number of free extents in the current file, as integer value. The larger the free space in the database data file, the smaller Skip Target value. To have one file to write on each loop, there should be minimum one data file with Skip Target value equal to 1. Each time a new database data file is added or removed, or 8192 extents is filled in the database filegroup, the Skip Target value will be calculated again. In this way, all database data files will become full approximately at the same time. The Skip Target calculation can be monitored by enabling the Trace Flag 1165.
While this behavior occurs in any database, with the write-intensive nature of tempdb, an uneven write pattern could cause a bottleneck on the largest file, as more of the writes would happen there.