9 Minuten Lesezeit (1718 Worte)

"Deflate!": Detecting Buffer Pool Fragmentation in SQL Server

In database systems, the size of the main memory has a significant impact on performance. The goal is to keep as much relevant data as possible in the memory. Due to the structure of the memory, it can happen that part of the memory contains "air" in a figurative sense that could be used much more sensibly.

Fortunately, Microsoft SQL Server provides a good source for tracking down this waste of space that provides a good basis for responding to this problem in a constructive way. But so far it is used far too rarely, and I would like to change that.

Some basics: What exactly is fragmentation?

To begin with, we need to clarify a few terms and, above all, dispel some old myths. Many have certainly heard the term fragmentation before. Many database systems regularly run processes to counteract this fragmentation: The so-called defragmentation.

In the following I will mainly talk about indexes, but the same applies almost analogously to tables. Especially with SQL Server, where most tables are eventually stored in the form of a clustered index.

If an index is newly created, all pages are on the one hand close to each other and on the other hand in the "correct" order. In addition, the pages are almost all completely filled. The memory space is used optimally. There is no "air" in the pages. In the course of time, however, new pages are needed for the index due to the addition, modification, and deletion of data records, and these are then generally no longer located in proximity to the previous pages. In addition, the pages are no longer completely filled. The index starts to get fragmented.

At this point, we need to distinguish between two types of fragmentation: external and internal fragmentation. External fragmentation refers to the location of the pages, internal fragmentation refers to the fill level of the pages.

External fragmentation is not relevant.

External fragmentation is almost exclusively caused by page splits. These are necessary when new entries have to be written between the previous entries of an index, but there is no longer enough space on the corresponding page. The new pages are then no longer close to the previous pages. When reading the index sequentially, jumps occur, which are called external fragmentation.

The focus on external fragmentation comes from a time when data was stored on spinning disks and a moving arm took some time to move to the appropriate location. Back then, it was important and advisable to store data as close to each other as possible so that it could be accessed as quickly as possible.

With modern storage systems, external fragmentation only affects performance to a relevant degree in very rare cases. Unfortunately, the value of the "avg_fragmentation_in_percent" column in the "sys.dm_db_index_physical_stats" system function, which is frequently used for analysis, only refers to external fragmentation. Defragmentation based on this value is therefore not necessary in many cases and creates effort without generating relevant benefits. Since most of the tools used for defragmentation only consider this value, their use should definitely be critically questioned in my opinion.

Internal fragmentation is relevant: The "air" must yield!

Internal fragmentation only considers the individual pages and the data stored on them, regardless of the position of the pages in relation to each other. The term "high internal fragmentation" is thus only a synonym for "low fill factor of the pages".

This type of fragmentation also occurs because of the page splits already mentioned. In most cases, the resulting pages are only half filled. However, internal fragmentation also occurs, for example, when data records are deleted. Especially when historical data is regularly removed from tables, the indexes can be affected by internal fragmentation.

The focus should therefore be primarily on internal fragmentation. This value is also provided by the addressed system function. The name of the column is "avg_page_space_used_in_percent". However, the mode "SAMPLED" or "DETAILED" must be used for this. Unfortunately, this leads to more pages having to be loaded into the main memory for analysis than is necessary in the "LIMITED" mode. Above all, pages are only loaded for analysis that are not currently needed by the application. Thus, the really needed pages are displaced from the main memory by this process. Especially for a regular analysis, this is not suitable from my point of view.

As an alternative, I recommend the system view "sys.dm_os_buffer_descriptors", which supplies for each page in the main memory a line with information from the header of this page. Among other things, the column "free_space_in_bytes" specifies the free space needed for the analysis. We therefore only access the main memory and take only these pages into consideration, which are currently relevant for the application or were at least some time ago. So, we do not look at the "air" on the disk, but at the "air" already in the main memory.

Analysis of the entire instance

The mentioned system view can be used in two ways: Globally for the entire instance, or specifically for one database. The view across all databases provides a good initial overview of how the main memory is distributed among the individual databases. However, an analysis based on individual indexes is only possible per database, since several database-specific system views must also be taken into account in order to be able to display all the required information.

The following queries should show you some possibilities for analysis. Filtering and sorting of the queries must be adapted depending on the concrete question and situation.

The first query gives us information about all databases. It provides us an overview of the number of pages used, the amount of space used and the available free space („air").

SELECT database_id
     , DB_NAME(database_id) AS database_name
     , COUNT(*) AS pages
     , COUNT(*)*8/1024 AS mb_total
     , SUM(CAST(free_space_in_bytes AS bigint))/1024/1024 AS mb_free
  FROM sys.dm_os_buffer_descriptors
 WHERE database_id BETWEEN 5 and 32760
 GROUP BY database_id 
 ORDER BY SUM(CAST(free_space_in_bytes AS bigint)) DESC;

The second query goes one step further and subdivides not only by databases, but also by the type of page. The focus is placed on the types relevant to the topic of fragmentation: Data pages and index pages. In the case of index pages, a distinction is also made between the lowest level, which is primarily relevant, and the pages of the tree above it. In addition, only groups with more than 100 MB of free space are displayed, which provides a better overview, especially for instances with many databases.

WITH data AS (
  SELECT database_id
       , DB_NAME(database_id) AS database_name
       , CASE
           WHEN page_type = 'DATA_PAGE'
           THEN 'DATA_PAGE'
           WHEN page_type = 'INDEX_PAGE'
            AND page_level = 0
           THEN 'INDEX_PAGE_leaf'
           ELSE 'INDEX_PAGE_non_leaf'
         END AS page_type_and_level
       , CAST(free_space_in_bytes AS bigint) AS free_space_in_bytes
    FROM sys.dm_os_buffer_descriptors
   WHERE database_id BETWEEN 5 and 32760
     AND page_type IN ('DATA_PAGE', 'INDEX_PAGE')
SELECT database_id
     , database_name
     , page_type_and_level
     , COUNT(*) AS pages
     , COUNT(*)*8/1024 AS mb_total
     , SUM(free_space_in_bytes)/1024/1024 AS mb_free
  FROM data
 GROUP BY database_id
        , database_name
        , page_type_and_level
HAVING SUM(free_space_in_bytes)/1024/1024 > 100
 ORDER BY SUM(CAST(free_space_in_bytes AS bigint)) DESC;

Thus, in the next step, we can focus on those databases where a notable effect can be achieved.

Analysis of a single database

For a single database, the names of the affected tables and indexes can now be added as well. The query must be executed in the context of the database.

Again, the following query restricts the output to the relevant aspects. Only data pages as well as index pages of the lowest level are considered. Equivalent to the corresponding column in the system function "sys.dm_db_index_physical_stats", I calculate the average fill factor of the pages. As a comparison value, the fill factor configured in the index is also specified.

SELECT DB_ID() AS database_id
     , DB_NAME() AS database_name
     , o.name AS table_name
     , i.index_id
     , i.name AS index_name
     , i.type_desc AS index_type
     , CASE
         WHEN bd.page_type = 'DATA_PAGE'
         THEN 'DATA_PAGE'
         ELSE 'INDEX_PAGE_leaf'
       END AS page_type_and_level
     , COUNT(*) AS pages
     , COUNT(*)*8/1024 AS mb_total
     , SUM(CAST(bd.free_space_in_bytes AS bigint))/1024/1024 AS mb_free
     , AVG(CAST(bd.free_space_in_bytes AS bigint)) AS avg_free_space_in_bytes
     , (8096-AVG(CAST(bd.free_space_in_bytes AS bigint)))*100/8096 AS avg_page_space_used_in_percent
     , CASE 
         WHEN i.fill_factor = 0
         THEN 100 
         ELSE i.fill_factor
       END AS index_fill_factor
  FROM sys.dm_os_buffer_descriptors AS bd
  JOIN sys.allocation_units AS au ON bd.allocation_unit_id = au.allocation_unit_id
  JOIN sys.partitions AS p ON au.container_id = p.hobt_id
  JOIN sys.objects AS o ON p.object_id = o.object_id
  JOIN sys.indexes AS i ON p.object_id = i.object_id AND p.index_id = i.index_id
 WHERE bd.database_id = DB_ID()
   AND bd.page_type IN ('DATA_PAGE', 'INDEX_PAGE')
   AND bd.page_level = 0
   AND au.type_desc = 'IN_ROW_DATA'
   AND o.is_ms_shipped = 0
 GROUP BY o.name
        , i.index_id
        , i.name
        , i.type_desc
        , i.fill_factor
        , bd.page_type
 ORDER BY SUM(CAST(bd.free_space_in_bytes AS bigint)) DESC;

The first glance should be at the "mb_free" column. Reorganization or rebuilding of an index should always lead to a significant reduction in the "air". However, the ratio of the current fill factor to the fill factor configured in the index must always be considered, because if the deviation is very small, the corresponding effect would be small as well.

If you need support in interpreting the values, we will be happy to help you. Please feel free to contact us.

More queries like this can be found in my repository on GitHub.


By choosing the right system view, information about the internal fragmentation relevant for performance can be collected in a resource-saving way. Based on this, customized processes for reorganizing tables and indexes can be developed that also work for very large databases. This avoids unnecessary reorganizations.

Principal Consultant bei ORDIX



Derzeit gibt es keine Kommentare. Schreibe den ersten Kommentar!
Dienstag, 25. Juni 2024

Sicherheitscode (Captcha)

Informiert bleiben!

Bei Updates im Blog, informieren wir per E-Mail.