Track Data Changes in SQL Server with Change Tracking
SQL Server provides three mechanisms to track data changes in the table objects. This mechanism is also a useful solution for implementing the audit and logging process for the SQL Server Instance. The mechanisms are Change Tracking also known as CT, Change Data Capture also known as CDC, and Temporal Table (System Versioned Table). We can also implement a custom logging and auditing process with help of Triggers on DML Events, which we illustrated in our previous post here. Change Tracking or CT is a lightweight synchronous solution that only tracks the row changes in the table but doesn't capture the data changes. Due to this limitation of capturing changes in data and its history, the mechanism has less overhead cost compared to others. In simpler terms, CT tracks DML events on the table causing data modifications, and it will return by providing the Primary Key column value of the modified row, the changed column, and the modification type, it doesn't return the data changes done for the columns and it doesn't also provide the previous data before change event. For example, if a row is inserted, then updated multiple times, and finally deleted, CT will record only the last delete operation in the transaction, we will illustrate this with an example later. Thus, to implement CT, the table should have a primary key column. Before moving forward and playing with CT implementation, let's create a database and a table with data in it.
CREATE DATABASE DemoDB;
use DemoDB;
CREATE TABLE products(
id int primary key identity(1,1),
item_name varchar(20) not null,
item_description varchar(40),
unit_price decimal(8,2)
);
SET IDENTITY_INSERT [dbo].[products] ON
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (1, N'Compactor', NULL, CAST(7145.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (2, N'Crawler', NULL, CAST(9549.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (3, N'Excavator', NULL, CAST(39549.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (4, N'Scraper', NULL, CAST(4883.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (5, N'Grader', NULL, CAST(9549.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (6, N'Trencher', NULL, CAST(7347.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (7, N'Backhoe', NULL, CAST(1275.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (8, N'Dragline', NULL, CAST(9549.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (9, N'Dump Truck', NULL, CAST(25241.00 AS Decimal(8, 2)))
GO
INSERT [dbo].[products] ([id], [item_name], [item_description], [unit_price]) VALUES (10, N'Item 1', NULL, CAST(500.00 AS Decimal(8, 2)))
GO
SET IDENTITY_INSERT [dbo].[products] OFF
GO
SELECT * FROM products;
Enabling Change Tracking
To enable CT on the table, we have to enable it on the database level first. Since we have created a demo database for the illustration with the table and its data, let's enable Change Tracking for our DemoDB database.
use DemoDB;
ALTER DATABASE DemoDB
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON);
As the alter statement above enables Change_Tracking on Database Level, we need to mention retention value and CleanUp value. Change_retention field requires value and unit, so the data can be retained for the mentioned period in the internal on-disk tables and the Auto_cleanup field takes ON or OFF as value, for enabling or disabling the auto cleanup process. If auto cleanup is enabled then it will delete the internal on-disk tables data after the mentioned retention period automatically. After enabling CT on the Database level, we also have to enable it for each table we want to track.
USE DemoDB
GO
ALTER TABLE products
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);
With the alter statement above, we enabled the change tracking for the table we wanted. We could also do the enabling of both, the database level and table level through the management studio, by visiting database properties and table properties respectively. Now let's see how Change Tracking works with each DML event.
Before moving forward, we need to know one thing, it is about the version number counter, and is incremented automatically with any DML operation on the tracked table. To check the current version of the tracked table, use the below query
SELECT CHANGE_TRACKING_CURRENT_VERSION();
DML Events making Changes
Insert Operation
To see the working of Change Tracking in the case of an Insert operation, let's insert 2 new records in our product table which is a tracked table, and see how we can track the new insertions
INSERT INTO products(item_name, unit_price) VALUES
('Item 1', 322.45),
('Item 2', 137.90);
We can utilize the Change Tracking functions and CHANGETABLE system function, to access the data saved in the internal table after the INSERT statement executed above. Each DML event causing data modifications increments the version number counter. The function returns all the changes made on the tracked table with the version counter. Query the select statement given below to view the modifications in the tracked table.
SELECT * FROM CHANGETABLE
(CHANGES products,0) as CT ORDER BY SYS_CHANGE_VERSION;
image: In the image, we can see the version counter, type of DML Event, which is I in our case as we did inserts along with the primary key column as value for the newly inserted rows. We also see that SYS_CHANGE_COLUMNS, SYS_CHANGE_CONTEXT column currently has a NULL value. SYS_CHANGE_COLUMNS returns the list of columns which was modified since the last version counter and are of binary data type. Also, note that Inserts and Deletes will always have an SYS_CHANGE_COLUMNS value "null", a whole row is affected by an insert or a delete. SYS_CHANGE_CONTEXT is optional and is used to specify the context changes using WITH clause while performing DML changes.
Update Operation
Like the insert operation, let's perform the update operation, by updating unit_price to 500.00 for product id 10 and also two updates for product id 11 in the products table. Before proceeding to execute the update statement, check the current version number.
UPDATE products SET unit_price = 500.00 WHERE id = 10;
UPDATE products SET unit_price = 500.00 WHERE id = 11;
UPDATE products SET unit_price = 520.00 WHERE id = 11;
Now let's see the effect of the above update statement in our internal table, with the same select statement executed above.
SELECT * FROM CHANGETABLE
(CHANGES products, 1) as CT ORDER BY SYS_CHANGE_VERSION;
image:
In the snapshot above, we can see that SYS_CHANGE_VERSION which is maintaining the counter incremented for the id 10 along with U Char for the SYS_CHANGE_OPERATION column denoting that the last operation on the row was an update statement, and SYS_CHANGE_COLUMNS contains the binary value for the columns modified. Notice that the above select statement is a little different than the previous select statement we used in the Insert part. Along with the table name in the CHANGES sub-clause we also provided 1 as a literal value which was my current version counter before the update operation. Mentioning it in the select query from the changetable, we get the record of the modification made after the last version mentioned. Even though product id 11 has been updated twice, we only can get the record of the most recent modification made to the row. It doesn't provide the history of data changes made. Now let's see how a delete operation is recorded in the Changetable internal table.
Delete
Now, let's delete the product having id 11 from the products table, by executing the below delete statement. Before that, you can check the current version, which we use to query changetable later. My current counter value is 4.
DELETE FROM products where id = 11;
Now let's query the changetable and see the records reflecting our delete operation
SELECT * FROM CHANGETABLE
(CHANGES products, 4) as CT ORDER BY SYS_CHANGE_VERSION;
image: In the above select query to fetch the records from the changetable for the table products, we passed the literal value 4 after the comma in the CHANGES sub-clause, it was my version counter value before the delete statement was executed. And we can see the version counter has incremented and the SYS_CHANGE_OPERATION value is Character D denoting the Delete operation. SYS_CHANGE_COLUMNS depicting the list of the columns changed will be null, as it was a delete operation. Now, let's run the same select query on the changetable for the products table, but this time we will provide literal value 1 in place of 4 for the CHANGES clause within the brackets. This will return the last modifications made in the products table after our first insertion for the records.
SELECT * FROM CHANGETABLE
(CHANGES products, 1) as CT ORDER BY SYS_CHANGE_VERSION;
image: In the snapshot of the output above, we can infer that the changetable only records the last modification made for the particular record row. Earlier in the Update part, we perform an update operation twice on product id 11 still it only shows the record of the last operation for product id 11.
Disable Change_Tracking
To disable the Change_Tracking (CT) function, we need to first disable tracking for the tracked tables. After disabling tracking on the table, we can disable it for the database with the T-SQL code below
ALTER TABLE products
DISABLE CHANGE_TRACKING
GO
ALTER DATABASE DemoDB
SET CHANGE_TRACKING = OFF
GO
The Change_Tracking function has limited usage and it's mainly the best use case where we want to track the data changes on the table, but we have no requirement for the pre and post-data modifications, or we only want to know if the row is modified or not. One such use case is implementing a custom cache layer for the application where we can synchronize our cache database for the application or reinitialize it as the case may be. In the next post for Change Tracking, we will be looking at a few Change Tracking Functions.