When normalizing a database, foreign keys are commonly created in order to maintain data integrity between the parent and child tables. Oftentimes, however, the foreign key columns are overlooked when it comes to proper indexing strategy.
Using the Adventureworks2008R2 sample database, available at http://msftdbprodsamples.codeplex.com/, let’s look at what happens when a join is performed on a table where an index is omitted on the foreign key columns, when the index is not properly optimized, and then at an optimal index.
USE AdventureWorks2008R2 GO SET STATISTICS IO ON GO --The first query is run against the Production.WorkOrderRouting table that does --not have an index on the LocationID column (which is a Foreign Key column) SELECT L.* FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR ON L.LocationID = WR.LocationID WHERE WR.LocationID = 30 GO --Notice the Logical Reads below for the Production.WorkOrderRouting table that does not have --an index on the LocationID column --Table 'WorkOrderRouting'. Scan count 1, logical reads 699, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Next, create an index on the LocationID column to support join on the foreign key CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (LocationID) GO --Rerun the query and observe the difference. The Logical Reads for the query decreased from 699 to 6! SELECT L.* FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR ON L.LocationID = WR.LocationID WHERE WR.LocationID = 30 --Table 'WorkOrderRouting'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 3, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Lastly, does the column position in the index matter? --Drop the existing index and recreate a composite index with LocationID as the second position in the index DROP INDEX IDX_LocationID ON Production.WorkOrderRouting GO CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (ProductID, LocationID) GO --Rerun the query and observe the difference again. The Logical Reads for the query is now 129. Although this is --an improvement over having no index (699 reads), it is much higher than the 6 logical reads that occurred with --an optimal index. SELECT L.* FROM Production.Location L INNER JOIN Production.WorkOrderRouting WR ON L.LocationID = WR.LocationID WHERE WR.LocationID = 30 --Table 'WorkOrderRouting'. Scan count 1, logical reads 129, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'Location'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Now that we have seen the effect of no index, a non-optimal index, and an optimal index on a single --column foreign key, what is the effect of non-optimal indexing on a multi-column foreign key? --Let's create some tables to observe the difference. USE AdventureWorks2008R2 GO SET STATISTICS IO OFF GO CREATE TABLE dbo.Employee (LastName varchar(50), FirstName varchar(50), Address varchar(100) CONSTRAINT PK_Employee PRIMARY KEY (LastName, FirstName)) GO --Populate the Employee Table DECLARE @i int SET @i = 1 DECLARE @LastName varchar(25) SET @LastName = 'Proactive' + CAST (@i AS varchar(7)) DECLARE @FirstName varchar(25) SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7)) DECLARE @Address varchar(25) SET @Address = '123 ' + CAST (@i AS varchar(7)) + ' Street' WHILE @i < 100000 BEGIN INSERT INTO dbo.Employee (LastName, FirstName, Address) VALUES (@LastName, @FirstName, @Address) SET @i = @i + 1 SET @LastName = 'Proactive' + CAST (@i AS varchar(7)) SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7)) SET @Address = '123 ' + CAST (@i AS varchar(7)) + ' Street' END GO CREATE TABLE dbo.EmployeeDepartment (LastName varchar(50), FirstName varchar(50), DepartmentName varchar(100), EffectiveDate datetime) GO --Populate the EmployeeDepartment Table DECLARE @i int SET @i = 1 DECLARE @LastName varchar(25) SET @LastName = 'Proactive' + CAST (@i AS varchar(7)) DECLARE @FirstName varchar(25) SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7)) DECLARE @DepartmentName varchar(25) SET @DepartmentName = 'IT - ' + CAST (@i AS varchar(7)) WHILE @i < 100000 BEGIN INSERT INTO dbo.EmployeeDepartment (LastName, FirstName, DepartmentName, EffectiveDate) VALUES (@LastName, @FirstName, @DepartmentName, GETDATE()) SET @i = @i + 1 SET @LastName = 'Proactive' + CAST (@i AS varchar(7)) SET @FirstName = 'Miss T. ' + CAST (@i AS varchar(7)) SET @DepartmentName = 'IT - ' + CAST (@i AS varchar(7)) END GO --Create the Foreign Key constraint ALTER TABLE dbo.EmployeeDepartment ADD CONSTRAINT FK_EmployeeName FOREIGN KEY (LastName, FirstName) REFERENCES dbo.Employee (LastName, FirstName) GO --Create an index on the Foreign Key constraint CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (LastName, FirstName) GO --Turn on Statistics IO SET STATISTICS IO ON GO --Run the query with the index created as LastName, FirstName which matches the index key ordinal --position of the Foreign Key. Observe the Logical Reads for EmployeeDepartment. SELECT E.lastname, E.firstname FROM dbo.Employee E INNER JOIN dbo.EmployeeDepartment ED ON E.LastName = ED.LastName AND E.FirstName = ED.FirstName --Table 'EmployeeDepartment'. Scan count 1, logical reads 586, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'Employee'. Scan count 1, logical reads 1195, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Drop the index and recreate it swapping the position to First, LastName DROP INDEX IDX_LastName ON dbo.EmployeeDepartment GO CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (FirstName, LastName) GO --Run the query with the index created as FirstName, LastName. Observe the difference in Logical Reads for --EmployeeDepartment, as well as the addition of a "Worktable" object. SELECT E.lastname, E.firstname FROM dbo.Employee E INNER JOIN dbo.EmployeeDepartment ED ON E.LastName = ED.LastName AND E.FirstName = ED.FirstName --Table 'Employee'. Scan count 5, logical reads 1337, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'EmployeeDepartment'. Scan count 5, logical reads 645, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. --Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Now that we have seen the effects of a missing index and an index that is not optimal for the foreign key, we will look at identifying foreign keys that do not have an index and that have the index order reversed from the order of the foreign key.
--Identify Foreign Keys columns that are not indexed, including multi-column Foreign Keys, --or Foreign Key columns that are indexed, but the position of the key in the index is not --the same position of the column in the index. The query for identifying each missing index --on the Foreign Key as well as key ordinal position issues is identical and will be rerun --below a number of times to demonstrate each scenario. DROP INDEX IDX_LocationID ON Production.WorkOrderRouting GO --Index is missing SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName, OBJECT_NAME(FK.parent_object_id) TableName, COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition, IDXCOL.key_ordinal IndexKeyColumnPosition FROM sys.foreign_keys FK INNER JOIN sys.foreign_key_columns FKC ON FK.object_id = FKC.constraint_object_id INNER JOIN sys.columns COL ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id LEFT JOIN sys.index_columns IDXCOL ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id WHERE OBJECT_NAME(FK.parent_object_id) = 'WorkOrderRouting' AND (IDXCOL.object_id IS NULL OR IDXCOL.key_ordinal <> FKC.constraint_column_id) ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition GO --Create an index with Location ID in the second position in the index CREATE INDEX IDX_LocationID ON Production.WorkOrderRouting (ProductID, LocationID) GO --Identify where the key ordinal of the Foreign Key does not match the key ordinal of the index SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName, OBJECT_NAME(FK.parent_object_id) TableName, COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition, IDXCOL.key_ordinal IndexKeyColumnPosition FROM sys.foreign_keys FK INNER JOIN sys.foreign_key_columns FKC ON FK.object_id = FKC.constraint_object_id INNER JOIN sys.columns COL ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id LEFT JOIN sys.index_columns IDXCOL ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id WHERE OBJECT_NAME(FK.parent_object_id) = 'WorkOrderRouting' AND (IDXCOL.object_id IS NULL OR IDXCOL.key_ordinal <> FKC.constraint_column_id) ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition GO --What about composite Foreign Keys? DROP INDEX IDX_LastName ON dbo.EmployeeDepartment GO --Missing index on composite Foreign Key SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName, OBJECT_NAME(FK.parent_object_id) TableName, COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition, IDXCOL.key_ordinal IndexKeyColumnPosition FROM sys.foreign_keys FK INNER JOIN sys.foreign_key_columns FKC ON FK.object_id = FKC.constraint_object_id INNER JOIN sys.columns COL ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id LEFT JOIN sys.index_columns IDXCOL ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id WHERE OBJECT_NAME(FK.parent_object_id) = 'EmployeeDepartment' AND (IDXCOL.object_id IS NULL OR IDXCOL.key_ordinal <> FKC.constraint_column_id) ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition GO --Create an index with the key positions reversed from the Foreign Key positions CREATE INDEX IDX_LastName ON dbo.EmployeeDepartment (FirstName, LastName) GO --Key ordinal positions in the index are not the same as in the Foreign Key SELECT FK.name ForeignKeyName, SCHEMA_NAME(FK.schema_id) SchemaName, OBJECT_NAME(FK.parent_object_id) TableName, COL.name ForeignKeyColumnName, IDXCOL.object_id, FKC.constraint_column_id ForeignKeyColumnPosition, IDXCOL.key_ordinal IndexKeyColumnPosition FROM sys.foreign_keys FK INNER JOIN sys.foreign_key_columns FKC ON FK.object_id = FKC.constraint_object_id INNER JOIN sys.columns COL ON FKC.parent_column_id = COL.column_id AND FKC.parent_object_id = COL.object_id LEFT JOIN sys.index_columns IDXCOL ON FKC.parent_column_id = IDXCOL.column_id AND FKC.parent_object_id = IDXCOL.object_id WHERE OBJECT_NAME(FK.parent_object_id) = 'EmployeeDepartment' AND (IDXCOL.object_id IS NULL OR IDXCOL.key_ordinal <> FKC.constraint_column_id) ORDER BY ForeignKeyName, SchemaName, TableName, ForeignKeyColumnPosition GO --Cleanup DROP TABLE dbo.EmployeeDepartment DROP TABLE dbo.Employee DROP INDEX IDX_LocationID ON Production.WorkOrderRouting GO
In conclusion, remember that extra logical reads are not really logical at all. Let’s work to prevent them by creating optimal indexes for Foreign Keys that are used in join scenarios.
Katie Holmes Victoria Beckham Missy Peregrym Sarah Gellman Eliza Dushku
No comments:
Post a Comment