Lets explore the topic “Index Join in sql server” starting with “what is index join?”. Index join is a technique which uses an index intersection with two or more indexes to fulfill a query completely. If a table has multiple small index instead of a large covering index, the optimizer can choose index intersection on these small indexes to satisfy a query. In index intersection, it has to perform logical reads on all the indexes being used to serve the query.
If the querying table is large and it has multiple narrow indexes instead of one wide index, index intersection can be used by sql server to fetch the entire data to be returned.
Create a table as below;
CREATE TABLE DemoTable ( EmpID INT, EmpName VARCHAR(100), EmpAddress VARCHAR(500), EmpContact VARCHAR(15), EmpPinCode VARCHAR(15) )
Now we have created a table and as per the next step, we have to insert large number of records into the table as below;
INSERT INTO DemoTable (EmpID, EmpName, EmpAddress, EmpContact, EmpPinCode) SELECT ROW_NUMBER() OVER(ORDER BY(SELECT 0)), 'Employee Name ' + CAST(column_id AS VARCHAR(10)), 'Demo address of employee : ' + a.name, LEFT(CAST(ABS(CHECKSUM(NEWID())) AS VARCHAR(10)) + '0000000000', 10), LEFT(CAST(ABS(CHECKSUM(NEWID())) AS VARCHAR(10)) + '000000', 6) FROM SYS.COLUMNS a cross join sys.tables b
Now to demonstrate the index join, i am going to create two narrow nonclustered indexes on two different columns of this table as below;
CREATE NONCLUSTERED INDEX noncidx_DemoTable_EmpName ON DemoTable(EmpName) CREATE NONCLUSTERED INDEX noncidx_DemoTable_EmpAddress ON DemoTable(EmpAddress)
Keep in mind that the size of table matters much more here. If size of the table is small then instead of index join, optimizer may choose a bookmark lookup (RID lookup or key lookup). So for this we need some more number of records like 5000+ (vary as per number of columns and other size factors).
For your ready reference, above is the details of the space of table which is being used in this demo.
Now to see the index join in action run this below query;
SELECT EmpName, EmpAddress FROM DemoTable WHERE EmpName = 'Employee Name 12' AND EmpAddress = 'Demo address of employee : bitpos'
Another important point is the values in where clause for columns ‘Employee Name 12‘ and ‘Demo address of employee : bitpos‘, which can vary on your machine.
In above figure, we can see the index intersection on “noncidx_DemoTable_EmpName” and”noncidx_DemoTable_EmpAddress “, which we have created above.
Pros and Cons
- Can serve more queries – Instead of having one wide index on multiple columns, we can choose multiple narrow indexes, which can server more queries. Because if the left most column of index is not being used in Join or Where clause of the query, index will not be used. But if we have created multiple narrow indexes, all the indexes can server individual queries and can also be used together with index join or index intersection to produce the complete required result.
- Decreases index maintenance cost – If we have 5 columns in our table and have different combination of these columns with many queries which can be maximum up to !5(Factorial 5) = 5 x 4 x 3 x 2 x 1 = 120, then we have to create 120 indexes on the table and it will increase the cost of Index maintenance during Insert, Update and Delete operations.
- Different index scan on different indexes, requires extra logical reads on the index pages in comparison of a wide covering index which does not need this extra logical read.
- After getting data from different indexes, these data are combined with a join technique, which needs an extra effort to combine the fetched records from various indexes.
Index intersection or index join is not a complete replacement of a covering index, instead both can be used on a single table. For example, if we have a table with many columns and some queries on this table runs very frequently (10 times in a second), replacing one wide covering index with multiple narrow indexes will degrade the query performance drastically. And if this table has various types of queries with different different combination of columns and the frequency of running these queries are not too high, then having multiple index on different columns will decrease the index maintenance cost by removing a lot of covering indexes and by creating some narrow indexes which can server many queries using index join.
Please share your comments below about this article.