1. 程式人生 > >SQL SERVER中什麼情況會導致索引查詢變成索引掃描

SQL SERVER中什麼情況會導致索引查詢變成索引掃描

SQL Server 中什麼情況會導致其執行計劃從索引查詢(Index Seek)變成索引掃描(Index Scan)呢? 下面從幾個方面結合上下文具體場景做了下測試、總結、歸納。

1:隱式轉換會導致執行計劃從索引查詢(Index Seek)變為索引掃描(Index Scan)

Implicit Conversion will cause index scan instead of index seek. While implicit conversions occur in SQL Server to allow data evaluations against different data types, they can introduce performance problems for specific data type conversions that result in an index scan occurring during the execution.  Good design practices and code reviews can easily prevent implicit conversion issues from ever occurring in your design or workload.

如下示例,AdventureWorks2014資料庫的HumanResources.Employee表,由於NationalIDNumber欄位型別為NVARCHAR,下面SQL發生了隱式轉換,導致其走索引掃描(Index Scan)

SELECT NationalIDNumber, LoginID  
FROM HumanResources.Employee  
WHERE NationalIDNumber = 112457891 

clipboard

我們可以通過兩種方式避免SQL做隱式轉換:

    1:確保比較的兩者具有相同的資料型別。

    2:使用強制轉換(explicit conversion)方式。

我們通過確保比較的兩者資料型別相同後,就可以讓SQL走索引查詢(Index Seek),如下所示

SELECT nationalidnumber,
       loginid
FROM   humanresources.employee
WHERE  nationalidnumber = N'112457891' 

clipboard[1]

注意:並不是所有的隱式轉換都會導致索引查詢(Index Seek)變成索引掃描(Index Scan),Implicit Conversions that cause Index Scans 部落格裡面介紹了那些資料型別之間的隱式轉換才會導致索引掃描(Index Scan)。如下圖所示,在此不做過多介紹。

clipboard[2]

clipboard[3]

避免隱式轉換的一些措施與方法

    1:良好的設計和程式碼規範(前期)

    2:對釋出指令碼進行Review(中期)

    3:通過指令碼查詢隱式轉換的SQL(後期)

下面是在資料庫從執行計劃中搜索隱式轉換的SQL語句

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
DECLARE @dbname SYSNAME 
SET @dbname = QUOTENAME(DB_NAME());
WITH XMLNAMESPACES 
   (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') 
SELECT 
   stmt.value('(@StatementText)[1]', 'varchar(max)'), 
   t.value('(ScalarOperator/Identifier/ColumnReference/@Schema)[1]', 'varchar(128)'), 
   t.value('(ScalarOperator/Identifier/ColumnReference/@Table)[1]', 'varchar(128)'), 
   t.value('(ScalarOperator/Identifier/ColumnReference/@Column)[1]', 'varchar(128)'), 
   ic.DATA_TYPE AS ConvertFrom, 
   ic.CHARACTER_MAXIMUM_LENGTH AS ConvertFromLength, 
   t.value('(@DataType)[1]', 'varchar(128)') AS ConvertTo, 
   t.value('(@Length)[1]', 'int') AS ConvertToLength, 
   query_plan 
FROM sys.dm_exec_cached_plans AS cp 
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp 
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt) 
CROSS APPLY stmt.nodes('.//Convert[@Implicit="1"]') AS n(t) 
JOIN INFORMATION_SCHEMA.COLUMNS AS ic 
   ON QUOTENAME(ic.TABLE_SCHEMA) = t.value('(ScalarOperator/Identifier/ColumnReference/@Schema)[1]', 'varchar(128)') 
   AND QUOTENAME(ic.TABLE_NAME) = t.value('(ScalarOperator/Identifier/ColumnReference/@Table)[1]', 'varchar(128)') 
   AND ic.COLUMN_NAME = t.value('(ScalarOperator/Identifier/ColumnReference/@Column)[1]', 'varchar(128)') 
WHERE t.exist('ScalarOperator/Identifier/ColumnReference[@Database=sql:variable("@dbname")][@Schema!="[sys]"]') = 1

2:非SARG謂詞會導致執行計劃從索引查詢(Index Seek)變為索引掃描(Index Scan)

    SARG(Searchable Arguments)又叫查詢引數, 它的定義:用於限制搜尋的一個操作,因為它通常是指一個特定的匹配,一個值的範圍內的匹配或者兩個以上條件的AND連線。不滿足SARG形式的語句最典型的情況就是包括非操作符的語句,如:NOT、!=、<>;、!<;、!>;NOT EXISTS、NOT IN、NOT LIKE等,另外還有像在謂詞使用函式、謂詞進行運算等。

2.1:索引欄位使用函式會導致索引掃描(Index Scan)

SELECT nationalidnumber,
       loginid
FROM   humanresources.employee
WHERE  SUBSTRING(nationalidnumber,1,3) = '112'

clipboard[4]

2.2索引欄位進行運算會導致索引掃描(Index Scan)

    對索引欄位欄位進行運算會導致執行計劃從索引查詢(Index Seek)變成索引掃描(Index Scan):

    SELECT  * FROM Person.Person WHERE  BusinessEntityID + 10 < 260

clipboard[5]

一般要儘量避免這種情況出現,如果可以的話,儘量對SQL進行邏輯轉換(如下所示)。雖然這個例子看起來很簡單,但是在實際中,還是見過許多這樣的案例,就像很多人知道抽菸有害健康,但是就是戒不掉!很多人可能瞭解這個,但是在實際操作中還是一直會犯這個錯誤。道理就是如此!

SELECT  * FROM Person.Person WHERE  BusinessEntityID  < 250

clipboard[6]

2.3 LIKE模糊查詢回導致索引掃描(Index Scan)

    Like語句是否屬於SARG取決於所使用的萬用字元的型別, LIKE 'Condition%' 就屬於SARG、LIKE ’%Condition'就屬於非SARG謂詞操作

SELECT  * FROM Person.Person WHERE LastName LIKE 'Ma%'

clipboard[7]

SELECT  * FROM Person.Person WHERE LastName LIKE '%Ma%'

clipboard[8]

3:SQL查詢返回資料頁(Pages)達到了臨界點(Tipping Point)會導致索引掃描(Index Scan)或表掃描(Table Scan)

What is the tipping point?

It's the point where the number of rows returned is "no longer selective enough". SQL Server chooses NOT to use the nonclustered index to look up the corresponding data rows and instead performs a table scan.

    關於臨界點(Tipping Point),我們下面先不糾結概念了,先從一個鮮活的例子開始吧:

SET NOCOUNT ON;
DROP TABLE TEST
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(8));
CREATE INDEX PK_TEST ON TEST(OBJECT_ID)
DECLARE @Index INT =1;
WHILE @Index <= 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry';
    SET @Index = @Index +1;
END
UPDATE STATISTICS  TEST WITH FULLSCAN;
SELECT * FROM TEST WHERE OBJECT_ID= 1

如上所示,當我們查詢OBJECT_ID=1的資料時,優化器使用索引查詢(Index Seek)

clipboard[9]

上面OBJECT_ID=1的資料只有一條,如果OBJECT_ID=1的資料達到全表總資料量的20%會怎麼樣? 我們可以手工更新2001條資料。此時SQL的執行計劃變成全表掃描(Table Scan)了。

UPDATE TEST SET OBJECT_ID =1 WHERE OBJECT_ID<=2000;
UPDATE STATISTICS  TEST WITH FULLSCAN;
SELECT * FROM TEST WHERE OBJECT_ID= 1

clipboard[10]

clipboard[11]

臨界點決定了SQL Server是使用書籤查詢還是全表/索引掃描。這也意味著臨界點只與非覆蓋、非聚集索引有關(重點)。

Why is the tipping point interesting?

  • It shows that narrow (non-covering) nonclustered indexes have fewer uses than often expected (just because a query has a column in the WHERE clause doesn't mean that SQL Server's going to use that index)

  • It happens at a point that's typically MUCH earlier than expected… and, in fact, sometimes this is a VERY bad thing!

  • Only nonclustered indexes that do not cover a query have a tipping point. Covering indexes don't have this same issue (which further proves why they're so important for performance tuning)

  • You might find larger tables/queries performing table scans when in fact, it might be better to use a nonclustered index. How do you know, how do you test, how do you hint and/or force… and, is that a good thing?

4:統計資訊缺失或不正確會導致索引掃描(Index Scan)

     統計資訊缺失或不正確,很容易導致索引查詢(Index Seek)變成索引掃描(Index Scan)。 這個倒是很容易理解,但是構造這樣的案例比較難,一時沒有想到,在此略過。

5:謂詞不是聯合索引的第一列會導致索引掃描(Index Scan)

SELECT * INTO Sales.SalesOrderDetail_Tmp FROM Sales.SalesOrderDetail;
CREATE INDEX PK_SalesOrderDetail_Tmp ON Sales.SalesOrderDetail_Tmp(SalesOrderID, SalesOrderDetailID);
UPDATE STATISTICS  Sales.SalesOrderDetail_Tmp WITH FULLSCAN;

下面這個SQL語句得到的結果是一致的,但是第二個SQL語句由於謂詞不是聯合索引第一列,導致索引掃描

SELECT * FROM Sales.SalesOrderDetail_Tmp
WHERE SalesOrderID=43659 AND SalesOrderDetailID<10

clipboard[12]

SELECT * FROM Sales.SalesOrderDetail_Tmp WHERE SalesOrderDetailID<10

clipboard[13]

參考資料: