在群里看到一個基礎題,有關索引的使用。
題目就在這里,有的朋友說選B,有的選C,有的說題目不嚴謹,還有的說沒答案,都是錯誤的。
討論了很久,有兩個共性的問題,值得拿出來說下:
- a=1 and b=1 和 b=1 and a=1 會有效利用 idx(b,a) 嗎?
- b=1 還會利用索引 idx(a,b)嗎?
實踐出真知,我就試著上機操作下。
create database factory ;
use factory
go
create table dbo.workflow ( flowid int, flowamount int, flowcount int )
go
先回答第一個問題,判斷條件的順序會影響索引使用嗎
這兒模擬題目中的 idx(b,a) 索引結構
create index idx_amt_id on dbo.workflow(flowamount,flowid)
模擬 a=1 and b=1 的查詢
select * from dbo.workflow
where flowid = 1 and flowamount = 1
模擬 b=1 and a=1 的查詢
select * from dbo.workflow
where flowamount = 1 and flowid = 1
可以看到,當表新建,還沒有數(shù)據(jù)時,優(yōu)化器根本不會去判斷用不用索引,而是直接全表掃描。反正就一個數(shù)據(jù)頁。
當我們加點數(shù)據(jù)時,再看看反應:
這里不得不再提下 tally table 的用法,實在看不下去利用循環(huán)來生成測試數(shù)據(jù)的方法
DECLARE @BEGIN DATETIME = '2010-01-01'
,@END DATETIME = '2017-10-30'
DECLARE @INC INT ;
SELECT @INC = DATEDIFF(DAY,@BEGIN,@END)
; WITH
L0 AS (
SELECT * FROM (VALUES(1),(2),(3)) AS T(C) )
, L1 AS (
SELECT a.C,b.C AS BC FROM L0 AS a cross join L0 AS b )
, L2 AS (
SELECT a.C,b.C AS BC FROM L1 AS a cross join L1 AS b )
, L3 AS (
SELECT a.C,b.C AS BC FROM L2 AS a cross join L2 AS b )
, L4 AS (
SELECT a.C,b.C AS BC FROM L3 AS a cross join L3 AS b )
, L5 AS (
SELECT a.C,b.C AS BC FROM L4 AS a cross join L4 AS b )
insert into dbo.workflow (flowid,flowamount,flowcount)
SELECT TOP 50000 RNK , RNK * 10, RNK + 20
FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RNK
FROM L5
) M
此時表里有5萬條數(shù)據(jù),再看上面兩條查詢的執(zhí)行計劃:
這兒模擬題目中的 idx(b,a) 索引結構
create index idx_amt_id on dbo.workflow(flowamount,flowid)
模擬 a=1 and b=1 的查詢
select * from dbo.workflow
where flowid = 1 and flowamount = 1
模擬 b=1 and a=1 的查詢
select * from dbo.workflow
where flowamount = 1 and flowid = 1
很明顯,都會走索引 idx(b,a) 這種模式,與 b 在前和 a 在前無關。優(yōu)化器可以優(yōu)化這部分表達式的重組。
但,是不是所有條件表達式都沒有先后順序要求呢?肯定不是
只有在相等條件判斷時,先后順序不重要,一旦有表達式用于非等判斷,順序就很重要了,如下:
select * from dbo.workflow
where flowamount > 39 and flowid = 1
select * from dbo.workflow
where flowid = 1 and flowamount > 39
這里優(yōu)化器提示(綠色字體部分),建立一個相等判斷條件的索引在前,非等判斷字段在后的索引 (flowid,flowamount)。所以本質上,索引結構中字段先后不受制于查詢中相等判斷條件表達式字段的順序,而受制于非等條件判斷表達式。即非等判斷字段(flowamount>39)需要放在相等判斷字段(flowid=1)的后面。
create index idx_id_amtr on dbo.workflow(flowid,flowamount)
select * from dbo.workflow
where flowamount > 39 and flowid = 1
select * from dbo.workflow
where flowid = 1 and flowamount > 39
再看兩者的執(zhí)行計劃:
這里就走了我們剛才新建的索引 idx_id_amtr
第二個問題,b=1 還會利用索引 idx(a,b)嗎?
在上面的示例中,建立 index(flowamount,flowid) 的索引,那么對應到要解決的問題,便是 where flowid = 1 會走 index(flowamount,flowid)的索引嗎?
select * from dbo.workflow
where flowid = 1
由此可見 b=1 是不會利用索引 idx(a,b) 了。
注意,或許 oracle, MySQL, pg, 等其他數(shù)據(jù)庫會有不同,大家可以嘗試實際操作下,再一起來討論。各自優(yōu)化器的算法不同,優(yōu)化略微有些詫異。不必過于糾結。