本文介紹了識別SQLite表中缺少的序列的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!
問題描述
我有一個包含1000條記錄的表。每條記錄表示子文件夾中的一個文件以及與該文件相關的一些屬性。感興趣的字段/列如下:
目錄路徑=包含感興趣文件的每個子文件夾的名稱
Track=文件的序列號(它們應該是連續(xù)的,范圍從1到任何數(shù)字
我正在查找與每個__目錄路徑相關的文件所表示的序列中缺少的數(shù)字。
列出序列中每個缺失數(shù)字的開始和結束的通用查詢如下(歸功于:https://www.xaprb.com/blog/2005/12/06/find-missing-numbers-in-a-sequence-with-sql/):
select start, stop from (
select m.id + 1 as start,
(select min(id) - 1 from sequence as x where x.id > m.id) as stop
from sequence as m
left outer join sequence as r on m.id = r.id - 1
where r.id is null
) as x
where stop is not null order by start, stop;
但是,在本例中,我需要對與具有相同__dirPath值的記錄相關的每個序列執(zhí)行相同的操作。假設順序表除了通用示例中的id字段之外還有一個__dirpath列,人們將如何做到這一點?
下面是一個包含偽數(shù)據(jù)的表,前述查詢適用于此表,而不考慮__dirpath:
drop table if exists sequence;
create table sequence (__dirpath blob, id int not null);
insert into sequence(__dirpath, id) values
("A", 1), ("A",2), ("A", 3), ("A", 4), ("A", 6), ("A", 7), ("A", 8), ("A", 9),
("A", 10), ("A", 15), ("A", 16), ("A", 17), ("A", 18), ("A", 19), ("A", 20);
如果然后運行以下查詢,則會得到正確的答案集:
select dir, start, stop from (
select m.id + 1 as start,
(select min(id) - 1 from sequence as x where x.id > m.id) as stop, m.__dirpath as dir
from sequence as m
left outer join sequence as r on m.id = r.id - 1
where r.id is null
) as x
where stop is not null order by dir, start, stop;
結果正確,如下所示:
如果隨后將以下記錄添加到表中:
insert into sequence(__dirpath, id) values
("B", 1), ("B",4), ("B", 5), ("B", 6), ("B", 7), ("B", 117), ("B", 14), ("B", 9),
("B", 10), ("B", 15), ("B", 16), ("B", 17), ("B", 18), ("B", 19), ("B", 20);
并重新運行上面的左外連接,則結果是沒有意義的,因為與__dirPath=";A";和__dirPath=";B";相關的值都在查詢中被引用,從而產生:
因此,問題實質上是如何修改查詢以僅引用與每個相應的__目錄路徑條目相關的記錄。
推薦答案
必須在相關子查詢和聯(lián)接中添加__dirpath
列:
SELECT dir, start, stop
FROM (
SELECT m.id + 1 start,
(SELECT MIN(id) - 1 FROM sequence x WHERE x.__dirpath = m.__dirpath AND x.id > m.id) stop,
m.__dirpath dir
FROM sequence m LEFT JOIN sequence r
ON m.__dirpath = r.__dirpath AND m.id = r.id - 1
WHERE r.id IS NULL
)
WHERE stop IS NOT NULL
ORDER BY dir, start, stop;
另一個具有CTE和窗口函數(shù)的解決方案:
WITH cte AS (
SELECT __dirpath, grp, MIN(id) min_id, MAX(id) max_id
FROM (
SELECT *, SUM(flag) OVER (PARTITION BY __dirpath ORDER BY id) grp
FROM (
SELECT *, id - 1 <> LAG(id, 1, id - 1) OVER (PARTITION BY __dirpath ORDER BY id) flag
FROM sequence
)
)
GROUP BY __dirpath, grp
)
SELECT c1.__dirpath,
MAX(c1.max_id) + 1 start,
MIN(c2.min_id) - 1 stop
FROM cte c1 INNER JOIN cte c2
ON c2.__dirpath = c1.__dirpath AND c2.grp = c1.grp + 1
GROUP BY c1.__dirpath, c1.grp
請參閱demo。
這篇關于識別SQLite表中缺少的序列的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,