數(shù)據(jù)庫(kù)中表存在重復(fù)數(shù)據(jù),需要清理重復(fù)數(shù)據(jù),清理后保留其中一條的情況是比較常見(jiàn)的需求,如何通過(guò)1條SQL準(zhǔn)確的刪除數(shù)據(jù)呢?
1. 創(chuàng)建表及測(cè)試數(shù)據(jù)
1.1 數(shù)據(jù)庫(kù)中創(chuàng)建一張測(cè)試表
CREATE TABLE `test` (
`id` INT NOT NULL AUTO_INCREMENT,
`c1` VARCHAR(20) DEFAULT NULL,
`c2` VARCHAR(20) DEFAULT NULL,
`c3` INT DEFAULT NULL,
`c4` DATETIME DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;
1.2 插入測(cè)試數(shù)據(jù)
INSERT INTO test(c1,c2,c3,c4) VALUES( 'a','b',10, '2022-05-24 18:00:46'),('a','c',20, '2022-05-24 18:00:46');
INSERT INTO test(c1,c2,c3,c4) VALUES( 'a','c',10, '2022-05-24 18:00:46'),('a','b',20, '2022-05-24 18:00:46');
INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',10, '2022-05-24 18:00:46'),('d','b',20, '2022-05-24 18:00:46');
INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',20, '2022-05-24 18:00:46'),('d','b',30, '2022-05-24 18:00:46');
INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',20, '2022-05-24 18:00:46'),('a','b',40, '2022-05-24 18:00:46');
INSERT INTO test(c1,c2,c3,c4) VALUES( 'd','b',40, '2022-05-24 18:00:46'),('r','f',40, '2022-05-24 18:00:46');
1.3 查看重復(fù)數(shù)據(jù)
例如c1,c2 這2個(gè)字段組合作為唯一條件,則查詢(xún)重復(fù)數(shù)據(jù)的SQL如下
SELECT
c1,
c2,
COUNT(*)
FROM
test
GROUP BY c1,
c2
HAVING COUNT(*) > 1;
可見(jiàn),結(jié)果如下:
2. 如何刪除重復(fù)數(shù)據(jù)
2.1 方案一
很多研發(fā)同學(xué)習(xí)慣的思路如下:
- 先查出重復(fù)的記錄(使用in)
- 再查出在重復(fù)記錄但id不在每組id最大值的記錄
- 直接將select 改為delete進(jìn)行刪除
查詢(xún)SQL如下
SELECT * FROM test
WHERE (c1,c2) IN (
SELECT c1,c2
FROM test
GROUP BY c1,c2
HAVING COUNT(*)>1 )
AND id NOT IN (
SELECT MAX(id)
FROM test
GROUP BY c1,c2
HAVING COUNT(*)>1)
ORDER BY c1,c2
;
看上去比較符合結(jié)果了,但是改為delete執(zhí)行的時(shí)候結(jié)果如下:
-- delete SQL
DELETE FROM test
WHERE (c1,c2) IN (
SELECT c1,c2
FROM test
GROUP BY c1,c2
HAVING COUNT(*)>1 )
AND id NOT IN (
SELECT MAX(id)
FROM test
GROUP BY c1,c2
HAVING COUNT(*)>1)
出現(xiàn)報(bào)錯(cuò)信息:
錯(cuò)誤代碼:1093
You can't specify target table 'test' for update in FROM clause
也就是說(shuō)MySQL里需刪除的目標(biāo)表在in子查詢(xún)中時(shí),不能直接執(zhí)行刪除操作。
3. 推薦寫(xiě)法
基于以上情況,使用單條SQL刪除的方式如下:
查詢(xún)SQL:
SELECT a.*
FROM test a ,
(SELECT c1,c2,MAX(id)id FROM test GROUP BY c1,c2 HAVING COUNT(*)>1)b
WHERE a.c1=b.c1 AND a.c2=b.c2
AND a.id <>b.id
刪除SQL
DELETE a
FROM test a ,
(SELECT c1,c2,MAX(id)id FROM test GROUP BY c1,c2 HAVING COUNT(*)>1)b
WHERE a.c1=b.c1 AND a.c2=b.c2
AND a.id <>b.id
結(jié)果:
<n>查詢(xún):delete a FROM test a , (select c1,c2,max(id)id from test group by c1,c2 having count(*)>1)b where a.c1=b.c1 and a.c2=b.c2 and a....
共 7 行受到影響
刪除后數(shù)據(jù)如下:
無(wú)重復(fù)數(shù)據(jù)了。