1. 程式人生 > >mysql去除多列組合重複(並新增組合唯一索引)

mysql去除多列組合重複(並新增組合唯一索引)

緣起

由於起初mysql資料表設計考慮不周,導致後續表裡出現重複資料,這裡點重複是指多個列組合起來資料一樣。現期望多個列組合資料表示點記錄在資料表裡唯一,結局辦法就是加多列組合唯一索引。

本文以col1, col2col3三列組合為唯一索引。資料表名為table_name

這時如果使用:

alter table table_name add unique "uk_index" (col1, col2, col3)

mysql會提示重複,報錯。
原因在於資料表裡已經有重複資料。

如何去重呢?
  1. 首先找出重複行:
select 
    col1, col2, col3 
from
table_name group by col1, col2, col3 having count(*) > 1
  1. 使用col1, col2col3table_name和上面的sql結果連線:
select 
    t1.*
from 
    table_name t1 
join 
    (select col1, col2, col3 from table_name group by col1, col2, col3 having count(*) > 1) t2 
on 
    t1.col1 = t2.col1 and 
    t1.col2 = t2.col2 and
t1.col3 = t2.col3

這裡就篩選出所有重複的記錄。假設重複數都為2,則需要將其中一條刪除。這裡使用group by取重複中都一條ID。
3. 取ID

select 
    t1.id
from 
    table_name t1 
join 
    (select col1, col2, col3 from table_name group by col1, col2, col3 having count(*) > 1) t2 
on 
    t1.col1 = t2.col1 and 
    t1.col2 = t2.col2 and 
    t1.col3 = t2.col3
group
by col1, col2, col3

這裡取出了需要刪除的ID,如何使用這些ID進行刪除呢?
4. 刪除
使用delete from where id in ()刪除。

delete from 
    table_name
where 
    id 
in ( 
    select 
        t1.id
    from 
        table_name t1 
    join 
        (select col1, col2, col3 from table_name group by col1, col2, col3 having count(*) > 1) t2 
    on 
        t1.col1 = t2.col1 and 
        t1.col2 = t2.col2 and 
        t1.col3 = t2.col3
    group by
        col1, col2, col3
    )

這條sql會報錯,原因在於mysql禁止在一個表裡同時子查詢與修改。我們可以分兩步做,分開執行selectdelete操作。下面偷懶一下,用一個比較hack的方法:在子查詢中在查詢一次並將結果做一個alias。

delete from 
    table_name
where 
    id 
in (
    select 
        * 
    fromselect 
            t1.id
        from 
            table_name t1 
        join 
            (select col1, col2, col3 from table_name group by col1, col2, col3 having count(*) > 1) t2 
        on 
            t1.col1 = t2.col1 and 
            t1.col2 = t2.col2 and 
            t1.col3 = t2.col3
        group by
            col1, col2, col3
    )as p
)

新增組合唯一索引

alter table table_name add unique "uk_index" (col1, col2, col3)