1. 程式人生 > >Oracle 刪除重復數據只留一條

Oracle 刪除重復數據只留一條

num bsd 結構 comment 字段 包含 log 一次 acl

查詢及刪除重復記錄的SQL語句 1、查找表中多余的重復記錄,重復記錄是根據單個字段(Id)來判斷 select * from 表 where Id in (select Id from 表 group byId having count(Id) > 1) 2、刪除表中多余的重復記錄,重復記錄是根據單個字段(Id)來判斷,只留有rowid最小的記錄 DELETE from 表 WHERE (id) IN ( SELECT id FROM 表 GROUP BY id HAVING COUNT(id) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FROM 表 GROUP BY id HAVING COUNT(*) > 1); 3、查找表中多余的重復記錄(多個字段) select * from 表 a where (a.Id,a.seq) in(select Id,seq from 表 group by Id,seq having count(*) > 1) 4、刪除表中多余的重復記錄(多個字段),只留有rowid最小的記錄 delete from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1) 5、查找表中多余的重復記錄(多個字段),不包含rowid最小的記錄 select * from 表 a where (a.Id,a.seq) in (select Id,seq from 表 group by Id,seq having count(*) > 1) and rowid not in (select min(rowid) from 表 group by Id,seq having count(*)>1)

一:重復數據根據單個字段進行判斷

1、首先,查詢表中多余的數據,由關鍵字段(name)來查詢。

select * from OA_ADDRESS_BOOK where name in (select name from OA_ADDRESS_BOOK group by name having count(name)>1)

2、刪除表中重復數據,重復數據是根據單個字段(Name)來判斷,只留有rowid最小的記錄

delete from OA_ADDRESS_BOOK where (Name) in

(select Name from OA_ADDRESS_BOOK group by Name having count(Name) >1)

and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name having count(Name)>1)

二:重復數據根據多個字段進行判斷

1、首先,查詢表中重復數據,由關鍵字段(Name,UNIT_ID)來查詢。

select * from OA_ADDRESS_BOOK book1 where (book1.name,book1.unit_id) in
(select book2.name,book2.unit_id from OA_ADDRESS_BOOK book2 group by book2.name,book2.unit_id having count(*)>1)

2、刪除表中重復數據,重復數據是根據多個字段(Name,UNIT_ID)來判斷,只留有rowid最小的記錄

delete from OA_ADDRESS_BOOK a where (a.Name,a.UNIT_ID) in
(select Name,UNIT_ID from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*) > 1)
and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*)>1)
3、查詢表中重復數據,重復數據是根據多個字段(Name,UNIT_ID)來判斷,不包含rowid最小的記錄 select name,unit_id from OA_ADDRESS_BOOK a where (a.Name,a.UNIT_ID) in
(select Name,UNIT_ID from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*) > 1)
and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*)>1)

1. 問題描述

BBSCOMMENT表為BBSDETAIL的從表,記錄商戶評價信息。因為數據倒騰來倒騰去的,有很多重復數據。表結構如下:

COMMENT_ID NOT NULL NUMBER  --主鍵
DETAIL_ID NOT NULL NUMBER  --外鍵,引用BBSDETAIL表
COMMENT_BODY NOT NULL VARCHAR2(500)  --評價內容

--其它字段忽略

其中主鍵是沒有重復的,重復的是DETAIL_ID+COMMENT_BODY+......等信息,就是某些商家的評價信息有重復。

2. 解決步驟

2.1 查找表中多余的重復記錄

技術分享
--查詢出所有有重復的數據
select DETAIL_ID,COMMENT_BODY,count(*)
from BBSCOMMENT
group by DETAIL_ID,COMMENT_BODY
having count(*)>1
order by DETAIL_ID, COMMENT_BODY; --1955條
技術分享

2.2 顯示了所有的非冗余的數據

--這一條命令顯示了所有的非冗余的數據
select min(COMMENT_ID) as COMMENT_ID,DETAIL_ID,COMMENT_BODY
from BBSCOMMENT
group by DETAIL_ID,COMMENT_BODY;   --21453條,之所以此值不等於表總記錄數-1955,是因為1955條記錄中,有的重復了不止一次。

2.3 如果記錄數量少(千級別),可以把上面的語句做成子查詢然後直接刪除

技術分享
--如果表數據量不是很大(1千條以內),可以把上面的語句做成子查詢然後直接刪除
delete from BBSCOMMENT where COMMENT_ID not in(
    select min(COMMENT_ID)
    from BBSCOMMENT
    group by DETAIL_ID,COMMENT_BODY
);          --782秒,在我這裏,2萬條記錄,重復記錄2千多(太慢了!!)
技術分享

2.4 另一種刪除方法

技術分享
--這條語句也能夠實現上述功能,但不好測試了,數據已經被我刪除了
--刪除條件一:有重復數據的記錄;條件二:保留最小rowid的記錄。
delete from BBSCOMMENT a
where
    (a.DETAIL_ID,a.COMMENT_BODY) in(select DETAIL_ID,COMMENT_BODY from BBSCOMMENT group by DETAIL_ID,COMMENT_BODY having count(*) > 1)
    and rowid not in (select min(rowid) from BBSCOMMENT group by DETAIL_ID,COMMENT_BODY having count(*)>1);
技術分享

2.5 大數據量還是用PL/SQL方便快捷

技術分享
declare
--定義存儲結構
type bbscomment_type is record
(
    comment_id BBSCOMMENT.COMMENT_ID%type,
    detail_id BBSCOMMENT.DETAIL_ID%type,
    comment_body BBSCOMMENT.COMMENT_BODY%type
);
bbscomment_record bbscomment_type;

--可供比較的變量
v_comment_id BBSCOMMENT.COMMENT_ID%type;
v_detail_id BBSCOMMENT.DETAIL_ID%type;
v_comment_body BBSCOMMENT.COMMENT_BODY%type;

--其它變量
v_batch_size integer := 5000;
v_counter integer := 0;

cursor cur_dupl is
    --取出所有有重復的記錄
    select COMMENT_ID, DETAIL_ID, COMMENT_BODY
    from BBSCOMMENT
    where(DETAIL_ID, COMMENT_BODY) in (
        --這些記錄有重復
        select DETAIL_ID, COMMENT_BODY
        from BBSCOMMENT
        group by DETAIL_ID, COMMENT_BODY
        having count(*) > 1)
    order by DETAIL_ID, COMMENT_BODY;
begin
    for bbscomment_record in cur_dupl loop
        if v_detail_id is null or (bbscomment_record.detail_id != v_detail_id or nvl(bbscomment_record.comment_body, ‘ ‘) != nvl(v_comment_body, ‘ ‘)) then
            --首次進入、換記錄了,都重新賦值
            v_detail_id := bbscomment_record.detail_id;
            v_comment_body := bbscomment_record.comment_body;
        else
            --其它記錄刪除
            delete from BBSCOMMENT where COMMENT_ID = bbscomment_record.comment_id;
            v_counter := v_counter + 1;

            if mod(v_counter, v_batch_size) = 0 then
                --每多少條提交一次
                commit;
            end if;
        end if;
    end loop;

    if v_counter > 0 then
        --最後一次提交
        commit;
    end if;

    dbms_output.put_line(to_char(v_counter)||‘條記錄被刪除!‘);
exception
    when others then
        dbms_output.put_line(‘sqlerrm-->‘ ||sqlerrm);
        rollback;
end;

Oracle 刪除重復數據只留一條