1. 程式人生 > >mysql去除重複資料,只保留一條。

mysql去除重複資料,只保留一條。

之前寫過的爬蟲裡面,因為種種原因出現了一些重複的資料需要刪除掉。然後發現mysql並沒有直接的去重功能,要自己寫。

查過許多部落格之後發現可以這麼寫。

刪除ppeople 重複的資料,然後重複資料中保留id最小的那一條資料。

delete from people 
where peopleId in (select peopleId from people group by peopleId having count(peopleId) > 1) 
and rowid not in (select min(rowid) from people group by peopleId having count(peopleId )>1) 


但是執行之後發現mysql不支援這麼寫。報錯資訊為:

You can't specify target table 'news' for update in FROM clause

查閱之後發現,應當把查詢結果通過中間表再查詢一遍才行。

修改為:

deletefrom news

wherenewsurl in (select NewsUrl from (select NewsUrl from news group by newsurlhaving count(newsurl) > 1) a)

andnewsid not in ( select newsid from (select min(newsid) as newsid  from news group by newsurl havingcount(newsurl )>1) b)

執行成功