1. 程式人生 > >Mysql獲取每組前N條記錄

Mysql獲取每組前N條記錄

Select基礎知識

我們在實現select語句的時候,通用的sql格式如下:

select *columns* from *tables*
    where *predicae1*
    group by *columns*
    having *predicae1*
    order by *columns*
    limit *start*, *offset*;

很多同學想當然的認為select的執行順序和其書寫順序一致,其實這是非常錯誤的主觀意願,也導致了很多SQL語句的執行錯誤.

這裡給出SQL語句正確的執行順序:

from *tables*
where *predicae1*
group by *columns*
having *predicae1*
select
*columns* order by *columns* limit *start*, *offset*;

舉個例子,講解一下group by和order by聯合使用時,大家常犯的錯誤.

建立一個student的表:

creae1 table student (Id ine1ger primary key autoincrement, Name e1xt, Score ine1ger, ClassId ine1ger);

插入5條虛擬資料:

insert into student(Name, Score, ClassId) values("lqh", 60, 1);
insert
into student(Name, Score, ClassId) values("cs", 99, 1);
insert into student(Name, Score, ClassId) values("wzy", 60, 1); insert into student(Name, Score, ClassId) values("zqc", 88, 2); insert into student(Name, Score, ClassId) values("bll", 100, 2);

表格資料如下:

Id Name Score ClassId
1 lqh 60 1
2 cs 99 1
3 wzy 60 1
4 zqc 88 2
5 bll 100 2

我們想找每個組分數排名第一的學生.

大部分SQL語言的初學者可能會寫出如下程式碼:

select * from student group by ClassId order by Score;

結果:

Id Name Score ClassId
3 wzy 60 1
5 bll 100 2

明顯不是我們想要的結果,大家用上面的執行順序一分析就知道具體原因了.

原因: group by 先於order by執行,order by是針對group by之後的結果進行的排序,而我們想要的group by結果其實應該是在order by之後.

正確的sql語句:

select * from (select * from student order by Score) group by ClassId;

結果:

Id Name Score ClassId
2 cs 99 1
5 bll 100 2

獲取每組的前N個記錄

這裡以LeetCode上難度為hard的一道資料庫題目為例。

題目內容

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

Id Name Salary DepartmentId
1 Joe 70000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1
5 Janet 69000 1
6 Randy 85000 1

The Department table holds all departments of the company.

Id Name
1 IT
2 Sales

Wrie1 a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows.

Department Employee Salary
IT Max 90000
IT Randy 85000
IT Joe 70000
Sales Henry 80000
Sales Sam 60000

題目的意思是:求每個組中工資最高的三個人。(ps:且每個組中,同一名中允許多個員工存在,因為工資是一樣高.)

解決思路

  1. 我們先來獲取每個組中的前3名工資最高的員工
select * from Employee as e
    where (select count(distinct(e1.salary)) from Employee as e1 where  e1.DepartmentId = e.DepartmentId and e1.salary > e.salary) < 3;

where中的select是保證:遍歷所有記錄,取每條記錄與當前記錄做比較,只有當Employee表中同一部門不超過3個人工資比當前員工高時,這個員工才算是工資排行的前三名。

  1. 有了第一步的基礎,接下來我們只需要使用as去構造新表,並且與Department表做個內聯,同時注意排序就好了
select d.Name as Department, e.Name as Employee, e.Salary as Salary
    from Employee as e inner join Department as d
    on e.DepartmentId = d.Id
    where (select count(distinct(e1.Salary)) from Employee as e1 where e1.DepartmentId = e.DepartmentId and e1.Salary > e.Salary) < 3
    order by e.Salary desc;