Mysql獲取每組前N條記錄
Select基礎知識
我們在實現select語句的時候,通用的sql格式如下:
select *columns* from *tables*
where *predicae1*
group by *columns*
having *predicae1*
order by *columns*
limit *start*, *offset*;
很多同學想當然的認為select的執行順序和其書寫順序一致,其實這是非常錯誤的主觀意願,也導致了很多SQL語句的執行錯誤.
這裡給出SQL語句正確的執行順序:
from *tables*
where *predicae1*
group by *columns*
having *predicae1*
select *columns*
order by *columns*
limit *start*, *offset*;
舉個例子,講解一下group by和order by聯合使用時,大家常犯的錯誤.
建立一個student的表:
creae1 table student (Id ine1ger primary key autoincrement, Name e1xt, Score ine1ger, ClassId ine1ger);
插入5條虛擬資料:
insert into student(Name, Score, ClassId) values("lqh", 60, 1);
insert into student(Name, Score, ClassId) values("cs", 99, 1);
insert into student(Name, Score, ClassId) values("wzy", 60, 1);
insert into student(Name, Score, ClassId) values("zqc", 88, 2);
insert into student(Name, Score, ClassId) values("bll", 100, 2);
表格資料如下:
Id | Name | Score | ClassId |
---|---|---|---|
1 | lqh | 60 | 1 |
2 | cs | 99 | 1 |
3 | wzy | 60 | 1 |
4 | zqc | 88 | 2 |
5 | bll | 100 | 2 |
我們想找每個組分數排名第一的學生.
大部分SQL語言的初學者可能會寫出如下程式碼:
select * from student group by ClassId order by Score;
結果:
Id | Name | Score | ClassId |
---|---|---|---|
3 | wzy | 60 | 1 |
5 | bll | 100 | 2 |
明顯不是我們想要的結果,大家用上面的執行順序一分析就知道具體原因了.
原因: group by 先於order by執行,order by是針對group by之後的結果進行的排序,而我們想要的group by結果其實應該是在order by之後.
正確的sql語句:
select * from (select * from student order by Score) group by ClassId;
結果:
Id | Name | Score | ClassId |
---|---|---|---|
2 | cs | 99 | 1 |
5 | bll | 100 | 2 |
獲取每組的前N個記錄
這裡以LeetCode上難度為hard的一道資料庫題目為例。
題目內容
The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.
Id | Name | Salary | DepartmentId |
---|---|---|---|
1 | Joe | 70000 | 1 |
2 | Henry | 80000 | 2 |
3 | Sam | 60000 | 2 |
4 | Max | 90000 | 1 |
5 | Janet | 69000 | 1 |
6 | Randy | 85000 | 1 |
The Department table holds all departments of the company.
Id | Name |
---|---|
1 | IT |
2 | Sales |
Wrie1 a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows.
Department | Employee | Salary |
---|---|---|
IT | Max | 90000 |
IT | Randy | 85000 |
IT | Joe | 70000 |
Sales | Henry | 80000 |
Sales | Sam | 60000 |
題目的意思是:求每個組中工資最高的三個人。(ps:且每個組中,同一名中允許多個員工存在,因為工資是一樣高.)
解決思路
- 我們先來獲取每個組中的前3名工資最高的員工
select * from Employee as e
where (select count(distinct(e1.salary)) from Employee as e1 where e1.DepartmentId = e.DepartmentId and e1.salary > e.salary) < 3;
where中的select是保證:遍歷所有記錄,取每條記錄與當前記錄做比較,只有當Employee表中同一部門不超過3個人工資比當前員工高時,這個員工才算是工資排行的前三名。
- 有了第一步的基礎,接下來我們只需要使用as去構造新表,並且與Department表做個內聯,同時注意排序就好了
select d.Name as Department, e.Name as Employee, e.Salary as Salary
from Employee as e inner join Department as d
on e.DepartmentId = d.Id
where (select count(distinct(e1.Salary)) from Employee as e1 where e1.DepartmentId = e.DepartmentId and e1.Salary > e.Salary) < 3
order by e.Salary desc;