如何在SQL中显示存在和不存在第n个最高组的值

108 阅读5分钟

在为准备面试而略读SQL时,我经常遇到这样的问题。通过连接一个包含员工信息的表和另一个包含部门信息的表,找到工资最高或(第二高)的员工。这又引发了一个问题。那么,在整个部门范围内找到工资排名第N位的员工呢?

现在我想提出一个更复杂的情况。当一个部门没有赚取第n个最高工资的员工时,会发生什么?例如,一个只有两名员工的部门将不会有员工获得第三高的工资。

下面是我解决这个问题的方法。

创建部门和雇员表

我创建了一个表,其中包括dept_iddept_name 等字段。

CREATE TABLE department (
    dept_id INT,
    dept_name VARCHAR(60)
);

现在我在新表中插入各个部门。

INSERT INTO department (dept_id,dept_name)
VALUES (780,'HR');
INSERT INTO department (dept_id,dept_name)
VALUES (781,'Marketing');
INSERT INTO department (dept_id,dept_name)
VALUES (782,'Sales');
INSERT INTO department (dept_id,dept_name)
VALUES (783,'Web Dev');

A table showing the data from the earlier code snippets with the columns "Department ID" and "Department Name"

接下来,我创建了另一个表,将字段first_name,last_name,dept_id, 和salary

CREATE TABLE employee (
    first_name VARCHAR(100),
    last_name VARCHAR(100),
    dept_id INT,
    salary INT
);

然后,我向该表插入数值。

INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Sam','Burton',781,80000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Peter','Mellark',780,90000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Happy','Hogan',782,110000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Steve','Palmer',782,120000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Christopher','Walker',783,140000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Richard','Freeman',781,85000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Alex','Wilson',782,115000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Harry','Simmons',781,90000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Thomas','Henderson',780,95000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Ronald','Thompson',783,130000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('James','Martin',783,135000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Laurent','Fisher',780,100000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Tom','Brooks',780,85000);
INSERT INTO employee (first_name,last_name,dept_id,salary)
VALUES ('Tom','Bennington',783,140000);

A table showing data from the earlier code snippets with first name, last name, dept ID, and salary columns, ordered by department ID number

我可以用这个表推断出每个部门的员工人数(部门ID:员工人数)。

  • 780:4
  • 781:3
  • 782:3
  • 783:4

如果我想查看不同部门中收入第二高的员工,以及他们的部门名称(使用DENSE_RANK ),表格将如下。

A table with department ID, department name, first name, last name, and salary columns, listing the second-highest-earning employee in each of four departments, ordered from lowest to highest salary

如果我应用同样的查询来寻找收入第四高的员工,输出结果将只有780部门(人力资源部)的汤姆-布鲁克斯,工资为85,000美元。

The table listing fourth-highest-earning employees lists only one employee.

虽然部门783(网络开发)有四名员工,但其中两名(詹姆斯-马丁和罗纳德-汤普森)将被归类为该部门收入第三高的员工,因为前两名收入者的工资相同。

寻找第n个最高的

现在,说说主要问题。如果我想显示dept_IDdept_name ,对于没有第n个最高收入的员工的部门,与员工有关的字段为空值,该怎么办?

The list of fourth-highest-earning employee by department, showing "null" in the first name, last name, and salary columns for departments that do not have a fourth-highest earner.

图5中显示的表格 ,是我在特定部门没有第N高收入员工时的目标。市场部、销售部和网络开发部被列出,但名称和工资字段包含一个空值。

帮助获得图5中的表格的最终查询如下。

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
FROM (SELECT * FROM (
SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
FROM employee INNER JOIN department
ON employee.dept_id=department.dept_id) AS k
WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM department WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,salary
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM null1
UNION
SELECT * FROM nulled
ORDER BY dept_id)
B;

对查询的分解

我将对该查询进行分解,以使其不至于让人不知所措。

使用DENSE_RANK() ,显示雇员和部门信息(不涉及没有第n个收入最高的成员的空值)。

SELECT * FROM (
  SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
   employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
   FROM employee INNER JOIN department
   ON employee.dept_id=department.dept_id) AS k
   WHERE rank1=4

输出

A table of the fourth-highest earners showing only the department with a fourth-highest earner

从图6的表格中排除rank1 列,该列只确定了一个工资第四高的员工,尽管在另一个部门有四个员工。

SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
    FROM (SELECT * FROM (
  SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
   employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
   FROM employee INNER JOIN department
   ON employee.dept_id=department.dept_id) AS k
   WHERE rank1=4)A

输出

The fourth-highest earner table (table six) without the rank 1 column

从部门表中指出没有第n个最高收入员工的部门。

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
    FROM (SELECT * FROM (
  SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
   employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
   FROM employee INNER JOIN department
   ON employee.dept_id=department.dept_id) AS k
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM department WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL))
SELECT * FROM full1)B

输出

The full1 table listing the departments without a fourth-highest earner by department ID and name: marketing, sales, web dev

null1 替换上述代码最后一行中的full1

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
    FROM (SELECT * FROM (
  SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
   employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
   FROM employee INNER JOIN department
   ON employee.dept_id=department.dept_id) AS k
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM department WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL))
SELECT * FROM null1)B

The null1 table listing all departments, with null values for those without a fourth-highest earner

列出所有部门的null1表,其中没有第四高收入者的部门为空值 (Mohammed Kamil Khan, CC BY-SA 4.0)

现在,我将图9中的dept_iddept_name 字段的空值填充为图8中的相应数值。

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
    FROM (SELECT * FROM (
  SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
   employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
   FROM employee INNER JOIN department
   ON employee.dept_id=department.dept_id) AS k
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM department WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,salary
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM nulled) B;

The table with department id, department name, first name, last name, and salary columns, with null values in the name and salary columns

nulled查询在null1 表的dept_iddept_name 列中遇到的空值上使用CASE WHEN ,并将它们替换为full1 表中的相应值。现在我需要做的是将UNION 应用于图7和图10中得到的表。null1这可以通过使用WITH 来声明前面代码中的最后一个查询,然后用UNION-化它来完成。

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.salary
FROM (SELECT * FROM (
SELECT department.dept_id, department.dept_name, employee.first_name, employee.last_name,
employee.salary, DENSE_RANK() OVER (PARTITION BY employee.dept_id ORDER BY employee.salary DESC) AS Rank1
FROM employee INNER JOIN department
ON employee.dept_id=department.dept_id) AS k
WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM department WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,salary
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM null1
UNION
SELECT * FROM nulled
ORDER BY dept_id)
B;

The complete table: department ID, department name, first name, last name, salary columns. The first row contains the information of the one fourth-highest earner, and the next three columns show the remaining departments, with ID, and null value in the other three columns.

现在我可以从图11中推断出,市场部、销售部和网络开发部是没有任何员工获得第四高工资的部门。