LeetCode数据专题:176. 第二高的薪水

21 阅读2分钟

题目描述

Employee 表:

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+SQL 中,id 是这个表的主键。
表的每一行包含员工的工资信息。

查询并返回 Employee 表中第二高的薪水 。如果不存在第二高的薪水,查询应该返回 null(Pandas 则返回 None) 。

查询结果如下例所示。

示例 1:

输入:
Employee 表:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+
输出:
+---------------------+
| SecondHighestSalary |
+---------------------+
| 200                 |
+---------------------+

示例 2:

输入:
Employee 表:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
+----+--------+
输出:
+---------------------+
| SecondHighestSalary |
+---------------------+
| null                |
+---------------------+

解题思路

我们可以使用 MySQL 中的 LIMIT 子句和 DISTINCT 关键字来实现这个查询。首先,我们使用 DESC 关键字对薪水进行降序排序,然后使用 LIMIT 语句获取第一条记录之后的薪水,并使用 DISTINCT 来去除重复值。

代码实现

MySQL 实现

  • order by + 临时表
SELECT (
    SELECT DISTINCT
            Salary
    FROM
        Employee
    ORDER BY Salary DESC
    LIMIT 1 OFFSET 1) 
AS SecondHighestSalary
;
  • ifnull
SELECT
    IFNULL(
      (SELECT DISTINCT Salary
       FROM Employee
       ORDER BY Salary DESC
        LIMIT 1 OFFSET 1),
    NULL) AS SecondHighestSalary
  • 窗口函数
WITH CTE AS (SELECT 
    Salary, 
    RANK () OVER (ORDER BY Salary desc) AS RANK_desc
 FROM Employee)
SELECT MAX(salary) AS SecondHighestSalary
  FROM CTE
 WHERE RANK_desc = 2
  • join 小的里面的最大的一个。

SELECT MAX(b.val) FROM table a, table b WHERE a.val > b.val

SELECT
   MAX(a.Salary) as SecondHighestSalary
  FROM Employee a
  JOIN Employee b
    ON a.Salary < b.Salary

Pandas 实现

import pandas as pd

def second_highest_salary(employee: pd.DataFrame) -> pd.DataFrame:
    # 删除重复的薪水值,以避免将重复的薪水计算为不同的薪水等级
    unique_salaries = employee['salary'].drop_duplicates()

    # 对唯一的薪水按降序排序,并获取第二高的薪水
    second_highest = unique_salaries.nlargest(2).iloc[-1] if len(unique_salaries) >= 2 else None

    # 如果第二高的薪水不存在(例如,只有一个或没有唯一的薪水值),则返回 None
    if second_highest is None:
        return pd.DataFrame({'SecondHighestSalary': [None]})

    # 创建一个包含第二高薪水的 DataFrame
    result_df = pd.DataFrame({'SecondHighestSalary': [second_highest]})

    return result_df