1481_Easy_院系无效的学生
院系表: Departments
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id 是该表的主键
该表包含一所大学每个院系的 id 信息
学生表: Students
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
| department_id | int |
+---------------+---------+
id 是该表的主键
该表包含一所大学每个学生的 id 和他/她就读的院系信息
写一条 SQL 语句以查询那些所在院系不存在的学生的 id 和姓名
可以以任何顺序返回结果
下面是返回结果格式的例子
Departments 表:
+------+--------------------------+
| id | name |
+------+--------------------------+
| 1 | Electrical Engineering |
| 7 | Computer Engineering |
| 13 | Bussiness Administration |
+------+--------------------------+
Students 表:
+------+----------+---------------+
| id | name | department_id |
+------+----------+---------------+
| 23 | Alice | 1 |
| 1 | Bob | 7 |
| 5 | Jennifer | 13 |
| 2 | John | 14 |
| 4 | Jasmine | 77 |
| 3 | Steve | 74 |
| 6 | Luis | 1 |
| 8 | Jonathan | 7 |
| 7 | Daiana | 33 |
| 11 | Madelynn | 1 |
+------+----------+---------------+
结果表:
+------+----------+
| id | name |
+------+----------+
| 2 | John |
| 7 | Daiana |
| 4 | Jasmine |
| 3 | Steve |
+------+----------+
John, Daiana, Steve 和 Jasmine 所在的院系分别是 14, 33, 74 和 77, 其中 14, 33, 74 和 77 并不存在于院系表
建表+导数:
create table 1481_Departments(id int, name varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1481_Departments" overwrite into table 1481_Departments;
create table 1481_Students(id int, name varchar(20), department_id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1481_Students" overwrite into table 1481_Students;
知识点:左外连接 <七种join方式>
select
s.id,
s.name
from
1481 _Students s
left join 1481 _Departments d on
s.department_id = d.id
where
d.id is null
1494_Medium_活动参与者
表: Friends
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
| activity | varchar |
+---------------+---------+
id 是朋友的 id 和该表的主键
name 是朋友的名字
activity 是朋友参加的活动的名字
表: Activities
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id 是该表的主键
name 是活动的名字
写一条 SQL 查询那些既没有最多,也没有最少参与者的活动的名字
可以以任何顺序返回结果,Activities 表的每项活动的参与者都来自 Friends 表
注意:名称相同 id 不同的参与者算作两个人
下面是查询结果格式的例子:
Friends 表:
+------+--------------+---------------+
| id | name | activity |
+------+--------------+---------------+
| 1 | Jonathan D. | Eating |
| 2 | Jade W. | Singing |
| 3 | Victor J. | Singing |
| 4 | Elvis Q. | Eating |
| 5 | Daniel A. | Eating |
| 6 | Bob B. | Horse Riding |
+------+--------------+---------------+
Activities 表:
+------+--------------+
| id | name |
+------+--------------+
| 1 | Eating |
| 2 | Singing |
| 3 | Horse Riding |
+------+--------------+
Result 表:
+--------------+
| activity |
+--------------+
| Singing |
+--------------+
Eating 活动有三个人参加, 是最多人参加的活动 (Jonathan D. , Elvis Q. and Daniel A.)
Horse Riding 活动有一个人参加, 是最少人参加的活动 (Bob B.)
Singing 活动有两个人参加 (Victor J. and Jade W.)
建表+导数:
create table 1494_Friends(id int, name varchar(20), activity varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1494_Friends" overwrite into table 1494_Friends;
create table 1494_Activities(id int, name varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1494_Activities" overwrite into table 1494_Activities;
知识点:rank排序函数,where 不等于 <>
select
tt.name
from
(
select
t.name,
rank() over(order by t.cnt) rk1,
rank() over(order by t.cnt desc) rk2
from
(
select
a.name,
count(1) cnt
from
1494 _Activities a
left join 1494 _Friends f
on a.name = f.activity
group by
a.name
) t
) tt
where
tt.rk1 <> 1
and tt.rk2 <> 1
1495_Medium_顾客的可信联系人数量
顾客表:Customers
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| customer_id | int |
| customer_name | varchar |
| email | varchar |
+---------------+---------+
customer_id 是这张表的主键。
此表的每一行包含了某在线商店顾客的姓名和电子邮件。
联系方式表:Contacts
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | id |
| contact_name | varchar |
| contact_email | varchar |
+---------------+---------+
(user_id, contact_email) 是这张表的主键。
此表的每一行表示编号为 user_id 的顾客的某位联系人的姓名和电子邮件。
此表包含每位顾客的联系人信息,但顾客的联系人不一定存在于顾客表中。
发票表:Invoices
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| invoice_id | int |
| price | int |
| user_id | int |
+--------------+---------+
invoice_id 是这张表的主键。
此表的每一行分别表示编号为 user_id 的顾客拥有有一张编号为 invoice_id、价格为 price 的发票。
为每张发票 invoice_id 编写一个SQL查询以查找以下内容:
customer_name:与发票相关的顾客名称。price:发票的价格。contacts_cnt:该顾客的联系人数量。trusted_contacts_cnt:可信联系人的数量:既是该顾客的联系人又是商店顾客的联系人数量(即:可信联系人的电子邮件存在于客户表中)。
将查询的结果按照 invoice_id 排序。
查询结果的格式如下例所示:
Customers table:
+-------------+---------------+--------------------+
| customer_id | customer_name | email |
+-------------+---------------+--------------------+
| 1 | Alice | alice@leetcode.com |
| 2 | Bob | bob@leetcode.com |
| 13 | John | john@leetcode.com |
| 6 | Alex | alex@leetcode.com |
+-------------+---------------+--------------------+
Contacts table:
+-------------+--------------+--------------------+
| user_id | contact_name | contact_email |
+-------------+--------------+--------------------+
| 1 | Bob | bob@leetcode.com |
| 1 | John | john@leetcode.com |
| 1 | Jal | jal@leetcode.com |
| 2 | Omar | omar@leetcode.com |
| 2 | Meir | meir@leetcode.com |
| 6 | Alice | alice@leetcode.com |
+-------------+--------------+--------------------+
Invoices table:
+------------+-------+---------+
| invoice_id | price | user_id |
+------------+-------+---------+
| 77 | 100 | 1 |
| 88 | 200 | 1 |
| 99 | 300 | 2 |
| 66 | 400 | 2 |
| 55 | 500 | 13 |
| 44 | 60 | 6 |
+------------+-------+---------+
Result table:
+------------+---------------+-------+--------------+----------------------+
| invoice_id | customer_name | price | contacts_cnt | trusted_contacts_cnt |
+------------+---------------+-------+--------------+----------------------+
| 44 | Alex | 60 | 1 | 1 |
| 55 | John | 500 | 0 | 0 |
| 66 | Bob | 400 | 2 | 0 |
| 77 | Alice | 100 | 3 | 2 |
| 88 | Alice | 200 | 3 | 2 |
| 99 | Bob | 300 | 2 | 0 |
+------------+---------------+-------+--------------+----------------------+
Alice 有三位联系人,其中两位(Bob 和 John)是可信联系人。
Bob 有两位联系人, 他们中的任何一位都不是可信联系人。
Alex 只有一位联系人(Alice),并是一位可信联系人。
John 没有任何联系人。
建表+导数
create table 1495_Customers(customer_id int, customer_name varchar(20), email varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1495_Customers" overwrite into table 1495_Customers;
select * from 1495_Customers;
create table 1495_Contacts(user_id int, contact_name varchar(20), contact_email varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1495_Contacts" overwrite into table 1495_Contacts;
select * from 1495_Contacts;
create table 1495_Invoices(invoice_id int, price int, user_id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1495_Invoices" overwrite into table 1495_Invoices;
select * from 1495_Invoices;
知识点:三表join,view视图创建,
创建视图,增强可读性
create
view tmp as select
inv.invoice_id,
inv.price,
cus.customer_name,
con.contact_name,
con.user_id
from
1495_Invoices inv
left join 1495 _Customers cus on
inv.user_id = cus.customer_id
left join 1495_Contacts con on
inv.user_id = con.user_id
tmp视图输出:
最终执行sql
select
tmp.invoice_id,
tmp.customer_name,
tmp.price,
sum(case when cus.customer_name is null then 0 else 1 end) contacts_cnt,
sum(case when cus.customer_id is null then 0 else 1 end) trusted_contacts_cnt
from
tmp
left join 1495_Customers cus on
tmp.user_id = cus.customer_id
group by
tmp.invoice_id,
tmp.customer_name,
tmp.price
最终结果:
1504_Hard_获取最近第二次的活动
表: UserActivity
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| username | varchar |
| activity | varchar |
| startDate | Date |
| endDate | Date |
+---------------+---------+
该表不包含主键
该表包含每个用户在一段时间内进行的活动的信息
名为 username 的用户在 startDate 到 endDate 日内有一次活动
写一条SQL查询展示每一位用户 最近第二次 的活动
如果用户仅有一次活动,返回该活动
一个用户不能同时进行超过一项活动,以 任意 顺序返回结果
下面是查询结果格式的例子:
UserActivity 表:
+------------+--------------+-------------+-------------+
| username | activity | startDate | endDate |
+------------+--------------+-------------+-------------+
| Alice | Travel | 2020-02-12 | 2020-02-20 |
| Alice | Dancing | 2020-02-21 | 2020-02-23 |
| Alice | Travel | 2020-02-24 | 2020-02-28 |
| Bob | Travel | 2020-02-11 | 2020-02-18 |
+------------+--------------+-------------+-------------+
Result 表:
+------------+--------------+-------------+-------------+
| username | activity | startDate | endDate |
+------------+--------------+-------------+-------------+
| Alice | Dancing | 2020-02-21 | 2020-02-23 |
| Bob | Travel | 2020-02-11 | 2020-02-18 |
+------------+--------------+-------------+-------------+
Alice 最近一次的活动是从 2020-02-24 到 2020-02-28 的旅行, 在此之前的 2020-02-21 到 2020-02-23 她进行了舞蹈
Bob 只有一条记录,我们就取这条记录
建表+导数
create table 1504_UserActivity(username varchar(20), activity varchar(20), startDate Date, endDate Date) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1504_UserActivity" overwrite into table 1504_UserActivity;
select * from 1504_UserActivity;
知识点: LEAD(col,1,col)开窗函数+rank函数一套带走。
LEAD(col,1,col) => <在窗口内对的col的每行取下一行数据,如果取不到默认当前行>
-LEAD(参数1,[参数2],[参数3])
- 参数1:取值的行。
-参数2:向后取几行。
-参数3:缺省参数,默认为null
-LAG(参数1,[参数2],[参数3])
-参数1:取值的行。
-参数2:向前取几行。
-参数3:缺省参数,默认为null
select
u.username_sce username,
u.activity_sce activity,
u.startDate_sce startDate,
u.endDate_sce endDate
from
(
select
LEAD(username,1,username) over(partition by username order bystartDate) username_sce,
LEAD(activity,1,activity) over(partition by username order by startDate) activity_sce,
LEAD(startDate, 1,startDate) over(partition by username order by startDate) startDate_sce,
LEAD(endDate,1, endDate) over(partition by username order by startDate) endDate_sce,
rank() over(partition by username order by startDate) rk
from
1504 _UserActivity
) u
where
rk = 1
执行结果:
1509_Easy_使用唯一标识码替换员工ID
Employees 表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id 是这张表的主键。
这张表的每一行分别代表了某公司其中一位员工的名字和 ID 。
EmployeeUNI 表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| unique_id | int |
+---------------+---------+
(id, unique_id) 是这张表的主键。
这张表的每一行包含了该公司某位员工的 ID 和他的唯一标识码(unique ID)。
写一段SQL查询来展示每位用户的 唯一标识码(unique ID ) ;如果某位员工没有唯一标识码,使用 null 填充即可。
你可以以 任意 顺序返回结果表。
查询结果的格式如下例所示:
Employees table:
+----+----------+
| id | name |
+----+----------+
| 1 | Alice |
| 7 | Bob |
| 11 | Meir |
| 90 | Winston |
| 3 | Jonathan |
+----+----------+
EmployeeUNI table:
+----+-----------+
| id | unique_id |
+----+-----------+
| 3 | 1 |
| 11 | 2 |
| 90 | 3 |
+----+-----------+
EmployeeUNI table:
+-----------+----------+
| unique_id | name |
+-----------+----------+
| null | Alice |
| null | Bob |
| 2 | Meir |
| 3 | Winston |
| 1 | Jonathan |
+-----------+----------+
Alice and Bob 没有唯一标识码, 因此我们使用 null 替代。
Meir 的唯一标识码是 2 。
Winston 的唯一标识码是 3 。
Jonathan 唯一标识码是 1 。
建表+导数
create table 1509_Employees(id int, name varchar(20), startDate Date, endDate Date) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1509_Employees" overwrite into table 1509_Employees;
select * from 1509_Employees;
create table 1509_EmployeeUNI(id int, unique_id int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1509_EmployeeUNI" overwrite into table 1509_EmployeeUNI;
select * from 1509_EmployeeUNI;
知识点:coalesce(参数a,参数b...) 返回第一个不为空的值
select
coalesce(euni.unique_id,'null') unique_id,
ee.name
from
1509_Employees ee
left join 1509_EmployeeUNI euni on
ee.id = euni.id
执行结果:
1518_Hard_按年度列出销售总额
Product表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| product_name | varchar |
+---------------+---------+
product_id 是这张表的主键。
product_name 是产品的名称。
Sales表:
+---------------------+---------+
| Column Name | Type |
+---------------------+---------+
| product_id | int |
| period_start | date |
| period_end | date |
| average_daily_sales | int |
+---------------------+---------+
product_id 是这张表的主键。
period_start和 period_end是该产品销售期的起始日期和结束日期,且这两个日期包含在销售期内。
average_daily_sales 列存储销售期内该产品的日平均销售额。
编写一段 SQL 查询每个产品每年的总销售额,并包含 product_id, product_name 以及 report_year 等信息。
销售年份的日期介于 2018 年到 2020 年之间。你返回的结果需要按product_id 和 report_year 排序。
查询结果格式如下例所示:
Product table:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 1 | LC Phone |
| 2 | LC T-Shirt |
| 3 | LC Keychain |
+------------+--------------+
Sales table:
+------------+--------------+-------------+---------------------+
| product_id | period_start | period_end | average_daily_sales |
+------------+--------------+-------------+---------------------+
| 1 | 2019-01-25 | 2019-02-28 | 100 |
| 2 | 2018-12-01 | 2020-01-01 | 10 |
| 3 | 2019-12-01 | 2020-01-31 | 1 |
+------------+--------------+-------------+---------------------+
Result table:
+------------+--------------+-------------+--------------+
| product_id | product_name | report_year | total_amount |
+------------+--------------+-------------+--------------+
| 1 | LC Phone | 2019 | 3500 |
| 2 | LC T-Shirt | 2018 | 310 |
| 2 | LC T-Shirt | 2019 | 3650 |
| 2 | LC T-Shirt | 2020 | 10 |
| 3 | LC Keychain | 2019 | 31 |
| 3 | LC Keychain | 2020 | 31 |
+------------+--------------+-------------+--------------+
LC Phone 在 2019-01-25 至 2019-02-28 期间销售,该产品销售时间总计35天。销售总额 35*100 = 3500。
LC T-shirt 在 2018-12-01至 2020-01-01 期间销售,该产品在2018年、2019年、2020年的销售时间分别是31天、365天、1天,2018年、2019年、2020年的销售总额分别是31*10=310、365*10=3650、1*10=10。
LC Keychain 在 2019-12-01至 2020-01-31 期间销售,该产品在2019年、2020年的销售时间分别是:31天、31天,2019年、2020年的销售总额分别是31*1=31、31*1=31。
建表+导数
create table 1518_Product(product_id int, product_name varchar(20)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1518_Product" overwrite into table 1518_Product;
select * from 1518_Product;
create table 1518_Sales(product_id int, period_start date, period_end date, average_daily_sales int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1518_Sales" overwrite into table 1518_Sales;
select * from 1518_Sales;
知识点:分类讨论+结果union。<注意起始日期>
select
t.product_id,
p.product_name,
t.report_year,
t.total_amount
from
(
select
product_id,'2018' report_year,
datediff(if(period_end < cast('2019-01-01' as date),
period_end,
cast('2019-01-01' as date)),
period_start)* average_daily_sales total_amount
from
1518 _Sales
union all
select
product_id,
'2019' report_year,
datediff(if(period_end < cast('2020-01-01' as date),
cast(date_add(period_end,
1)as date),
cast('2020-01-01' as date)),
if(period_start > cast('2018-12-31' as date),
period_start,
cast('2019-01-01' as date)))* average_daily_sales total_amount
from
1518 _Sales
union all
select
product_id,
'2020' report_year,
datediff(if(period_end < cast('2021-01-01' as date),
cast(date_add(period_end,
1)as date),
cast('2021-01-01' as date)),
if(period_start > cast('2019-12-31' as date),
period_start,
cast('2020-01-01' as date)))* average_daily_sales total_amount
from
1518 _Sales ) t
left join 1518 _Product p on
t.product_id = p.product_id
where
t.total_amount>0
1523_Medium_股票的资本损益
Stocks 表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| stock_name | varchar |
| operation | enum |
| operation_day | int |
| price | int |
+---------------+---------+
(stock_name, day) 是这张表的主键
operation 列使用的是一种枚举类型,包括:('Sell','Buy')
此表的每一行代表了名为 stock_name 的某支股票在 operation_day 这一天的操作价格。
保证股票的每次'Sell'操作前,都有相应的'Buy'操作。
编写一个SQL查询来报告每支股票的资本损益。
股票的资本损益是一次或多次买卖股票后的全部收益或损失。
以任意顺序返回结果即可。
SQL查询结果的格式如下例所示:
Stocks 表:
+---------------+-----------+---------------+--------+
| stock_name | operation | operation_day | price |
+---------------+-----------+---------------+--------+
| Leetcode | Buy | 1 | 1000 |
| Corona Masks | Buy | 2 | 10 |
| Leetcode | Sell | 5 | 9000 |
| Handbags | Buy | 17 | 30000 |
| Corona Masks | Sell | 3 | 1010 |
| Corona Masks | Buy | 4 | 1000 |
| Corona Masks | Sell | 5 | 500 |
| Corona Masks | Buy | 6 | 1000 |
| Handbags | Sell | 29 | 7000 |
| Corona Masks | Sell | 10 | 10000 |
+---------------+-----------+---------------+--------+
Result 表:
+---------------+-------------------+
| stock_name | capital_gain_loss |
+---------------+-------------------+
| Corona Masks | 9500 |
| Leetcode | 8000 |
| Handbags | -23000 |
+---------------+-------------------+
Leetcode 股票在第一天以1000美元的价格买入,在第五天以9000美元的价格卖出。资本收益=9000-1000=8000美元。
Handbags 股票在第17天以30000美元的价格买入,在第29天以7000美元的价格卖出。资本损失=7000-30000=-23000美元。
Corona Masks 股票在第1天以10美元的价格买入,在第3天以1010美元的价格卖出。在第4天以1000美元的价格再次购买,在第5天以500美元的价格出售。最后,它在第6天以1000美元的价格被买走,在第10天以10000美元的价格被卖掉。资本损益是每次(’Buy'->'Sell')操作资本收益或损失的和=(1010-10)+(500-1000)+(10000-1000)=1000-500+9000=9500美元。
建表+导数
create table 1523_Stocks(stock_name varchar(20), operation varchar(20), operation_day int, price int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1523_Stocks" overwrite into table 1523_Stocks;
select * from 1523_Stocks;
知识点:group by配合case when then else end语句实现对字段不同条件的求和。
select
stock_name, sum(case when operation='Buy' then -price else price end) capital_gain_loss
from 1523_Stocks
group by stock_name
执行结果:
1536_Medium_购买了产品A和产品B却没有购买产品C的顾客
Customers 表:
+---------------------+---------+
| Column Name | Type |
+---------------------+---------+
| customer_id | int |
| customer_name | varchar |
+---------------------+---------+
customer_id 是这张表的主键。
customer_name 是顾客的名称。
Orders 表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| order_id | int |
| customer_id | int |
| product_name | varchar |
+---------------+---------+
order_id 是这张表的主键。
customer_id 是购买了名为 "product_name" 产品顾客的id。
请你设计 SQL 查询来报告购买了产品 A 和产品 B 却没有购买产品 C 的顾客的 ID 和姓名( customer_id 和 customer_name ),我们将基于此结果为他们推荐产品 C 。
您返回的查询结果需要按照 customer_id 排序。
查询结果如下例所示。
Customers table:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 1 | Daniel |
| 2 | Diana |
| 3 | Elizabeth |
| 4 | Jhon |
+-------------+---------------+
Orders table:
+------------+--------------+---------------+
| order_id | customer_id | product_name |
+------------+--------------+---------------+
| 10 | 1 | A |
| 20 | 1 | B |
| 30 | 1 | D |
| 40 | 1 | C |
| 50 | 2 | A |
| 60 | 3 | A |
| 70 | 3 | B |
| 80 | 3 | D |
| 90 | 4 | C |
+------------+--------------+---------------+
Result table:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 3 | Elizabeth |
+-------------+---------------+
只有 customer_id 为 3 的顾客购买了产品 A 和产品 B ,却没有购买产品 C 。
建表+导数
create table 1536_Customers(customer_id int, customer_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1536_Customers" overwrite into table 1536_Customers;
select * from 1536_Customers;
create table 1536_Orders(order_id int, customer_id int, product_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1536_Orders" overwrite into table 1536_Orders;
select * from 1536_Orders;
知识点:1、collect_set(xxx),对group by 聚合后的字段set去重。2、array_contains(array, value) 如果array包含value字段,那么返回true.
select
c.customer_id,
c.customer_name
from
1536 _Customers c
right join (
select
customer_id,
collect_set(product_name) product_name_set
from
1536 _Orders
group by
customer_id
having
array_contains(product_name_set,
'A')
and array_contains(product_name_set,
'B')
and if(array_contains(product_name_set,
'C'),
false,
true)
) t on
c.customer_id = t.customer_id
执行结果:
1541_Easy_排名靠前的旅行者
表:Users
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
id 是该表单主键。
name 是用户名字。
表:Rides
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| user_id | int |
| distance | int |
+---------------+---------+
id 是该表单主键。
user_id 是本次行程的用户的 id, 而该用户此次行程距离为 distance 。
写一段 SQL ,报告每个用户的旅行距离。
返回的结果表单,以travelled_distance降序排列 ,如果有两个或者更多的用户旅行了相同的距离,那么再以name升序排列 。
查询结果格式如下例所示。
Users 表:
+------+-----------+
| id | name |
+------+-----------+
| 1 | Alice |
| 2 | Bob |
| 3 | Alex |
| 4 | Donald |
| 7 | Lee |
| 13 | Jonathan |
| 19 | Elvis |
+------+-----------+
Rides 表:
+------+----------+----------+
| id | user_id | distance |
+------+----------+----------+
| 1 | 1 | 120 |
| 2 | 2 | 317 |
| 3 | 3 | 222 |
| 4 | 7 | 100 |
| 5 | 13 | 312 |
| 6 | 19 | 50 |
| 7 | 7 | 120 |
| 8 | 19 | 400 |
| 9 | 7 | 230 |
+------+----------+----------+
Result 表:
+----------+--------------------+
| name | travelled_distance |
+----------+--------------------+
| Elvis | 450 |
| Lee | 450 |
| Bob | 317 |
| Jonathan | 312 |
| Alex | 222 |
| Alice | 120 |
| Donald | 0 |
+----------+--------------------+
Elvis 和 Lee 旅行了 450 英里,Elvis 是排名靠前的旅行者,因为他的名字在字母表上的排序比 Lee 更小。
Bob, Jonathan, Alex 和 Alice 只有一次行程,我们只按此次行程的全部距离对他们排序。
Donald 没有任何行程, 他的旅行距离为 0。
建表+导数
create table 1541_Users(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1541_Users" overwrite into table 1541_Users;
select * from 1541_Users;
create table 1541_Rides(id int, user_id int, distance int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1541_Rides" overwrite into table 1541_Rides;
select * from 1541_Rides;
知识点:right join,group by-having,coalesce(xxx,yyy)去第一个不为null的值
select
u.name,
coalesce(sum(r.distance),
0) travelled_distance
from
1541 _Rides r
right join 1541 _Users u on
r.user_id = u.id
group by
u.name
order by
travelled_distance desc,
u.name
执行结果:
1546_Hard_查找成绩处于中游的学生
表: Student
+---------------------+---------+
| Column Name | Type |
+---------------------+---------+
| student_id | int |
| student_name | varchar |
+---------------------+---------+
student_id 是该表主键.
student_name 学生名字.
表: Exam
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| exam_id | int |
| student_id | int |
| score | int |
+---------------+---------+
(exam_id, student_id) 是该表主键.
学生 student_id 在测验 exam_id 中得分为 score.
成绩处于中游的学生是指至少参加了一次测验,且得分既不是最高分也不是最低分的学生。
写一个 SQL 语句,找出在 所有 测验中都处于中游的学生 (student_id, student_name)。
不要返回从来没有参加过测验的学生。返回结果表按照student_id排序。
查询结果格式如下。
Student 表:
+-------------+---------------+
| student_id | student_name |
+-------------+---------------+
| 1 | Daniel |
| 2 | Jade |
| 3 | Stella |
| 4 | Jonathan |
| 5 | Will |
+-------------+---------------+
Exam 表:
+------------+--------------+-----------+
| exam_id | student_id | score |
+------------+--------------+-----------+
| 10 | 1 | 70 |
| 10 | 2 | 80 |
| 10 | 3 | 90 |
| 20 | 1 | 80 |
| 30 | 1 | 70 |
| 30 | 3 | 80 |
| 30 | 4 | 90 |
| 40 | 1 | 60 |
| 40 | 2 | 70 |
| 40 | 4 | 80 |
+------------+--------------+-----------+
Result 表:
+-------------+---------------+
| student_id | student_name |
+-------------+---------------+
| 2 | Jade |
+-------------+---------------+
对于测验 1: 学生 1 和 3 分别获得了最低分和最高分。
对于测验 2: 学生 1 既获得了最高分, 也获得了最低分。
对于测验 3 和 4: 学生 1 和 4 分别获得了最低分和最高分。
学生 2 和 5 没有在任一场测验中获得了最高分或者最低分。
因为学生 5 从来没有参加过任何测验, 所以他被排除于结果表。
由此, 我们仅仅返回学生 2 的信息。
建表+导数
create table 1546_Student(student_id int, student_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1546_Student" overwrite into table 1546_Student;
select * from 1546_Student;
create table 1546_Exam(exam_id int, student_id int, score int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data inpath "/leetcode/1546_Exam" overwrite into table 1546_Exam;
select * from 1546_Exam;
知识点1:开窗函数、外连接。
开窗函数求score的最大值最小值,过滤出有过最高得分和最低得分的同学,再外连接即可得到中游学生。
方法一、
select
t3.student_id,s.student_name
from
(
select
e.student_id
from 1546_Exam e
left join
(
select
t.student_id, t.exam_count
from
(
select
student_id,
exam_id,
score,
count(1) over(partition by student_id) exam_count,
min(score) over(partition by exam_id) exam_min,
max(score) over(partition by exam_id) exam_max
from 1546_Exam
) t
where t.exam_count > 1 and score = exam_min or score = exam_max
) t2
on e.student_id=t2.student_id where t2.student_id is null
) t3
left join 1546_Student s
on t3.student_id=s.student_id
group by t3.student_id,s.student_name
知识点2:采用count() = sum()并配合标志位达到过滤目的。
方法二:
select
s.student_id, s.student_name
from 1546_Student s
left join
(
select
e.student_id, case when count(1)=sum(case when e.score>t.exam_min and e.score<t.exam_max then 1 else 0 end) then 1 else 0 end tag
from
(
select
exam_id,
min(score) exam_min,
max(score) exam_max
from 1546_Exam
group by exam_id
) t
left join 1546_Exam e
on t.exam_id=e.exam_id
group by e.student_id
) t2
on s.student_id=t2.student_id where t2.tag=1
group by s.student_id, s.student_name
执行结果: