从今天开始入门SQL,本文内容根据油管上Michelle小梦想家数据科学的Channel整理。
1. SQL in General
为什么学习SQL?
- SQL是关系型数据库管理的标准语言
- 通过使用SQL可以加载,查询以及分析大规模的数据集
- 在申请数据相关岗位时会经常被面试官问到相关问题
2. Database 和 Table
数据库基础
-
一个数据库可能包含多个相关的表格
-
每个表格都包含 行 和 列, 假设我们有以下关于员工信息的表格:
- 这个表格包含 4 行和 4 列。
- 每一列包含员工的一个属性。
- 每一行对应一个特定的员工。
| Id | Name | Salary | ManagerId |
|---|---|---|---|
| 1 | Joe | 70000 | 3 |
| 2 | Henry | 80000 | 4 |
| 3 | Sam | 60000 | NULL |
| 4 | Max | 90000 | NULL |
-
每个表格都包含 Primary Key 和 Foreign Key:
- Primary key 是每一条记录的独一无二的标识符。
- Primary key 不能包含重复或Null值。
- Foreign keys 可能引用于其他表格里的primary key。
- Foreign keys 可以重复或有Null值。
基础SQL语法

3. SELECT 和 FROM
接下来均以下表(leetcode185) 为例:
Table1 Employee
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 85000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
| 7 | Will | 70000 | 1 |
SELECT * FROM
- SELECT 告诉SQL server来检索特定行的所有列
- SELECT * FROM 告诉SQL server来检索指定表格中的所有行和列
- SELECT和FROM是SQL query语句中必须包含的
Example MySQL Query
SELECT *
FROM Employee;
Query result
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 85000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
| 7 | Will | 70000 | 1 |
SELECT ColumnName
- 通过在SELECT 声明列名可以得到指定的列数据
- 为了SELECT 多个列, 我们可以将它们的名字用逗号隔开
- SELECT DISTINCT 帮助我们返回指定列中去重后的数据
- DISTINCT 功能可以帮助我们检测某个属性所有可能的结果
Example MySQL Query
SELECT Name,Salary
FROM Employee;
Query result
| Name | Salary |
|---|---|
| Joe | 85000 |
| Henry | 80000 |
| Sam | 60000 |
| Max | 90000 |
| Janet | 69000 |
| Randy | 85000 |
| Will | 70000 |
Example MySQL Query
SELECT DISTINCT DepartmentID
FROM Employee;
Query result
| DepartmentId |
|---|
| 1 |
| 2 |
易错点


4. WHERE
-
WHERE让你指定rows所要满足的逻辑条件
-
WHERE可以使用操作符如 '=','<>','<','>','<=','>=','BETWEEN','Like','IN',etc
WHERE & AND
Example MySQL Query
SELECT *
FROM Employee
WHERE DepartmentId = 2 AND Salary>50000;
Query result
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
WHERE & OR
Example MySQL Query
SELECT *
FROM Employee
WHERE Salary<70000 OR Salary>80000;
Query result
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 85000 | 1 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 89000 | 1 |
WHERE & IN
Example MySQL Query
SELECT *
FROM Employee
WHERE Salary IN (70000,85000);
Query result
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 85000 | 1 |
| 6 | Randy | 85000 | 1 |
| 7 | Will | 70000 | 1 |
WHERE & LIKE
Example MySQL Query
SELECT *
FROM Employee
WHERE Salary Name LIKE '%an%';
Query result
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
易错点


5. GROUP BY
GROUP BY
- GROUP BY 指定的是你想如何对结果集合进行分组
- 将GROUP BY与aggregate function(i.e. COUNT SUM)组合来声明根据一个列或多个列对结果集进行分组
Example MySQL Query
SELECT DepartmentId,COUNT(Name)
FROM Employee
GROUP BY DepartmentId;
Query result
| DepartmentId | COUNT(Name) |
|---|---|
| 1 | 5 |
| 2 | 2 |
GROUP BY & Aliases
- Aliasis 允许你通过使用“AS"声明,在结果中的某个域加上一个自定义的名称
Example MySQL Query
SELECT DepartmentId,
COUNT(Name) AS Number_Employee
FROM Employee
GROUP BY DepartmentId;
Query result
| DepartmentId | Number_Employee |
|---|---|
| 1 | 5 |
| 2 | 2 |
GROUP BY & Aggregate Function
- Aggregate Function 包含 COUNT,COUNT DISTINCT,MIN,MAX,AVG,SUM,等等
Example MySQL Query
SELECT DepartmentId,
COUNT(Salary) AS Total,
MIN(Salary) AS MinSalary,
MAX(Salary) AS MaxSalary,
AVG(Salary) AS AvgSalary,
SUM(Salary) AS SumSalary
FROM Employee
GROUP BY DepartmentId;
Query result
| DepartmentId | Total | MinSalary | MaxSalary | AvgSalary | SumSalary |
|---|---|---|---|---|---|
| 1 | 5 | 69000 | 90000 | 79800 | 399000 |
| 2 | 2 | 60000 | 80000 | 70000 | 140000 |
6. HAVING & ORDER BY
HAVING
- HAVING 帮助我们筛选分组后的各组数据
- HAVING是一个必须和GROUP BY联合使用的语法
- HAVING存在的意义是 WHERE无法和Aggregate Function一起使用
Example MySQL Query
SELECT DepartmentId
FROM Employee
GROUP BY DepartmentId
HAVING COUNT(*)>3;
Query result
| DepartmentId |
|---|
| 1 |
## ORDER BY
- ORDER BY 帮助你对结果集中的记录进行排序
- 默认方式是升序排列
Example MySQL Query
SELECT NAME, Salary, DepartmentId
FROM Employee
ORDER BY Salary;
Query result
| Name | Salary | DepartmentId |
|---|---|---|
| Sam | 60000 | 2 |
| Janet | 69000 | 1 |
| Will | 70000 | 1 |
| Henry | 80000 | 2 |
| Joe | 85000 | 1 |
| Randy | 85000 | 1 |
| Max | 90000 | 1 |
ORDER BY(多列)
- ORDER BY可以包含多个准则,server会优先根据第一列的数据进行排序,然后根据附加的列的数据产生最终结果
Example MySQL Query
SELECT NAME, Salary, DepartmentId
FROM Employee
ORDER BY Salary,DepartmentId;
Query result
| Name | Salary | DepartmentId |
|---|---|---|
| Sam | 60000 | 2 |
| Janet | 69000 | 1 |
| Will | 70000 | 1 |
| Henry | 80000 | 2 |
| Joe | 85000 | 1 |
| Randy | 85000 | 1 |
| Max | 90000 | 1 |
ORDER BY(降序)
- 通过在列名后面加DESC可以将结果降序排列
Example MySQL Query
SELECT NAME, Salary, DepartmentId
FROM Employee
ORDER BY Salary DESC;
Query result
| Name | Salary | DepartmentId |
|---|---|---|
| Max | 90000 | 1 |
| Randy | 85000 | 1 |
| Joe | 85000 | 1 |
| Henry | 80000 | 2 |
| Will | 70000 | 1 |
| Janet | 69000 | 1 |
| Sam | 60000 | 2 |
7. INNER JOIN
Recap
- Primary Key 是独一无二的,不能重复,且不能为空
- Foreign Key 不是独一无二的,可以重复,且可以为空
- 我们可以通过连接一个table中的Foreign Key和其他table中的Primary Key来创建一对多的关系

INNER JOIN
- INNER JOIN 返回同时存在在两个table中的值,把不匹配的值去除
- INNER JOIN 是最常使用的JOIN类型之一(另一个为LEFT JOIN)
接下来以下面两个表为例:
Table2 Employee
| Id | Name | Salary | DepartmentId |
|---|---|---|---|
| 1 | Joe | 85000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
| 7 | Will | 70000 | 1 |
Table3 Department
| Id | Name |
|---|---|
| 1 | IT |
Example MySQL Query
SELECT Employee.id AS Id, Employee.name AS Name,
Employee.Salary AS Salary,
Emplyee.DepartmentId as DepartmentId,
Department.Name as DepartmentName
FROM Employee
INNER JOIN Department
ON Employee.DepartmentId = Department.Id;
Query result
| Id | Name | Salary | DepartmentId | DepartmentName |
|---|---|---|---|---|
| 1 | Joe | 85000 | 1 | IT |
| 4 | Max | 90000 | 1 | IT |
| 5 | Janet | 69000 | 1 | IT |
| 6 | Randy | 85000 | 1 | IT |
| 7 | Will | 70000 | 1 | IT |
易错点

8. LEFT JOIN
- LEFT JOIN 可以添加来自第二个table中的额外的信息到第一个table中,同时保留第一个table的所有记录(即便不匹配)
- LEFT JOIN是最常用的两个JOIN类型之一(INNER JOIN是另一个)
Example MySQL Query
SELECT Employee.id AS Id, Employee.name AS Name,
Employee.Salary AS Salary,
Emplyee.DepartmentId as DepartmentId,
Department.Name as DepartmentName
FROM Employee
LEFT JOIN Department
ON Employee.DepartmentId = Department.Id;
Query result
| Id | Name | Salary | DepartmentId | DepartmentName |
|---|---|---|---|---|
| 1 | Joe | 85000 | 1 | IT |
| 2 | Henry | 80000 | 2 | < NA > |
| 3 | Sam | 60000 | 2 | < NA > |
| 4 | Max | 90000 | 1 | IT |
| 5 | Janet | 69000 | 1 | IT |
| 6 | Randy | 85000 | 1 | IT |
| 7 | Will | 70000 | 1 | IT |
9. RIGHT JOIN 和 OUTER JOIN
- RIGHT JOIN 与LEFT JOIN恰恰相反,它保留的是右边表格的所有数据而不是左边的
Example MySQL Query
SELECT Employee.id AS Id, Employee.name AS Name,
Employee.Salary AS Salary,
Emplyee.DepartmentId as DepartmentId,
Department.Name as DepartmentName
FROM Employee
RIGHT JOIN Department
ON Employee.DepartmentId = Department.Id;
Query result
| Id | Name | Salary | DepartmentId | DepartmentName |
|---|---|---|---|---|
| 1 | Joe | 85000 | 1 | IT |
| 4 | Max | 90000 | 1 | IT |
| 5 | Janet | 69000 | 1 | IT |
| 6 | Randy | 85000 | 1 | IT |
| 7 | Will | 70000 | 1 | IT |
OUTER JOIN
- 使用OUTER JOIN得到两个tables中存在的record即便有列的数值不对应
- OUTER JOIN不经常使用, 但在希望把两个tables中的所有数据进行合并时可以派上用场
10. UNION
我们使用以下tables为例:
Table4 professor
| Title | Name | Salary |
|---|---|---|
| Professor | Joe | 85000 |
| Professor | Henry | 80000 |
| Professor | Sam | 60000 |
Table5 RTA
| Title | Name | Salary |
|---|---|---|
| TA | Janet | 2000 |
| TA | Randy | 2000 |
| RA | Will | 1500 |
| Professor | Sam | 60000 |
UNION
- Union 告诉SQL来合并两个或多个SELECT语句的结果
- Union会对记录进行去重,最后结果只包含单独的数据
Example MySQL Query
SELECT Title AS Position, Name, Salary
FROM professor
UNION
SELECT Title AS Position, Name, Salary
FROM RTA
Query result
| Position | Name | Salary |
|---|---|---|
| TA | Janet | 2000 |
| TA | Randy | 2000 |
| RA | Will | 1500 |
| Professor | Joe | 85000 |
| Professor | Henry | 80000 |
| Professor | Sam | 60000 |
UNION ALL
- 如果想要保留重复的数据,可以使用UNION ALL操作符
Example MySQL Query
SELECT Title AS Position, Name, Salary
FROM professor
UNION ALL
SELECT Title AS Position, Name, Salary
FROM RTA
Query result
| Position | Name | Salary |
|---|---|---|
| Professor | Joe | 85000 |
| Professor | Henry | 80000 |
| Professor | Sam | 60000 |
| TA | Janet | 2000 |
| TA | Randy | 2000 |
| RA | Will | 1500 |
| Professor | Sam | 60000 |
易错点

