排版问题
pandas索引
二级索引
多级索引及其表的结构
- 索引名字,
names
- 值属性,
values
- 得到某一层索引,
get_level_values
multi_index = pd.MultiIndex.from_product([list('ABCD'),
df_demo.Gender.unique()], names=('School', 'Gender'))
multi_column = pd.MultiIndex.from_product([['Height', 'Weight'],
df_demo.Grade.unique()], names=('Indicator', 'Grade'))
df_multi = pd.DataFrame(np.c_[(np.random.randn(8,4)*5 + 163).tolist(),
(np.random.randn(8,4)*5 + 65).tolist()],
index = multi_index,
columns = multi_column).round(1)
df_multi
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
Indicator |
Height |
Weight |
|
Grade |
Freshman |
Senior |
Sophomore |
Junior |
Freshman |
Senior |
Sophomore |
Junior |
| School |
Gender |
|
|
|
|
|
|
|
|
| A |
Female |
171.8 |
165.0 |
167.9 |
174.2 |
60.6 |
55.1 |
63.3 |
65.8 |
| Male |
172.3 |
158.1 |
167.8 |
162.2 |
71.2 |
71.0 |
63.1 |
63.5 |
| B |
Female |
162.5 |
165.1 |
163.7 |
170.3 |
59.8 |
57.9 |
56.5 |
74.8 |
| Male |
166.8 |
163.6 |
165.2 |
164.7 |
62.5 |
62.8 |
58.7 |
68.9 |
| C |
Female |
170.5 |
162.0 |
164.6 |
158.7 |
56.9 |
63.9 |
60.5 |
66.9 |
| Male |
150.2 |
166.3 |
167.3 |
159.3 |
62.4 |
59.1 |
64.9 |
67.1 |
| D |
Female |
174.3 |
155.7 |
163.2 |
162.1 |
65.3 |
66.5 |
61.8 |
63.2 |
| Male |
170.7 |
170.3 |
163.8 |
164.9 |
61.6 |
63.2 |
60.9 |
56.4 |
School 和Gender 分别对应了表的第一层和第二层行索引的名字
Indicator 和 Grade 分别对应了第一层和第二层列索引的名字
df_multi.index.names
FrozenList(['School', 'Gender'])
df_multi.columns.names
FrozenList(['Indicator', 'Grade'])
df_multi.index.values
array([('A', 'Female'), ('A', 'Male'), ('B', 'Female'), ('B', 'Male'), ('C', 'Female'), ('C', 'Male'), ('D', 'Female'), ('D', 'Male')],
dtype=object)
df_multi.columns.values
array([('Height', 'Freshman'), ('Height', 'Senior'), ('Height', 'Sophomore'), ('Height', 'Junior'), ('Weight', 'Freshman'), ('Weight', 'Senior'), ('Weight', 'Sophomore'), ('Weight', 'Junior')], dtype=object)
df_multi.index.get_level_values(1)
Index(['Female', 'Male', 'Female', 'Male', 'Female', 'Male', 'Female', 'Male'], dtype='object', name='Gender')
多级索引中的loc索引器
df_mutli = df_demo.set_index(['School','Grade'])
df_mutli
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
|
Gender |
Height |
Weight |
| School |
Grade |
|
|
|
| Shanghai Jiao Tong University |
Freshman |
Female |
158.9 |
46.0 |
| Peking University |
Freshman |
Male |
166.5 |
70.0 |
| Shanghai Jiao Tong University |
Senior |
Male |
188.9 |
89.0 |
| Fudan University |
Sophomore |
Female |
NaN |
41.0 |
| Sophomore |
Male |
174.0 |
74.0 |
| ... |
... |
... |
... |
| Junior |
Female |
153.9 |
46.0 |
| Tsinghua University |
Senior |
Female |
160.9 |
50.0 |
| Shanghai Jiao Tong University |
Senior |
Female |
153.9 |
45.0 |
| Senior |
Male |
175.3 |
71.0 |
| Tsinghua University |
Sophomore |
Male |
155.7 |
51.0 |
200 rows × 3 columns
df_sorted = df_multi.sort_index()
IndexSlice对象
- 应用场景
- 索引不重复
- 可对每层进行切片
- 允许将切片和布尔列表混合使用
- 使用方式
loc[idx[*,*]
loc[idx[*,*],idx[*,*]]
np.random.seed(0)
a, b = ['A','B','C'],['a','b','c']
mul_index1 = pd.MultiIndex.from_product([a,b],names=('Upper','Lower'))
c, d = ['D','E','F'],['d','e','f']
mul_index2 = pd.MultiIndex.from_product([c,d],names=('Big','Small'))
df = pd.DataFrame(np.random.randint(-9,10,(9,9)),
index = mul_index1,
columns=mul_index2)
df
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
Big |
D |
E |
F |
|
Small |
d |
e |
f |
d |
e |
f |
d |
e |
f |
| Upper |
Lower |
|
|
|
|
|
|
|
|
|
| A |
a |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
-5 |
| b |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
-4 |
4 |
| c |
-1 |
0 |
7 |
-4 |
6 |
6 |
-9 |
9 |
-6 |
| B |
a |
8 |
5 |
-2 |
-9 |
-8 |
0 |
-9 |
1 |
-6 |
| b |
2 |
9 |
-7 |
-9 |
-9 |
-5 |
-4 |
-3 |
-1 |
| c |
8 |
6 |
-5 |
0 |
1 |
-8 |
-8 |
-2 |
0 |
| C |
a |
-6 |
-3 |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
| b |
1 |
2 |
-5 |
-3 |
-5 |
6 |
-6 |
3 |
-5 |
| c |
-1 |
5 |
6 |
-6 |
6 |
4 |
7 |
8 |
-4 |
idx = pd.IndexSlice
loc[idx[*,*]]
- 第1个
*表示行,第2个*表示列
df.loc[idx['C':,('D','f'):]]
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
Big |
D |
E |
F |
|
Small |
f |
d |
e |
f |
d |
e |
f |
| Upper |
Lower |
|
|
|
|
|
|
|
| C |
a |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
| b |
-5 |
-3 |
-5 |
6 |
-6 |
3 |
-5 |
| c |
6 |
-6 |
6 |
4 |
7 |
8 |
-4 |
df.loc[idx[:'A', lambda x:x.sum()>0]]
loc[idx[*,*],idx[*,*]]
分层切片,前一个idx指代行索引,后一个idx是列索引
df.loc[idx[:'A','b':],idx['E':,'e':]]
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
Big |
E |
F |
|
Small |
e |
f |
e |
f |
| Upper |
Lower |
|
|
|
|
| A |
b |
-2 |
5 |
-4 |
4 |
| c |
6 |
6 |
9 |
-6 |
多级索引构造
- 可使用
pd.MultiIndex对象下的函数
from_tuples,根据传入由元组组成的列表进行构造
from_arrays,根据传入列表中,对应层的列表进行构造
from_product,根据给定多个列表的笛卡尔积进行构造
a = [('a','cat'),('a','dog'),('b','cat'),('b','dog')]
pd.MultiIndex.from_tuples(a, names=['First','Second'])
MultiIndex([('a', 'cat'), ('a', 'dog'), ('b', 'cat'), ('b', 'dog')],
names=['First', 'Second'])
b = [list('aabb'),['cat','dog']*2]
pd.MultiIndex.from_arrays(b,names=['Frist','Second'])
MultiIndex([('a', 'cat'), ('a', 'dog'), ('b', 'cat'), ('b', 'dog')],
names=['Frist', 'Second'])
a = ['a','b']
b = ['cat','dog']
pd.MultiIndex.from_product([a,b],names=['First','Second'])
MultiIndex([('a', 'cat'), ('a', 'dog'), ('b', 'cat'), ('b', 'dog')],
names=['First', 'Second'])
索引的常用方法
索引的交换和删除
- 交换,可指定交换轴(行/列索引)
swaplevel,只能交换2个层
reorder_levels,可以交换任意层
- 删除,
droplevel
np.random.seed(0)
L1,L2,L3 = ['A','B'],['a','b'],['alpha','beta']
mul_index1 = pd.MultiIndex.from_product([L1,L2,L3],
names=('Upper', 'Lower','Extra'))
L4,L5,L6 = ['C','D'],['c','d'],['cat','dog']
mul_index2 = pd.MultiIndex.from_product([L4,L5,L6],
names=('Big', 'Small', 'Other'))
df = pd.DataFrame(np.random.randint(-9,10,(8,8)),
index=mul_index1,
columns=mul_index2)
df
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
| beta |
-9 |
-5 |
-4 |
-3 |
-1 |
8 |
6 |
-5 |
| b |
alpha |
0 |
1 |
-8 |
-8 |
-2 |
0 |
-6 |
-3 |
| beta |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
1 |
df.swaplevel(0,2,axis=1).head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
|
|
Small |
c |
c |
d |
d |
c |
c |
d |
d |
|
|
Big |
C |
C |
C |
C |
D |
D |
D |
D |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
df.reorder_levels([2,0,1],axis=0).head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Extra |
Upper |
Lower |
|
|
|
|
|
|
|
|
| alpha |
A |
a |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
A |
a |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| alpha |
A |
b |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
A |
b |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| alpha |
B |
a |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
df.droplevel(1,axis=1)
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
| beta |
-9 |
-5 |
-4 |
-3 |
-1 |
8 |
6 |
-5 |
| b |
alpha |
0 |
1 |
-8 |
-8 |
-2 |
0 |
-6 |
-3 |
| beta |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
1 |
df.droplevel([0,1],axis=0)
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
| Big |
C |
D |
| Small |
c |
d |
c |
d |
| Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Extra |
|
|
|
|
|
|
|
|
| alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
| beta |
-9 |
-5 |
-4 |
-3 |
-1 |
8 |
6 |
-5 |
| alpha |
0 |
1 |
-8 |
-8 |
-2 |
0 |
-6 |
-3 |
| beta |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
1 |
索引属性修改
rename_axis:对索引层名字修改
rename:对索引的值进行修改
- 传入参数可以为函数
- 对整个索引元素替换,使用迭代器
- 对某个位置元素进行修改,可使用map函数
df.rename_axis(index={'Upper':'Changed_row'},
columns={'Other':'Changed_Col'}).head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Changed_Col |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Changed_row |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
df.rename(columns={'cat':'not_cat'},level=2).head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Other |
not_cat |
dog |
not_cat |
dog |
not_cat |
dog |
not_cat |
dog |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
alpha |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| beta |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
alpha |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| beta |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
alpha |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
df.rename(index=lambda x:str.upper(x),level=2).head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
ALPHA |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| BETA |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
ALPHA |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| BETA |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
ALPHA |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
a = iter(list('abcdefgh'))
df.rename(index= lambda x:next(a),level=2)
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead tr th {
text-align: left;
}
.dataframe thead tr:last-of-type th {
text-align: right;
}
|
|
Big |
C |
D |
|
|
Small |
c |
d |
c |
d |
|
|
Other |
cat |
dog |
cat |
dog |
cat |
dog |
cat |
dog |
| Upper |
Lower |
Extra |
|
|
|
|
|
|
|
|
| A |
a |
a |
3 |
6 |
-9 |
-6 |
-6 |
-2 |
0 |
9 |
| b |
-5 |
-3 |
3 |
-8 |
-3 |
-2 |
5 |
8 |
| b |
c |
-4 |
4 |
-1 |
0 |
7 |
-4 |
6 |
6 |
| d |
-9 |
9 |
-6 |
8 |
5 |
-2 |
-9 |
-8 |
| B |
a |
e |
0 |
-9 |
1 |
-6 |
2 |
9 |
-7 |
-9 |
| f |
-9 |
-5 |
-4 |
-3 |
-1 |
8 |
6 |
-5 |
| b |
g |
0 |
1 |
-8 |
-8 |
-2 |
0 |
-6 |
-3 |
| h |
2 |
5 |
9 |
-9 |
5 |
-6 |
3 |
1 |
索引设置与重置
set_index,索引设置
rest_index是set_index逆函数
索引的变形
索引运算
集合的运算法则
一般的索引运算
参考