在这个Python教程中,我们将探索在 pandas DataFrame 中删除多列的不同方法 。所以,让我们开始吧
删除数据框架多列的方法
在我们开始之前,我们需要一个样本数据框架。所以下面是一个简短的代码片段,我将在本教程中使用的数据框架。请随意复制粘贴这段代码,并跟随本教程的学习:
# Import pandas Python module
import pandas as pd
# Create a pandas DataFrame object
df = pd.DataFrame({'Dept': ['ECE', 'ICE', 'IT', 'CSE', 'CHE', 'EE', 'TE', 'ME', 'CSE', 'IPE', 'ECE'],
'GPA': [8.15, 9.03, 7.85, 8.55, 9.45, 7.45, 8.85, 9.35, 6.53,8.85, 7.83],
'Name': ['Mohan', 'Gautam', 'Tanya', 'Rashmi', 'Kirti', 'Ravi', 'Sanjay', 'Naveen', 'Gaurav', 'Ram', 'Tom'],
'RegNo': [111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121]})
# Print the created sample pandas DataFrame
print('Sample pandas DataFrame:\n')
print(df)
输出
Sample pandas DataFrame:
Dept GPA Name RegNo
0 ECE 8.15 Mohan 111
1 ICE 9.03 Gautam 112
2 IT 7.85 Tanya 113
3 CSE 8.55 Rashmi 114
4 CHE 9.45 Kirti 115
5 EE 7.45 Ravi 116
6 TE 8.85 Sanjay 117
7 ME 9.35 Naveen 118
8 CSE 6.53 Gaurav 119
9 IPE 8.85 Ram 120
10 ECE 7.83 Tom 121
方法1:使用del关键字
# Drop 'GPA' column using del keyword
del df['GPA']
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
Dept Name RegNo
0 ECE Mohan 111
1 ICE Gautam 112
2 IT Tanya 113
3 CSE Rashmi 114
4 CHE Kirti 115
5 EE Ravi 116
6 TE Sanjay 117
7 ME Naveen 118
8 CSE Gaurav 119
9 IPE Ram 120
10 ECE Tom 121
方法2:使用DataFrame.pop()函数
# Drop 'RegNo' column using DataFrame.pop() function
df.pop('RegNo')
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
Dept GPA Name
0 ECE 8.15 Mohan
1 ICE 9.03 Gautam
2 IT 7.85 Tanya
3 CSE 8.55 Rashmi
4 CHE 9.45 Kirti
5 EE 7.45 Ravi
6 TE 8.85 Sanjay
7 ME 9.35 Naveen
8 CSE 6.53 Gaurav
9 IPE 8.85 Ram
10 ECE 7.83 Tom
方法3:使用带列参数的DataFrame.drop()函数
# Drop 'GPA' and 'Name' column using DataFrame.drop() function with columns parameter
df.drop(columns=['GPA','Name'], inplace=True)
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
Dept RegNo
0 ECE 111
1 ICE 112
2 IT 113
3 CSE 114
4 CHE 115
5 EE 116
6 TE 117
7 ME 118
8 CSE 119
9 IPE 120
10 ECE 121
方法4:使用带轴参数的DataFrame.drop()函数
# Drop 'Dept' and 'GPA' columns using DataFrame.drop() function with axis parameter
df.drop(['Dept','GPA'], axis=1, inplace=True)
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
Name RegNo
0 Mohan 111
1 Gautam 112
2 Tanya 113
3 Rashmi 114
4 Kirti 115
5 Ravi 116
6 Sanjay 117
7 Naveen 118
8 Gaurav 119
9 Ram 120
10 Tom 121
方法5:使用DataFrame.drop()函数和DataFrame.iloc[]
# Drop 'Name' and 'GPA' column using DataFrame.drop() function and DataFrame.iloc[]
df.drop(df.iloc[:,1:3], axis=1, inplace=True)
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
Dept RegNo
0 ECE 111
1 ICE 112
2 IT 113
3 CSE 114
4 CHE 115
5 EE 116
6 TE 117
7 ME 118
8 CSE 119
9 IPE 120
10 ECE 121
方法6:使用DataFrame.drop()函数和DataFrame.columns[]
# Drop 'Name' and 'Dept' columns using DataFrame.drop() function and DataFrame.columns[]
df.drop(df.columns[[0,2]], axis=1, inplace=True)
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df)
输出
Modified pandas DataFrame:
GPA RegNo
0 8.15 111
1 9.03 112
2 7.85 113
3 8.55 114
4 9.45 115
5 7.45 116
6 8.85 117
7 9.35 118
8 6.53 119
9 8.85 120
10 7.83 121
方法7:只选择需要的列
# Drop 'RegNo' and 'Dept' columns by selecting only the required columns
df2 = df[['Name','GPA']]
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df2)
输出
Modified pandas DataFrame:
Name GPA
0 Mohan 8.15
1 Gautam 9.03
2 Tanya 7.85
3 Rashmi 8.55
4 Kirti 9.45
5 Ravi 7.45
6 Sanjay 8.85
7 Naveen 9.35
8 Gaurav 6.53
9 Ram 8.85
10 Tom 7.83
方法8:使用DataFrame.dropna()函数
首先,创建一个带有NaN值的pandas DataFrame。下面是一个相同的代码片断。
# Import pandas Python module
import pandas as pd
# Import NumPy module
import numpy as np
# Create a pandas DataFrame object with NaN values
df = pd.DataFrame({'Dept': ['ECE', 'ICE', 'IT', 'CSE', 'CHE', 'EE', 'TE', 'ME', 'CSE', 'IPE', 'ECE'],
'GPA': [8.15, 9.03, 7.85, np.nan, 9.45, 7.45, np.nan, 9.35, 6.53,8.85, 7.83],
'Name': ['Mohan', 'Gautam', 'Tanya', 'Rashmi', 'Kirti', 'Ravi', 'Sanjay', 'Naveen', 'Gaurav', 'Ram', 'Tom'],
'RegNo': [111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121],
'City': ['Biharsharif','Ranchi',np.nan,'Patiala', 'Rajgir', 'Patna', np.nan,'Mysore',np.nan,'Mumbai',np.nan]})
# Print the created pandas DataFrame
print('Sample pandas DataFrame with NaN values:\n')
print(df)
输出
Sample pandas DataFrame with NaN values:
Dept GPA Name RegNo City
0 ECE 8.15 Mohan 111 Biharsharif
1 ICE 9.03 Gautam 112 Ranchi
2 IT 7.85 Tanya 113 NaN
3 CSE NaN Rashmi 114 Patiala
4 CHE 9.45 Kirti 115 Rajgir
5 EE 7.45 Ravi 116 Patna
6 TE NaN Sanjay 117 NaN
7 ME 9.35 Naveen 118 Mysore
8 CSE 6.53 Gaurav 119 NaN
9 IPE 8.85 Ram 120 Mumbai
10 ECE 7.83 Tom 121 NaN
现在,我们将删除那些有NaN值的列。
# Drop columns with NaN values using the DataFrame.dropna() function
df2 = df.dropna(axis='columns')
# Print the modified pandas DataFrame
print('Modified pandas DataFrame:\n')
print(df2)
输出
Modified pandas DataFrame:
Dept Name RegNo
0 ECE Mohan 111
1 ICE Gautam 112
2 IT Tanya 113
3 CSE Rashmi 114
4 CHE Kirti 115
5 EE Ravi 116
6 TE Sanjay 117
7 ME Naveen 118
8 CSE Gaurav 119
9 IPE Ram 120
10 ECE Tom 121
总结
在本教程中,我们已经学会了删除pandas DataFrame的多列的不同方法。希望你已经理解了以上讨论的方法,并对在你的数据分析项目中使用它们感到兴奋。谢谢你的阅读!请继续关注我们,了解更多关于Python编程的精彩学习内容。