Pandas提供了两种排序方法,即按标签排序和按值排序。本文介绍了如何在Pandas中用这两种方法进行排序。
1.潘达斯排序示例DataFrame数据
-
下面的代码将创建DataFrame对象的例子用于排序:
import numpy as np import pandas as pd def pandas_dataframe_sorting_example(): # create a 2 dimensional array with 5 rows and 3 columns, each element value is a floating number. data_array = np.random.randn(5, 3) # the index array contains unsorted index number. index_array = [0,2,1,6,3] # the column name array. columns_array = ['column-3','column-1','column-2'] # create the unsorted DataFrame object with the above . dataframe_unsorted = pd.DataFrame(data = data_array,index = index_array,columns = columns_array) print('dataframe_unsorted\r') print(dataframe_unsorted) if __name__ == '__main__': pandas_dataframe_sorting_example()
-
下面是上述代码的执行结果,从结果中我们可以看到,行标签和数字元素都没有被排序。让我们在下面的例子中分别用标签排序和数字排序对它们进行操作。
dataframe_unsorted column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006
2.Pandas DataFrame按标签排序的例子
- Pandas DataFrame的**sort_index(axis, ascending)**方法可以用来对DataFrame对象按标签排序。
2.1 通过行标签对DataFrame对象进行排序
-
当你不向该方法传递任何参数时,它将按行标签以升序对DataFrame对象进行排序。
dataframe_unsorted.sort_index()
-
这是因为默认轴参数的值是0,默认升序参数的值是True。
dataframe_unsorted.sort_index(axis = 0, ascending = True)
2.2 按列标签对DataFrame对象进行排序
-
如果你想按列标对DataFrame对象进行排序,你可以向sort_index()方法传递axis = 1。
dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False)
-
如果你向该方法传递ascending = False参数,它将按降序对列标进行排序。
3.Pandas DataFrame按值排序示例
- DataFrame对象的**sort_values(by, kind)**方法可以用来对DataFrame对象的值进行排序。
- 参数by是用来指定一列或多列。
- kind参数指定了排序算法,它有3个值,它们是heapsort、mergesort和quicksort。
- kind参数只在按一列排序时生效,默认值是quicksort,而mergesort算法是最稳定的选择。
4.Pandas数据框排序示例源代码
-
下面是这个例子的完整源代码。
import pandas as pd import numpy as np def pandas_dataframe_sorting_example(): # create a 2 dimensional array with 5 rows and 3 columns, each element value is a floating number. data_array = np.random.randn(5, 3) # the index array contains unsorted index number. index_array = [0,2,1,6,3] # the column name array. columns_array = ['column-3','column-1','column-2'] # create the unsorted DataFrame object with the above . dataframe_unsorted = pd.DataFrame(data = data_array,index = index_array,columns = columns_array) print('dataframe_unsorted\r') print(dataframe_unsorted) # sort DataFrame by row index label in ascending order. dataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True) print('\ndataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True)\r') print(dataframe_sort_by_row_index_label_ascending) # sort DataFrame by row index label in descending order. dataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False) print('\ndataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False)\r') print(dataframe_sort_by_row_index_label_descending) # sort DataFrame by column index label. dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False) print('\ndataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False)\r') print(dataframe_sort_by_column_label) # sort DataFrame by column value. dataframe_sort_by_column_value = dataframe_unsorted.sort_values(by='column-1') print('\ndataframe_sort_by_column_value = dataframe_unsorted.sort_values(by=\'column-1\')\r') print(dataframe_sort_by_column_value) # when 2 rows has same colimn-1 value then order by the column-2 value. dataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=['column-1','column-2'], ascending=False) print('\ndataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=[\'column-1\',\'column-2\'], ascending=False)\r') print(dataframe_sort_by_multiple_columns_value) dataframe_sorting_algorithm = dataframe_unsorted.sort_values(by='column-1' ,kind='heapsort') print('\ndataframe_sorting_algorithm = dataframe_unsorted.sort_values(by=\'column-1\' ,kind=\'heapsort\')\r') print (dataframe_sorting_algorithm) if __name__ == '__main__': pandas_dataframe_sorting_example()
-
下面是上述例子的源代码执行输出:
dataframe_unsorted column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006 dataframe_sort_by_row_index_label_ascending = dataframe_unsorted.sort_index(ascending=True) column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 1 0.088765 0.672521 -0.195716 2 -1.417987 0.636136 -0.068395 3 1.243266 0.558752 -0.703006 6 0.600821 0.814108 -0.086112 dataframe_sort_by_row_index_label_descending = dataframe_unsorted.sort_index(ascending=False) column-3 column-1 column-2 6 0.600821 0.814108 -0.086112 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 0 0.395101 -0.051456 0.327673 dataframe_sort_by_column_label = dataframe_unsorted.sort_index(axis = 1, ascending=False) column-3 column-2 column-1 0 0.395101 0.327673 -0.051456 2 -1.417987 -0.068395 0.636136 1 0.088765 -0.195716 0.672521 6 0.600821 -0.086112 0.814108 3 1.243266 -0.703006 0.558752 dataframe_sort_by_column_value = dataframe_unsorted.sort_values(by='column-1') column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112 dataframe_sort_by_multiple_columns_value = dataframe_unsorted.sort_values(by=['column-1','column-2'], ascending=False) column-3 column-1 column-2 6 0.600821 0.814108 -0.086112 1 0.088765 0.672521 -0.195716 2 -1.417987 0.636136 -0.068395 3 1.243266 0.558752 -0.703006 0 0.395101 -0.051456 0.327673 dataframe_sorting_algorithm = dataframe_unsorted.sort_values(by='column-1' ,kind='heapsort') column-3 column-1 column-2 0 0.395101 -0.051456 0.327673 3 1.243266 0.558752 -0.703006 2 -1.417987 0.636136 -0.068395 1 0.088765 0.672521 -0.195716 6 0.600821 0.814108 -0.086112