使用Pandas分析Windows各文件夹的动态

96 阅读1分钟

在Powershell中进入相关目录

dir | group {$_.CreationTime.ToShortDateString()} | select Name > Name.csv

dir | group {$_.CreationTime.ToShortDateString()} | select Count > Count.csv

Pandas处理

import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt

data = pd.read_csv("Name.csv", skip_blank_lines=True, encoding='utf-16', names=['Name'])
data2 = pd.read_csv("Count.csv", skip_blank_lines=True, encoding='utf-16',names=['Count'])

df = pd.DataFrame(data)
df.dropna(how="all", inplace=True)
df2 = pd.DataFrame(data2)
df2.dropna(how="all", inplace=True)
df = df.join(df2)
df = df.drop(0)
df = df.drop(1)
df['Name'] = pd.to_datetime(df['Name'], dayfirst=True)
df['Count'] = pd.to_numeric(df['Count'])

df= df.sort_values(by=['Name'])
print(df)
df.plot(kind = 'scatter', x = 'Name', y = 'Count')

df.to_csv("plt.csv")
plt.show()

Downloads文件夹

fixed.png

C:\Program Files

X64.png

C:\Program Files (x86)

X86.png

C:\Windows

win.png