机器学习阶段一:python数据分析和建模库(1)

152 阅读6分钟

本文笔记仅用于自我学习,笔记相关原视频地址:www.bilibili.com/video/BV1Ns…

1、变量类型

1.1 创建变量

变量名 = 赋给变量的值
根据赋值类型确定变量类型
例如days = 365,days类型为整型

一般通过“ _ ”连接前后,如number_of_days = 365

1.2 输出

输出整型值 ,print(365)
输出string值 ,print('hello world')
输出变量,print(number_of_days)

1.3 确定变量类型

先赋值3个变量
str_test="china" 字符
int_test=123 整型
float_test=122.5 浮点数

使用type()函数,可确定类型
print(type(str_test))

1.4 类型转换

1.4.1

假若在文本中读取到了123,实际上是一串string字符,即无法参与计算
例:string转换为int

str_eight=str(8)
print(type(str_eight))

结果为<class 'str'>

str_eight="8"
int_eight = int(str_eight)
print(type(str_eight))

结果为<class 'int'>

1.4.2

str_test = 'test'
此字符型转换为整型就无法转换,能转换为数字的才能转换为整型

1.5 计算

" ** "的运算

china=10`
united_states=100
print(china**2)

结果为100,10的二次幂

🔺 2 LIST结构(一个集合,可以填很多东西)

2.1声明一个list类型( 变量名=[] )

例如 :

months = []
print(type(months))

结果为 <class 'list'>

2.2 存储,利用append()函数,按顺序填充

接2.1
print(months)
输出结果为 [],list是空的

例1:

months.append("jan")
months.append("feb")
print(months)

输出结果为 ['jan','feb']两个元素

例2:

months.append(1)
months.append("jan")
months.append(2)
months.append("feb")
print(months)

输出结果为 [1,'jan',2,'feb']
在list中添加什么类型的值都是可以的

2.3 利用索引取出元素

countries = []
temperatures = []
countries.append("china")
countries.append("india")
countries.append("US")
                 
temperatures.append("30.5")
temperatures.append("25.0")
temperatures.append("15.1")

print(countries)
print(temperatures)

输出结果为
['china','india','US']
[30.5,25.0,15.1]
china索引为0

在python中想取出元素就需要先定义一个索引

china = countries[0]  ##0即为索引
china_tempertures =  tempertures[1] 
print(china)
print(tempertures)

输出结果为
china
25.0

2.4 len

计算list中的元素个数

int_months = [1,2,3,4,5]
 length = len(int_months)
print(length)

输出结果为5

2.5 取最后一个元素

int_months = [1,2,3,4,5]
index = len(int_months) - 1  ##最后一个元素的索引为长度-1
last_value = int_months[index]
print(last_value)

结果为5

在python中
print(int_months[-1])= 集合中最后一个元素
print(int_months[-2])= 集合中倒数第二个元素

2.6 取list中连续的某一段值,或取一段切片

months = ["1","9","8","5","2"]
two_four = months[2:4]  
print(two_four)

" : "号表示一个连续的区间,从2号开始且包含2号元素,到4号结束但取不到4号,即取头不取尾
结果为['8','5']

2.7 从list某一元素开始,取其后面所有元素

months = ["1","9","8","5","2","6","4"]
three_six = months[3:] 
##想从3号开始向后的元素都取到时,就不用写结束的位置了
print(three_six)

结果为['5','2','6','4']

3.循环

3.1 for循环

3.1.1 list中有多个元素

cities=["beijing","shanghai","guangzhou"]
for city in cities: ##注意要加":"
    print(city) ##有4个空格,
    ##通过一个缩进来表示一个完整的for循环的结构
    ##有缩进,说明还在for中;无缩进,说明跳出for循环

结果为
beijing(第1个city
shanghai(第2个city
guangzhou(第3个city
city相当于cities中每一个需要遍历的元素的别名,可用city作一系列方法
in+需要遍历的东西

3.1.2 range

for i in range(10): ##range(10)不包括10,取10个数 
    print(i)

结果为0
1
2
3
4
5
6
7
8
9

3.1.3 list中的元素类型为list

cities = [["austin","dallas","houston"],["beijing","shanghai","guangzhou"]]
print(cities)
for city in cities:
    print(city)

for i in cities:
    for j in i:
        print(j)
##通过两层for循环把list中的元素都取出来

结果为
print(cities):
[["austin","dallas","houston"],["beijing","shanghai","guangzhou"]]

print(city):
["austin","dallas","houston"] ["beijing","shanghai","guangzhou"]

print(j):
austin
dallas
houston
beijing
shanghai
guangzhou

3.2 while循环

i=0
while i < 3:
    i+=1
    print(i)

结果为
1
2
3

4.判断结构

4.1 布尔类型

4.1.1

cat=true
print(type(cat))

结果为<class 'bool'>

4.1.2

print( 8==8 ) ##TRUE,=为赋值,==为判断
print( 8!=8 ) ##FALSE
print( 8==10 ) ##FALSE
print( 8!=10 ) ##TRUE

4.1.3

print("8" == "8") ##TRUE
print(["jan","feb"] == ["jan","feb"]) ##TRUE
print(5.0 == 5.0) ##TRUE

4.1.4

rates = [10,15,20]
print(rate[0] > rate[1])  ##false
print(rate[0] >= rate[0]) ##true

4.2 判断(if else)

4.2.1

sample_rate = 700
greater = (sample_rate > 5)
if greater: ##if+需要判断的东西(若为真,执行print())
    print(sample_rate)
else:
    print('less than')

结果:700
注意:在python中,如果if+0:则为false

4.2.2

t = True
f = False

if t:
    print("now you see me")
if f:
    print("now you don't")

结果:now you see me

4.3 元素是否存在

法一

animals = ["cat","dog","rabbit"]
for animal in animals:
    if animal == "cat":
        print("cat found")

结果:cat found

法二

animals = ["cat","dog","rabbit"]
if "cat" in animaks:
    print("cat found")

结果:cat found

法三

animals = ["cat","dog","rabbit"]
cat_found = "cat" in animals
print(cat_found)

结果:True

5.字典结构

5.1引例:

students = ["tom","jim","sue","ann"]
scores = [70,80,85,75] ##和名字一一对应

indexes = [0,1,2,3]
name = "sue"
score = 0  ##设置一个初始值
for i in indexes:
    if student[i] == name: ##遍历找到"sue"的索引
        score = score[i]
print(score)

结果:85
遍历两次,些许麻烦

5.2进化引例,字典结构:

scores = {} ##key value,定义字典,用{}

##print(type(scores)) = <class 'dict'>
##键值对一一对应

scores["jim"] = 80 ##创建了一个名为”jim“的key,且value=80
scores["sue"] = 85
scores["ann"] = 75

## print(scores.keys()) = dict_keys(['ann','sue','jim'])

print(scores)
print(scores["sue"])

结果:
{'ann':75,'sue':85,'jim':80}
85
字典中,有key(键) 和 value(值),键值对一一对应

5.3 键值对多种写法

students = {}         ##定义空字典
student["tom"] = 60   ##向空子典中添加值
student["jim"] = 70
print(students)

##直接在{}中写键值对
students = {   
    "tom":60,
    "jim":70
}
print(students)

结果:
{'tom':60,'jim':70}
{'tom':60,'jim':70}

5.4 对key的value进行修改

students = {   
    "tom":60,
    "jim":70
}
students["tom"] = 65
print(students)
students["tom"] = students["tom"] + 5
print(students)

结果:
{'tom':65,'jim':70}
{'tom':70,'jim':70}

5.5 判断字典中是否存在所需value

students = {   
    "tom":60,
    "jim":70
}
print('tom' in students)

结果:
True

5.6 字典的应用--统计

pantry = ["apple","orange","grape","apple","orange","apple","tomato","potato","grape"]
pantry_counts = {} ##空子典

for item in pantry: ##对list遍历
    if item in pantry_counts: ##如果item在字典中,则+1
        pantry_counts[item] = pantry_counts[item] + 1
    else:  ##如果item不在字典中,创建一个键值对,key=>item,value=>1
        pantry_counts[item] = 1
print(pantry_counts)

结果:
('potata':1, 'apple':3, 'tomato':1, 'orange':2, 'grape':2)

6.文件处理

先在本地文件夹创建一个test.txt,里面写上

屏幕截图 2022-03-30 203933.jpg

6.1读取test.txt文件

f = open("test.txt","r") ##借助open()函数
##open("要打开的文件的路径,若与当前python运行环境相同路径,则直接写名字即可;反之要指定其路径",”r=>读取操作,read“)
##写r命令时,读取的文件必须要存在

g = f.read() ##read函数可以把文件中的内容读进来
print(g)

f.close() ##读完要关闭

结果:上图片内容

6.2通过指令在文件中写内容

f = open("test_write.txt","w") ##写内容时,文件可以不存在
## write模式

f.write('123456')
f.write('\n') ##换行操作
f.write('234567')

f.close()

6.3读取文件内容赋值入list

已知la_weather.csv文件内容如图

屏幕截图 2022-03-30 205849.jpg 将每一行看成一个元素,元素之间的划分是通过回车进行的

weather_data = []
f = open("la_weather.csv",'r')
data = f.read()
rows = data.spilt('\n') ##spilt对数据按照\n对行进行切分
##print(rows)==>['1,Sunny','2,Sunny','3,Sunny','4,Sunny','5,Sunny','6,Sunny','7,Sunny','8,Sunny','9,Fog','10,Rain']

for row in rows:
    spilt_row = row.spilt(",") ## 1 和 sunny 之间也再进行切分,通过用”,“
    weather_data.append(spilt_row)
print(weather_data)

结果:
[['1','Sunny'],['2','Sunny'],['3','Sunny'],['4','Sunny'],['5','Sunny'],['6','Sunny'],['7','Sunny'],['8','Sunny'],['9','Fog'],['10','Rain']]

若只需每天的天气,不加数字

weather_data = []
f = open("la_weather.csv",'r')
data = f.read()
rows = data.spilt('\n') 
weather = []
for row in weather_data:
    weather.append(row[1]) 
##row[index=1]时,仅有天气
##当row[0]时,print的元素全为数字
print(weather)
f.close()

结果:['Sunny','Sunny','Sunny','Sunny','Sunny','Sunny','Sunny','Sunny','Fog','Rain']

7.函数基础

7.1 定义函数

关键字def
def 函数名(参数可写可不写):

def printHello():
    print('hello world')

def printNum():
    for i in range(0,10):
        print(i)
    return

def add(a,b):
    print a+b

printHello()
print(printNum())
print(add(1,3))

结果:
hello world
0
1
2
3
4
5
6
7
8
9
4




print("累 like dog”)
print("day1 over")