Midterm_review_1

125 阅读6分钟

这是一门关于Python数据分析入门课程的期中复习笔记, 这里一些基本知识可供大家参考

Tutorial 1: Basic syntax

1.1 变量

String的引号

单引号和双引号没有区别,但是内部包含单引号时有区别:

str1 = "I'm a fan of Python"
str2 = 'I\'m a fan of Python'
print(str1)
print(str2)
>>>I'm a fan of Python
>>>I'm a fan of Python
换行

\ 换行;但是对list、int等语句无效 ;可以结束语句:

print("Hi! " + \
      "I'm " + \
      "Tracy")
a = 1; b = 2; c = 3; Total = a + b + c; print(Total)
delete the reference

参考:tut1 variable types以及Question1 deleting an element from either num_list or ref will be reflected in the other.

num_list = [2604, 2604.2604]
ref = num_list
del ref[0]
print("ref: ", ref)
print("num_list: ", num_list)
>>>ref:  [2604.2604]
>>>num_list:  [2604.2604]
a = 1
b = a
b = 0
print(b)
print(a)
>>>0
>>>1
变量赋值

不声明变量直接调用会引起报错

变量类型的选择

例子:只选择数字

for arg in args:
        if type(arg) in (int, float):
            result = result + arg

    return result

1.2 print函数

逗号以空格形式连接,两边数据类型可以不同;加号只能连接两个相同数据类型

1.3 列表、字典和元组

列表和元组的声明

不能对其中某个位置赋值

b[2] = "STAT2603"
>>> TypeError: 'tuple' object does not support item assignment

列表和元组都具有Repetition性质

a = [2604, 2604.2604, "STAT2604"]
print(a*2)
>>>[2604, 2604.2604, 'STAT2604', 2604, 2604.2604, 'STAT2604']
列表的遍历

The rules can be summarized as:

  • a[start:stop] --> items start through stop-1
  • a[start:] --> items start through the rest of the array
  • a[:stop] --> items from the beginning through stop-1
  • a[:] --> a copy of the whole array
  • a[start:stop:step] --> start through not past stop, by step
lst = [1,2,3,4,5,6,7,8,9,10]
print(lst[-1:0:-2])
print(lst[-1:0:2])
print(lst[1:9:2])
>>>[10, 8, 6, 4, 2]
>>>[]
>>>[2, 4, 6, 8]
# 特别注意默认值
courses = [1600, 1602, 2601, 2602, 2604, 3600]
print(courses[-1: -5])
>>>[]
字典基础
my_dict = {1: 'Tue', 2: 'Wed', 3: 'Sat'}
print(my_dict[1])
print(my_dict.keys())
print(my_dict.values())
>>>Tue
>>>dict_keys([1, 2, 3])
>>>dict_values(['Tue', 'Wed', 'Sat'])

1.4 运算符

地板除

If one of the number that uses floor division is a floating point number, the output is floating point number. But when both of them are integer, the output of floor division is integer. (整数除整数返回整数,但是除数或被除数包含浮点数的时候返回浮点数) This is different from division.

a = 2604
b = 26042604
c = b // a
print(c)
>>>10001

a = 2604
b = 26042604.0
c = b // a
print(c)
>>>10001.0

a = -300
b = 400
c = b // a
print(c)
>>>-2

a = 300
b = -400
c = b // a
print(c)
>>>-2

a = 300
b = 400
c = b // a
print(c)
>>>1

a = 300.0
b = 400
c = b // a
print(c)
>>>1.0

a = -300
b = 400.0
c = b // a
print(c)
>>>-2.0
Python中不存在'<>'运算符
'is'和'=='比较

Please be careful with this function. The "is" keyword is used to test if two variables refer to the same object (in memory). Please note that the definition here says "object," not content.

And as you see, integers and lists give you different outputs. This is because lists have a unique property -- they are mutable, while integers and strings are immutable. Lists are mutable in the sense that we could change an item in a list by accessing it directly as part of the assignment statement. Don't worry about this, as we will repeat it in future lectures.

因为list是可修改的,可修改变量对应的是object而非value
运算符小结
  1. Exponential **
  2. Unary plus +x, Unary minus -x, Bitwise NOT ~x
  3. Multiplication *, Division /, Floor division //, Modulus %
  4. Addition +, Subtraction -
  5. Bitwise shift operators <<, >>
  6. Bitwise AND &
  7. Bitwise XOR ^, Bitwise OR |
  8. Comparisons operators: >, >=, <, <=
  9. Equality operators: ==, !=
  10. Assignment operators: =, %=, /=, //=, -=, +=, *=, ** =
  11. Identity operators: is, is not
  12. Membership operators: in, not in
  13. Logical operators: not, or, and

1.5 loop

break 跳出整个循环

continue 跳出本次循环

pass 不做任何事情,一般用做占位语句。

1.6 Functions

形参和实参

传递object和传递value 这里可以把list看作一个容器,传递list实际上是传递了一个容器,函数调用其内部的值;而int内的变量只是一个值,没有容器,在函数内部更改不影响外部变量的值(通过下面的例子可以看出来) 一般来讲,对于immutable的,传递的是value;对于mutable的,传递的是object.

def modify_params(a, b):
    a += 1
    b.append(a)
    print("a is ", a)
    print("b is ", b)
    return
x = 1
y = []
modify_params(x, y)
print(x)
print(y)
>>>a is  2
>>>b is  [2]
>>>1
>>>[2]
def inside_modify_params(a, b):
	print("Inside function")

	print("initial id of a:",  {id(a)})
	print("initial id of b:",  {id(b)})

	a += 1
	b.append(a)

	print("id of a after modification:", {id(a)})
	print("id of b after modification:", {id(b)})
 
inside_modify_params(1, [])
>>>Inside function
>>>initial id of a: {4342540592}
>>>initial id of b: {4404839040}
>>>id of a after modification: {4342540624}
>>>id of b after modification: {4404839040}

关于是否immutable和传递的是object还是value,可以参考以下代码:

a = (1, 2, 3)
b = (1, 2, 3)
c = 1
d = 1
print(id(a))
print(id(b))
print(id(c))
print(id(d))
a = (4,5,6)
c = 2
print(id(a))
print(id(c))

e = [1,2,3]
f = [1,2,3]
print(id(e))
print(id(f))
e = [4,5,6]
print(id(e))

>>>4368313728
>>>4368313728
>>>4366123248
>>>4366123248
>>>4368312640
>>>4366123280
>>>4367496768
>>>4367695744
>>>4367782080
delete the reference

下面两个例子中,因为variable传递的是值,所以更改a的值不会影响b的值;而ref和num_list两个列表的相等传递的是object,即占用同一内存,所以更改的时候同时变化。

参考:tut1 variable types以及Question1 deleting an element from either num_list or ref will be reflected in the other.

num_list = [2604, 2604.2604]
ref = num_list
del ref[0]
print("ref: ", ref)
print("num_list: ", num_list)
>>>ref:  [2604.2604]
>>>num_list:  [2604.2604]
a = 1
b = a
b = 0
print(b)
print(a)
>>>0
>>>1

Tutorial 2: Functions, pandas & File I/O

2.1 Functions

函数的声明和调用
def function_name (parameters):
  function_suite  # what you want the function to do
  return # Don't forget to return something for your function
  
function_name (pass_parameters_reference)
return

如果不返回输入的变量值,那么外部变量就不会被函数内部运算所更改

def change_list (list1):
  list1 = [1, 3, 5]
  print("Values after call a function inside: ", list1)
  return list1

list1 = [2, 4, 6]
print("Values before call a function: ", list1)

change_list(list1)
print("Values after call a function outside: ", list1)
print("Values after call a function outside: ", change_list(list1))

>>>Values before call a function:  [2, 4, 6]
>>>Values after call a function inside:  [1, 3, 5]
>>>Values after call a function outside:  [2, 4, 6]
>>>Values after call a function inside:  [1, 3, 5]
>>>Values after call a function outside:  [1, 3, 5]
默认参数

默认参数可以被overwrite, default argument一定要放在param的最后!

def printcourse(code, department = "STAT"):
  print(department + code)
  return;

printcourse("2604", "STAT")
printcourse("3600")
>>> STAT2604
>>> STAT3600
参数不定函数
def function(*args):
  body of the function
lambda函数

基本语法:

lambda arg1, arg2, ..., argn : expression;

这里需要注意,lambda函数在function和argument之间不要加逗号和semicolon,Eg:

print((lambda a, b: a * b), (10, 20))
print((lambda a, b: a * b); (10, 20))

这不会报错,但是结果会是:

<function <lambda> at 0x107ee95e0> (10, 20)

    print((lambda a, b: a * b); (10, 20))
                              ^
SyntaxError: invalid syntax

对于lambda函数,不需要对他进行起名,但是要先声明后调用;同时,lambda无需return,它会自动返回结果

2.2 Module

A module is a Python code file. A module defines functions, classes and variables. The module was imported by another python program to execute the functions or access the variables of that module.

2.3 File I/O

文件读写基本语法
file object = open(file_name [, access_mode][, buffering])
fileObject.close()

file.read() size | Optional. The number of bytes to return. Default -1, which means the whole file. |

螢幕截圖 2022-10-17 上午11.26.06.png

seek()

用于移动文件读取指针到指定位置

  • offset -- 开始的偏移量,也就是代表需要移动偏移的字节数
  • whence: 可选,默认值为 0。给offset参数一个定义,表示要从哪个位置开始偏移;0代表从文件开头开始算起,1代表从当前位置开始算起,2代表从文件末尾算起。
f.seek(offset, from_what)
tell()

tell() method returns current position of file object. This method takes no parameters and returns an integer value. Initially file pointer points to the beginning of the file(if not opened in append mode). So, the initial value of tell() is zero.

fp = open("numbers.txt", "r")
fp.read(8)

print(fp.tell())
fp.close()
>>> 8
文件删除创建语句
#文件重命名
os.rename(current_file_name, new_file_name)
#删除文件
os.remove("delete.txt")
#新建文件夹
os.mkdir("newdir")
#改变当前文件夹位置
os.chdir("newdir")
#获取当前文件夹
os.getcwd()
#删除文件夹
os.rmdir('dirname')