pandas之一维数组series如果我们在生成的时候没有设置index值，Series还是会自动帮我们生成index，

series创建方式

series 常见创建方式：

列表等可迭代对象
ndarray数组对象
字典对象
标量

# 创建Series：使用列表
s = pd.Series([1, 2, 3, 4], index=list("abcd"), dtype=np.int32)
display(s)

a   1
b   2
c   3
d   4
dtype: int32

如果我们在生成的时候没有设置index值，Series还是会自动帮我们生成index，这种方式生成的Series结构跟list列表差不多，可以把这种形式的Series理解为竖起来的list列表。

#创建Series：使用ndarray数组
s = pd.Series(np.array([1, 2, 3, 4]),index=list("abcd"))
display(s)

a   1
b   2
c   3
d   4
dtype: int32

这种形式的Series可以理解为numpy的array外面披了一件index的马甲，所以array的相关操作，Series同样也是支持的。结构非常相似的dict字典同样也是可以转化为Series格式的。

#创建Series：使用字典。
# 字典的key充当标签，字典的value充当Series的值。
s = pd.Series({"a":"xy", "b":"11", "c":"22"})
display(s)

a   xy
b   11
c   22
dtype: object

#创建Series: 使用标量。
#在创建Series时，可以使用index参数来显式指定索引。如果没有显式指定，则默认从0开始进行排列。
s = pd.Series(33, index=["k", "x", "y"])
display(s)

s = pd.Series([11, 22, 33], index=["k", "x", "y"])
display(s)
k   33
x   33
y   33
dtype: int64

k   11
x   22
y   33
dtype: int64

Series相关属性

index
values
shape
size
dtype
ndim
name
T

Series对象可以通过index与values访问索引与值；

说明：

如果没有指定索引，则会自动生成从0开始的整数值索引，也可以使用index显式指定索引。
Series对象与index具有name属性。Series的name属性可在创建时通过name参数指定。
当数值较多时，可以通过head与tail访问前 / 后N个数据。【中间的怎么办？】
Series对象的数据只能是一维数组类型。

s = pd.Series([1, 2, 3, 4], index=list("abcd"))

# 返回Series的索引对象。
display(s.index)

# 返回Series所关联的数组数据。naarray类型。
display(s.values, type(s.values))

# 返回Series对象的形状
display(s.shape)

# 返回元素的个数
display(s.size)

# 返回元素的类型
display(s.dtype)

Index(['a', 'b', 'c', 'd'], dtype='object')array([1, 2, 3, 4], dtype=int64)numpy.ndarray(4,)4dtype('int64')

Series相关操作

Series在操作上，与Numpy数据具有如下的相似性：

支持广播与矢量化运算。
支持索引与切片。
支持整数数组与布尔数组提取元素。

运算

Series类型也支持矢量化运算与广播操作。计算规则与Numpy数组的规则相同。同时，Numpy的一些函数，也适用于Series类型，例如，np.mean，np.sum等。
多个Series运算时，会根据索引进行对齐。当索引无法匹配时，结果值为NaN（缺失值）。

说明：

我们可以通过pandas或Series的isnull与notnull来判断数据是否缺失。
除了运算符以外，我们也可以使用Series对象提供的相关方法进行运算【可以指定缺失的填充值】。
尽管Numpy的一些函数，也适用于Series类型，但Series与ndarray数组对于空值NaN的计算处理方式上是不同的。【Numpy的计算，会得到NaN，而Series会忽略NaN】

s1 = pd.Series([1, 2, 3])
s2 = pd.Series([4, 5, 6])
display(s1 * s2)
display(s1 * 5)

# 对于numpy的一些函数，例如mean，sum等，也适用于Series。
display(np.mean(s1), np.sum(s2))
0     4
1    10
2    18
dtype: int640     5
1    10
2    15
dtype: int642.015

# 数据准备
s1 = pd.Series([1, 2, 3], index=[1, 2, 3])
s2 = pd.Series([4, 5, 6], index=[2, 3, 4])

# Series与ndarray数组计算的不同。Series运行时，会根据标签进行对齐，如果标签无法匹配（对齐），就会产生空值（NaN）。
display(s1 + s2)

# 如果不想产生空值，则可以使用Series提供的计算方法来代替运算符的计算。
display(s1.add(s2, fill_value=100))
1    NaN
2    6.0
3    8.0
4    NaN
dtype: float641    101.0
2      6.0
3      8.0
4    106.0
dtype: float64

# 判断是否为空值。
s = pd.Series([1, 2, 3, float("NaN"), np.nan])
display(s)

# 判断是否为空值。
display(s.isnull())

# 判断是否不是空值。
display(pd.notnull(s))
0    1.0
1    2.0
2    3.0
3    NaN
4    NaN
dtype: float640    False
1    False
2    False
3     True
4     True
dtype: bool0     True
1     True
2     True
3    False
4    False
dtype: bool

Series索引

标签索引与位置索引

如果Series对象的index值为非数值类型，通过[索引]访问元素，索引既可以是标签索引，也可以是位置索引。这会在一定程度上造成混淆。
我们可以通过：

loc 仅通过标签索引访问。
iloc 仅通过位置索引访问。

这样，就可以更加具有针对性去访问元素。

# Series的索引分为标签索引与位置索引。
s = pd.Series([1, 2, 3], index=list("abc"))
display(s)

# 既通过标签索引访问，也可以通过位置索引访问。
display(s["a"])
display(s[0])

# 如果指定的索引是数值类型，则位置索引就失灵。
s = pd.Series([1, 2, 3], index=[2, 3, 4])
display(s[3])
# 出错，因为位置索引不再可用。
# display(s[0])
a    1
b    2
c    3
dtype: int64112

loc和iloc

# 为了避免上诉的混淆性，我们可以通过loc与iloc进行更有针对的访问。
# loc 专门针对标签进行访问
# iloc专门针对位置进行访问
s = pd.Series([1, 2, 3], index=list("abc"))
display(s.loc["a"])
display(s.iloc[0])
# 错误
# display(s.loc[0])
# 错误
# display(s.iloc["a"])
11

Series切片

Series也支持切片访问一个区间的元素。与Numpy的数组相同，切片返回的是原数组数据的视图。

# Series也支持切片操作。与ndarray相同的是，Series切片返回的也是原数据的视图。
s1 = pd.Series([1, 2, 3, 4])
s2 = s1[0:3]

# 要对s2改变，会影响到以前的s1。
s2[0] = 1000
display(s1, s2)
0    1000
1       2
2       3
3       4
dtype: int640    1000
1       2
2       3
dtype: int64

# Series的索引分为标签索引与位置索引，二者在切片的行为上是不一致的。
# 通过位置索引切片，不包含末尾的值，通过标签索引切片，包含末尾的值。

s = pd.Series([1, 2, 3, 4], index=list("abcd"))
# 通过位置索引切片
display(s.iloc[0:3])
# 通过标签索引切片
display(s.loc["a":"d"])
a    1
b    2
c    3
dtype: int64a    1
b    2
c    3
d    4
dtype: int64

Series的CRUD

Series索引-数值CRUD操作：

获取值
修改值
增加索引-值
删除索引-值

s = pd.Series([1, 2, 3, 4, 5, 6 ,7], index=list("abcdefg"))
# 获取值，通过标签索引或位置索引（或者是二者的数组）
display(s.loc["a"])
display(s.iloc[0])

# 修改值
s.loc["a"] = 3000
display(s)

# 增加值 就可以像字典那样进行操作
s["new_key"] = "new_value"
display(s)
11a    3000
b       2
c       3
d       4
e       5
f       6
g       7
dtype: int64a               3000
b                  2
c                  3
d                  4
e                  5
f                  6
g                  7
new_key    new_value
dtype: object

s = pd.Series([1, 2, 3, 4, 5, 6 ,7], index=list("abcdefg"))
# 删除值 类似字典的操作
del s["a"]
display(s)

# 删除值，通过drop方法。
# inplace，就地修改。如果指定为True，则不会返回修改修改后的结果（返回None）。
s.drop("b")
display(s)
s.drop("b", inplace=True)
display(s)
b    2
c    3
d    4
e    5
f    6
g    7
dtype: int64b    2
c    3
d    4
e    5
f    6
g    7
dtype: int64c    3
d    4
e    5
f    6
g    7
dtype: int64