大纲

等我有时间按照堆栈、字典、数据、类、集合这个5个大类好好梳理一番。每个下面放一个链接。

collections-容器数据类型

该模块实现了专门的容器数据类型，为 Python 的通用内置容器 dict， list， set 和 tuple 提供了替代方案。

ChainMap

ChainMap 类将多个映射链在一起，这通常比创建一个新字典再重复使用 update() 要快得多。

参数介绍

class collections.ChainMap(*map)

用法

deque-双端队列

类似列表的容器，两端都有快速追加和弹出。

参数介绍

class collections.deque([iterable[, maxlen]])

用法

# 双端队列 deque
from collections import deque
# deque 支持以下方法
dq = deque('')

# append(x)：将 x 添加到 deque 的右侧
dq.append('a')
dq.append('b')
print(dq) # deque(['a', 'b'])

# appendleft(x)：将 x 添加到 deque 的左侧
dq.appendleft('c')
print(dq) # deque(['c', 'a', 'b'])

# pop()：从 deque 的右侧弹出一个元素，并返回该元素
dq.pop()
print(dq) # deque(['c', 'a'])

# popleft()：从 deque 的左侧弹出一个元素，并返回该元素
dq.popleft()
print(dq) # deque(['a'])

# clear()：清空 deque
dq.clear()

案例：保留最后N个元素

from collections import deque
def search(lines, pattern, history=5):
    previous_lines = deque(maxlen=history)
    for li in lines:
        if pattern in li:
            yield li, previous_lines
        previous_lines.append(li)

# Example use on a file
if __name__ == '__main__':
    with open(r'../../cookbook/somefile.txt') as f:
        for line, prevlines in search(f, 'python', 5):
            for pline in prevlines:
                print(pline, end='')
            print(line, end='')
            print('-' * 20)

yield 函数

yield 函数和 return 的区别：

return 函数执行完之后，会返回一个值，然后结束函数。而 yield 函数执行完之后，会返回一个生成器对象，可以用 for() 或者 next() 来提取结果。

def generate_even(n):
    for i in range(n):
        if i % 2 == 0:
            yield i

even_lists = generate_even(10)
for i in even_lists:
    print(i)

defaultdict-一键多值字典

defaultdict 接受一个类型对象或函数对象，在取值时，如果不存在对应的 key 则返回对应的函数返回值或默认构造函数的实例对象。

参数介绍

class* collections.defaultdict(*default_factory=None*,  */* [,  *...* ])

用法

from collections import defaultdict
d1 = defaultdict(list)
d1['a'].append(1)
d1['a'].append(2)
d1['b'].append(3)
print(d1)
# defaultdict(<class 'list'>, {'a': [1, 2], 'b': [3]})

d2 = defaultdict(list)
pairs= [('a', 1), ('b', 2), ('a', 3), ('b', 4), ('c', 1)]
for key, value in pairs:
    d2[key].append(value)
print(d2)
# defaultdict(<class 'list'>, {'a': [1, 3], 'b': [2, 4], 'c': [1]})

d3 = defaultdict(set)
d3['a'].add(1)
d3['a'].add(2)
d3['b'].add(3)
print(d3)
# defaultdict(<class 'set'>, {'a': {1, 2}, 'b': {3}})

OrderedDict-有序字典

返回一个 dict 子类的实例，它具有专门用于重新排列字典顺序的方法。

参数介绍

class collections.OrderedDict([ *items* ])

用法

namedtuple-命名元祖

Python 内置元祖 tuple 存在一个局限，即不能为 tuple 中的元素命名。命名元祖 namedtuple 构造一个带字段名的元祖。
namedtuple 继承自 tuple，属性均不可变

参数介绍

collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)

typename：namedtuple 返回一个叫做 typename 的 tuple 子类。
filed_name：元祖中元素的名称，是一串字符串。形式如：['x', 'y']、'x y'、'x, y'。
rename：如果是True，那么filed_name中无效的字段，会自动替换为占位符。如['abc','def','abc','ghi']→['abc','_1','_2','ghi']。
defaults：filed_name 的默认值。可以是 None，也可以是默认的迭代值。
module：如果定义了该值，则命名元祖的 __module__ 属性会被设置该值。

用法

1. 实例化及调用属性

from collections import namedtuple
Point = namedtuple('Point', ['x','y'])
p = Point(11, 22)
print(p) # Point(x=11, y=22)
print(p[0], p[1]) # 11 22
print(p.x, p.y) # 11 22

2. _make(iterable)

# _make(iterable)：通过已存在的序列或者迭代对象创造一个新的实例
Point = namedtuple('Point', ['x','y'])
t = [27, 28]
p = Point._make(t)
print(p) # Point(x=27, y=28)

3. _asdict()

# _asdict()：将 字段名:值 以字典形式返回
Point = namedtuple('Point', ['x','y'])
p = Point(11, 22)
my_dict = p._asdict()
print(my_dict) # {'x': 11, 'y': 22}

4. _replace(**kwargs)

# _replace(**kwargs)：返回命名元组的新实例，用新值替换指定字段
Point = namedtuple('Point', ['x','y'])
p = Point(11, 22)
new_p = p._replace(x=999)
print(p) # Point(x=11, y=22) 这也说明 namedtuple 与 tuple 一样不能改属性
print(new_p) # Point(x=999, y=22)

5. _fileds

# _fileds：将 filed_names 的字段名以元祖形式返回
Point = namedtuple('Point', ['x','y'])
p = Point(11, 22)
p_fileds = p._fields
print(p_fileds) # ('x', 'y')

6. _field_defaults

# _field_defaults：将 字段名：默认值 以字典形式返回
# default属性
Account = namedtuple('Account', 'type, balance', defaults=['man', 170])
a = Account()
print(a) # Account(type='man', balance=170)
print(Account._field_defaults) # 默认值的字典 {'type': 'man', 'balance': 170}

Counter

快速计数工具，Counter 是 dict 的子类，用于计数 hashable 对象。它是一个多项集，存储 {元素：元素的计数} 。

参数介绍

class collections.Counter([iterable-or-mapping])

用法

heapq

这个模块实现了堆队列算法，即优先队列算法。这个实现使用了数组，其中对于所有从 0 开始计数的 k 有 heap[k] <= heap[2*k+1] 且 heap[k] <= heap[2*k+2]。为了便于比较，不存在的元素将被视为无穷大。堆最有趣的特性在于其最小的元素始终位于根节点 heap[0]。

用法

import heapq

nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
# heapq.heapify(x)：将列表 x 转换为堆
heapq.heapify(nums)
print(nums) # [-4, 2, 1, 23, 7, 2, 18, 23, 42, 37, 8]

# heapq.heappop(heap)：将 heap 中的最小元素弹出，并返回该元素
x = heapq.heappop(nums)
print(x) # -4
x = heapq.heappop(nums)
print(x) # 1
print(nums) # [2, 2, 8, 23, 7, 37, 18, 23, 42]

# heapq.heappush(heap, item)：将 item push 到 heap中，push 的元素在 heap 中呈现堆的顺序
heapq.heappush(nums, -8)
print(nums) # [-8, 2, 8, 23, 2, 37, 18, 23, 42, 7]

print("最大的前3个元素：", heapq.nlargest(3, nums))
# 最大的前3个元素： [42, 37, 23]

# heapq.nsmallest(n, iterable, key=None)：取出前 n 个最小的元素，并返回一个从小到大的列表
print("最小的前3个元素：", heapq.nsmallest(3, nums))
# 最小的前3个元素： [-8, 2, 2]

求 1 个最小或者最大的元素，使用 min() 或者 max()

求 N 个最小或者最大的元素，使用 heapq 库的 nlargest() 和 nsmallest()

如果 N 的大小和集合的大小接近，使用 sorted(items)[:N] 或者 sorted(items)[-N:]

1. 案例：升序列表

h = []
for num in nums:
    heapq.heappush(h, num)
lis = [heapq.heappop(h) for i in range(len(h))]
print("升序列表：",lis) # 升序列表： [-8, 2, 2, 7, 8, 18, 23, 23, 37, 42]

2. 案例：按优先级排序队列

import heapq

class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0

    def push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        self._index += 1

    def pop(self):
        return heapq.heappop(self._queue)[-1]

>>> class Item:
...     def __init__(self, name):
...         self.name = name
...     def __repr__(self):
...         return 'Item({!r})'.format(self.name)
...
>>> q = PriorityQueue()
>>> q.push(Item('foo'), 1)
>>> q.push(Item('bar'), 5)
>>> q.push(Item('spam'), 4)
>>> q.push(Item('grok'), 1)
>>> q.pop()
Item('bar')
>>> q.pop()
Item('spam')
>>> q.pop()
Item('foo')
>>> q.pop()
Item('grok')
>>>

19. Python数据结构2（未更完。。。）

大纲

collections-容器数据类型

ChainMap

参数介绍

用法

deque-双端队列

参数介绍

用法

案例：保留最后N个元素

defaultdict-一键多值字典

参数介绍

用法

OrderedDict-有序字典

参数介绍

用法

namedtuple-命名元祖

参数介绍

用法

1. 实例化 及 调用属性

2. _make(iterable)

3. _asdict()

4. _replace(**kwargs)

5. _fileds

6. _field_defaults

Counter

参数介绍

用法

heapq

用法

1. 案例：升序列表

2. 案例：按优先级排序队列

1. 实例化及调用属性