str 使用encode方法转换为bytes(爬虫的得到的响应以二进制的方式传送)
In [9]: a = "你好"
In [10]: type(a) Out[10]: str
In [11]: b = a.encode()
In [12]: b Out[12]: b'\xe4\xbd\xa0\xe5\xa5\xbd'
In [13]: type(b) Out[13]: bytes bytes 通过decode转化为 str
In [12]: b Out[12]: b'\xe4\xbd\xa0\xe5\xa5\xbd'
In [13]: type(b) Out[13]: bytes
In [14]: c = b.decode()
In [15]: c Out[15]: '你好'
In [16]: type(c) Out[16]: str 默认方式都以utf-8的方式编解码。其编解码的方式必须一样,否则会出现乱码
In [17]: a = "你好"
In [18]: b = a.encode("gbk")
In [19]: b.decode() --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-19-4169e64150f6> in <module> ----> 1 b.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 0: invalid continuation byte
In [20]: b.decode("gbk") Out[20]: '你好'
|
|