Windows下安装impyla连接Hive

1,446 阅读1分钟

环境

Windows 10

Python 3.6

安装步骤

  1. pip install six

  2. pip install bit_array

  3. pip install pure-sasl

  4. pip install thrift

  5. pip install thriftpy

  6. pip install --no-deps thrift-sasl==0.2.1

  7. pip install impyla

    注意:pure-saslthrift_saslthriftpy都是impyla` 依赖包

    在安装 impyla 时会抛异常:

    error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": Downloads | IDE, Code, & Team Foundation Server | Visual Studio

    解决方法:

    安装visualcppbuildtools_full,工具包下载地址:

    pan.baidu.com/s/1q2Nj41Xk… 密码:qbba

    工具包安装后,再次执行 pip install impyla 即可完成安装

测试

# coding:utf-8
from impala.dbapi import connect
from impala.util import as_pandas


if __name__ == '__main__':
    conn = connect(host='x.x.x.x', port=21050, user='xxx', password='xxxxxx', database='xxxx', auth_mechanism="PLAIN")
    cur = conn.cursor()
    cur.execute('show tables')
    print(cur.fetchall())

问题

  1. 读取hbase时抛 “ThriftPy does not support generating module with path in protocol ‘f’”的异常

    修改 Lib\site-packages\thriftpy\parser\parser.py文件:

    # if url_scheme == '':
    if len(url_scheme) <= 1:
    
  2. impyla (0.14.0) ERROR - 'TSocket' object has no attribute 'isOpen'

    原因:thrift-sasl版本过高导致的,将其换成0.2.1的版本 pip install thrift-sasl==0.2.1

  3. thriftpy2.protocol.exc.TProtocolException: TProtocolException(type=4)

    原因:auth_mechanism设置问题,将其改为auth_mechanism="PLAIN"

  4. TypeError: can’t concat str to bytes

    修改 thrift-sasl init.py,在第94行之前加上以下语句:

    if (type(body) is str):
        body = body.encode()
    
  5. thrift.transport.TTransport.TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'

    这是Windows下采用pyhive连接方式提出的错误,正如前言所述,可能需要修改对应的配置文件,也可能sasl根本就不支持Windows,建议改用impyla形式连接

  6. thriftpy.parser.exc.ThriftParserError: ThriftPy does not support generating module with path in protocol 'f'

    修改thriftpy包下\parser\parser.py"中第488行代码,将"if url_scheme == '':" 修改为"if len(url_scheme) <=1:"