阅读 273

hive-1. 安装

Hive安装

  1. 下载解压安装包
  2. 配置环境变量HIVE_HOME
  3. 配置hive conf文件

cp hive-env.sh.template hive-env.sh cp hive-default.xml.template hive-site.xml conf/hive-env.sh指定HADOOP_HOME conf/hive-site.xml

<!--指定metadata数据库信息-->
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://192.168.1.178:3306/hive?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>
<property>
  <name>hive.metastore.uris</name>
  <value>thrift://localhost:9083</value>
</property>
<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/user/hive/warehouse</value>
</property>
<!--指定缓存目录-->
<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/home/hadoop/iotmp</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>hive.downloaded.resources.dir</name>
  <value>/home/hadoop/iotmp</value>
  <description>Temporary local directory for added resources in the remote file system.</description>
</property>
复制代码
  1. 下载mysql-connector-java-5.1.27-bin.jar文件,并放到$HIVE_HOME/lib目录下
  2. 执行schematool -initSchema -dbType mysql初始化数据库
  3. 启动metastore

vim hvie-site.xml

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://localhost:9001</value>
</property>
复制代码

hive --service metastore 7. 执行hive测试

CREATE TABLE t_hive (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
cat <<EOF >/tmp/t_hive.txt
1    2    3
4    1    2
1    2    8
EOF
LOAD DATA LOCAL INPATH '/tmp/t_hive.txt' OVERWRITE INTO TABLE t_hive ;
[root@hd222 hadoop-2.7.5]# ./bin/hadoop dfs -cat /user/hive/warehouse/t_hive/test
1    2    3
4    1    2
1    2    8
复制代码
  1. hiveserver2
nohup hive --service hiveserver2 &
kill -9 pid
复制代码

tez安装

  1. 下载

wget mirrors.tuna.tsinghua.edu.cn/apache/tez/… 2. 修改pom.xml,编译打包 在tez目录下修改pom.xml,首先将hadoop version改为当前运行的Hadoop版本. (可选):因为tez-ui依赖node.js以及bower, bower install时间过久而且报错,可以在pom.xml中搜索tez-ui模块并且注释掉. mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Phadoop28 -Dhadoop.version=3.1.2 文件生成在tez-dict/target目录中 3. 部署配置 tar -zxvf tez-0.9.1.tar.gz -C tez-0.9.1 hdfs dfs -put ./tez-0.9.1/* /apps/tez/ mkdir tez-0.9.1/conf vim tez-0.9.1/conf/tez-site.xml

<configuration>
  <property>
    <name>tez.lib.uris</name>
    <value>hdfs://ns1/apps/tez/,hdfs://ns1/apps/tez/lib/</value>
  </property>
</configuration>
复制代码

vim hive/conf/hive-env.sh

export TEZ_HOME=/home/hadoop/apps/tez-0.9.1
export TEZ_CONF_DIR=/home/hadoop/apps/tez-0.9.1/conf
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${TEZ_CONF_DIR}:${TEZ_HOME}/*:${TEZ_HOME}/lib/*
复制代码

压测

  1. 下载安装
wget https://github.com/hortonworks/hive-testbench/archive/hive14.zip
unzip hive14.zip
cd hive-testbench-hive14/
./tpcds-build.sh
复制代码

2 生成测试数据和查询脚本

export FORMAT=parquet
./tpcds-setup.sh 1000
复制代码

单位为G,修改FORMAT,比如orc、parquet等

  1. 运行测试

测试sql脚本目录:sample-queries-tpcds

cd sample-queries-tpcds
hive> use tpcds_bin_partitioned_parquet_10;
hive> source query12.sql;
复制代码
  1. 批量测试

根据需要修改hive配置:sample-queries-tpcds/testbench.settings 根据需要修改测试脚本(perl):runSuite.pl perl runSuiteCommon.pl

遇到的问题

  1. note: tpcds_kit.zip may be a plain executable, not an archive

解决: sh -x ./tpcds-build.sh

Building TPC-DS Data Generator
+ cd tpcds-gen
+ make
test -d target/tools/ || (cd target; unzip tpcds_kit.zip)
Archive:  tpcds_kit.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
note:  tpcds_kit.zip may be a plain executable, not an archive
unzip:  cannot find zipfile directory in one of tpcds_kit.zip or
        tpcds_kit.zip.zip, and cannot find tpcds_kit.zip.ZIP, period.
复制代码

vim tpcds-gen/Makefile

target/tpcds_kit.zip: tpcds_kit.zip
        mkdir -p target/
        cp tpcds_kit.zip target/tpcds_kit.zip

tpcds_kit.zip:
        curl http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/README
        curl --output tpcds_kit.zip http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/TPCDS_Tools.zip
复制代码

手动下载TPCDS_Tools.zip,将Makefile中curl tpcds_kit.zip相关删除

文章分类
后端
文章标签