本文已参与「新人创作礼」活动,一起开启掘金创作之路。
序列在hive里不支持
table_src是我们经过业务需求处理的到的中间表数据,现在我们需要为table_src新增一列自增序列字段auto_increment_id,并将最终数据保存到table_dest中。
insert into table table_dest\
select (row_number() over(order by 1) + dest.max_id) auto_increment_id, src.* from table_src src cross join (select max(auto_increment_id) max_id from table_dest) dest;
dataos查询超过60s,后没有返回值。dataos程序测试时,前台显示一直在执行中,实际后台已经执行完了
dataos nginx 65s 改为300 重启nginx
重启 hbase regionserver 启动不了,掉线。
使用命令行 yum remove 重装
413 request entity too large
原因:
nginx 默认的接受参数大小是 1M,但是我们上传文件大小是已经超过1M,
解决方案
找到自己主机的nginx.conf配置文件,打开
在http{}中加入 client_max_body_size 10m;
重启nginx
/etc/init.d/nginx restart
hive-staging文件产生的原因和解决方案
原因
select或者insert overwrite等sql到hive时,会产生该目录,用于临时存放执行结果,比如insert overwrite会将结果暂存到该目录下,待任务结束,将结果复制到hive表中。
解决
默认配置:
<property>
<name>hive.exec.stagingdir</name>
<value>.hive-staging</value>
</property>
*修改后:**\
<property>
<name>hive.exec.stagingdir</name>
<value>/tmp/hive/.hive-staging</value>
</property>
设置项为 tez.staging-dir ,默认值为 /tmp/tez/staging
问题
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 5, vertexId=vertex_1637642623048_49568_1_00Vertex re-running, vertexName=Map 5, vertexId=vertex_1637642623048_49568_1_00Vertex failed, vertexName=Reducer 3, vertexId=vertex_1637642623048_49568_1_02, diagnostics=[Task failed, taskId=task_1637642623048_49568_1_02_000046, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1637642623048_49568_1_02_000046_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Could not get block locations. Source file "/tmp/staging/.hive-staging_hive_2021-11-27_19-52-13_665_7246302871668790688-4691/_task_tmp.-ext-10000/acct_month=201410/eday_id=20211123/_tmp.000046_0" - Aborting...block==null
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1041)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.handleOutputRows(PTFOperator.java:337)
at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.processRow(PTFOperator.java:325)
at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:139)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)
... 19 more
解决
解决:修改DataNode max data transfer threads = 8192