阅读 211

hive-4.源码metastore

技术

技术:dataneclues/thrift

hive --service metastore

/usr/java/jdk1.8.0_131/bin/java -Dproc_jar -Dproc_metastore -XX:+UseParallelGC -agentlib:jdwp=transport=dt_socket,server=y,address=8000,suspend=y -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/soft/hive-3.1.2/conf/parquet-logging.properties -Dyarn.log.dir=/soft/hadoop-3.1.3/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/soft/hadoop-3.1.3 -Dyarn.root.logger=INFO,console -Djava.library.path=/soft/hadoop-3.1.3/lib/native -Xmx256m -Dhadoop.log.dir=/soft/hadoop-3.1.3/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/soft/hadoop-3.1.3 -Dhadoop.id.str=root -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /soft/hive-3.1.2/lib/hive-metastore-3.1.2.jar org.apache.hadoop.hive.metastore.HiveMetaStore
复制代码

程序入口

org.apache.hadoop.hive.metastore.HiveMetaStore:

1. 加载配置,解析命令行参数,使用apache-commons-cli包做命令行参数解析

final Configuration conf = MetastoreConf.newMetastoreConf();
HiveMetastoreCli cli = new HiveMetastoreCli(conf);
cli.parse(args);
复制代码

2. 执行startMetaStore,去启动thrift server

Lock startLock = new ReentrantLock();
Condition startCondition = startLock.newCondition();
AtomicBoolean startedServing = new AtomicBoolean();
//等metastore启动成功之后,启动事务相关的处理线程CompactorThread
startMetaStoreThreads(conf, startLock, startCondition, startedServing);
startMetaStore(cli.getPort(), HadoopThriftAuthBridge.getBridge(), conf, startLock,
  startCondition, startedServing);
复制代码

启动thrift server区分是否开启kerberos来启动对应类型的server,下面讲解未开启ssl的启动方式:

//服务层serverSocket: new TServerSocket(new InetSocketAddress(9083));
TThreadPoolServer.Args args = new TThreadPoolServer.Args(serverSocket)
//HMSHandler baseHandler = new HiveMetaStore.HMSHandler("new db based metaserver", conf, false); 
//IHMSHandler handler = newRetryingHMSHandler(baseHandler, conf);
//processor = new TSetIpAddressProcessor<>(handler);
  .processor(processor)
  .transportFactory(transFactory)
  .protocolFactory(protocolFactory)
  .inputProtocolFactory(inputProtoFactory)
  .minWorkerThreads(minWorkerThreads)
  .maxWorkerThreads(maxWorkerThreads);

TServer tServer = new TThreadPoolServer(args);
复制代码

了解thrift的知道,核心处理业务逻辑在handler里面,handler外层装饰了一层重试机制,核心HMSHandler调用初始化方法之后,负责处理客户端请求

synchronized (HMSHandler.class) {
    if (currentUrl == null || !currentUrl.equals(MetaStoreInit.getConnectionURL(conf))) {
      // 创建默认数据库default,默认catalog
      createDefaultDB();
      // 创建默认角色admin
      createDefaultRoles();
      addAdminUsers();
      currentUrl = MetaStoreInit.getConnectionURL(conf);
    }
}
复制代码

其中,在createDefaultDB时,初始化创建RawStoreProxy代理ObjectStore,ObjectStore为metastore操作mysql数据库的数据库操作接口

LOG.info("ObjectStore, initialize called");
prop = dsProps;
// 获取datanucleus的持久化管理对象
pm = getPersistenceManager();
try {
  String productName = MetaStoreDirectSql.getProductName(pm);
  sqlGenerator = new SQLGenerator(DatabaseProduct.determineDatabaseProduct(productName), conf);
} catch (SQLException e) {
  LOG.error("error trying to figure out the database product", e);
  throw new RuntimeException(e);
}
isInitialized = pm != null;
if (isInitialized) {
  dbType = determineDatabaseProduct();
  expressionProxy = createExpressionProxy(conf);
  if (MetastoreConf.getBoolVar(getConf(), ConfVars.TRY_DIRECT_SQL)) {
    String schema = prop.getProperty("javax.jdo.mapping.Schema");
    schema = org.apache.commons.lang.StringUtils.defaultIfBlank(schema, null);
    //As of now, only the partition retrieval is done this way to improve job startup time;
    directSql = new MetaStoreDirectSql(pm, conf, schema);
  }
}
LOG.debug("RawStore: {}, with PersistenceManager: {}" +
        " created in the thread with id: {}", this, pm, Thread.currentThread().getId());
复制代码

3.与client通信

服务启动后,hiveclient连接执行sql,会调用RetryingHMSHandler后,调用HiveMetaStore.HMSHandler的对应接口方法,比如 hiveclient - show database -> HiveMetaStore.HMMSHandler.get_databases -> ObejctStore.getAllDatabase hiveclient - create table test(id int,name string); -> 校验数据库是否存在,校验字段,校验表是否存在,创建table,创建table_ds,创建table_pri等一系列操作

文章分类
后端
文章标签