Topology 从提交到运行
Topology的提交流程
当我们将 topology 通过 StormSubmitter.submitTopologyWithPeogressBar() 方法提交到集群时,我们来看看 topology 从提交到开始运行中经历了什么?
首先构建pas拓扑,然后通过submitTopologyWithPeogressBar()提交拓扑,源码如下:
public static void submitTopologyWithProgressBar(String name, Map<String, Object> topoConf, StormTopology topology,SubmitOptions opts) throws AlreadyAliveException, InvalidTopologyException,AuthorizationException {
// 显示一个进度条,方便我们知道有没有卡住,尤其是连接慢时。
submitTopology(name, topoConf, topology, opts, new StormSubmitter.ProgressListener() {
@Override
public void onStart(String srcFile, String targetFile, long totalBytes) {
System.out.printf("Start uploading file '%s' to '%s' (%d bytes)\n", srcFile, targetFile, totalBytes);
}
@Override
public void onProgress(String srcFile, String targetFile, long bytesUploaded, long totalBytes) {
int length = 50;
int p = (int) ((length * bytesUploaded) / totalBytes);
String progress = StringUtils.repeat("=", p);
String todo = StringUtils.repeat(" ", length - p);
System.out.printf("\r[%s%s] %d / %d", progress, todo, bytesUploaded, totalBytes);
}
@Override
public void onCompleted(String srcFile, String targetFile, long totalBytes) {
System.out.printf("\nFile '%s' uploaded to '%s' (%d bytes)\n", srcFile, targetFile, totalBytes);
}
});
}
该方法主要是显示一个进度条,以方便我们知道 topology 提交进度,主要看 submitTopology() 方法中调用的 submitTopologyAs() 方法
public static void submitTopologyAs(String name, Map<String, Object> topoConf, StormTopology topology, SubmitOptions opts, ProgressListener progressListener, String asUser) throws AlreadyAliveException, InvalidTopologyException, AuthorizationException, IllegalArgumentException {
//首先验证拓扑名,如果无效则不执行其他操作。
Utils.validateTopologyName(name);
//验证拓扑配置格式,必须时json可序列化。
if (!Utils.isValidConf(topoConf)) {
throw new IllegalArgumentException("Storm conf is not valid. Must be json-serializable");
}
//验证拓扑中spout的个数,若为0,直接抛出异常。
if (topology.get_spouts_size() == 0) {
throw new WrappedInvalidTopologyException("Topology " + name + " does not have any spout");
}
//可以看到配置之间的优先级为 defaults.yaml < storm.yaml < 特定topology配置 < 特定外部配置
//拓扑内的配置
topoConf = new HashMap<>(topoConf);
//外部配置
topoConf.putAll(Utils.readCommandLineOpts());
//storm.yaml
Map<String, Object> conf = Utils.readStormConfig();
conf.putAll(topoConf);
//加载zk认证信息
topoConf.putAll(prepareZookeeperAuthentication(conf));
//验证配置
validateConfs(conf);
//验证拓扑是无周期的
try {
Utils.validateCycleFree(topology, name);
} catch (InvalidTopologyException ex) {
LOG.warn("", ex);
}
Map<String, String> passedCreds = new HashMap<>();
if (opts != null) {
Credentials tmpCreds = opts.get_creds();
if (tmpCreds != null) {
passedCreds = tmpCreds.get_creds();
}
}
//如果传入opts为空,则填充opts
Map<String, String> fullCreds = populateCredentials(conf, passedCreds);
if (!fullCreds.isEmpty()) {
if (opts == null) {
opts = new SubmitOptions(TopologyInitialStatus.ACTIVE);
}
opts.set_creds(new Credentials(fullCreds));
}
try {
String serConf = JSONValue.toJSONString(topoConf);
//通过配置创建nimbus客户端
try (NimbusClient client = NimbusClient.getConfiguredClientAs(conf, asUser)) {
//检查拓扑名是否合理或已存在
if (!isTopologyNameAllowed(name, client)) {
throw new RuntimeException("Topology name " + name + " is either not allowed or it already exists on the cluster");
}
// 上传依赖,只对分布式模式有用
List<String> jarsBlobKeys = Collections.emptyList();
List<String> artifactsBlobKeys;
DependencyUploader uploader = new DependencyUploader();
try {
uploader.init();
jarsBlobKeys = uploadDependencyJarsToBlobStore(uploader);
artifactsBlobKeys = uploadDependencyArtifactsToBlobStore(uploader);
} catch (Throwable e) {
// remove uploaded jars blobs, not artifacts since they're shared across the cluster
uploader.deleteBlobs(jarsBlobKeys);
uploader.shutdown();
throw e;
}
try {
setDependencyBlobsToTopology(topology, jarsBlobKeys, artifactsBlobKeys);
submitTopologyInDistributeMode(name, topology, opts, progressListener, asUser, conf, serConf, client);
} catch (AlreadyAliveException | InvalidTopologyException | AuthorizationException e) {
// remove uploaded jars blobs, not artifacts since they're shared across the cluster
// Note that we don't handle TException to delete jars blobs
// because it's safer to leave some blobs instead of topology not running
uploader.deleteBlobs(jarsBlobKeys);
throw e;
} finally {
uploader.shutdown();
}
}
} catch (TException e) {
throw new RuntimeException(e);
}
invokeSubmitterHook(name, asUser, conf, topology);
}
由上可知,submitTopologyAs() 方法的整体流程大体如下:
主要是对传入配置、拓扑名等进行校验,以及根据校验后的配置创建nimbus客户端。核心方法是 submitTopologyInDistributeMode() ,我们进一步细看一下该方法。
private static void submitTopologyInDistributeMode(String name, StormTopology topology, SubmitOptions opts,
ProgressListener progressListener, String asUser, Map<String, Object> conf,
String serConf, NimbusClient client) throws TException {
try {
//通过文件流将jar包上传至nimbus
String jar = submitJarAs(conf, System.getProperty("storm.jar"), progressListener, client);
LOG.info("Submitting topology {} in distributed mode with conf {}", name, serConf);
Utils.addVersions(topology);
//调用submitTopologyWithOpts正式向nimbus提交拓扑
if (opts != null) {
client.getClient().submitTopologyWithOpts(name, jar, serConf, topology, opts);
} else {
// this is for backwards compatibility,为了向后兼容
client.getClient().submitTopology(name, jar, serConf, topology);
}
LOG.info("Finished submitting topology: {}", name);
} catch (InvalidTopologyException e) {
LOG.error("Topology submission exception: {}", e.get_msg());
throw e;
} catch (AlreadyAliveException e) {
LOG.error("Topology already alive exception", e);
throw e;
}
}
该方法主要是将jar包通过流的形式上传至nimbus。
以上,便是拓扑提交到nimbus的整个流程
该博客仅为初学者自我学习的记录,粗浅之言,如有不对之处,恳请指正。