涉及的知识点
基本概念
索引结构
- 存储结构上 由 _index,_type和_id标记唯一的文档
- _index指出一个或者多个物理分片的逻辑命名空间
- _type区分同一个集合
- _id 文档标记由系统自动生成
分片
- 在分布式系统中,单机无法存储规模巨大的数据量,要依靠大规模集群处理和存储这些数据,一般通过增加机器水平扩展提高整个集群能力,需要将数据分为若干个小块分配到各个机器上
- 将数据分片以提高水平扩展能力,分布式存储中还会把数据复制多个副本
- ES分片分为主分片和副分片,写索引是只能写在主分片上,然后同步到副本分片。写操作:对文档的新建、索引和删除请求,必须在主分片上面完成之后才能被复制到相关的副本分片。
- 分片是底层的最基本读写单元,分片的目的是为了分隔巨大索引,让读写可以并行操作。
- 分片可以独立执行读写工作
- 在5.x之主索引数量是不能修改的,副分配可以随时修改。5.x-6.x之后,es已经支持在一定条件的限制下,对主索引进行拆分和缩小,还是尽量提前规划好分片的数量。
段合并
- 每秒清空一次写操作,将这些数据写入文件,这个过程称为 refresh,每次refresh会创建一个新的lucene段。
集群节点角色
- 主节点
- 主节点尽量不要做数据节点
- 数据节点
- 预处理节点(ingest node)
- 写入数据之前,通过事先定义好一系列的处理器和管道,对数据进行转换
- 协调节点
- 客户端请求可以发送到集群的任何节点,每个节点都知道任意文档所处的位置,然后转发这些请求,收集数据并返回给客户端,处理客户端请求的节点称为协调节点。
- 协调节点将请求转发给保存数据的数据节点
主要内部模块
-
cluster
- 主节点执行集权管理的封装实现,管理集群状态,维护集群层面的配置信息
-
allocation
- 封装了分片分配相关的功能和策略,包括主分片分配和副分配分配
-
discovery
-
gateway
-
indices
-
http
-
transport
-
engine
先了解架构图
注:该图从网上找的,借用下,谢谢
生命周期管理
LifecycleComponent
public interface LifecycleComponent extends Releasable {
Lifecycle.State lifecycleState();
void addLifecycleListener(LifecycleListener listener);
void removeLifecycleListener(LifecycleListener listener);
void start();
void stop();
}
- 状态管理
- 增加监听器
- 启动和关闭行为
Lifecycle
INITIALIZED -> STARTED, STOPPED, CLOSED
STARTED -> STOPPED
STOPPED -> STARTED, CLOSED
CLOSED ->
public enum State {
INITIALIZED,
STOPPED,
STARTED,
CLOSED
}
- 定义:
-
生命周期实体类
-
状态机
-
LifecycleListener
public void beforeStart() {
}
public void afterStart() {
}
public void beforeStop() {
}
public void afterStop() {
}
public void beforeClose() {
}
public void afterClose() {
}
AbstractLifecycleComponent
-
成员变量
-
protected final Lifecycle lifecycle = new Lifecycle(); private final List<LifecycleListener> listeners = new CopyOnWriteArrayList<>();
-
-
public void start() { // 安全启动,不容易重复启动 if (!lifecycle.canMoveToStarted()) { return; } for (LifecycleListener listener : listeners) { listener.beforeStart(); } // 真正的执行 doStart(); lifecycle.moveToStarted(); for (LifecycleListener listener : listeners) { listener.afterStart(); } } protected abstract void doStart();
-
@Override public void stop() { if (!lifecycle.canMoveToStopped()) { return; } for (LifecycleListener listener : listeners) { listener.beforeStop(); } lifecycle.moveToStopped(); doStop(); for (LifecycleListener listener : listeners) { listener.afterStop(); } } protected abstract void doStop();
-
下面类都使用了生命周期管理
- AzureComputeServiceImpl
- BlobStoreRepository
- CircuitBreakerService
- ClusterService
- DelayedAllocationService
- GatewayService
- GceMetadataService
- HdfsRepository
- IndicesClusterStateService
- IndicesService
- IndicesTTLService
- InternalAwsS3Service
- JvmGcMonitorService
- LocalDiscovery
- LocalTransport
- MockTcpTransport
- MonitorService
- Netty3HttpServerTransport
- Netty3Transport
- Netty4HttpServerTransport
- Netty4Transport
- NodeConnectionsService
- NoneDiscovery
- ResourceWatcherService
- RoutingService
- SearchService
- SingleNodeDiscovery
- SnapshotShardsService
- SnapshotsService
- TcpTransport
- TransportService
- TribeService
- ZenDiscovery
Inject(轻量级注入器)
例子
public class FooApplication {
public static void main(String[] args) {
Injector injector = Guice.createInjector(
new ModuleA(),
new ModuleB(),
. . .
new FooApplicationFlagsModule(args)
);
// Now just bootstrap the application and you're done
FooStarter starter = injector.getInstance(FooStarter.class);
starter.runApplication();
}
}
- 我们在开发一些小的中间件也可以使用这个小框架
- 具体的参考 www.imooc.com/article/206…
Settings
- 成员变量
private final Map<String, String> settings;
/** The secure settings storage associated with these settings. */
private final SecureSettings secureSettings;
/** The first level of setting names. This is constructed lazily in {@link #names()}. */
private final SetOnce<Set<String>> firstLevelNames = new SetOnce<>();
/**
* Setting names found in this Settings for both string and secure settings.
* This is constructed lazily in {@link #keySet()}.
*/
private final SetOnce<Set<String>> keys = new SetOnce<>();
ThreadPool
- 构造方法
public ThreadPool(final Settings settings, final ExecutorBuilder<?>... customBuilders) {
super(settings);
assert Node.NODE_NAME_SETTING.exists(settings);
final Map<String, ExecutorBuilder> builders = new HashMap<>();
final int availableProcessors = EsExecutors.boundedNumberOfProcessors(settings);
final int halfProcMaxAt5 = halfNumberOfProcessorsMaxFive(availableProcessors);
final int halfProcMaxAt10 = halfNumberOfProcessorsMaxTen(availableProcessors);
final int genericThreadPoolMax = boundedBy(4 * availableProcessors, 128, 512);
// 初始化几种线程池
builders.put(Names.GENERIC, new ScalingExecutorBuilder(Names.GENERIC, 4, genericThreadPoolMax, TimeValue.timeValueSeconds(30)));
builders.put(Names.INDEX, new FixedExecutorBuilder(settings, Names.INDEX, availableProcessors, 200));
builders.put(Names.BULK, new FixedExecutorBuilder(settings, Names.BULK, availableProcessors, 200)); // now that we reuse bulk for index/delete ops
builders.put(Names.GET, new FixedExecutorBuilder(settings, Names.GET, availableProcessors, 1000));
builders.put(Names.SEARCH, new FixedExecutorBuilder(settings, Names.SEARCH, searchThreadPoolSize(availableProcessors), 1000));
builders.put(Names.MANAGEMENT, new ScalingExecutorBuilder(Names.MANAGEMENT, 1, 5, TimeValue.timeValueMinutes(5)));
// no queue as this means clients will need to handle rejections on listener queue even if the operation succeeded
// the assumption here is that the listeners should be very lightweight on the listeners side
builders.put(Names.LISTENER, new FixedExecutorBuilder(settings, Names.LISTENER, halfProcMaxAt10, -1));
builders.put(Names.FLUSH, new ScalingExecutorBuilder(Names.FLUSH, 1, halfProcMaxAt5, TimeValue.timeValueMinutes(5)));
builders.put(Names.REFRESH, new ScalingExecutorBuilder(Names.REFRESH, 1, halfProcMaxAt10, TimeValue.timeValueMinutes(5)));
builders.put(Names.WARMER, new ScalingExecutorBuilder(Names.WARMER, 1, halfProcMaxAt5, TimeValue.timeValueMinutes(5)));
builders.put(Names.SNAPSHOT, new ScalingExecutorBuilder(Names.SNAPSHOT, 1, halfProcMaxAt5, TimeValue.timeValueMinutes(5)));
builders.put(Names.FETCH_SHARD_STARTED, new ScalingExecutorBuilder(Names.FETCH_SHARD_STARTED, 1, 2 * availableProcessors, TimeValue.timeValueMinutes(5)));
builders.put(Names.FORCE_MERGE, new FixedExecutorBuilder(settings, Names.FORCE_MERGE, 1, -1));
builders.put(Names.FETCH_SHARD_STORE, new ScalingExecutorBuilder(Names.FETCH_SHARD_STORE, 1, 2 * availableProcessors, TimeValue.timeValueMinutes(5)));
for (final ExecutorBuilder<?> builder : customBuilders) {
if (builders.containsKey(builder.name())) {
throw new IllegalArgumentException("builder with name [" + builder.name() + "] already exists");
}
builders.put(builder.name(), builder);
}
this.builders = Collections.unmodifiableMap(builders);
threadContext = new ThreadContext(settings);
final Map<String, ExecutorHolder> executors = new HashMap<>();
for (@SuppressWarnings("unchecked") final Map.Entry<String, ExecutorBuilder> entry : builders.entrySet()) {
final ExecutorBuilder.ExecutorSettings executorSettings = entry.getValue().getSettings(settings);
final ExecutorHolder executorHolder = entry.getValue().build(executorSettings, threadContext);
if (executors.containsKey(executorHolder.info.getName())) {
throw new IllegalStateException("duplicate executors with name [" + executorHolder.info.getName() + "] registered");
}
logger.debug("created thread pool: {}", entry.getValue().formatInfo(executorHolder.info));
executors.put(entry.getKey(), executorHolder);
}
executors.put(Names.SAME, new ExecutorHolder(DIRECT_EXECUTOR, new Info(Names.SAME, ThreadPoolType.DIRECT)));
this.executors = unmodifiableMap(executors);
// 初始化定时器
this.scheduler = new ScheduledThreadPoolExecutor(1, EsExecutors.daemonThreadFactory(settings, "scheduler"), new EsAbortPolicy());
this.scheduler.setExecuteExistingDelayedTasksAfterShutdownPolicy(false);
this.scheduler.setContinueExistingPeriodicTasksAfterShutdownPolicy(false);
this.scheduler.setRemoveOnCancelPolicy(true);
TimeValue estimatedTimeInterval = ESTIMATED_TIME_INTERVAL_SETTING.get(settings);
this.cachedTimeThread = new CachedTimeThread(EsExecutors.threadName(settings, "[timer]"), estimatedTimeInterval.millis());
this.cachedTimeThread.start();
}
- EsExecutors
- 线程隔离
monitor
- jvm
- os
ToXContent
- XContentType:Elasticsearch支持4种数据类型,分别是JSON、SMILE、YAML、CBOR。XContentType是表示这四种数据类型的枚举类。
- XContent:数据的抽象。因为支持4种数据类型,因此XContent有4种实现,分别是JsonXContent、SmileXContent、YamlXContent、CborXContent。
- XContentParser:数据的解析器,4种数据类型的解析器实现分别是JsonXContentParser、SmileXContentParser、YamlXContentParser、CborXContentParser
Setting
- 位于common.settings包下面
- 封装了典型的东西,如默认值、解析和范围。
- 成员变量
private final Key key; // setting的key
protected final Function<Settings, String> defaultValue;
@Nullable
private final Setting<T> fallbackSetting;
private final Function<String, T> parser;
private final EnumSet<Property> properties;
private static final EnumSet<Property> EMPTY_PROPERTIES = EnumSet.noneOf(Property.class);
Settings
Environment
- 主要职责 *