“智慧的开始,是对自己无知的承认。” ——苏格拉底
点关注,不迷路
大家好,我是侠风,以后会持续更新在大厂实践中的一些硬核技术知识分享,希望小可爱们"一键三连"呀。
背景
Elasticsearch(以下简称ES)是我接触的最早的大数据组件之一,基本上在我待过的每一家公司项目上都有使用,不管是大厂还是小厂。
我们学习使用某个技术组件,肯定要非常清楚的知道它的使用场景,那么多开源组件,选型的时候为什么选择ES,而不选择MySQL、Doris、Clickhouse等等呢,他们都有类似的一些功能,但是各个开源组件,所擅长的肯定是不一样的,所以我们结合业务场景,来选择合适的技术,是程序猿一个非常重要的能力。
ES非常擅长的事情就是检索,那可能有同学就会有疑问了,我们最常用的MySQL也可以用来检索呀,比如检索漂亮妹妹的名字
select beautiful_girl_name from human_tabel where face like '%非常漂亮%'
第一个缺点:凡是这个漂亮妹子的脸部形容词改成了“她的眼睛非常好看,整体五官巨漂亮,赛过刘亦菲”,这样是检索不到,可能你的人生中就会错过一个巨漂亮的妹子图片。
第二个缺点:就是当数据量很大的时候,使用MySQL这种关系型数据库根本写不动,也查不动,无法满足业务诉求;
为什么还要用ES呢?
- 检索的效率问题,MySQL是基于索引的,而ES是基于倒排索引的,倒排索引的查询效率是非常高的。
- 检索的性能问题,MySQL是基于磁盘的,而ES是基于内存的,内存的读取速度是磁盘的几十倍,所以ES的检索性能是非常高的。
- 检索的扩展性问题,MySQL是基于表的,而ES是基于索引的,索引的扩展性是非常高的,可以非常方便的进行横向扩展。
- 检索的灵活性问题,MySQL是基于SQL的,而ES是基于JSON的,JSON的灵活性是非常高的,可以非常方便的进行灵活的查询。
ES生态:
当前ES的生态组成了Elastic Stack技术服务栈,囊括了大数据处理领域的方方面面,包括数据收集、写入、检索、监控、处理、分析、安全等。
Elasticsearch、Kibana、Beats 和 Logstash 四大金刚负责不同领域。后续有机会再一一分享(求关注)。
本地部署单机ES环境
1、下载地址:www.elastic.co/cn/download…
2、解压缩下载的文件:
# 解压缩文件
tar -zxvf elasticsearch-7.10.0-darwin-x86_64.tar.gz
# 修改文件配置
vim elasticsearch-7.10.0/config/elasticsearch.yml
3、我们将修改对应的配置文件:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: localhost
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
node.master: true
node.data: true
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /Users/zengxuefeng/elasticsearch-7.10.0/data
#
# Path to log files:
#
path.logs: /Users/zengxuefeng/elasticsearch-7.10.0/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 127.0.0.1 #当前节点ip
#
# Set a custom port for HTTP:
#
http.port: 8080
transport.tcp.port: 8081
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["localhost"] #[{当前节点ip}]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["localhost"] #{初始化master节点列表}
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
启动集群
执行命令:
./bin/elasticsearch
启动完成后,我们可以通过浏览器访问: http://localhost:8080 可以看到对应的ES返回信息
集群使用
我们现在已经构建好了一个ES集群,可以进行测试,首先我们需要创建一个索引(类似于MySQL的一个表,仅仅只是类似哈),写入数据,和查询数据。
1、创建索引
curl -H "Content-Type: application/json" -X PUT 'http://127.0.0.1:8080/beautifule_people?pretty' -d '
{
"mappings": {
"_source": {
"enabled": true
},
"properties": {
"id": {
"type": "integer"
},
"address": {
"type": "keyword"
},
"sex": {
"type": "keyword"
},
"name": {
"type": "text"
},
"age":{
"type":"integer"
}
}
},
"settings": {
"index": {
"number_of_shards": "2",
"number_of_replicas": "1"
}
}
}'
2、查询创建的索引
curl 'http://127.0.0.1:8080/_cat/indices?v'
显示已经创建成功。
3、添加数据
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1000' -d '{"id":1000,"address":"胜辛路426号2层222-3号商铺","sex":"male","name":"张三","age":20}'
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1001' -d '{"id":1001,"address":"浦江镇召楼路1976号","sex":"male","name":"李四","age":21}'
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1002' -d '{"id":1002,"address":"长宁区北新泾四村5号","sex":"male","name":"王五","age":22}'
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1003' -d '{"id":1003,"address":"铁路上海虹桥站出发层2F-26","sex":"male","name":"赵六","age":23}'
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1004' -d '{"id":1004,"address":"金沙江路956号","sex":"female","name":"孙七真漂亮","age":24}'
curl -H "Content-Type: application/json" -X POST '127.0.0.1:8080/beautifule_people/_doc/1005' -d '{"id":1005,"address":"川沙路4839号","sex":"male","name":"周八非常好看真美丽","age":25}'
4、查询数据
1、查询所有数据
curl -H "Content-Type: application/json" -X GET '127.0.0.1:8080/beautifule_people/_search' -d '
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": [],
"should": [],
"filter": []
}
},
"from": 0,
"size": 10,
"sort": [],
"profile": false
}'
2、查询指定条件 name like "非常美丽"
curl -H "Content-Type: application/json" -X GET '127.0.0.1:8080/beautifule_people/_search' -d '
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "非常美丽"
}
}
],
"must_not": [],
"should": [],
"filter": []
}
},
"from": 0,
"size": 10,
"sort": [],
"profile": false
}'
从上图的返回结果可以看到,我们没有错过这个美丽的姑娘,哈哈,希望生活中的你也不会错过自己喜欢的美丽女孩。
总结
这篇文章简单的介绍了一下ES集群在本地的部署和简单的使用,包括创建索引,查看索引,往索引中写入数据,查询索引中的数据,让大家对ES有一个非常直观的了解。
还有Lucene、ES的写入过程、查询过程、分片、副本复制、ES的慢查询的原因以及如何优化、如何进行性能优化、常用的查询语法等等内容,将在后续的文章中持续输出。
能看到这里的小可爱肯定是超级爱学习的人(牛马),欢迎大家点赞、转发、关注,一键三连呀,这会鼓励我更好的创作。也欢迎大家扫一扫关注我的公众号,可以私下交流。