ELK企业级日志分析系统ELK的组成以及功能 ELK是三个开源软件的缩写，分别表示：Elasticsearch(阿拉斯特

前言 · ELKstack 中文指南 (elasticsearch.cn)

Elasticsearch 中文文档 (kilvn.com)

ES

ES 使⽤架构将 ES 应⽤到项⽬中时，可以有两种架构⼀种是使⽤ ES 作为唯⼀的后端；另⼀种是 ES 与数据库系统配合，⼀同作为后端。

ES 作为唯⼀后端

ES 作为⼀个现代化的搜索引擎，它本身除了拥有检索功能外，还拥有存储功能。因此，在⼀个不复杂的项⽬中，可以将 ES 作为唯⼀的后端来使⽤

ES 与数据库系统配合

在⽐较复杂的项⽬中，ES ⽆法提供传统数据库的所有功能（⽐如事务处理），因此需要将 ES 和传统数据库来配合使⽤

Elasticsearch安装

从 ES 7.0 开始，ES 内置了 Java 环境，所以如果安装的是 7.0 及以上版本的 ES，就不需要额外安装 Java 环境了。

ES下面链接： Download Elasticsearch | Elastic

[root@devops01 ~]#rpm -ivh elasticsearch-7.9.1-x86_64.rpm

cat > /etc/elasticsearch/elasticsearch.yml << 'EOF'    
node.name: node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 127.0.0.1,10.0.0.51
http.port: 9200
discovery.seed_hosts: ["10.0.0.51"]
cluster.initial_master_nodes: ["10.0.0.51"]
EOF
systemctl daemon-reload
systemctl start elasticsearch.service

version: '2'
services:
  elasticsearch1:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.0.1
    container_name: elasticsearch1
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    mem_limit: 1g
    cap_add:
      - IPC_LOCK
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - esnet
      
volumes:
  esdata1:
    driver: local

networks:
  esnet:
    driver: bridge

查看ES配置⽂件

[root@devops01 ~]#rpm -qc elasticsearch /etc/elasticsearch/elasticsearch.yml # 主配置 
/etc/elasticsearch/jvm.options # es占⽤多少内存
/etc/elasticsearch/log4j2.properties /etc/elasticsearch/role_mapping.yml
/etc/elasticsearch/roles.yml /etc/elasticsearch/users
/etc/elasticsearch/users_roles
/etc/init.d/elasticsearch # 启动脚本
/etc/sysconfig/elasticsearch # 环境变量 
/usr/lib/sysctl.d/elasticsearch.conf # 内核参数 /usr/lib/systemd/system/elasticsearch.service

索引管理

查看索引
浏览器
http://192.168.91.112:9200/_cat/indices?v&pretty
命令行
curl -XGET 'localhost:9200/_cat/indices?v&pretty'
curl -XGET 'localhost:9200/_cat/indices?v&health=red  查看状态为红的索引
curl -XGET 'localhost:9200/_cat/indices?v&health=yellow  查看状态为红的索引
curl -XGET 'localhost:9200/_cat/indices?v&health=green  查看状态为红的索引
查看索引设置
curl -XGET 'localhost:9200/索引名/_settings'
查看分片情况
curl -XGET 'localhost:9200/_cat/shards?v&h=n,index,shard,prirep,state,sto,sc,unassigned.reason,unassigned.details'


ALLOCATION_FAILED：由于分片分配失败而未分配。  
CLUSTER_RECOVERED：由于集群恢复而未分配。  
DANGLING_INDEX_IMPORTED：由于导入了悬空索引导致未分配。  
EXISTING_INDEX_RESTORED：由于恢复为已关闭的索引导致未分配。  
INDEX_CREATED：由于API创建索引而未分配。  
INDEX_REOPENED：由于打开已关闭索引而未分配。  
NEW_INDEX_RESTORED：由于恢复到新索引而未分配。  
NODE_LEFT：由于托管的节点离开集群而未分配。  
REALLOCATED_REPLICA：确定了更好的副本位置，并导致现有副本分配被取消。  
REINITIALIZED：当分片从开始移动回初始化，导致未分配。  
REPLICA_ADDED：由于显式添加副本而未分配。  
REROUTE_CANCELLED：由于显式取消重新路由命令而未分配。


查看分配失败的具体原因
curl -XGET 'localhost:9200/_cluster/allocation/explain  

磁盘满了es会自动变成只读模式，磁盘恢复后要手动将索引改为可写模式

curl -XPUT -H "Content-Type: application/json" http://127.0.0.1:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": false}'

性能调优参数

禁止内存交换

为了避免不必要的es jvm堆中数据被交换到磁盘，造成垃圾收集性能较差，可能造成分钟级别的垃圾收集，es内存使用通过jvm堆的大小控制，不需要交换内存。以下有三种解决方式。

修改/etc/fstab文件，注释掉包含“swap”关键字的行。
修改/etc/sysctl.conf文件中的配置 vm.swappiness = 1。此配置降低Linux内核的交换偏好性，此配置不会禁用内存交换，在某些紧急条件下仍然可能触发内存交换
修改es的elasticsearch.yml文件，配置bootstrap.memory_lock: true。此配置会锁定es进程的内存地址空间，避免被交换出内存，需要注意的是当尝试分配更多的内存空间（超过锁定的内存地址空间）es进程会退出；通常启动es的用户不被允许锁定内存空间，需要采用root用户修改/etc/security/limits.conf文件，配置memlock为unlimited。

soft memlock unlimited
hard memlock unlimited

curl -X GET "localhost:9200/_nodes?filter_path=**.mlockall"

如果输出 "mlockall": true，说明内存锁定已生效

文件描述符

Elasticsearch使用大量的文件描述符或文件句柄。文件描述符超限在运行时灾难性的，很可能导致数据丢失。请确保调大运行Elasticsearchd的用户允许打开文件描述符数量到65536或更大。

如果使用的是.zip与.tar.gz安装包，在启动elasticsearch前通过root用户设置ulimit -n 65536，或者是设置/etc/security/limits.conf的nofile为65536。

elasticsearch hard nofile 65536

elasticsearch soft nofile 65536

RPM和Debian软件包已经默认文件描述符的最大数量为65536，不需要进一步配置。

你可以通过各节点的Nodes StatsAPI来检查max_file_descriptions:

GET _nodes/stats/process?filter_path=**.max_file_descriptors

线程数

Elasticsearch不同类的操作使用不同的线程池。在必要的时候创建新的线程非常重要，确保elasticsearch用户可以创建的线程数至少为2048。注意此文件修改后需要重新登录用户，才会生效

elasticsearch soft nproc 4096

elasticsearch hard nproc 4096

虚拟内存

Elasticsearch默认采用[hybrid mmapfs / niofs]目录来保存索引。默认的操作系统mmap数限制看起来太小，这可能会导致内存溢出的异常。

在Linux系统，你可以使用root用户通过如下命令来增加限制数：


echo "vm.max_map_count=262144" >> /etc/sysctl.conf 

参考数据（分配 2g/262144，4g/4194304，8g/8388608）
#一个进程能够分配的最大内存大小
#使修改立即生效
sysctl -p

es写数据原理

待写入的文档并没有立马写入磁盘。首先写入的是es虚拟机堆内存中（memory cache）
memory cache中的数据默认没隔一秒中会被refresh刷新到操作系统缓存中（os cache），即在操作系统级别的内存中产生一个segment file；同时会在os cache中记录translog日志，此时被索引的数据可以被搜索到。
默认位于os cache中记录的translog日志数据每隔5秒会被写入磁盘持久化。最多丢失5秒的数据
默认每隔30分钟或者translog日志文件大小大到一定阈值，会触发commit（flush）操作。会将os cache中的segment file数据强制写入磁盘，并清空和重开translog

说明：默认情况下es被索引的数据需要等待1秒后才能被搜索到，最多丢失5秒的数据。

安装 Elasticsearch-head 插件

安装⽅式推荐

npm安装
docker安装 docker run -p 9100:9100 mobz/elasticsearch-head:7
google浏览器装插件 GitHub - mobz/elasticsearch-head: A web front end for an elastic search cluster

es CRUD

创建索引

# 创建名为"products"的索引
PUT /products
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

插入文档

# 插入一条文档到products索引，ID为1
PUT /products/_doc/1
{
  "name": "iPhone 13",
  "price": 799,
  "description": "Latest Apple smartphone",
  "stock": 100
}

# 不指定ID，让ES自动生成
POST /products/_doc
{
  "name": "Samsung Galaxy S22",
  "price": 699,
  "description": "Latest Samsung smartphone",
  "stock": 150
}

查询文档

# 根据ID查询
GET /products/_doc/1

# 简单搜索
GET /products/_search
{
  "query": {
    "match": {
      "name": "iPhone"
    }
  }
}

# 查询所有文档
GET /products/_search
{
  "query": {
    "match_all": {}
  }
}

更新文档

# 完全替换文档
PUT /products/_doc/1
{
  "name": "iPhone 13 Pro",
  "price": 999,
  "description": "Latest Pro Apple smartphone",
  "stock": 80
}

# 部分更新
POST /products/_update/1
{
  "doc": {
    "price": 899,
    "stock": 90
  }
}

删除文档和索引

# 删除ID为1的文档
DELETE /products/_doc/1

# 删除整个索引
DELETE /products

批量操作

POST /products/_bulk
{ "index": { "_id": "101" } }
{ "name": "MacBook Pro", "price": 1999, "category": "laptop" }
{ "index": { "_id": "102" } }
{ "name": "iPad Air", "price": 599, "category": "tablet" }
{ "delete": { "_id": "103" } }
{ "update": { "_id": "101" } }
{ "doc": { "price": 1899 }

es集群

 #node01节点
 hostnamectl set-hostname node01
 su
 vim /etc/hosts
 192.168.121.10 node01
 192.168.121.20 node02
 
 #node02节点
 hostnamectl set-hostname node02
 su
 vim /etc/hosts
 192.168.121.10 node01
 192.168.121.20 node02




# 指定集群名称3个节点必须一致
cluster.name: docker-cluster
#指定节点名称，每个节点名字唯一
node.name: node-1
#是否有资格为master节点，默认为true
node.master: true
#是否为data节点，默认为true
node.data: true
# 绑定ip,开启远程访问,可以配置0.0.0.0
network.host: 0.0.0.0
#指定web端口
#http.port: 9200
#指定tcp端口
#transport.tcp.port: 9300
#用于节点发现，三个节点的主机号
discovery.seed_hosts: ["xxx.xxx.xxx.166", "xxx.xxx.xxx.167", "xxx.xxx.xxx.168"]
#初始化集群的master节点的候选列表，列表中的节点都可能竞选成为master节点
cluster.initial_master_nodes: ["node-1","node-2","node-3"]

在es集群中配置2到3个主节点并且让它们只负责管理不负责存储，从节点只负责存储

 grep -v "^#" /etc/elasticsearch/elasticsearch.yml

es密码

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

设置密码
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive

## 自动生成密码
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto

ELK

ELK的工作原理

（1）在所有需要收集日志的服务器上部署Logstash; 或者先将日志进行集中化管理在日志服务器上，在日志服务器上部署 Logs tash。

（2）Logstash 收集日志，将日志格式化并输出到 Elasticsearch 群集中。

（3）Elasticsearch 对格式化后的数据进行索引和存储。

（4）Kibana 从 ES 群集中查询数据生成图表，并进行前端数据的展示。

Elasticsearch 默认端口9200；Kibana默认端口5601

ELK、ELFK、EFLKL

ELK： ES+logstash+kibana

ELFK： ES+logstash+filebeat+kibana

ELFK： ES+filebeat+logstash+kafka+kibana

Filebeat：

轻量级的开源日志文件数据搜集器。通常在需要采集数据的客户端安装 Filebeat，并指定目录与日志格式，Filebeat 就能快速收集数据，并发送给 logstash 进行解析，或是直接发给 Elasticsearch 存储，性能上相比运行于 JVM 上的 logstash 优势明显，是对它的替代。常应用于 EFLK 架构当中。（如果要使用过滤功能的话，Filebeat不能完全替代logstash，Filebeat没有过滤功能，收集数据后需要发送给 logstash 进行处理）

filebeat 结合 logstash 带来好处：

通过 Logstash 具有基于磁盘的自适应缓冲系统，该系统将吸收传入的吞吐量，从而减轻 Elasticsearch 持续写入数据的压力。
从其他数据源（例如数据库，S3对象存储或消息传递队列）中提取。
将数据发送到多个目的地，例如S3，HDFS（Hadoop分布式文件系统）或写入文件。
使用条件数据流逻辑组成更复杂的处理管道。

Fluentd：

是一个流行的开源数据收集器。由于 logstash 太重量级的缺点，Logstash 性能低、资源消耗比较多等问题，随后就有 Fluentd 的出现。相比较 logstash，Fluentd 更易用、资源消耗更少、性能更高，在数据处理上更高效可靠，受到企业欢迎，成为 logstash 的一种替代方案，常应用于 EFK 架构当中。在 Kubernetes 集群中也常使用 EFK 作为日志数据收集的方案。
在 Kubernetes 集群中一般是通过 DaemonSet 来运行 Fluentd，以便它在每个 Kubernetes 工作节点上都可以运行一个 Pod。它通过获取容器日志文件、过滤和转换日志数据，然后将数据传递到 Elasticsearch 集群，在该集群中对其进行索引和存储。

ELK集群部署

实验环境：

服务器类型	系统和IP地址	需要安装的组件	硬件方面
node03节点	CentOS7.4(64 位) 192.168.91.105	Elasticsearch 、Kibana	2核4G
node04节点	CentOS7.4(64 位) 192.168.91.106	Elasticsearch	2核4G
Apache节点	CentOS7.4(64 位) 192.168.121.30	Logstash、Apache	2核4G

优化elasticsearch用户拥有的内存权限：

由于ES构建基于lucene，而lucene设计强大之处在于lucene能够很好的利用操作系统内存来缓存索引数据，以提供快速的查询性能。lucene的索引文件segements是存储在单文件中的，并且不可变，对于OS来说，能够很友好地将索引文件保持在cache中，以便快速访问；因此，我们很有必要将一半的物理内存留给lucene ; 另一半的物理内存留给ES（JVM heap )。所以，在ES内存设置方面，可以遵循以下原则：

当机器内存小于64G时，遵循通用的原则，50%给ES，50%留给操作系统，供lucene使用。
当机器内存大于64G时，遵循原则：建议分配给ES分配 4~32G 的内存即可，其它内存留给操作系统，供lucene使用。

创建数据存放路径并授权

mkdir -p /data/elasticsearch
chown elasticsearch:elasticsearch /data/elasticsearch/

6、启动elasticsearch是否成功开启

 systemctl start elasticsearch.service
 systemctl enable elasticsearch.service
 netstat -antp | grep 9200

7、查看节点信息

 浏览器访问  
 http://192.168.91.112:9200
 http://192.168.91.113:9200 
 查看节点 Node1、Node2 的信息。
 
 浏览器访问 
 http://192.168.91.112:9200/_cluster/health?pretty   http://192.168.91.113:9200/_cluster/health?pretty
 查看群集的健康情况，可以看到 status 值为 green（绿色）， 表示节点健康运行。
 
 浏览器访问 http://192.168.91.112:9200/_cluster/state?pretty  检查群集状态信息。

Logstash 部署（在 Apache 节点上操作）

Logstash 一般部署在需要监控其日志的服务器。在本案例中，Logstash 部署在 Apache 服务器上，用于收集 Apache 服务器的日志信息并发送到 Elasticsearch。

1、更改主机名

 hostnamectl set-hostname apache

2、安装Apahce服务（httpd）

 yum -y install httpd
 systemctl start httpd

3、安装Java环境

 yum -y install java
 java -version

4、安装logstash

 #上传软件包 logstash-5.5.1.rpm 到/opt目录下
 cd /opt
 rpm -ivh logstash-5.5.1.rpm                           
 systemctl start logstash.service                      
 systemctl enable logstash.service
 
 #将logstash放入PATH环境变量的目录中，便于系统识别
 ln -s /usr/share/logstash/bin/logstash  /usr/local/bin/

5、测试 Logstash

（1）Logstash 命令常用选项：

选项	作用
-f	通过这个选项可以指定 Logstash 的配置文件，根据配置文件配置 Logstash 的输入和输出流。
-e	从命令行中获取，输入、输出后面跟着字符串，该字符串可以被当作 Logstash 的配置（如果是空，则默认使用 stdin 作为输入，stdout 作为输出）。
-t	测试配置文件是否正确，然后退出。

（2）定义输入和输出流：

 #1.输入采用标准输入，输出采用标准输出（类似管道）
 logstash -e 'input { stdin{} } output { stdout{} }'
 ......
 www.baidu.com                                       #键入内容（标准输入）
 2020-12-22T03:58:47.799Z node1 www.baidu.com        #输出结果（标准输出）
 www.sina.com.cn                                     #键入内容（标准输入）
 2017-12-22T03:59:02.908Z node1 www.sina.com.cn      #输出结果（标准输出）
 
 //执行 ctrl+c 退出
 
 
 #2.使用 rubydebug 输出详细格式显示，codec 为一种编解码器
 logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }'
 ......
 www.baidu.com                                       #键入内容（标准输入）
 {
     "@timestamp" => 2020-12-22T02:15:39.136Z,       #输出结果（处理后的结果）
       "@version" => "1",
           "host" => "apache",
        "message" => "www.baidu.com"
 }
 
 
 #3.使用 Logstash 将信息写入 Elasticsearch 中
 logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.80.10:9200"] } }'
               输入                输出          对接
 ......
 www.baidu.com                                       #键入内容（标准输入）
 www.sina.com.cn                                     #键入内容（标准输入）
 www.google.com                                      #键入内容（标准输入）
 
 //结果不在标准输出显示，而是发送至 Elasticsearch 中，可浏览器访问 http://192.168.121.10:9100/ 查看索引信息和数据浏览。

6、定义 logstash配置文件

Logstash 配置文件基本由三部分组成：input、output 以及 filter（可选，根据需要选择使用）。

input：表示从数据源采集数据，常见的数据源如Kafka、日志文件等。
filter：表示数据处理层，包括对数据进行格式化处理、数据类型转换、数据过滤等，支持正则表达式。
output：表示将Logstash收集的数据经由过滤器处理之后输出到Elasticsearch。

 #格式如下：
 input {...}
 filter {...}
 output {...}
 
 #在每个部分中，也可以指定多个访问方式。例如，若要指定两个日志来源文件，则格式如下：
 input {
     file { path =>"/var/log/messages" type =>"syslog"}
     file { path =>"/var/log/httpd/access.log" type =>"apache"}
 }

修改Logstash 配置文件：

 #修改 Logstash 配置文件，让其收集系统日志/var/log/messages，并将其输出到 elasticsearch 中。
 chmod +r /var/log/messages                  #让 Logstash 可以读取日志
 
 vim /etc/logstash/conf.d/system.conf
 input {
     file{
         path =>"/var/log/messages"                      #指定要收集的日志的位置
         type =>"system"                                 #自定义日志类型标识
         start_position =>"beginning"                    #表示从开始处收集
     }
 }
 output {
     elasticsearch {                                     #输出到 elasticsearch
         hosts => ["192.168.121.10:9200","192.168.121.20:9200"]  #指定 elasticsearch 服务器的地址和端口
         index =>"system-%{+YYYY.MM.dd}"                 #指定输出到 elasticsearch 的索引格式
     }
 }
 
 systemctl restart logstash 
 
 浏览器访问 http://192.168.91.112:9100/ 查看索引信息

Kiabana 部署

1、安装 Kiabana

www.elastic.co/cn/what-is/…

 #上传软件包 kibana-5.5.1-x86_64.rpm 到/opt目录
 cd /opt
 rpm -ivh kibana-5.5.1-x86_64.rpm

2、设置 Kibana 的主配置文件

 vim /etc/kibana/kibana.yml
 #--第2行--取消注释，Kiabana 服务的默认监听端口为5601
 server.port: 5601
 
 #--第7行--取消注释，设置 Kiabana 的监听地址，0.0.0.0代表所有地址
 server.host: "0.0.0.0"
 
 #--第28行--取消注释，配置es服务器的ip，如果是集群则配置该集群中master节点的ip
 elasticsearch.url:  ["http://192.168.121.10:9200","http://192.168.121.20:9200"]
 
 #--第37行--取消注释，设置在 elasticsearch 中添加.kibana索引
 kibana.index: ".kibana"
 
 #--第96行--取消注释，配置kibana的日志文件路径（需手动创建），不然默认是messages里记录日志
 logging.dest: /var/log/kibana.log



6.7开始支持中文
i18n.locale: "zh-CN"

3、创建日志文件，启动 Kibana 服务

 touch /var/log/kibana.log
 chown kibana:kibana /var/log/kibana.log
 
 systemctl start kibana.service
 systemctl enable kibana.service
 
 netstat -natp | grep 5601

4、验证 Kibana

 浏览器访问 http://192.168.91.112:5601
 第一次登录需要添加一个 Elasticsearch 索引：
 Index name or pattern
 //输入：system-*           #在索引名中输入之前配置的 Output 前缀“system”
 
 单击 “create” 按钮创建，单击 “Discover” 按钮可查看图表信息及日志信息。
 数据展示可以分类显示，在“Available Fields”中的“host”，然后单击 “add”按钮，可以看到按照“host”筛选后的结果

5、将 Apache 服务器的日志（访问的、错误的）添加到 Elasticsearch 并通过 Kibana 显示

 vim /etc/logstash/conf.d/apache_log.conf
 input {
     file{
         path => "/etc/httpd/logs/access_log"   #指定访问日志存放位置录
         type => "access"
         start_position => "beginning"    #beginning表示从开头收集，如果只需收集最新的，此处改为latest
     }
     file{
         path => "/etc/httpd/logs/error_log"   #指定错误日志存放位置
         type => "error"
         start_position => "beginning"
     }
 }
 output {
     if [type] == "access" {
         elasticsearch {
             hosts => ["192.168.121.10:9200","192.168.121.20:9200"]
             index => "apache_access-%{+YYYY.MM.dd}"      #索引后缀加上当天的日期
         }
     }
     if [type] == "error" {
         elasticsearch {
             hosts => ["192.168.121.10:9200","192.168.121.20:9200"]
             index => "apache_error-%{+YYYY.MM.dd}"     #索引后缀加上当天的日期
         }  
     }
 }
 
 cd /etc/logstash/conf.d/
 /usr/share/logstash/bin/logstash -f apache_log.conf

6、浏览器访问

 浏览器访问 http://192.168.121.10:9100 查看索引是否创建
 
 浏览器访问 http://192.168.121.10:5601 登录 Kibana，
 单击“Index Pattern -> Create Index Pattern”按钮添加索引， 在索引名中输入之前配置的 Output 前缀 apache_access-*，并单击“Create”按钮。再用相同的方法添加 apache_error-*索引。
 选择“Discover”选项卡，在中间下拉列表中选择刚添加的 apache_access-* 、apache_error-* 索引， 可以查看相应的图表及日志信息。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    k8s-app: elasticsearch-logging
    version: v7.4.2
  name: elasticsearch-logging
  namespace: logging
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: elasticsearch-logging
      version: v7.4.2
  serviceName: elasticsearch-logging
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v7.4.2
    spec:
      nodeSelector:
        log: "true" ## 指定部署在哪个节点。需根据环境来修改
      containers:
      - env:
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: cluster.initial_master_nodes
          value: elasticsearch-logging-0
        - name: ES_JAVA_OPTS
          value: "-Xms512m -Xmx512m"
        image: 172.21.32.13:5000/elasticsearch/elasticsearch:7.4.2
        name: elasticsearch-logging
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: elasticsearch-logging
      dnsConfig:
        options:
        - name: single-request-reopen
      initContainers:
      - command:
        - /sbin/sysctl
        - -w
        - vm.max_map_count=262144
        image: alpine:3.6
        imagePullPolicy: IfNotPresent
        name: elasticsearch-logging-init
        resources: {}
        securityContext:
          privileged: true
      - name: fix-permissions
        image: alpine:3.6
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-logging
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi # Adjust the size as needed
      storageClassName: standard # Change to your desired storage class
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: elasticsearch-logging
  name: elasticsearch
  namespace: logging
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging
  type: ClusterIP

k8s安装efk

es

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    app: elasticsearch-logging
  name: elasticsearch-logging
  namespace: logging
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: elasticsearch-logging
  serviceName: elasticsearch-logging
  template:
    metadata:
      labels:
        app: elasticsearch-logging
    spec:
      securityContext:
        fsGroup: 2000
      nodeSelector:
        log: "true" ## 指定部署在哪个节点。需根据环境来修改
      containers:
      - env:
        - name: cluster.name
          value: k8s-logs
        - name: node.name
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: discovery.seed_hosts
          value: "elasticsearch-logging-0"
        - name: network.host
          value: "0.0.0.0"
        - name: cluster.initial_master_nodes
          value: "elasticsearch-logging-0"
        - name: ES_JAVA_OPTS
          value: "-Xms512m -Xmx512m"
        image:  elasticsearch:7.4.2
        name: elasticsearch-logging
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        resources:
          limits:
            cpu: 1000m
            memory: 1Gi
          requests:
            cpu: 1000m
            memory: 1Gi
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data
          name: elasticsearch-logging
#      dnsConfig:
#        options:
#        - name: single-request-reopen
      initContainers:
#      - name: fix-permissions
#        image: busybox
#        imagePullPolicy: IfNotPresent
#        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
#        securityContext:
#          privileged: true
#        volumeMounts:
#        - name: elasticsearch-logging
#          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
  - metadata:
      name: elasticsearch-logging
      labels:
        app: elasticsearch-logging
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 30Gi # Adjust the size as needed
      storageClassName: nfs-client-storageclass
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: elasticsearch-logging
  name: elasticsearch
  namespace: logging
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    app: elasticsearch-logging
  type: ClusterIP

kubectl label node k8s-node03 log=true

kibana

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: kibana:7.4.2
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_URL
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: logging
  labels:
    app: kibana
spec:
  ports:
  - name: http
    port: 5601
    targetPort: 5601
    nodePort: 30056
  type: NodePort
  selector:
    app: kibana

fluentd

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd-es
  namespace: logging
  labels:
    app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
  - apiGroups:
      - ""
    resources:
      - "namespaces"
      - "pods"
    verbs:
      - "get"
      - "watch"
      - "list"

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-es
  labels:
    app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
  - kind: ServiceAccount
    name: fluentd-es
    namespace: logging
    apiGroup: ""
roleRef:
  kind: ClusterRole
  name: fluentd-es
  apiGroup: ""

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    app: fluentd-es
  name: fluentd-es
  namespace: logging
spec:
  selector:
    matchLabels:
      app: fluentd-es
  template:
    metadata:
      labels:
        app: fluentd-es
    spec:
      containers:
        - env:
            - name: FLUENTD_ARGS
              value: --no-supervisor -q
          image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
          imagePullPolicy: IfNotPresent
          name: fluentd-es
          resources:
            limits:
              memory: 500Mi
            requests:
              cpu: 100m
              memory: 200Mi
          volumeMounts:
            - mountPath: /var/log
              name: varlog
            - mountPath: /var/lib/docker/containers
              name: varlibdockercontainers
              readOnly: true
            - mountPath: /fluentd/etc/config.d
              name: config-volume
            - mountPath: /fluentd/etc/fluent.conf
              name: config-volume-main
              subPath: fluent.conf
      nodeSelector:
        fluentd: "true"
      securityContext: {}
      serviceAccount: fluentd-es
      serviceAccountName: fluentd-es
      volumes:
        - hostPath:
            path: /var/log
            type: ""
          name: varlog
        - hostPath:
            path: /var/lib/docker/containers
            type: ""
          name: varlibdockercontainers
        - configMap:
            defaultMode: 420
            name: fluentd-config
          name: config-volume
        - configMap:
            defaultMode: 420
            items:
              - key: fluent.conf
                path: fluent.conf
            name: fluentd-es-config-main
          name: config-volume-main

fluentd-es-config-main.yaml

apiVersion: v1
data:
  fluent.conf: |-
    # This is the root config file, which only includes components of the actual configuration
    #
    #  Do not collect fluentd's own logs to avoid infinite loops.
    <match fluent.**>
    @type null
    </match>

    @include /fluentd/etc/config.d/*.conf
kind: ConfigMap
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
  name: fluentd-es-config-main
  namespace: logging

fluentd-configmap.yaml

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  containers.input.conf: |-
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      localtime
      tag raw.kubernetes.*
      format json
      read_from_head true
    </source>
    # Detect exceptions in the log output and forward them as one log entry.
    # https://github.com/GoogleCloudPlatform/fluent-plugin-detect-exceptions 
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      host elasticsearch
      port 9200
      logstash_format true
      request_timeout    30s
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      </buffer>
    </match>

ELK企业级日志分析系统

ES