基于滴滴云DC2+Nginx搭建负载均衡方案

251 阅读12分钟
原文链接: blog.didiyun.com

本文是滴滴云开源工具教程系列文章的一篇。

Nginx是一款轻量级、高性能的Web服务器,专为高流量应用场景而设计。

本文主要介绍它的健康检查和负载均衡机制。健康检查和负载均衡是相辅相成,健康检查能够及时标记出服务异常的后端RS,使得数据面负载到可用的RS上,提高系统的可靠性和高可用。

Nginx支持丰富的第三方模块,这里示例以ngx_http_upstream_round_robin(简称RR)做为负载均衡模块,以ngx_http_proxy_module(检查proxy)作为后端代理模块。

健康检查有两种方式:
1)ngx_http_proxy_module模块和ngx_http_upstream_module模块(这是Nginx自带模块)
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_next_upstream

2)nginx_upstream_check_module模块(淘宝技术团队开发)
https://github.com/yaoweibin/nginx_upstream_check_module

Nginx的upstream目前支持5种方式的负载算法:

  • 轮询(默认)
    每个请求按时间顺序逐一分配到不同的后端服务器,如果后端服务器down掉,能自动剔除

  • weight
    指定轮询几率,weight和访问比率成正比,用于后端服务器性能不均的情况

  • ip_hash
    每个请求按访问ip的hash结果分配,每个访客固定访问一个后端服务器,可以解决session的问题

  • fair(第三方)
    按后端服务器的响应时间来分配请求,响应时间短的优先分配

  • url_hash(第三方)
    按访问url的hash结果来分配请求,使每个url定向到同一个后端服务器,后端服务器为缓存时比较有效

部署步骤:

1) 在滴滴云官网,可以申请多个DC2实例,节省费用:

2) 场景说明:

比如,这里示例四个滴滴云DC2实例分别为:10.255.10.12(Client)、10.255.44.122(Nginx Proxy)、10.255.15.111(RS1)、10.255.24.133(RS2),这里的滴滴云DC2实例需要处在同一个VPC即可,不必同一子网。

3) 配置Nginx负载均衡和健康检查功能:
通过yum install nginx就可以在DC2实例内安装Nginx服务,Nginx默认是以 conf/nginx.conf作为启动配置的,我们可以根据自己的需求在nginx.conf(默认路径是/etc/nginx/nginx.conf)中配置负载均衡和健康检查。

nginx.conf内容如下(Note:以下所有配置仅仅为测试所用,不代表线上环境真实所用,真正的线上环境需要更多配置和优化):

user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; Load dynamic modules. See /usr/share/nginx/README.dynamic. include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } http { # 设置日志格式 log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; # Load modular configuration files from the /etc/nginx/conf.d directory. # See http://nginx.org/en/docs/ngx_core_module.html#include # for more information. include /etc/nginx/conf.d/*.conf; upstream cluster { # ip_hash负载策略,优点是实现session共享,根据需求决定是否加这个配置 ip_hash; # max_fails = 3,为允许失败的次数,默认值为1,这是对后端节点做健康检查 server 10.255.15.111:80 max_fails=3 fail_timeout=30s; # fail_timeout = 30s,max_fails次失败后,暂停将请求分发到该后端服务器的时间 server 10.255.24.133:80 max_fails=3 fail_timeout=30s; } server { listen 80 default_server; listen [::]:80 default_server; server_name hello.test.com; root /usr/share/nginx/html; # Load configuration files for the default server block. include /etc/nginx/default.d/*.conf; location / { proxy_pass http://cluster; proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header REMOTE-HOST $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # 跟后端服务器连接超时时间,发起握手等候响应时间 proxy_connect_timeout 300; # 后端服务器回传时间,就是在规定时间内后端服务器必须传完所有数据 proxy_send_timeout 300; # 连接成功后等待后端服务器的响应时间,已经进入后端的排队之中等候处理 proxy_read_timeout 600; # 代理请求缓冲区,会保存用户的头信息以供nginx进行处理 proxy_buffer_size 256k; # 同上,告诉nginx保存单个用几个buffer最大用多少空间 proxy_buffers 4 256k; # 如果系统很忙时候可以申请最大的proxy_buffers proxy_busy_buffers_size 256k; # proxy缓存临时文件的大小 proxy_temp_file_write_size 256k; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504 http_403 http_404; proxy_max_temp_file_size 128m; # 如果该域名负载的访问请求不需要缓存功能,那就将这以下四行全部注释掉 proxy_cache mycache; proxy_cache_valid 200 302 1h; proxy_cache_valid 301 1d; proxy_cache_valid any 1m; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } } }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
    user nginx;      
    worker_processes auto;      
    error_log /var/log/nginx/error.log;      
    pid /run/nginx.pid;    
 
    Load dynamic modules. See /usr/share/nginx/README.dynamic.      
    include /usr/share/nginx/modules/*.conf;        
 
    events {      
        worker_connections 1024;      
    }      
 
    http {      
        # 设置日志格式    
        log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '    
                       '$status $body_bytes_sent "$http_referer" '    
                       '"$http_user_agent" "$http_x_forwarded_for"';    
 
        access_log  /var/log/nginx/access.log  main;  
 
        sendfile            on;  
        tcp_nopush          on;  
        tcp_nodelay         on;  
        keepalive_timeout   65;  
        types_hash_max_size 2048;
 
        include             /etc/nginx/mime.types;
        default_type        application/octet-stream;
 
        # Load modular configuration files from the /etc/nginx/conf.d directory.
        # See http://nginx.org/en/docs/ngx_core_module.html#include
        # for more information.
        include /etc/nginx/conf.d/*.conf;
 
        upstream cluster {
            # ip_hash负载策略,优点是实现session共享,根据需求决定是否加这个配置
            ip_hash;  
            # max_fails = 3,为允许失败的次数,默认值为1,这是对后端节点做健康检查
            server 10.255.15.111:80 max_fails=3 fail_timeout=30s;  
            # fail_timeout = 30s,max_fails次失败后,暂停将请求分发到该后端服务器的时间
            server 10.255.24.133:80 max_fails=3 fail_timeout=30s;  
        }
 
        server {
            listen       80 default_server;
            listen       [::]:80 default_server;
            server_name  hello.test.com;
            root         /usr/share/nginx/html;
 
            # Load configuration files for the default server block.
            include /etc/nginx/default.d/*.conf;
 
            location / {
                proxy_pass http://cluster;  
                proxy_redirect off;  
                proxy_set_header Host $host;  
                proxy_set_header X-Real-IP $remote_addr;  
                proxy_set_header REMOTE-HOST $remote_addr;  
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  
                # 跟后端服务器连接超时时间,发起握手等候响应时间  
                proxy_connect_timeout 300;  
                # 后端服务器回传时间,就是在规定时间内后端服务器必须传完所有数据              
                proxy_send_timeout 300;    
                # 连接成功后等待后端服务器的响应时间,已经进入后端的排队之中等候处理                
                proxy_read_timeout 600;  
                # 代理请求缓冲区,会保存用户的头信息以供nginx进行处理                
                proxy_buffer_size 256k;    
                # 同上,告诉nginx保存单个用几个buffer最大用多少空间              
                proxy_buffers 4 256k;    
                # 如果系统很忙时候可以申请最大的proxy_buffers              
                proxy_busy_buffers_size 256k;    
                # proxy缓存临时文件的大小          
                proxy_temp_file_write_size 256k;        
                proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504 http_403 http_404;  
                proxy_max_temp_file_size 128m;  
                # 如果该域名负载的访问请求不需要缓存功能,那就将这以下四行全部注释掉    
                proxy_cache mycache;                    
                proxy_cache_valid 200 302 1h;  
                proxy_cache_valid 301 1d;  
                proxy_cache_valid any 1m;  
            }
 
            error_page 404 /404.html;
                location = /40x.html {
            }
 
            error_page 500 502 503 504 /50x.html;
                location = /50x.html {
            }
        }
    }
 

修改配置文件后,nginx -s reload平滑重启Nginx,即可生效。

4) ngx_http_upstream_check_module
该模块可以为Tengine提供主动式后端服务器健康检查的功能,该模块在Tengine-1.4.0版本以前没有默认开启,它可以在配置编译选项的时候开启:./configure –with-http_upstream_check_module

http { upstream cluster1 { #simple RR server 10.255.15.111:8020; server 10.255.24.133:8021; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_http_send "HEAD / HTTP/1.0\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } upstream cluster2 { #simple RR server 10.255.15.111:8020; server 10.255.24.133:8021; check interval=3000 rise=2 fall=5 timeout=1000 type=http; check_keepalive_requests 100; check_http_send "HEAD / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; } server { listen 80 default_server; listen [::]:80 default_server; server_name hello.test.com; root /usr/share/nginx/html; # Load configuration files for the default server block. include /etc/nginx/default.d/*.conf; location /1 { proxy_pass http://cluster1; ... } location /2 { proxy_pass http://cluster2; ... } location /status { check_status; ... access_log off; #allow SOME.IP.ADD.RESS; #deny all; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } } }
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
    http {    
        upstream cluster1 {    
            #simple RR  
            server 10.255.15.111:8020;    
            server 10.255.24.133:8021;    
            check interval=3000 rise=2 fall=5 timeout=1000 type=http;    
            check_http_send "HEAD / HTTP/1.0\r\n\r\n";  
            check_http_expect_alive http_2xx http_3xx;  
        }  
 
        upstream cluster2 {  
            #simple RR  
            server 10.255.15.111:8020;    
            server 10.255.24.133:8021;  
            check interval=3000 rise=2 fall=5 timeout=1000 type=http;  
            check_keepalive_requests 100;    
            check_http_send "HEAD / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n";  
            check_http_expect_alive http_2xx http_3xx;  
        }
 
        server {  
            listen       80 default_server;  
            listen       [::]:80 default_server;  
            server_name  hello.test.com;  
            root         /usr/share/nginx/html;  
 
            # Load configuration files for the default server block.  
            include /etc/nginx/default.d/*.conf;  
 
            location /1 {  
                proxy_pass http://cluster1;    
                ...
            }  
 
            location /2 {  
                proxy_pass http://cluster2;    
                ...
            }  
 
            location /status {  
                check_status;  
                ...
                access_log   off;  
                #allow SOME.IP.ADD.RESS;  
                #deny all;  
            }
 
            error_page 404 /404.html;  
                location = /40x.html {  
            }  
 
            error_page 500 502 503 504 /50x.html;
                location = /50x.html {
            }
        }
    }  
 

5) 通过curl http://localhost/status?format=json,查看健康检查状态:

{"servers": { "total": 2, "generation": 1, "server": [ {"index": 0, "upstream": "cluster", "name": "10.255.15.111:8020", "status": "up", "rise": 35, "fall": 0, "type": "http", "port": 0}, {"index": 1, "upstream": "cluster", "name": "10.255.24.133:8021", "status": "up", "rise": 26, "fall": 0, "type": "http", "port": 0} ] }}
1
2
3
4
5
6
7
8
9
    {"servers": {  
      "total": 2,  
      "generation": 1,  
      "server": [  
        {"index": 0, "upstream": "cluster", "name": "10.255.15.111:8020", "status": "up", "rise": 35, "fall": 0, "type": "http", "port": 0},  
        {"index": 1, "upstream": "cluster", "name": "10.255.24.133:8021", "status": "up", "rise": 26, "fall": 0, "type": "http", "port": 0}  
     ]  
    }}  
 

小结:

滴滴云为用户提供了SLB,给用户提供了高可用和高可靠性的负载均衡产品,用户也可以自己选择使用Nginx搭建自定义的网络,Nginx提供了丰富的第三方模块,部署灵活,基于DC2+Nginx搭建负载均衡网络也是不错的选择。