起因是我们的一个算法接口服务的镜像,在服务器上部署的时候,一下会占用60g的内存,但这个服务本身不到1g。业务需要对这个服务进行优化。 我逛遍了各种论坛也没有找到原因。 后来直接进入镜像查看配置文件。 gunicorn_conf.py
import json
import multiprocessing
import os
workers_per_core_str = os.getenv("WORKERS_PER_CORE", "1")
max_workers_str = os.getenv("MAX_WORKERS")
use_max_workers = None
if max_workers_str:
use_max_workers = int(max_workers_str)
web_concurrency_str = os.getenv("WEB_CONCURRENCY", None)
host = os.getenv("HOST", "0.0.0.0")
port = os.getenv("PORT", "80")
bind_env = os.getenv("BIND", None)
use_loglevel = os.getenv("LOG_LEVEL", "info")
if bind_env:
use_bind = bind_env
else:
use_bind = f"{host}:{port}"
cores = multiprocessing.cpu_count()
workers_per_core = float(workers_per_core_str)
default_web_concurrency = workers_per_core * cores
if web_concurrency_str:
web_concurrency = int(web_concurrency_str)
assert web_concurrency > 0
else:
web_concurrency = max(int(default_web_concurrency), 2)
if use_max_workers:
web_concurrency = min(web_concurrency, use_max_workers)
accesslog_var = os.getenv("ACCESS_LOG", "-")
use_accesslog = accesslog_var or None
errorlog_var = os.getenv("ERROR_LOG", "-")
use_errorlog = errorlog_var or None
graceful_timeout_str = os.getenv("GRACEFUL_TIMEOUT", "120")
timeout_str = os.getenv("TIMEOUT", "120")
keepalive_str = os.getenv("KEEP_ALIVE", "5")
# Gunicorn config variables
loglevel = use_loglevel
workers = web_concurrency
bind = use_bind
errorlog = use_errorlog
worker_tmp_dir = "/dev/shm"
accesslog = use_accesslog
graceful_timeout = int(graceful_timeout_str)
timeout = int(timeout_str)
keepalive = int(keepalive_str)
# For debugging and testing
log_data = {
"loglevel": loglevel,
"workers": workers,
"bind": bind,
"graceful_timeout": graceful_timeout,
"timeout": timeout,
"keepalive": keepalive,
"errorlog": errorlog,
"accesslog": accesslog,
# Additional, non-gunicorn variables
"workers_per_core": workers_per_core,
"use_max_workers": use_max_workers,
"host": host,
"port": port,
}
print(json.dumps(log_data))
web_concurrency:如果在不设置的情况下,default_web_concurrency =min( workers_per_core * cores,use_max_workers),也就是如果没有use_max_workers,那么服务器上有多少进程就启动多少个进程,一个进程等于一个完整的服务。 服务器上一共32核64进程,部署服务就会启64个服务。
通过修改default_web_concurrency或者use_max_workers就可以解决这个问题了。 解决方法是挺简单的,就是找不到,这里记录一下,万一有人和我一样遇到相同问题呢!