一:问题
访问公司内网,发现有两次重复的 302 跳转:
第一次:302 Moved Temporarily,本次为http协议;
第二次:302 FOUND,本次为https协议;
怀疑一次为 Nginx 强制https的302,一次为 Flask 内部跳转302。
正常情况应该是一次302才符合预期。
二:排查为何出现两次302
排查发现在Flask项目302逻辑中,使用了 request.host_url 来进行做302跳转。通过日志,发现该 request.host_url 打印结果为http协议。
怀疑302时使用http协议,在经过Nginx代理后做了强制https跳转,才出现两次302的情况。
那么为何通过 request.host_url 获取到的 url 仍然是 http 协议呢?于是查看下 host_url 源码:
@cached_property
def host_url(self):
"""Just the host with scheme as IRI.
See also: :attr:`trusted_hosts`.
"""
return get_current_url(self.environ, host_only=True,
trusted_hosts=self.trusted_hosts)
引出 def get_current_url()函数 源码:
def get_current_url(environ, root_only=False, strip_querystring=False,
host_only=False, trusted_hosts=None):
"""A handy helper function that recreates the full URL as IRI for the
current request or parts of it. Here's an example:
>>> from werkzeug.test import create_environ
>>> env = create_environ("/?param=foo", "http://localhost/script")
>>> get_current_url(env)
'http://localhost/script/?param=foo'
>>> get_current_url(env, root_only=True)
'http://localhost/script/'
>>> get_current_url(env, host_only=True)
'http://localhost/'
>>> get_current_url(env, strip_querystring=True)
'http://localhost/script/'
This optionally it verifies that the host is in a list of trusted hosts.
If the host is not in there it will raise a
:exc:`~werkzeug.exceptions.SecurityError`.
Note that the string returned might contain unicode characters as the
representation is an IRI not an URI. If you need an ASCII only
representation you can use the :func:`~werkzeug.urls.iri_to_uri`
function:
>>> from werkzeug.urls import iri_to_uri
>>> iri_to_uri(get_current_url(env))
'http://localhost/script/?param=foo'
:param environ: the WSGI environment to get the current URL from.
:param root_only: set `True` if you only want the root URL.
:param strip_querystring: set to `True` if you don't want the querystring.
:param host_only: set to `True` if the host URL should be returned.
:param trusted_hosts: a list of trusted hosts, see :func:`host_is_trusted`
for more information.
"""
tmp = [environ['wsgi.url_scheme'], '://', get_host(environ, trusted_hosts)]
cat = tmp.append
if host_only:
return uri_to_iri(''.join(tmp) + '/')
cat(url_quote(wsgi_get_bytes(environ.get('SCRIPT_NAME', ''))).rstrip('/'))
cat('/')
if not root_only:
cat(url_quote(wsgi_get_bytes(environ.get('PATH_INFO', '')).lstrip(b'/')))
if not strip_querystring:
qs = get_query_string(environ)
if qs:
cat('?' + qs)
return uri_to_iri(''.join(tmp))
可以看到,该协议是通过 environ['wsgi.url_scheme'] 获取到的。于是打印了日志发现 wsgi.url_scheme 结果确实为 http;
综上,由于在flask内部使用的302是http,在302之后,nginx会强制https,会再次302,所以就出现上述两次302的情况。
三:分析URl跳转
查看下nginx配置:
location / {
set $upstream_name "com.maoyan.op.devOps.ssosv";
proxy_pass http://$upstream_name;
}
可以看出其中 nginx->flask 是经过http转发的。
而从浏览器访问系统地址(直接https),是一个https协议。
从上面的分析结果可以得出,目前访问内网系统,进行登录之后的请求的流为:浏览器客户端 -> nginx -> wsgi(flask)
第一段浏览器客户端 -> nginx 是 https,第二段nginx->wsgi 是 http,所以说,Flask项目经过 nginx 代理后,即使通过 https 访问,但flask中的wsgi.url_scheme仍然是http,所以使用的request.host_url结果也是http。
所以,如果能拿到客户端 -> nginx之间的协议,就可以在 flask 内确定到底使用的是什么协议。那么怎么拿到客户端 -> nginx之间的协议呢?
其实,nginx 是有这样一个报头配置的,就是 X-Forwarded-Proto,官方解释为:您的服务器访问日志包含在服务器和负载平衡器之间使用的协议,但不包括客户端和负载平衡器之间使用的协议。要确定客户端和负载平衡器之间使用的协议,可以使用该请求标头。
当nginx完成该配置之后(公司nginx默认有该配置),在flask项目内打印下 request.headers:
X-Real-Ip: 172.9.160.21
User-Agent: Mozilla/5.00.0.0 Safari/537.36
Connection: close
Sec-Fetch-Dest: document
Sec-Ch-Ua-Platform: "macOS"
Sec-Ch-Ua-Mobile: ?0
X-Forwarded-Proto: https
Sec-Fetch-Mode: navigate
Sec-Ch-Ua: " Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"
Host: sso.xxx.com
Sec-Fetch-Site: none
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Language: zh-CN,zh;q=0.9
Sec-Fetch-User: ?1
X-Forwarded-For: 172.9.160.21
Accept-Encoding: gzip, deflate, br
直接看到 X-Forwarded-Proto: https,拿到了。
所以,可以直接使用 X-Forwarded-Proto + Host的方式 来获取一个正确的 host_url。
四:解决方法
方案一:
正如上面写的,通过 request.headers 里的 X-Forwarded-Proto和Host生成一个真实的 host_url.
host_url = "{}://{}".format(
request.headers.get("X-Forwarded-Proto"),
request.headers.get("Host")
)
那么还有没有更优雅的方式呢?当然!有一个类叫做 ProxyFix。
方案二:
参考:werkzeug.palletsprojects.com/en/2.1.x/mi… 注意:低版本与高版本的werkzeug ,使用方法可能不一样。
from werkzeug.contrib.fixers import ProxyFix
# Flask app
app = create_app(config_name=config_name)
# 使用ProxyFix调整代理后的WSGI的环境,拿到真实的http或者https协议。
# 低版本
app.wsgi_app = ProxyFix(app=app.wsgi_app)
# 高版本
app.wsgi_app = ProxyFix(app=app.wsgi_app, x_for=1, x_proto=1, x_host=0, x_port=0, x_prefix=0)
配置完成之后,重启服务,访问https地址,看下日志: 'wsgi.url_scheme': 'https'。
注意头部可信问题:在非代理情况下使用这个中间件是有安全问题的,因为它会盲目信 任恶意客户端发来的头部。
class werkzeug.middleware.proxy_fix.ProxyFix(app, x_for=1, x_proto=1, x_host=0, x_port=0, x_prefix=0)
Adjust the WSGI environ based on X-Forwarded- that proxies in front of the application may set.
X-Forwarded-For sets REMOTE_ADDR.
X-Forwarded-Proto sets wsgi.url_scheme.
X-Forwarded-Host sets HTTP_HOST, SERVER_NAME, and SERVER_PORT.
X-Forwarded-Port sets HTTP_HOST and SERVER_PORT.
X-Forwarded-Prefix sets SCRIPT_NAME.
You must tell the middleware how many proxies set each header so it knows what values to trust. It is a security issue to trust values that came from the client rather than a proxy.
The original values of the headers are stored in the WSGI environ as werkzeug.proxy_fix.orig, a dict.
Parameters:
app (WSGIApplication) – The WSGI application to wrap.
x_for (int) – Number of values to trust for X-Forwarded-For.
x_proto (int) – Number of values to trust for X-Forwarded-Proto.
x_host (int) – Number of values to trust for X-Forwarded-Host.
x_port (int) – Number of values to trust for X-Forwarded-Port.
x_prefix (int) – Number of values to trust for X-Forwarded-Prefix.
Return type
更多信息参考:
dormousehole.readthedocs.io/en/latest/d… werkzeug.palletsprojects.com/en/2.1.x/mi…