分析Flask WSGI经过Nginx代理出现两次302问题

528 阅读4分钟

一:问题

访问公司内网,发现有两次重复的 302 跳转: 第一次:302 Moved Temporarily,本次为http协议; 第二次:302 FOUND,本次为https协议; 怀疑一次为 Nginx 强制https的302,一次为 Flask 内部跳转302。 正常情况应该是一次302才符合预期。

二:排查为何出现两次302

排查发现在Flask项目302逻辑中,使用了 request.host_url 来进行做302跳转。通过日志,发现该 request.host_url 打印结果为http协议。

怀疑302时使用http协议,在经过Nginx代理后做了强制https跳转,才出现两次302的情况。

那么为何通过 request.host_url 获取到的 url 仍然是 http 协议呢?于是查看下 host_url 源码:

@cached_property
def host_url(self):
    """Just the host with scheme as IRI.
    See also: :attr:`trusted_hosts`.
    """
    return get_current_url(self.environ, host_only=True,
                           trusted_hosts=self.trusted_hosts)

引出 def get_current_url()函数 源码:

def get_current_url(environ, root_only=False, strip_querystring=False,
                    host_only=False, trusted_hosts=None):
    """A handy helper function that recreates the full URL as IRI for the
    current request or parts of it.  Here's an example:

    >>> from werkzeug.test import create_environ
    >>> env = create_environ("/?param=foo", "http://localhost/script")
    >>> get_current_url(env)
    'http://localhost/script/?param=foo'
    >>> get_current_url(env, root_only=True)
    'http://localhost/script/'
    >>> get_current_url(env, host_only=True)
    'http://localhost/'
    >>> get_current_url(env, strip_querystring=True)
    'http://localhost/script/'

    This optionally it verifies that the host is in a list of trusted hosts.
    If the host is not in there it will raise a
    :exc:`~werkzeug.exceptions.SecurityError`.

    Note that the string returned might contain unicode characters as the
    representation is an IRI not an URI.  If you need an ASCII only
    representation you can use the :func:`~werkzeug.urls.iri_to_uri`
    function:

    >>> from werkzeug.urls import iri_to_uri
    >>> iri_to_uri(get_current_url(env))
    'http://localhost/script/?param=foo'

    :param environ: the WSGI environment to get the current URL from.
    :param root_only: set `True` if you only want the root URL.
    :param strip_querystring: set to `True` if you don't want the querystring.
    :param host_only: set to `True` if the host URL should be returned.
    :param trusted_hosts: a list of trusted hosts, see :func:`host_is_trusted`
                          for more information.
    """
    tmp = [environ['wsgi.url_scheme'], '://', get_host(environ, trusted_hosts)]
    cat = tmp.append
    if host_only:
        return uri_to_iri(''.join(tmp) + '/')
    cat(url_quote(wsgi_get_bytes(environ.get('SCRIPT_NAME', ''))).rstrip('/'))
    cat('/')
    if not root_only:
        cat(url_quote(wsgi_get_bytes(environ.get('PATH_INFO', '')).lstrip(b'/')))
        if not strip_querystring:
            qs = get_query_string(environ)
            if qs:
                cat('?' + qs)
    return uri_to_iri(''.join(tmp))

可以看到,该协议是通过 environ['wsgi.url_scheme'] 获取到的。于是打印了日志发现 wsgi.url_scheme 结果确实为 http;

综上,由于在flask内部使用的302是http,在302之后,nginx会强制https,会再次302,所以就出现上述两次302的情况。

三:分析URl跳转

查看下nginx配置:

location / {
        set $upstream_name "com.maoyan.op.devOps.ssosv";
        proxy_pass http://$upstream_name;
    }

可以看出其中 nginx->flask 是经过http转发的。 而从浏览器访问系统地址(直接https),是一个https协议。 从上面的分析结果可以得出,目前访问内网系统,进行登录之后的请求的流为:浏览器客户端 -> nginx -> wsgi(flask)

第一段浏览器客户端 -> nginx 是 https,第二段nginx->wsgi 是 http,所以说,Flask项目经过 nginx 代理后,即使通过 https 访问,但flask中的wsgi.url_scheme仍然是http,所以使用的request.host_url结果也是http。

所以,如果能拿到客户端 -> nginx之间的协议,就可以在 flask 内确定到底使用的是什么协议。那么怎么拿到客户端 -> nginx之间的协议呢?

其实,nginx 是有这样一个报头配置的,就是 X-Forwarded-Proto,官方解释为:您的服务器访问日志包含在服务器和负载平衡器之间使用的协议,但不包括客户端和负载平衡器之间使用的协议。要确定客户端和负载平衡器之间使用的协议,可以使用该请求标头。

当nginx完成该配置之后(公司nginx默认有该配置),在flask项目内打印下 request.headers:

X-Real-Ip: 172.9.160.21
User-Agent: Mozilla/5.00.0.0 Safari/537.36
Connection: close
Sec-Fetch-Dest: document
Sec-Ch-Ua-Platform: "macOS"
Sec-Ch-Ua-Mobile: ?0
X-Forwarded-Proto: https
Sec-Fetch-Mode: navigate
Sec-Ch-Ua: " Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"
Host: sso.xxx.com
Sec-Fetch-Site: none
Upgrade-Insecure-Requests: 1
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Language: zh-CN,zh;q=0.9
Sec-Fetch-User: ?1
X-Forwarded-For: 172.9.160.21
Accept-Encoding: gzip, deflate, br

直接看到 X-Forwarded-Proto: https,拿到了。

所以,可以直接使用 X-Forwarded-Proto + Host的方式 来获取一个正确的 host_url

四:解决方法

方案一:

正如上面写的,通过 request.headers 里的 X-Forwarded-ProtoHost生成一个真实的 host_url.

host_url = "{}://{}".format(
    request.headers.get("X-Forwarded-Proto"),
    request.headers.get("Host")
)

那么还有没有更优雅的方式呢?当然!有一个类叫做 ProxyFix。

方案二:

参考:werkzeug.palletsprojects.com/en/2.1.x/mi… 注意:低版本与高版本的werkzeug ,使用方法可能不一样。

from werkzeug.contrib.fixers import ProxyFix

# Flask app
app = create_app(config_name=config_name)
# 使用ProxyFix调整代理后的WSGI的环境,拿到真实的http或者https协议。
# 低版本
app.wsgi_app = ProxyFix(app=app.wsgi_app)
# 高版本
app.wsgi_app = ProxyFix(app=app.wsgi_app, x_for=1, x_proto=1, x_host=0, x_port=0, x_prefix=0)

配置完成之后,重启服务,访问https地址,看下日志: 'wsgi.url_scheme': 'https'

注意头部可信问题:在非代理情况下使用这个中间件是有安全问题的,因为它会盲目信 任恶意客户端发来的头部。

class werkzeug.middleware.proxy_fix.ProxyFix(app, x_for=1, x_proto=1, x_host=0, x_port=0, x_prefix=0)

Adjust the WSGI environ based on X-Forwarded- that proxies in front of the application may set.
X-Forwarded-For sets REMOTE_ADDR.
X-Forwarded-Proto sets wsgi.url_scheme.
X-Forwarded-Host sets HTTP_HOST, SERVER_NAME, and SERVER_PORT.
X-Forwarded-Port sets HTTP_HOST and SERVER_PORT.
X-Forwarded-Prefix sets SCRIPT_NAME.

You must tell the middleware how many proxies set each header so it knows what values to trust. It is a security issue to trust values that came from the client rather than a proxy.

The original values of the headers are stored in the WSGI environ as werkzeug.proxy_fix.orig, a dict.

Parameters:
app (WSGIApplication) – The WSGI application to wrap.
x_for (int) – Number of values to trust for X-Forwarded-For.
x_proto (int) – Number of values to trust for X-Forwarded-Proto.
x_host (int) – Number of values to trust for X-Forwarded-Host.
x_port (int) – Number of values to trust for X-Forwarded-Port.
x_prefix (int) – Number of values to trust for X-Forwarded-Prefix.
Return type

更多信息参考:

dormousehole.readthedocs.io/en/latest/d… werkzeug.palletsprojects.com/en/2.1.x/mi…