记一次解决oauth模块监控告警问题的过程

1,099 阅读4分钟

背景

项目中使用的Spring Security OAuth2认证,在线上运行过程中出现了几类监控告警,本文对这些问题进行优化说明,以及记录一下由于自定义过滤器中使用getInputStream()、getParameter()导致的一些问题和解决办法。

问题描述

1. ERROR: URL:/oauth/token error status:401
2. ERROR(HttpServletRequestImpl.java:841): UT005023: Exception handling request to /oauth/token
java.lang.RuntimeException: java.io.IOException: UT000128: Remote peer closed connection before all data could be read

解决办法

ERROR: URL:/oauth/token error status:401

原因

请求头中缺少Authorization参数,从而导致oauth抛出异常。

分析

oauth自带的OAuth2AuthenticationProcessingFilter会对请求头做校验,源码(部分关键代码):

public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException {
        ...
        try {
            Authentication authentication = this.tokenExtractor.extract(request);
            ...
        } catch (OAuth2Exception var9) {
            SecurityContextHolder.clearContext();
            ...
            return;
        }

        chain.doFilter(request, response);
    }

BearerTokenExtractorextract方法会从HttpServletRequest的头部取出name为Authorization的value

public Authentication extract(HttpServletRequest request) {
        String tokenValue = this.extractToken(request);
        if (tokenValue != null) {
            PreAuthenticatedAuthenticationToken authentication = new PreAuthenticatedAuthenticationToken(tokenValue, "");
            return authentication;
        } else {
            return null;
        }
    }
protected String extractHeaderToken(HttpServletRequest request) {
        Enumeration headers = request.getHeaders("Authorization");

        String value;
        do {
            if (!headers.hasMoreElements()) {
                return null;
            }

            value = (String)headers.nextElement();
        } while(!value.toLowerCase().startsWith("Bearer".toLowerCase()));

        String authHeaderValue = value.substring("Bearer".length()).trim();
        request.setAttribute(OAuth2AuthenticationDetails.ACCESS_TOKEN_TYPE, value.substring(0, "Bearer".length()).trim());
        int commaIndex = authHeaderValue.indexOf(44);
        if (commaIndex > 0) {
            authHeaderValue = authHeaderValue.substring(0, commaIndex);
        }

        return authHeaderValue;
    }

解决办法

既然是oauth框架自带的校验问题,那我们可以自定义一个filter在OAuth2AuthenticationProcessingFilter之前执行(一定要注意filter的执行顺序),对请求头进行判断,抛出业务错误码。

@Slf4j
@Component
@AllArgsConstructor
public class AuthHeaderFilter extends OncePerRequestFilter {

    private static final String METHOD_PARAM = "_method";

    private ObjectMapper objectMapper;

    @Override
    protected void doFilterInternal(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, FilterChain filterChain) throws ServletException, IOException {
        // 获取请求头信息
        String header = httpServletRequest.getHeader("Authorization");
        if (header == null || !header.startsWith(TokenUtil.HEADER_PREFIX)) {
            buildTipsResponse(httpServletResponse, "请求头中无client信息");
            return;
        }
        ...
        filterChain.doFilter(httpServletRequest, httpServletResponse);
    }

}

UT000128: Remote peer closed connection before all data could be read

生产环境产生这个问题的原因目前没复现出来,但是从代码一步一步debug调试发现,FixedLengthStreamSourceConduit类中的read方法中调用了exitRead方法:

private void exitRead(long consumed) throws IOException {
        long oldVal = this.state;
        if (consumed == -1L) {
            if (Bits.anyAreSet(oldVal, MASK_COUNT)) {
                this.invokeFinishListener();
                this.state &= ~MASK_COUNT;
                throw UndertowMessages.MESSAGES.couldNotReadContentLengthData();
            }
        } else {
            long newVal = oldVal - consumed;
            this.state = newVal;
        }
    }
@Message(
        id = 128,
        value = "Remote peer closed connection before all data could be read"
    )
    IOException couldNotReadContentLengthData();
    
public final IOException couldNotReadContentLengthData() {
        IOException result = new IOException(String.format(this.getLoggingLocale(), this.couldNotReadContentLengthData$str()));
        _copyStackTraceMinusOne(result);
        return result;
    }

也就是说,当consumed=-1时候,就会抛出该异常,再一翻百度了之后还是无法复现,随后我采取了一个比较投机取巧的方式,我在自定义的filter中调用read方法,并进行try catch捕获异常

@Override
    protected void doFilterInternal(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, FilterChain filterChain) throws ServletException, IOException {
        ...
        // 校验form-data参数长度问题,解决线上部分告警
        ServletInputStream dst = httpServletRequest.getInputStream();
        if (dst != null) {
            int res = 0;
            // 打印请求头
            try {
                res = dst.read();
            } catch (Exception e) {
                printHeaderLog(httpServletRequest);
                log.warn("param error! read()={}, message={}", res, e.getMessage());
                buildTipsResponse(httpServletResponse, "请求参数有误");
                return;
            }
        }
        filterChain.doFilter(httpServletRequest, httpServletResponse);
    }

结果

通过上述方法,成功的解决了线上的两种问题。

但也引发了线上另外一个问题,对于App调用/oauth/token接口进行refresh token认证时候,将参数放在body中,结果报出Missing grant type的错误,原因是auth认证接口param参数是null,原因是啥呢?

分析过程

原来是我在自定义filter中调用了httpServletRequest.getInputStream()方法(HttpServletRequestImpl类)

public ServletInputStream getInputStream() throws IOException {
        if (this.reader != null) {
            throw UndertowServletMessages.MESSAGES.getReaderAlreadyCalled();
        } else {
            if (this.servletInputStream == null) {
                this.servletInputStream = new ServletInputStreamImpl(this);
            }

            this.readStarted = true;
            return this.servletInputStream;
        }
    }

调用了getInputStream导致readStarted=true,而在HiddenHttpMethodFilter中有个getParameter -> parseFormData会去判断是否已读:

private FormData parseFormData() {
        if (this.formParsingException != null) {
            throw this.formParsingException;
        } else if (this.parsedFormData == null) {
            if (this.readStarted) {
                return null;
            } else {
                ManagedServlet originalServlet = ((ServletRequestContext)this.exchange.getAttachment(ServletRequestContext.ATTACHMENT_KEY)).getCurrentServlet().getManagedServlet();
                FormDataParser parser = originalServlet.getFormParserFactory().createParser(this.exchange);
                if (parser == null) {
                    return null;
                } else {
                    this.readStarted = true;

                    try {
                        return this.parsedFormData = parser.parseBlocking();
                    } catch (FileTooLargeException | RequestTooBigException var4) {
                        throw this.formParsingException = new IllegalStateException(var4);
                    } catch (RuntimeException var5) {
                        throw this.formParsingException = var5;
                    } catch (IOException var6) {
                        throw this.formParsingException = new RuntimeException(var6);
                    }
                }
            }
        } else {
            return this.parsedFormData;
        }
    }

原因

这个问题主要是一个HttpServletRequest的一个不知道是不是bug的问题,就是对于一个request请求来说,它的参数输入流只能读取一次,读取之后流中的数据便没有了,而无论我们从过滤器中也好还是三方的一些功能里面也好还是controller,只要它需要使用到request中的参数,他就只能从流中读取。所以如果在controller之前,有对象都去过request中的流,那么controller中就再也读取不到参数了。

解决办法

1. 控制filter的顺序,这个比较处理,对于这些spring自带的filter顺序不可知
2. 自定义filter中调用getParameter()方法,主动将参数解析到FormData中

这边我采用了方法2解决该问题:

@Override
    protected void doFilterInternal(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, FilterChain filterChain) throws ServletException, IOException {
        ...
        // 填充form-data数据,防止由于调用了getInputStream导致readStarted=true,参数获取失败问题
        String paramValue = httpServletRequest.getParameter("_method");
        // 校验form-data参数长度问题,解决线上部分告警
        ServletInputStream dst = httpServletRequest.getInputStream();
        ...
        filterChain.doFilter(httpServletRequest, httpServletResponse);
    }

最后

最后这个问题是上线后用户反馈问题时候才发现的,事故持续时间14:30-16:20,先回滚代码重新上线。虽然最终解决了该问题,但其实还是自己考虑的面不够细节,如果不是把自定义的filter优先级设置成最高也不会发生这件事,也是自己对spring filter的掌握不够深,导致了该问题的发生,需要加强该方面的知识,对于任何相关的技术优化,都应该了解底层原理,防止出现不可预知的问题。