Tomcat访问(access)日志配置

3,423 阅读7分钟

原文地址 Tomcat访问(access)日志配置

在 tomcat 的 access 中打印出请求的情况可以帮助我们分析问题,通常比较关注的有访问 IP、线程号、访问 url、返回状态码、访问时间、持续时间。

最近在跟一个图片请求超时的问题,需要在项目的 access log 中增加请求的响应时间参数。正好学习一下 tomcat access log 中参数的配置。

最好的还是官方文档,以下摘自 tomcat7 文档。

Introduction

The Access Log Valve creates log files in the same format as those created by standard web servers. 
These logs can later be analyzed by standard log analysis tools to track page hit counts, user session activity, and so on. 
This Valve uses self-contained logic to write its log files, which can be automatically rolled over at midnight each day. 
(The essential requirement for access logging is to handle a large continuous stream of data with low overhead. 
This Valve does not use Apache Commons Logging, thus avoiding additional overhead and potentially complex configuration).

This Valve may be associated with any Catalina container (Context, Host, or Engine), and will record ALL requests processed by that container.

Some requests may be handled by Tomcat before they are passed to a container. These include redirects from /foo to /foo/ and the rejection of invalid requests. 
Where Tomcat can identify the Context that would have handled the request, the request/response will be logged in the AccessLog(s) associated Context, Host and Engine.
 Where Tomcat cannot identify the Context that would have handled the request, e.g. in cases where the URL is invalid, Tomcat will look first in the Engine, 
then the default Host for the Engine and finally the ROOT (or default) Context for the default Host for an AccessLog implementation. 
Tomcat will use the first AccessLog implementation found to log those requests that are rejected before they are passed to a container.

The output file will be placed in the directory given by the directory attribute. The name of the file is composed by concatenation of the configured prefix, timestamp and suffix. 
The format of the timestamp in the file name can be set using the fileDateFormat attribute. This timestamp will be omitted if the file rotation is switched off by setting rotatable to false.

Warning: If multiple AccessLogValve instances are used, they should be configured to use different output files.

If sendfile is used, the response bytes will be written asynchronously in a separate thread and the access log valve will not know how many bytes were actually written. 
In this case, the number of bytes that was passed to the sendfile thread for writing will be recorded in the access log valve.

Attributes

The Access Log Valve supports the following configuration attributes: 

AttributeDescription
buffered

Flag to determine if logging will be buffered. If set to false, then access logging will be written after each request. Default value: true

className

Java class name of the implementation to use. This MUST be set to org.apache.catalina.valves.AccessLogValve to use the default access log valve.

condition

The same as conditionUnless. This attribute is provided for backwards compatibility.

conditionIf

Turns on conditional logging. If set, requests will be logged only if ServletRequest.getAttribute() is not null. For example, if this value is set toimportant, then a particular request will only be logged if ServletRequest.getAttribute("important") != null. The use of Filters is an easy way to set/unset the attribute in the ServletRequest on many different requests.

conditionUnless

Turns on conditional logging. If set, requests will be logged only if ServletRequest.getAttribute() is null. For example, if this value is set to junk, then a particular request will only be logged if ServletRequest.getAttribute("junk") == null. The use of Filters is an easy way to set/unset the attribute in the ServletRequest on many different requests.

directory

Absolute or relative pathname of a directory in which log files created by this valve will be placed. If a relative path is specified, it is interpreted as relative to CATALINA_BASE. If no directory attribute is specified, the default value is "logs" (relative to CATALINA_BASE).

encoding

Character set used to write the log file. An empty string means to use the system default character set. Default value: use the system default character set.

fileDateFormat

Allows a customized timestamp in the access log file name. The file is rotated whenever the formatted timestamp changes. The default value is yyyy-MM-dd. If you wish to rotate every hour, then set this value to yyyy-MM-dd.HH. The date format will always be localized using the locale en_US.

locale

The locale used to format timestamps in the access log lines. Any timestamps configured using an explicit SimpleDateFormat pattern (%{xxx}t) are formatted in this locale. By default the default locale of the Java process is used. Switching the locale after the AccessLogValve is initialized is not supported. Any timestamps using the common log format (CLF) are always formatted in the locale en_US.

maxDays

The maximum number of days rotated access logs will be retained for before being deleted. If not specified, the default value of -1 will be used which means never delete old files.

pattern

A formatting layout identifying the various information fields from the request and response to be logged, or the word common or combinedto select a standard format. See below for more information on configuring this attribute.

prefix

The prefix added to the start of each log file's name. If not specified, the default value is"access_log".

renameOnRotate

By default for a rotatable log the active access log file name will contain the current timestamp in fileDateFormat. During rotation the file is closed and a new file with the next timestamp in the name is created and used. When setting renameOnRotate to true, the timestamp is no longer part of the active log file name. Only during rotation the file is closed and then renamed to include the timestamp. This is similar to the behavior of most log frameworks when doing time based rotation. Default value: false

requestAttributesEnabled

Set to true to check for the existence of request attributes (typically set by the RemoteIpValve and similar) that should be used to override the values returned by the request for remote address, remote host, server port and protocol. If the attributes are not set, or this attribute is set to false then the values from the request will be used. If not set, the default value of false will be used.

resolveHosts

This attribute is no longer supported. Use the connector attribute enableLookups instead.

If you have enableLookups on the connector set to true and want to ignore it, use %a instead of %h in the value of pattern.

rotatable

Flag to determine if log rotation should occur. If set to false, then this file is never rotated and fileDateFormat is ignored. Default value: true

suffix

The suffix added to the end of each log file's name. If not specified, the default value is"" (a zero-length string), meaning that no suffix will be added.

Values for the pattern attribute are made up of literal text strings, combined with pattern identifiers prefixed by the "%" character to cause replacement by the corresponding variable value from the current request and response. The following pattern codes are supported:

%a - Remote IP address
%A - Local IP address
%b - Bytes sent, excluding HTTP headers, or '-' if zero
%B - Bytes sent, excluding HTTP headers
%h - Remote host name (or IP address if enableLookups for the connector is false)
%H - Request protocol
%l - Remote logical username from identd (always returns '-')
%m - Request method (GET, POST, etc.)
%p - Local port on which this request was received. See also %{xxx}p below.
%q - Query string (prepended with a '?' if it exists)
%r - First line of the request (method and request URI)
%s - HTTP status code of the response
%S - User session ID
%t - Date and time, in Common Log Format
%u - Remote user that was authenticated (if any), else '-'
%U - Requested URL path
%v - Local server name
%D - Time taken to process the request, in millis
%T - Time taken to process the request, in seconds
%F - Time taken to commit the response, in millis
%I - Current request thread name (can compare later with stacktraces)

There is also support to write information incoming or outgoing headers, cookies, session or request attributes and special timestamp formats. It is modeled after the Apache HTTP Server log configuration syntax. Each of them can be used multiple times with different xxx keys:

%{xxx}i write value of incoming header with name xxx
%{xxx}o write value of outgoing header with name xxx
%{xxx}c write value of cookie with name xxx
%{xxx}r write value of ServletRequest attribute with name xxx
%{xxx}s write value of HttpSession attribute with name xxx
%{xxx}p write local (server) port (xxx==local) or remote (client) port (xxx=remote)
%{xxx}t write timestamp at the end of the request formatted using the enhanced SimpleDateFormat pattern xxx

一、配置与说明

1. 配置的位置

tomcat 访问日志格式配置,在 conf/server.xml 里 Host 标签下加上

<Host 
        unpackWARs="true" autoDeploy="true">
    <Valve class 
        prefix="access." suffix=".log" 
        pattern="%a %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-Agent}i" "%{QunarGlobal}c" %{X-Forwarded-For}i"
        resolveHosts="false"/>
</Host>

我们在日志文件中将看到如下文本:

211.157.146.222 211.157.146.222 - - \[14/Jul/2018:12:34:42 +0800\] "GET /web-base/item/10 HTTP/1.1" 200 251 "-" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 
Chrome/67.0.3396.99 Safari/537.36" "-" -

二、配置打印 POST 参数

另外 %r 参数能打印出请求的 url 和 get 参数。如果 url 指定访问方式是 post,post 的参数是打印不出来的。当需要打印 post 参数,该怎么办?

大家注意到我开篇举例 Valve 配置里的 %{postdata}r。没错,这个 combined 格式的 patterrn 可以实现。但是只在 valve 里配置这个东东还不够。因为 postdata 使我们自定义的参数名。需要在 request 中设置这个值。下面附上设置 postdata 到 request 中的代码。

import java.io.IOException;
import java.util.Enumeration;
 
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
 
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
 
public final class PostDataDumperFilter implements Filter {
 
    Logger logger = LoggerFactory.getLogger(getClass());
 
    private FilterConfig filterConfig = null;
 
    public void destroy() {
        this.filterConfig = null;
    }
 
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
            throws IOException, ServletException {
        if (filterConfig == null)
            return;
 
        Enumeration<String> names = request.getParameterNames();
        StringBuilder output = new StringBuilder();
        while (names.hasMoreElements()) {
            String name = (String) names.nextElement();
            output.append(name).append("=");
            String values\[\] = request.getParameterValues(name);
            for (int i = 0; i < values.length; i++) {
                if (i > 0) {
                    output.append("' ");
                }
 
                output.append(values\[i\]);
            }
            if (names.hasMoreElements())
                output.append("&");
        }
        request.setAttribute("postdata", output);
        logger.debug("postdata: " + output);
        chain.doFilter(request, response);
    }
 
    public void init(FilterConfig filterConfig) throws ServletException {
        this.filterConfig = filterConfig;
    }
}

在 web.xml 中添加配置该 filter:

<filter>
    <filter-name>post-data-dumper-filter</filter-name>
    <filter-class>com.xiaoxiliu.PostDataDumperFilter</filter-class>
</filter>
<filter-mapping>
    <filter-name>post-data-dumper-filter</filter-name>
    <url-pattern>/\*</url-pattern>
</filter-mapping>

三、查询访问最耗时的接口

这就要用到万能的 awk 了 我们的日志倒数第二列显示的访问时间。

<Valve class
               prefix="localhost\_access\_log." suffix=".txt"
               pattern="%h %l %u %t "%r" \[%{postdata}r\] %s %{Referer}i %{User-Agent}i %T %b" />

cat logs/localhost_access_log.2016-10-25.txt | awk '{print (NF1)""(NF−1)""0}'| sort -n -r| awk'{1"ˉ";print1\="";print 0}'  按照倒数第二列由大到小显示接口以及访问时间。这样我们就能找出那些借口耗时较大,然后对其进行优化,提高用户体验。