我写了一个程序:实现用户对数据库表的访问记录跟踪 | 表的访问次数统计

158 阅读4分钟

业务场景:大数据集群中对数据库表的访问记录追踪,本程序以hive数据库为例进行讲解

目的:说明本程序实现用户对数据库表的访问记录跟踪 、表的访问次数统计

开发思路1: 原本计划是通过大数据权限校验组件Ranger的对hive表的鉴权审计中获取表的访问记录,但是该方式需要修改RangerHiveAuth中的一个方法,其实是在修改源码,且需要重启ranger服务,若是生产环境该方案操作风险较高,若重启ranger服务会对生产一些实时业务产生影响,所以暂时放弃此方式。

开发思路2: 通过配置Ranger的一些配置参数,需要重启Ranger服务以及Ranger依赖的服务都要重启生效(因此也被pass掉): 在Ranger产品文档中「配置组件审计日志保存至Elasticsearch」章节有以下参数配置:

ranger.audit.source.type
ranger.audit.elasticsearch.urls
ranger.audit.elasticsearch.protocol
ranger.audit.elasticsearch.index
ranger.audit.elasticsearch.user
ranger.audit.elasticsearch.password

修改Ranger参数 重启服务生效后的情况如下图(Ranger对表的授权记录): 在这里插入图片描述

思路3:本程序

通过大数据集群自带的一个接口实现,以下就是HiveMetaStore的访问记录接口,这个接口因该在每个大数据集群中都会有的,使用此接口即可查询hive表的访问记录,且可实时追踪,你们还可将表的访问记录存在ElasticSearch实时检索查询某用户对表的访问情况,可达到按月、周、日这样统计用户对表的访问情况,查询出表的访问Top记录。 在这里插入图片描述

先看程序执行效果(视频链接请点击)

一、先执行对表的查询
二、查询数据库表访问记录
三、查询一段时间内数据库某表的访问次数

程序执行结果如下: 图片.png 业务主程序如下「查询一段时间内数据库某表的访问次数」:

package com.wubai.ranger.api;


import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.http.HttpEntity;
import org.apache.http.client.CookieStore;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.BasicCookieStore;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import javax.xml.bind.DatatypeConverter;
import java.io.IOException;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;



public class AuditApi {
    private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
    private final static Log LOG = LogFactory.getLog(AuditApi.class.getName());

    private static String printResponse(CloseableHttpResponse httpResponse) throws Exception {
        HttpEntity entity = httpResponse.getEntity();


        String entiryString = "";
        if (entity != null) {
            entiryString = EntityUtils.toString(entity);

        }

        EntityUtils.consume(entity);
        return entiryString;
    }

    public static void main(String[] args) throws Exception {
        String ipAddr = "xx.xx.xx.253"; // ip为集群webui地址

        String serviceName = "Ranger";

        String url;
        //port为集群webui的访问端口
        url = "https://" + ipAddr + ":port/web/monitor/service/customize/1/" + serviceName + ".do";
        HttpGet httpGet = new HttpGet(url);

        //首次请求使用HTTP Basic认证,csdn(示例用户)用户名和密码需要根据实际环境修改
        String authStr = "csdn" + ":" + "12345678";
        //根据协议要求,用户名密码需要使用base64编码
        String encoding = DatatypeConverter.printBase64Binary(authStr.getBytes("UTF-8"));
        //HTTP头中增加 HTTP Basic认证指示
        httpGet.setHeader("Authorization", "Basic " + encoding);

        //保存cookie。可以在一次HTTP Basic认证后,使用session维持会话,无需重复认证
        //20分钟内没有任何请求发送,则session会过期,需要重新登录
        CookieStore cookieStore = new BasicCookieStore();
        CloseableHttpClient httpClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
        CloseableHttpResponse httpResponse = httpClient.execute(httpGet);


        url = "https://xx.xx.xx.:port/web/api/v2/log/search";
        // 动态生成时间(示例:结束时间=当前时间,开始时间=24小时前)
        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss");
        LocalDateTime endTime = LocalDateTime.now();
        LocalDateTime startTime = endTime.minusHours(24); // 可调整时间差

        String startTimeStr = startTime.format(formatter);
        String endTimeStr = endTime.format(formatter);

        HttpPost httpPost = new HttpPost(url);

        httpPost.setHeader("Content-Type", "application/json;charset=UTF-8");
        String content = "{\"keyWord\":\": tbl\",\"sources\":[{\"sourceType\":\"CLUSTER\",\"clusterId\":1,\"clusterName\":\"\",\"services\":\"Hive:Hive:MetaStore\"}]" +
                ",\"logFileName\":\"metastore-audit\",\"minLogLevel\":\"INFO\",\"hosts\":\"\",\"startTime\":\"" + startTimeStr + "\"" +
                ",\"endTime\":\"" + endTimeStr + "\",\"timeOut\":\"120000\",\"maxLogsNum\":\"1000\"}";
        StringEntity stringEntity = new StringEntity(content);
        stringEntity.setContentEncoding("UTF-8");
        stringEntity.setContentType("application/json");
        httpPost.setEntity(stringEntity);


        httpResponse = httpClient.execute(httpPost);


        url = "https://" + ipAddr + ":port/web/api/v2/log/search/progress/" + printResponse(httpResponse).split(":")[1]
                .replace("\"", "").replace("}", "");

        HttpGet httpGet2 = new HttpGet(url);

        CloseableHttpResponse httpResponse2 = null;
        int maxAttempts = 100; // max attempts
        int attempt = 0;
        while (attempt < maxAttempts) {
            try {
                httpResponse2 = httpClient.execute(httpGet2);
                if (printResponse(httpResponse2).contains("COMPLETE")) {
                    break;
                }
                Thread.sleep(10);
            } catch (Exception e) {
                System.err.println("ERROR: " + e.getMessage());
            }
            attempt++;
        }
        httpResponse2 = httpClient.execute(httpGet2);
        List<Map> details = jsonToMaps(printResponse(httpResponse2));
        ConcurrentHashMap<String, Integer> tableCounts = new ConcurrentHashMap<>();
        String regex = "tbl=hive.([\\w.]+)(?=\\s|\\|)";
        Pattern pattern = Pattern.compile(regex);
        StringBuffer stringBuffer = new StringBuffer();
        for (Map detail : details) {
            Matcher matcher = pattern.matcher(String.valueOf(detail.get("details")));
            if (matcher.find()) {
                StringBuffer tableName = new StringBuffer(matcher.group(1));
                if (!tableName.toString().equals(stringBuffer.toString())) {
                    tableCounts.compute(tableName.toString(), (k, v) -> (v == null) ? 1 : v + 1);
                }
                stringBuffer.setLength(0);
                stringBuffer.append(tableName.toString());
            }

        }
        sortAndPrint(tableCounts);
    }

    public static List<Map> jsonToMaps(String json) throws IOException {

        Map map = OBJECT_MAPPER.readValue(json, Map.class);
        return (List) (map.get("details"));
    }

    private static void sortAndPrint(Map<String, Integer> tableCounts) {
        // 将Map=>List
        java.util.List<Map.Entry<String, Integer>> list = new java.util.ArrayList<>(tableCounts.entrySet());

        // compareTo,DESC
        list.sort((o1, o2) -> o2.getValue().compareTo(o1.getValue()));

        for (Map.Entry<String, Integer> entry : list) {
            System.out.println(entry.getKey() + " " + entry.getValue() + "次");
        }

    }
}

其他类程序:详解主页「资源」 最终输出结果如下: 在这里插入图片描述

#本程序实现用户对数据库表的访问记录跟踪 、表的访问次数统计 #真实业务开发 #大数据 #Java