业务场景:大数据集群中对数据库表的访问记录追踪,本程序以hive数据库为例进行讲解
目的:说明本程序实现用户对数据库表的访问记录跟踪 、表的访问次数统计
开发思路1: 原本计划是通过大数据权限校验组件Ranger的对hive表的鉴权审计中获取表的访问记录,但是该方式需要修改RangerHiveAuth中的一个方法,其实是在修改源码,且需要重启ranger服务,若是生产环境该方案操作风险较高,若重启ranger服务会对生产一些实时业务产生影响,所以暂时放弃此方式。
开发思路2: 通过配置Ranger的一些配置参数,需要重启Ranger服务以及Ranger依赖的服务都要重启生效(因此也被pass掉): 在Ranger产品文档中「配置组件审计日志保存至Elasticsearch」章节有以下参数配置:
ranger.audit.source.type
ranger.audit.elasticsearch.urls
ranger.audit.elasticsearch.protocol
ranger.audit.elasticsearch.index
ranger.audit.elasticsearch.user
ranger.audit.elasticsearch.password
修改Ranger参数 重启服务生效后的情况如下图(Ranger对表的授权记录):
思路3:本程序
通过大数据集群自带的一个接口实现,以下就是HiveMetaStore的访问记录接口,这个接口因该在每个大数据集群中都会有的,使用此接口即可查询hive表的访问记录,且可实时追踪,你们还可将表的访问记录存在ElasticSearch实时检索查询某用户对表的访问情况,可达到按月、周、日这样统计用户对表的访问情况,查询出表的访问Top记录。
先看程序执行效果(视频链接请点击)
一、先执行对表的查询
二、查询数据库表访问记录
三、查询一段时间内数据库某表的访问次数
程序执行结果如下:
业务主程序如下「查询一段时间内数据库某表的访问次数」:
package com.wubai.ranger.api;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.http.HttpEntity;
import org.apache.http.client.CookieStore;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.BasicCookieStore;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
import javax.xml.bind.DatatypeConverter;
import java.io.IOException;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class AuditApi {
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
private final static Log LOG = LogFactory.getLog(AuditApi.class.getName());
private static String printResponse(CloseableHttpResponse httpResponse) throws Exception {
HttpEntity entity = httpResponse.getEntity();
String entiryString = "";
if (entity != null) {
entiryString = EntityUtils.toString(entity);
}
EntityUtils.consume(entity);
return entiryString;
}
public static void main(String[] args) throws Exception {
String ipAddr = "xx.xx.xx.253"; // ip为集群webui地址
String serviceName = "Ranger";
String url;
//port为集群webui的访问端口
url = "https://" + ipAddr + ":port/web/monitor/service/customize/1/" + serviceName + ".do";
HttpGet httpGet = new HttpGet(url);
//首次请求使用HTTP Basic认证,csdn(示例用户)用户名和密码需要根据实际环境修改
String authStr = "csdn" + ":" + "12345678";
//根据协议要求,用户名密码需要使用base64编码
String encoding = DatatypeConverter.printBase64Binary(authStr.getBytes("UTF-8"));
//HTTP头中增加 HTTP Basic认证指示
httpGet.setHeader("Authorization", "Basic " + encoding);
//保存cookie。可以在一次HTTP Basic认证后,使用session维持会话,无需重复认证
//20分钟内没有任何请求发送,则session会过期,需要重新登录
CookieStore cookieStore = new BasicCookieStore();
CloseableHttpClient httpClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
CloseableHttpResponse httpResponse = httpClient.execute(httpGet);
url = "https://xx.xx.xx.:port/web/api/v2/log/search";
// 动态生成时间(示例:结束时间=当前时间,开始时间=24小时前)
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss");
LocalDateTime endTime = LocalDateTime.now();
LocalDateTime startTime = endTime.minusHours(24); // 可调整时间差
String startTimeStr = startTime.format(formatter);
String endTimeStr = endTime.format(formatter);
HttpPost httpPost = new HttpPost(url);
httpPost.setHeader("Content-Type", "application/json;charset=UTF-8");
String content = "{\"keyWord\":\": tbl\",\"sources\":[{\"sourceType\":\"CLUSTER\",\"clusterId\":1,\"clusterName\":\"\",\"services\":\"Hive:Hive:MetaStore\"}]" +
",\"logFileName\":\"metastore-audit\",\"minLogLevel\":\"INFO\",\"hosts\":\"\",\"startTime\":\"" + startTimeStr + "\"" +
",\"endTime\":\"" + endTimeStr + "\",\"timeOut\":\"120000\",\"maxLogsNum\":\"1000\"}";
StringEntity stringEntity = new StringEntity(content);
stringEntity.setContentEncoding("UTF-8");
stringEntity.setContentType("application/json");
httpPost.setEntity(stringEntity);
httpResponse = httpClient.execute(httpPost);
url = "https://" + ipAddr + ":port/web/api/v2/log/search/progress/" + printResponse(httpResponse).split(":")[1]
.replace("\"", "").replace("}", "");
HttpGet httpGet2 = new HttpGet(url);
CloseableHttpResponse httpResponse2 = null;
int maxAttempts = 100; // max attempts
int attempt = 0;
while (attempt < maxAttempts) {
try {
httpResponse2 = httpClient.execute(httpGet2);
if (printResponse(httpResponse2).contains("COMPLETE")) {
break;
}
Thread.sleep(10);
} catch (Exception e) {
System.err.println("ERROR: " + e.getMessage());
}
attempt++;
}
httpResponse2 = httpClient.execute(httpGet2);
List<Map> details = jsonToMaps(printResponse(httpResponse2));
ConcurrentHashMap<String, Integer> tableCounts = new ConcurrentHashMap<>();
String regex = "tbl=hive.([\\w.]+)(?=\\s|\\|)";
Pattern pattern = Pattern.compile(regex);
StringBuffer stringBuffer = new StringBuffer();
for (Map detail : details) {
Matcher matcher = pattern.matcher(String.valueOf(detail.get("details")));
if (matcher.find()) {
StringBuffer tableName = new StringBuffer(matcher.group(1));
if (!tableName.toString().equals(stringBuffer.toString())) {
tableCounts.compute(tableName.toString(), (k, v) -> (v == null) ? 1 : v + 1);
}
stringBuffer.setLength(0);
stringBuffer.append(tableName.toString());
}
}
sortAndPrint(tableCounts);
}
public static List<Map> jsonToMaps(String json) throws IOException {
Map map = OBJECT_MAPPER.readValue(json, Map.class);
return (List) (map.get("details"));
}
private static void sortAndPrint(Map<String, Integer> tableCounts) {
// 将Map=>List
java.util.List<Map.Entry<String, Integer>> list = new java.util.ArrayList<>(tableCounts.entrySet());
// compareTo,DESC
list.sort((o1, o2) -> o2.getValue().compareTo(o1.getValue()));
for (Map.Entry<String, Integer> entry : list) {
System.out.println(entry.getKey() + " " + entry.getValue() + "次");
}
}
}
其他类程序:详解主页「资源」
最终输出结果如下:
#本程序实现用户对数据库表的访问记录跟踪 、表的访问次数统计 #真实业务开发 #大数据 #Java