一、问题现象
2021-04-07 18:42 发现
总体的qps没有任何变化的情况下,包括mysql的qps没有任何增加,接口总体的p99相较于redis版本升级前增加了30~50ms(与预期不符,预期是redis版本切换之后P99无影响)
二、排查过程
1.排查GC
young gc也有相应变化,p99在升高时期,young gc的次数变多,由点图换成了线图。
但同时对比了早上开盘期间9:30~10:30之间的,也是出现的有yong gc频繁,但是P99没有像redis版本升级之后延迟增加那么严重(故yong gc不是本次P99升高的根本原因)
2.排查redis的P99,对比2.8来说P99表现更好,而且REDIS4.log下也没有client的slowlog或者其他异常log(故redis4server与本次P99升高的原因无关)
3. 使用arthas分析实际的性能如下(注:这一步是在RC环境操作,线上非必要禁止使用)
先看异常接口:batchquery,只调用评论的情况下,如图。
详细分析每一个query 的结果如下:发现一次redis get操作耗时 > 3ms。结果预计远低于预期,一个redis get操作理论上是1ms左右,不可能3ms。
继续追踪,代码如下。
redis2.8旧版解析速度:0.014ms。
redis4新版解析速度:3.7ms ????
对应代码:
redis4 和redis2.``8``相同:``private final Gson gson = ``new GsonBuilder()`` ``.registerTypeAdapterFactory(XQDateTypeAdapter.FACTORY)`` ``.create(); private <T> T fromJson(String value, Class<T> beanClass) {`` ``try {`` ``return gson.fromJson(value, beanClass);`` ``} ``catch (JsonParseException ex) {`` ``throw new DataAccessException(ex);`` ``}``} |
|---|
由arthas定位置后,最初觉得是2.8和4反序列化的内容不一致,抽样一个key,进行查询验证,redis存储内容value一致(排除反序列化对象不一致,也就是Redis2.8与4内容不一致的原因)
通过单元测试代码验证,发现反序列化耗时一致(排除反序列化的原因)
public static final String REDIS_2_RESULT =`` ``"{\"rqid\":0,\"censored\":false,\"spamType\":-1,\"modifiedSpamType\":-1,\"blocked\":false,\"blocking\":false,\"symbol_count\":0,\"donate_count\":0,\"donate_snowcoin\":0,\"donate_user_count\":0,\"rewardCount\":0,\"rewardAmount\":0,\"rewardUserCount\":0,\"talkCount\":0,\"likeCount\":0,\"liked\":false,\"rawPic\":\"\",\"rawCoverPic\":\"\",\"view_count\":0,\"weixin_retweet_count\":0,\"mark\":0,\"card\":{\"statusId\":176548090,\"typ\":71,\"param\":\"CSI1014\",\"data\":\"{\\\"symbol\\\":\\\"CSI1014\\\",\\\"name\\\":\\\"我要稳稳的幸福\\\",\\\"type\\\":27}\",\"expiredAt\":0},\"promotionId\":0,\"favorited\":false,\"picSizes\":[],\"tags\":[],\"excellentComments\":[],\"isFilterLongText\":false,\"sourceFeed\":false,\"editable\":false,\"showText\":false,\"answerCount\":0,\"replyUserCount\":0,\"id\":176548090,\"user_id\":-1,\"source\":\"公告\",\"title\":\"我要稳稳的幸福(CSI1014)\",\"text\":\"@交银施罗德基金:$我要稳稳的幸福(CSI1014)$ 2021-04-07日涨幅-0.05%,今年来涨幅1.17%,当前净值为1.3469。查看近期定投记录/购买请点击:https://danjuanfunds.com/j/3jw92n 。\",\"created_at\":\"Apr 8, 2021 9:00:07 AM\",\"retweet_count\":0,\"reply_count\":0,\"fav_count\":0,\"truncated\":false,\"commentId\":0,\"retweet_status_id\":0,\"forward\":0,\"in_retweet_status_id\":0,\"symbol_id\":\"CSI1014\",\"truncated_by\":0,\"flags\":0,\"flagsObj\":{\"flags\":0},\"metaKeywords\":\"{\\\"stockCorrelation\\\":\\\"CSI1014\\\",\\\"stockList\\\":\\\"CSI1014_27\\\"}\",\"pic\":\"\",\"coverPic\":\"\"}"``;``public static final String REDIS_4_RESULT =`` ``"{\"blocked\":false,\"blocking\":false,\"donate_count\":0,\"donate_snowcoin\":0,\"donate_user_count\":0,\"liked\":false,\"rewardCount\":0,\"rewardAmount\":0,\"rewardUserCount\":0,\"offerAmount\":0,\"picSizes\":[],\"rootReplyToCommentId\":-1,\"commentReplyCount\":0,\"childComments\":[],\"editable\":false,\"id\":129624722,\"user_id\":2979297684,\"text\":\"你們的成本那麼高 [为什么]\",\"created_at\":\"Nov 1, 2019 3:38:58 PM\",\"in_reply_to_comment_id\":0,\"root_in_reply_to_status_id\":135039982,\"retweet_status_id\":0,\"statusId\":0,\"truncated\":false,\"pic\":\"\",\"client\":\"WiCimxpj5H\",\"truncated_by\":0,\"likeCount\":0,\"source\":\"股票微博\",\"flags\":0,\"flagsObj\":{\"flags\":0},\"rqid\":0,\"payment\":0,\"amount\":0,\"created_ip\":\"45.64.240.13\",\"reply_count\":0}"``; public static void main(String[] args) {`` ``CommentView commentView;`` ``int batch = ``10000``;`` ``long start = System.currentTimeMillis();`` ``Gson gson = ``new GsonBuilder().registerTypeAdapterFactory(XQDateTypeAdapter.FACTORY).create();`` ``System.out.println(``"gson build cost " + (System.currentTimeMillis() - start) + ``" ms."``); ``start = System.currentTimeMillis();`` ``for (``int i = ``0``; i < ``1``; i++) {`` ``commentView = gson.fromJson(REDIS_2_RESULT, CommentView.``class``);`` ``}`` ``System.out.println(`` ``String.format(``"redis-2 do once cost %d ms."``, System.currentTimeMillis() - start)); ``start = System.currentTimeMillis();`` ``for (``int i = ``0``; i < ``1``; i++) {`` ``commentView = gson.fromJson(REDIS_4_RESULT, CommentView.``class``);`` ``}`` ``System.out.println(`` ``String.format(``"redis-4 do once cost %d ms."``, System.currentTimeMillis() - start)); |
|---|
4. 进行codereview发现,其中在2.8中则是采用的map<K,V>形式的单例,而redis4的实现则是每次new一个实例
jvisualvm打开dump堆内存分析也可以看出
(代码地址同学们可以继续review这段代码,看看这个单例是否还有问题)
``ICache<CommentView> cache = getCommentViewCache(); ``//2.8`` ``ICache<CommentView> newCache = getNewCommentViewCache(); ``//4 //单例`` ``protected ICache<CommentView> getCommentViewCache() ``throws DataAccessException {`` ``return statusCacheSession.getSingletonCache(CommentView.``class``);`` ``} //每次new`` ``protected ICache<CommentView> getNewCommentViewCache() ``throws DataAccessException {`` ``return new RedisCluster4Cache<>(commentRedis);`` ``} |
|---|
再结合上面的arthas进行分析,进行了redis版本切换之后,每次从redis获取到的返回值,在redis2.8中则是使用了同一个的Gson进行反序列化,而在4中则是每一次new一个Gson反序列化(这里抛出一个问题,单例和多例的性能差别在哪? )
根据测试用例,就可以验证是由于没有使用单例造成P99升高的结论(这个见附件)
public static final String REDIS_2_RESULT =`` ``"{\"rqid\":0,\"censored\":false,\"spamType\":-1,\"modifiedSpamType\":-1,\"blocked\":false,\"blocking\":false,\"symbol_count\":0,\"donate_count\":0,\"donate_snowcoin\":0,\"donate_user_count\":0,\"rewardCount\":0,\"rewardAmount\":0,\"rewardUserCount\":0,\"talkCount\":0,\"likeCount\":0,\"liked\":false,\"rawPic\":\"\",\"rawCoverPic\":\"\",\"view_count\":0,\"weixin_retweet_count\":0,\"mark\":0,\"card\":{\"statusId\":176548090,\"typ\":71,\"param\":\"CSI1014\",\"data\":\"{\\\"symbol\\\":\\\"CSI1014\\\",\\\"name\\\":\\\"我要稳稳的幸福\\\",\\\"type\\\":27}\",\"expiredAt\":0},\"promotionId\":0,\"favorited\":false,\"picSizes\":[],\"tags\":[],\"excellentComments\":[],\"isFilterLongText\":false,\"sourceFeed\":false,\"editable\":false,\"showText\":false,\"answerCount\":0,\"replyUserCount\":0,\"id\":176548090,\"user_id\":-1,\"source\":\"公告\",\"title\":\"我要稳稳的幸福(CSI1014)\",\"text\":\"@交银施罗德基金:$我要稳稳的幸福(CSI1014)$ 2021-04-07日涨幅-0.05%,今年来涨幅1.17%,当前净值为1.3469。查看近期定投记录/购买请点击:https://danjuanfunds.com/j/3jw92n 。\",\"created_at\":\"Apr 8, 2021 9:00:07 AM\",\"retweet_count\":0,\"reply_count\":0,\"fav_count\":0,\"truncated\":false,\"commentId\":0,\"retweet_status_id\":0,\"forward\":0,\"in_retweet_status_id\":0,\"symbol_id\":\"CSI1014\",\"truncated_by\":0,\"flags\":0,\"flagsObj\":{\"flags\":0},\"metaKeywords\":\"{\\\"stockCorrelation\\\":\\\"CSI1014\\\",\\\"stockList\\\":\\\"CSI1014_27\\\"}\",\"pic\":\"\",\"coverPic\":\"\"}"``;``public static final String REDIS_4_RESULT =`` ``"{\"blocked\":false,\"blocking\":false,\"donate_count\":0,\"donate_snowcoin\":0,\"donate_user_count\":0,\"liked\":false,\"rewardCount\":0,\"rewardAmount\":0,\"rewardUserCount\":0,\"offerAmount\":0,\"picSizes\":[],\"rootReplyToCommentId\":-1,\"commentReplyCount\":0,\"childComments\":[],\"editable\":false,\"id\":129624722,\"user_id\":2979297684,\"text\":\"你們的成本那麼高 [为什么]\",\"created_at\":\"Nov 1, 2019 3:38:58 PM\",\"in_reply_to_comment_id\":0,\"root_in_reply_to_status_id\":135039982,\"retweet_status_id\":0,\"statusId\":0,\"truncated\":false,\"pic\":\"\",\"client\":\"WiCimxpj5H\",\"truncated_by\":0,\"likeCount\":0,\"source\":\"股票微博\",\"flags\":0,\"flagsObj\":{\"flags\":0},\"rqid\":0,\"payment\":0,\"amount\":0,\"created_ip\":\"45.64.240.13\",\"reply_count\":0}"``; public static void main(String[] args) {`` ``CommentView commentView;`` ``int batch = ``10000``;`` ``long start = System.currentTimeMillis();`` ``Gson gson = ``new GsonBuilder().registerTypeAdapterFactory(XQDateTypeAdapter.FACTORY).create();`` ``System.out.println(``"gson build cost " + (System.currentTimeMillis() - start) + ``" ms."``); ``start = System.currentTimeMillis();`` ``for (``int i = ``0``; i < batch; i++) {`` ``commentView = gson.fromJson(REDIS_2_RESULT, CommentView.``class``);`` ``}`` ``System.out.println(`` ``String.format(``"redis-2 finish cost %d ms."``, System.currentTimeMillis() - start)); ``start = System.currentTimeMillis();`` ``for (``int i = ``0``; i < batch; i++) {`` ``commentView = gson.fromJson(REDIS_2_RESULT, CommentView.``class``);`` ``}`` ``System.out.println(`` ``String.format(``"redis-4 finish cost %d ms."``, System.currentTimeMillis() - start)); ``start = System.currentTimeMillis();`` ``for (``int i = ``0``; i < batch; i++) {`` ``Gson gson1 = ``new GsonBuilder().registerTypeAdapterFactory(XQDateTypeAdapter.FACTORY).create();`` ``commentView = gson1.fromJson(REDIS_4_RESULT, CommentView.``class``);`` ``}`` ``System.out.println(`` ``String.format(``"redis-4 finish cost %d ms."``, System.currentTimeMillis() - start)); |
|---|
三、解决办法
将4的部分的实例获取修改为单例
优化后:
protected ICache<CommentView> getNewCommentViewCache() {`` ``return getCommentSingletonCache(CommentView.``class``);``} /**`` ``* 获取redis cache`` ``*`` ``* @param clazz`` ``* @param <T>`` ``* @return`` ``*/``@SuppressWarnings``({``"rawtypes"``, ``"unchecked"``})``private <T ``extends Serializable> ICache<T> getCommentSingletonCache(Class clazz) {`` ``try {`` ``return cacheMap.computeIfAbsent(clazz, key -> ``new RedisCluster4Cache<>(commentRedis));`` ``} ``catch (Exception e) {`` ``throw new DataAccessException(e);`` ``}``} |
|---|
四、根本原因
1.帖子评论的这个接口是循环查找的,那就意味着在for循环中一个不起眼的地方增加1ms,那么整个接口就会扩大N倍(本次就是因为没有使用单例导致无法利用Gson内部的adapter Cache)
2.单例是本次事件的根本原因吗?
是:因为对于单例对象内部的实现逻辑较为复杂的情况下,使用单例,就简化了每次的单例对象初始化流程和耗时
也不是:要是这个单例对象的内部实现逻辑简单,那么每次new和使用单例是差不多的(这个已经通过简单对象验证,见附件),只是内存的占用上会比较浪费(在高QPS下会表现的较为明显)
本次事件就属于这个单例对象内部的Gson反序列化实现较为复杂,Gson的反序列化有自己的内部逻辑对adapter进行cache,那是每次new一个那么在复杂对象反序列化时性能就会大打折扣
对于adapter在复杂对象反序列化下cache的耗时影响,可以自行修改源码其中一个类在方法的前后打印耗时,然后使用endorsed技术java start时替换成你的实现,然后验证耗时日志
-Djava.endorsed.dirs=/Users/zhangyf/Dev/Projects/redis-cluster-test-app/target/classes/com/google/gson
这样比修改源代码 重新打包方便多了
github.com/google/gson(Gson的原理这里不细展开)
/**`` ``* Returns the type adapter for {@code} type.`` ``*`` ``* @throws IllegalArgumentException if this GSON cannot serialize and`` ``* deserialize {@code type}.`` ``*/``@SuppressWarnings``(``"unchecked"``)``public <T> TypeAdapter<T> getAdapter(TypeToken<T> type) {`` ``TypeAdapter<?> cached = typeTokenCache.get(type == ``null ? NULL_KEY_SURROGATE : type);`` ``if (cached != ``null``) {`` ``return (TypeAdapter<T>) cached;`` ``} ``Map<TypeToken<?>, FutureTypeAdapter<?>> threadCalls = calls.get();`` ``boolean requiresThreadLocalCleanup = ``false``;`` ``if (threadCalls == ``null``) {`` ``threadCalls = ``new HashMap<TypeToken<?>, FutureTypeAdapter<?>>();`` ``calls.set(threadCalls);`` ``requiresThreadLocalCleanup = ``true``;`` ``} ``// the key and value type parameters always agree`` ``FutureTypeAdapter<T> ongoingCall = (FutureTypeAdapter<T>) threadCalls.get(type);`` ``if (ongoingCall != ``null``) {`` ``return ongoingCall;`` ``} ``try {`` ``FutureTypeAdapter<T> call = ``new FutureTypeAdapter<T>();`` ``threadCalls.put(type, call); ``for (TypeAdapterFactory factory : factories) {`` ``TypeAdapter<T> candidate = factory.create(``this``, type);`` ``if (candidate != ``null``) {`` ``call.setDelegate(candidate);`` ``typeTokenCache.put(type, candidate);`` ``return candidate;`` ``}`` ``}`` ``throw new IllegalArgumentException(``"GSON (" + GsonBuildConfig.VERSION + ``") cannot handle " + type);`` ``} ``finally {`` ``threadCalls.remove(type); ``if (requiresThreadLocalCleanup) {`` ``calls.remove();`` ``}`` ``}``} |
|---|