背景
4月份开始,公司的几个Android项目线上突然出现了很多华为鸿蒙的设备TooManyRequestsException
的问题,在社区上也零零散散看到一些关于华为鸿蒙设备出现TooManyRequestsException
的反馈,但是都没有明确的解决方案。
排查过程
1、从崩溃堆栈看,问题发生在WorkManager
库里的注册网络监听的代码里。
2、我们立刻排查了项目里用到WorkManager
的地方,发现在闪屏广告里用到了WorkManager
下载图片的逻辑,但是这块的代码一直都没有改动,咨询了业务方,在发生异常前确实有上线一些城市的闪屏广告,于是寻求业务方配合临时下线部份城市的闪屏广告,崩溃异常有所下降。
3、进一步分析了项目中使用WorkManager
的地方,发现这一业务逻辑并不是最近上线的新功能,之前也从未出现过类似问题。同时,我还查看了应用中所有注册网络监听的地方,因为从堆栈信息来看,问题是由于注册了过多的网络监听,触发了系统的阈值限制。在对第三方库的分析中,我确实发现了一些注册网络监听的代码,但这些地方都已经加上了try-catch
块。
如glide图片加载库的注册逻辑
4、为了解决这个问题,我们首先对使用了WorkManager
的业务逻辑进行了改造,同时在项目中所有使用了registerDefaultNetworkCallback
的地方添加了try-catch
保护。这一措施暂时缓解了崩溃问题,但可能会影响一些依赖网络监听的业务。我们正在进一步分析和优化,以确保业务的正常运行。
于是对 registerDefaultNetworkCallback
的源码逻辑进行了分析:
源码分析
为了更全面地解决问题,我们深入分析了registerDefaultNetworkCallback
在不同Android版本中的源码逻辑。尽管在Android 12以下和Android 12及以上版本中存在一些差异,但整体核心逻辑并未发生重大变化。以下是具体的源码分析:
Android11
源码
registerDefaultNetworkCallback
是一个跨进程的操作,首先从ConnectivityManager
类的registerDefaultNetworkCallback
开始,调用了sendRequestForNetwork
,之后跨进程调用了ConnectivityService
的requestNetwork
方法
ConnectivityManager
ConnectivityManager cm = (ConnectivityManager) this.getSystemService(Context.CONNECTIVITY_SERVICE);
cm.registerDefaultNetworkCallback(new ConnectivityManager.NetworkCallback(){
});
@RequiresPermission(android.Manifest.permission.ACCESS_NETWORK_STATE)
public void registerDefaultNetworkCallback(@NonNull NetworkCallback networkCallback,
@NonNull Handler handler) {
// This works because if the NetworkCapabilities are null,
// ConnectivityService takes them from the default request.
//
// Since the capabilities are exactly the same as the default request's
// capabilities, this request is guaranteed, at all times, to be
// satisfied by the same network, if any, that satisfies the default
// request, i.e., the system default network.
CallbackHandler cbHandler = new CallbackHandler(handler);
sendRequestForNetwork ( null /* NetworkCapabilities need */ , networkCallback, 0 ,
REQUEST , TYPE_NONE , cbHandler);
}
private NetworkRequest sendRequestForNetwork(NetworkCapabilities need, NetworkCallback callback,
int timeoutMs, int action, int legacyType, CallbackHandler handler) {
printStackTrace();
checkCallbackNotNull(callback);
Preconditions.checkArgument(action == REQUEST || need != null, "null NetworkCapabilities");
final NetworkRequest request;
final String callingPackageName = mContext.getOpPackageName();
try {
synchronized(sCallbacks) {
...
if (action == LISTEN) {
request = mService.listenForNetwork(
need, messenger, binder, callingPackageName);
} else {
request = mService. requestNetwork (
need, messenger, timeoutMs, binder, legacyType, callingPackageName);
}
if (request != null) {
sCallbacks.put(request, callback);
}
callback.networkRequest = request;
}
} catch (RemoteException e) {
throw e.rethrowFromSystemServer();
} catch (ServiceSpecificException e) {
throw convertServiceException(e);
}
return request;
}
ConnectivityService
在ConnectivityService
的requestNetwork
中创建了一个NetworkRequestInfo
对象,在NetworkRequestInfo的构造函数调用了enforceRequestCountLimit
检查数量上限,如果通过uid的注册数量大于等于100,则抛出TOO_MANY_REQUESTS异常,由ConnectivityManager
的sendRequestForNetwork
捕获并再次抛出
@Override
public NetworkRequest requestNetwork(NetworkCapabilities networkCapabilities,
Messenger messenger, int timeoutMs, IBinder binder, int legacyType,
@NonNull String callingPackageName) {
...
NetworkRequest networkRequest = new NetworkRequest(networkCapabilities, legacyType,
nextNetworkRequestId(), type);
NetworkRequestInfo nri = new NetworkRequestInfo (messenger, networkRequest, binder);
if (DBG) log("requestNetwork for " + nri);
mHandler.sendMessage(mHandler.obtainMessage(EVENT_REGISTER_NETWORK_REQUEST, nri));
if (timeoutMs > 0) {
mHandler.sendMessageDelayed(mHandler.obtainMessage(EVENT_TIMEOUT_NETWORK_REQUEST,
nri), timeoutMs);
}
return networkRequest;
}
NetworkRequestInfo
NetworkRequestInfo(Messenger m, NetworkRequest r, IBinder binder) {
super();
...
enforceRequestCountLimit ();
try {
mBinder.linkToDeath(this, 0);
} catch (RemoteException e) {
binderDied();
}
}
private void enforceRequestCountLimit() {
synchronized (mUidToNetworkRequestCount) {
int networkRequests = mUidToNetworkRequestCount.get(mUid, 0) + 1;
if (networkRequests >= MAX_NETWORK_REQUESTS_PER_UID ) {
throw new ServiceSpecificException (
ConnectivityManager . Errors . TOO_MANY_REQUESTS );
}
mUidToNetworkRequestCount.put(mUid, networkRequests);
}
}
时序图
Android14
源码
在Android14中 ConnectivityManager
里的调用逻辑和Android11中没有太大区别,只是Android多了registerDefaultNetworkCallbackForUid
方法,但是在应用层的调用传入的uid是固定的值
ConnectivityManager
ConnectivityManager cm = (ConnectivityManager) this.getSystemService(Context.CONNECTIVITY_SERVICE);
cm.registerDefaultNetworkCallback(new ConnectivityManager.NetworkCallback(){
});
@RequiresPermission(android.Manifest.permission.ACCESS_NETWORK_STATE)
public void registerDefaultNetworkCallback(@NonNull NetworkCallback networkCallback,
@NonNull Handler handler) {
registerDefaultNetworkCallbackForUid(Process.INVALID_UID, networkCallback, handler);
}
public void registerDefaultNetworkCallbackForUid(int uid,
@NonNull NetworkCallback networkCallback, @NonNull Handler handler) {
CallbackHandler cbHandler = new CallbackHandler(handler);
sendRequestForNetwork(uid, null /* need */, networkCallback, 0 /* timeoutMs */,
TRACK_DEFAULT, TYPE_NONE, cbHandler);
}
private NetworkRequest sendRequestForNetwork(int asUid, NetworkCapabilities need,
NetworkCallback callback, int timeoutMs, NetworkRequest.Type reqType, int legacyType,
CallbackHandler handler) {
....
try {
synchronized(sCallbacks) {
......
if (reqType == LISTEN) {
request = mService.listenForNetwork(
need, messenger, binder, callbackFlags, callingPackageName,
getAttributionTag());
} else {
// 走到ConnectivityService里
request = mService. requestNetwork (
asUid, need, reqType. ordinal (), messenger, timeoutMs, binder,
legacyType, callbackFlags, callingPackageName, getAttributionTag ());
}
if (request != null) {
sCallbacks.put(request, callback);
}
callback.networkRequest = request;
}
} catch (RemoteException e) {
throw e.rethrowFromSystemServer();
} catch (ServiceSpecificException e) {
//抛异常的点
throw convertServiceException(e);
}
return request;
}
ConnectivityService
在Android14中,ConnectivityService
的源码位置发生了改变,在 www.aospxref.com/android-14.…
@Override
public NetworkRequest requestNetwork(int asUid, NetworkCapabilities networkCapabilities,
int reqTypeInt, Messenger messenger, int timeoutMs, final IBinder binder,
int legacyType, int callbackFlags, @NonNull String callingPackageName,
@Nullable String callingAttributionTag) {
...
final NetworkRequest networkRequest = new NetworkRequest(networkCapabilities, legacyType,
nextNetworkRequestId(), reqType);
final NetworkRequestInfo nri = getNriToRegister (
asUid, networkRequest, messenger, binder, callbackFlags,
callingAttributionTag);
if (DBG) log("requestNetwork for " + nri);
trackUidAndRegisterNetworkRequest(EVENT_REGISTER_NETWORK_REQUEST, nri);
if (timeoutMs > 0) {
mHandler.sendMessageDelayed(mHandler.obtainMessage(EVENT_TIMEOUT_NETWORK_REQUEST,
nri), timeoutMs);
}
return networkRequest;
}
private NetworkRequestInfo getNriToRegister(final int asUid, @NonNull final NetworkRequest nr,
@Nullable final Messenger msgr, @Nullable final IBinder binder,
@NetworkCallback.Flag int callbackFlags,
@Nullable String callingAttributionTag) {
....
return new NetworkRequestInfo (
asUid , requests , nr , msgr , binder , callbackFlags , callingAttributionTag );
}
NetworkRequestInfo(int asUid, @NonNull final List<NetworkRequest> r,
@NonNull final NetworkRequest requestForCallback, @Nullable final Messenger m,
@Nullable final IBinder binder,
@NetworkCallback.Flag int callbackFlags,
@Nullable String callingAttributionTag) {
super();
....
mPerUidCounter = getRequestCounter ( this );
mPerUidCounter . incrementCountOrThrow ( mUid );
....
}
private RequestInfoPerUidCounter getRequestCounter(NetworkRequestInfo nri) {
return hasAnyPermissionOf(mContext,
nri.mPid, nri.mUid, NetworkStack.PERMISSION_MAINLINE_NETWORK_STACK)
? mSystemNetworkRequestCounter : mNetworkRequestCounter ;
}
public static class RequestInfoPerUidCounter extends PerUidCounter {
RequestInfoPerUidCounter(int maxCountPerUid) {
super(maxCountPerUid);
}
@Override
public synchronized void incrementCountOrThrow(int uid) {
try {
super . incrementCountOrThrow ( uid );
} catch (IllegalStateException e) {
throw new ServiceSpecificException (
ConnectivityManager . Errors . TOO_MANY_REQUESTS ,
"Uid " + uid + " exceeded its allotted requests limit" );
}
}
@Override
public synchronized void decrementCountOrThrow(int uid) {
throw new UnsupportedOperationException("Use decrementCount instead.");
}
public synchronized void decrementCount(int uid) {
try {
super.decrementCountOrThrow(uid);
} catch (IllegalStateException e) {
logwtf("Exception when decrement per uid request count: ", e);
}
}
}
public synchronized void incrementCountOrThrow(final int uid) {
final long newCount = ((long) mUidToCount.get(uid, 0)) + 1;
if (newCount > mMaxCountPerUid) {
throw new IllegalStateException("Uid " + uid + " exceeded its allowed limit");
}
// Since the count cannot be greater than Integer.MAX_VALUE here since mMaxCountPerUid
// is an integer, it is safe to cast to int.
mUidToCount.put(uid, (int) newCount);
}
时序图
基于对源码的深入分析,我们发现,为了避免出现问题,需要采取以下措施:
- 保证
registerDefaultNetworkCallback
和unregisterNetworkCallback
成对出现: 确保每次调用registerDefaultNetworkCallback
时,都有对应的unregisterNetworkCallback
调用,以释放资源,防止累积过多的网络请求。 - 控制调用数量或使用
try-catch
保护代码: 在调用registerDefaultNetworkCallback
的地方,要么进行try-catch
保护,以捕捉并处理可能的异常,要么通过逻辑控制调用的数量,避免超出系统限制。
对于我们项目自身的调用,我们可以严格控制并管理这些调用方式。然而,对于第三方SDK,我们无法直接控制其内部实现方式。因此,为了彻底解决这个问题,我们采取了插桩的方法,将所有调用registerDefaultNetworkCallback和unregisterNetworkCallback的地方收拢起来,进行统一处理。这样不仅能有效监控和管理调用,还能降低出错的风险,确保应用的稳定性。
插桩方案
通过插桩,我们可以动态地注入代码来监控和管理registerDefaultNetworkCallback
和unregisterNetworkCallback
的调用。例如,可以使用ASM或AspectJ等工具对代码进行字节码级别的修改。
方案思路
为了降低程序中对registerDefaultNetworkCallback
方法的频繁使用,我们通过插桩技术将所有使用registerDefaultNetworkCallback
方法的代码集中管理。这不仅有助于简化代码结构,还可以方便后续维护和问题排查
方案源码
registerDefaultNetworkCallback
,unregisterNetworkCallback
的收拢管理类
public class MyConnectivityManager {
public static List<ConnectivityManager.NetworkCallback> callbacks = new ArrayList<>();
//降级开关
public static boolean downgrading = false;
public static void initConnectivityManager(Context context){
Log.d("MyConnectivityManager","initConnectivityManager");
ConnectivityManager cm = (ConnectivityManager) context.getSystemService(Context.CONNECTIVITY_SERVICE);
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
cm.registerDefaultNetworkCallback(new ConnectivityManager.NetworkCallback(){
@Override
public void onAvailable(@NonNull Network network) {
super.onAvailable(network);
Log.d("MyConnectivityManager","分发 onAvailable "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
callbacks.get(i).onAvailable(network);
}
}
@Override
public void onLosing(@NonNull Network network, int maxMsToLive) {
super.onLosing(network, maxMsToLive);
Log.d("MyConnectivityManager","分发 onLosing "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
callbacks.get(i).onLosing(network,maxMsToLive);
}
}
@Override
public void onLost(@NonNull Network network) {
super.onLost(network);
Log.d("MyConnectivityManager","分发 onLost "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
callbacks.get(i).onLost(network);
}
}
@Override
public void onUnavailable() {
super.onUnavailable();
Log.d("MyConnectivityManager","分发 onUnavailable "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
callbacks.get(i).onUnavailable();
}
}
}
@Override
public void onCapabilitiesChanged(@NonNull Network network, @NonNull NetworkCapabilities networkCapabilities) {
super.onCapabilitiesChanged(network, networkCapabilities);
Log.d("MyConnectivityManager","分发 onCapabilitiesChanged "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
callbacks.get(i).onCapabilitiesChanged(network,networkCapabilities);
}
}
@Override
public void onLinkPropertiesChanged(@NonNull Network network, @NonNull LinkProperties linkProperties) {
super.onLinkPropertiesChanged(network, linkProperties);
Log.d("MyConnectivityManager","分发 onLinkPropertiesChanged "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
callbacks.get(i).onLinkPropertiesChanged(network,linkProperties);
}
}
@Override
public void onBlockedStatusChanged(@NonNull Network network, boolean blocked) {
super.onBlockedStatusChanged(network, blocked);
Log.d("MyConnectivityManager","分发 onBlockedStatusChanged "+callbacks.size());
for (int i = 0; i < callbacks.size(); i++) {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.Q) {
callbacks.get(i).onBlockedStatusChanged(network,blocked);
}
}
}
});
}
}
public static void registerDefaultNetworkCallback(ConnectivityManager cm,ConnectivityManager.NetworkCallback callback){
Log.d("MyConnectivityManager","registerDefaultNetworkCallback:"+callback);
if (downgrading){
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
cm.registerDefaultNetworkCallback(callback);
}
}else {
callbacks.add(callback);
}
}
public static void unregisterNetworkCallback(ConnectivityManager cm,ConnectivityManager.NetworkCallback callback){
Log.d("MyConnectivityManager","unregisterNetworkCallback");
if (downgrading){
cm.unregisterNetworkCallback(callback);
}else {
callbacks.remove(callback);
}
}
}
插桩代码核心代码,在插桩的代码中,要注意过滤到com/example/gradledemo/MyConnectivityManager管理类,避免出现死循环
if (methodInsnNode.owner.equals("android/net/ConnectivityManager")&&methodInsnNode.name.equals("registerDefaultNetworkCallback") && "(Landroid/net/ConnectivityManager$NetworkCallback;)V".equals(methodInsnNode.desc)){
methodInsnNode.owner = "com/example/gradledemo/MyConnectivityManager";
methodInsnNode.desc = "(Landroid/net/ConnectivityManager;Landroid/net/ConnectivityManager$NetworkCallback;)V";
methodInsnNode.name = "registerDefaultNetworkCallback";
methodInsnNode.setOpcode(INVOKESTATIC);
}
if (methodInsnNode.owner.equals("android/net/ConnectivityManager")&&methodInsnNode.name.equals("unregisterNetworkCallback") && "(Landroid/net/ConnectivityManager$NetworkCallback;)V".equals(methodInsnNode.desc)){
methodInsnNode.owner = "com/example/gradledemo/MyConnectivityManager";
methodInsnNode.desc = "(Landroid/net/ConnectivityManager;Landroid/net/ConnectivityManager$NetworkCallback;)V";
methodInsnNode.name = "unregisterNetworkCallback";
methodInsnNode.setOpcode(INVOKESTATIC);
}
最终结果是cm.registerDefaultNetworkCallback
转换成 MyConnectivityManager.
registerDefaultNetworkCallback
插桩前
ConnectivityManager cm = (ConnectivityManager) this.getSystemService(Context.CONNECTIVITY_SERVICE);
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
cm.registerDefaultNetworkCallback(new ConnectivityManager.NetworkCallback(){
@Override
public void onAvailable(@NonNull Network network) {
super.onAvailable(network);
Log.d("MyConnectivityManager","onAvailable");
}
@Override
public void onCapabilitiesChanged(@NonNull Network network, @NonNull NetworkCapabilities networkCapabilities) {
super.onCapabilitiesChanged(network, networkCapabilities);
Log.d("MyConnectivityManager","onCapabilitiesChanged");
}
@Override
public void onUnavailable() {
super.onUnavailable();
Log.d("MyConnectivityManager","onUnavailable");
}
});
}
插桩后
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
MyConnectivityManager.registerDefaultNetworkCallback(cm,new ConnectivityManager.NetworkCallback(){
@Override
public void onAvailable(@NonNull Network network) {
super.onAvailable(network);
Log.d("MyConnectivityManager","onAvailable");
}
@Override
public void onCapabilitiesChanged(@NonNull Network network, @NonNull NetworkCapabilities networkCapabilities) {
super.onCapabilitiesChanged(network, networkCapabilities);
Log.d("MyConnectivityManager","onCapabilitiesChanged");
}
@Override
public void onUnavailable() {
super.onUnavailable();
Log.d("MyConnectivityManager","onUnavailable");
}
});
}
监控指标
由于我们无法确定华为是否对源码进行了改动,目前只能推测问题与华为设备相关,可能的原因包括:
- 华为对源码进行了改动:华为可能对Android源码进行了某些改动,这些改动可能影响了
registerDefaultNetworkCallback
方法的正常使用。 - 华为的厂商推送:华为可能通过系统更新或其他厂商推送,改变了系统服务的行为,导致该方法在华为设备上出现异常。
因此我们需要在线上监控registerDefaultNetworkCallback
的使用情况,记录registerDefaultNetworkCallback的调用次数,通过收拢使用场景排查出现多次调用registerDefaultNetworkCallback
的地方优化逻辑