global reference限制策略

Posted by Gityuan on January 19, 2019

基于Android 9.0源碼來講一講global reference問題

art/runtime/jni_internal.cc
art/runtime/indirect_reference_table.cc
art/runtime/java_vm_ext.cc
art/runtime/jni_env_ext.cc
art/runtime/java_vm_ext.h
art/runtime/jni_env_ext.h
art/runtime/runtime.h
libnativehelper/include/nativehelper/jni.h

一、概述

global reference使用不當,就會引發lobal reference overflow異常問題,為了解決這個問題,從Android 9.0開始新增了限制策略。

先來看看虛擬機的一些基本知識。每一個進程都必須有一個JavaVM,且只有一個,是Java虛擬機在JNI層的代表, JNI 全局只有一個;每一個線程都有一個JNIEnv,JNIEnv一個線程相關的結構體, 代表Java 在本線程的運行環境。每個虛擬機Runtime實例由調用Runtime::Create來創建,該過程包含創建JavaVMExt, Heap, Thread, ClassLinker等,調用Runtime::Start完成最后的初始化工作。再來看一張類圖了解JavaVM、JNIEnv以及Runtime的核心成員和方法。

jnienv_javavm

1.1 JavaVM

[-> java_vm_ext.cc]

JavaVMExt::JavaVMExt(Runtime* runtime,
                     const RuntimeArgumentMap& runtime_options,
                     std::string* error_msg)
    : runtime_(runtime),
      force_copy_(runtime_options.Exists(RuntimeArgumentMap::JniOptsForceCopy)),
      //見小節【1.3】
      globals_(kGlobalsMax, kGlobal, IndirectReferenceTable::ResizableCapacity::kNo, error_msg),
      libraries_(new Libraries),
      unchecked_functions_(&gJniInvokeInterface),
      weak_globals_(kWeakGlobalsMax, kWeakGlobal,
                    IndirectReferenceTable::ResizableCapacity::kNo, error_msg),
      allow_accessing_weak_globals_(true),
      weak_globals_add_condition_("weak globals add condition",
                                  (CHECK(Locks::jni_weak_globals_lock_ != nullptr),
                                   *Locks::jni_weak_globals_lock_)),
      env_hooks_() {
      
  functions = unchecked_functions_;
  SetCheckJniEnabled(runtime_options.Exists(RuntimeArgumentMap::CheckJni));
}

JavaVMExt初始化的過程,從ResizableCapacity::kNo可以看出該容量上限是不允許擴容的,根據kGlobalsMax = 51200,kWeakGlobalsMax = 51200,說明每個進程的全局引用和弱全局引用的上限是51200個,記錄在JavaVMExt的IndirectReferenceTable類型成員變量。

關于kGlobal是引用類型,定義如下:

enum IndirectRefKind {
  kHandleScopeOrInvalid = 0,           // 棧的間接引用表或無效引用
  kLocal                = 1,           // 本地引用
  kGlobal               = 2,           // 全局引用
  kWeakGlobal           = 3,           // 弱全局引用
  kLastKind             = kWeakGlobal
};

說明:

  • 本地引用:只在native方法的一次調用過程有效,方法一旦返回則會被自動釋放,可通過NewLocalRef/DeleteLocalRef來主動管理本地引用,比如JNI函數NewObject創建一個實例就是局部引用。
  • 全局引用:在釋放之前一直有效,不會被垃圾回收,可跨越多線程、多個native方法使用,可通過NewGlobalRef/DeleteGlobalRef來主動管理本地引用。
  • 弱全局引用:同樣可以跨越多線程、多個native方法使用,但不會阻止垃圾回收。可通過NewGolableWeakRef/DeleteGloablWeakRef管理。

關于globals_構造過程,見下面的[小節1.3]。

1.2 JNIEnv

[-> jni_env_ext.cc]

JNIEnvExt::JNIEnvExt(Thread* self_in, JavaVMExt* vm_in, std::string* error_msg)
    : self_(self_in),
      vm_(vm_in),
      local_ref_cookie_(kIRTFirstSegment),
      locals_(kLocalsInitial, kLocal, IndirectReferenceTable::ResizableCapacity::kYes, error_msg),
      monitors_("monitors", kMonitorsInitial, kMonitorsMax),
      critical_(0),
      check_jni_(false),
      runtime_deleted_(false) {
      
  MutexLock mu(Thread::Current(), *Locks::jni_function_table_lock_);
  check_jni_ = vm_in->IsCheckJniEnabled();
  functions = GetFunctionTable(check_jni_);
  unchecked_functions_ = GetJniNativeInterface();
}

JNIEnv初始化過程依次將當前的Thread和JavaVMExt對象記錄在JNIEnvExt的成員變量self_和vm_。 此處創建IndirectReferenceTable本地引用表的上限為512個引用實體(kLocalsInitial = 512)。

另外說明,在jni.h文件中JNIEnv結構體在C++里面通過typedef關鍵詞定義,其類型為_JNIEnv,該結構體內部有一個JNINativeInterface類型的指針;在C里面則直接通過typedef關鍵詞定義,其類型為JNINativeInterface類型的指針,C/C++下的差異是編譯器相關,但其功能是一樣的。

#if defined(__cplusplus)
typedef _JNIEnv JNIEnv;
typedef _JavaVM JavaVM;
#else
typedef const struct JNINativeInterface* JNIEnv;
typedef const struct JNIInvokeInterface* JavaVM;
#endif

因此使用過程也就有所不同,如下所示:

//C語言版本
jsize len = (*env)->GetArrayLength(env,array);
//C++語言版本
jsize len =env->GetArrayLength(array); 

1.3 IndirectReferenceTable

[-> indirect_reference_table.cc]

IndirectReferenceTable::IndirectReferenceTable(size_t max_count,
                                               IndirectRefKind desired_kind,
                                               ResizableCapacity resizable,
                                               std::string* error_msg)
    : segment_state_(kIRTFirstSegment),
      kind_(desired_kind),
      max_entries_(max_count),
      current_num_holes_(0),
      resizable_(resizable) {

  const size_t table_bytes = max_count * sizeof(IrtEntry);
  table_mem_map_.reset(MemMap::MapAnonymous("indirect ref table", nullptr, table_bytes,
                                            PROT_READ | PROT_WRITE, false, false, error_msg));
  if (table_mem_map_.get() == nullptr && error_msg->empty()) {
    *error_msg = "Unable to map memory for indirect ref table";
  }

  if (table_mem_map_.get() != nullptr) {
    table_ = reinterpret_cast<IrtEntry*>(table_mem_map_->Begin());
  } else {
    table_ = nullptr;
  }
  segment_state_ = kIRTFirstSegment;
  last_known_previous_state_ = kIRTFirstSegment;
}

再來看看IndirectReferenceTable對象的核心成員變量:

class IndirectReferenceTable {
  private:
    IRTSegmentState segment_state_;
    std::unique_ptr<MemMap> table_mem_map_;  // 用于存儲引用表的map
    IrtEntry* table_;    //用于存儲IndirectReference實體對象
    const IndirectRefKind kind_;   //引用類型
    size_t max_entries_;    //引用個數上限
    size_t current_num_holes_;   //當前可用的空槽
    IRTSegmentState last_known_previous_state_;
    ResizableCapacity resizable_;
    ...
}

IndirectReferenceTable對象中的max_entries_用于記錄引用表的引用個數上限:

  • JavaVM對象的IndirectReferenceTable引用表的引用個數上限等于51200個,不可擴容
  • JNIEnv對象的IndirectReferenceTable引用表的引用個數上限等于512個,可擴容

二、global reference管理

先來看看gloabl reference的添加引用和移除引用的過程。

2.1 添加引用

2.1.1 NewGlobalRef

[-> jni_internal.cc]

static jobject NewGlobalRef(JNIEnv* env, jobject obj) {
  ScopedObjectAccess soa(env);
  ObjPtr<mirror::Object> decoded_obj = soa.Decode<mirror::Object>(obj);
  return soa.Vm()->AddGlobalRef(soa.Self(), decoded_obj); //【小節2.1.2】
}

JNIEnv的NewGlobalRef過程主要實現是調用所在的JavaVM的AddGlobalRef來添加全局引用。

2.1.2 AddGlobalRef

[-> java_vm_ext.cc]

jobject JavaVMExt::AddGlobalRef(Thread* self, ObjPtr<mirror::Object> obj) {
  if (obj == nullptr) {
    return nullptr;
  }
  IndirectRef ref;
  std::string error_msg;
  {
    WriterMutexLock mu(self, *Locks::jni_globals_lock_);
    //obj加入全局引用表【小節2.1.3】
    ref = globals_.Add(kIRTFirstSegment, obj, &error_msg);
  }
  return reinterpret_cast<jobject>(ref);
}

此處globals_的數據類型為IndirectReferenceTable,是JavaVMExt對象的成員變量。

2.1.3 IndirectReferenceTable.Add

[-> indirect_reference_table.cc]

IndirectRef IndirectReferenceTable::Add(IRTSegmentState previous_state,
                                        ObjPtr<mirror::Object> obj,
                                        std::string* error_msg) {
  size_t top_index = segment_state_.top_index;
  if (top_index == max_entries_) { 
     //當引用個數達到上限,且不允許擴容的情況下,則直接返回
    if (resizable_ == ResizableCapacity::kNo) {
      std::ostringstream oss;
      oss << "JNI ERROR (app bug): " << kind_ << " table overflow "
          << "(max=" << max_entries_ << ")"
          << MutatorLockedDumpable<IndirectReferenceTable>(*this);
      *error_msg = oss.str();
      return nullptr;
    }

    ...
    // 對于允許擴容的情況下,嘗試將容量翻倍
    std::string inner_error_msg;
    if (!Resize(max_entries_ * 2, &inner_error_msg)) {
      std::ostringstream oss;
      oss << "JNI ERROR (app bug): " << kind_ << " table overflow "
          << "(max=" << max_entries_ << ")" << std::endl
          << MutatorLockedDumpable<IndirectReferenceTable>(*this)
          << " Resizing failed: " << inner_error_msg;
      *error_msg = oss.str();
      return nullptr;
    }
  }
  ...
  
  IndirectRef result;
  size_t index;
  //當存在可用的空槽時,從table_頂部往下開始遍歷查找,直到找到空槽為止
  if (current_num_holes_ > 0) {
    IrtEntry* p_scan = &table_[top_index - 1];
    --p_scan;
    while (!p_scan->GetReference()->IsNull()) {
      DCHECK_GE(p_scan, table_ + previous_state.top_index);
      --p_scan;
    }
    index = p_scan - table_; //找到目標空槽
    current_num_holes_--;  //可用空槽個數減一
  } else {
    index = top_index++;  //若沒有空槽,則添加到隊尾
    segment_state_.top_index = top_index; 
  }
  table_[index].Add(obj);
  result = ToIndirectRef(index);
  return result;
}

向IndirectReferenceTable表中添加全局引用的過程是不允許擴容的,保證引用個數小于上限,否則記錄將JNI ERROR的信息記錄在error_msg,并直接返回nullptr。接著需要查找reference所歸屬的槽位。

  • 當存在可用空槽(current_num_holes_>0)時,從table_頂部往下開始遍歷查找,直到找到空槽為止,并將可用槽位個數減1;
  • 當沒有空槽,則將reference添加到隊尾

2.2 移除引用

2.2.1 DeleteGlobalRef

[-> jni_internal.cc]

static void DeleteGlobalRef(JNIEnv* env, jobject obj) {
  JavaVMExt* vm = down_cast<JNIEnvExt*>(env)->GetVm();
  Thread* self = down_cast<JNIEnvExt*>(env)->self_;
  vm->DeleteGlobalRef(self, obj); // 【小節2.2.2】
}

JNIEnv的DeleteGlobalRef過程主要實現是調用所在的JavaVM的DeleteGlobalRef來添加全局引用。

2.2.2 DeleteGlobalRef

[-> java_vm_ext.cc]

void JavaVMExt::DeleteGlobalRef(Thread* self, jobject obj) {
  if (obj == nullptr) {
    return;
  }
  {
    WriterMutexLock mu(self, *Locks::jni_globals_lock_);
    //【小節2.2.3】
    if (!globals_.Remove(kIRTFirstSegment, obj)) {
      LOG(WARNING) << "JNI WARNING: DeleteGlobalRef(" << obj << ") "
                   << "failed to find entry";
    }
  }
  CheckGlobalRefAllocationTracking();
}

2.2.3 IndirectReferenceTable.Remove

[-> indirect_reference_table.cc]

bool IndirectReferenceTable::Remove(IRTSegmentState previous_state, IndirectRef iref) {
  const uint32_t top_index = segment_state_.top_index;
  const uint32_t bottom_index = previous_state.top_index;

  ...
  // 保證index屬于有效的范圍區間
  const uint32_t idx = ExtractIndex(iref);
  if (idx < bottom_index) {
    return false;
  }
  if (idx >= top_index) {
    return false;
  }
  RecoverHoles(previous_state);

  if (idx == top_index - 1) {
    ...
    //table_表中對應的槽位置空
    *table_[idx].GetReference() = GcRoot<mirror::Object>(nullptr);
    if (current_num_holes_ != 0) {
      uint32_t collapse_top_index = top_index;
      while (--collapse_top_index > bottom_index && current_num_holes_ != 0) {
        if (!table_[collapse_top_index - 1].GetReference()->IsNull()) {
          break;
        }
        current_num_holes_--;  //位于最上方的空槽,則減少當前可用的空槽個數
      }
      segment_state_.top_index = collapse_top_index; //更新表的頂部編號
    } else {
      segment_state_.top_index = top_index - 1;
    }
  } else {
    // 不是最上面的條目,則會產生一個空槽。判斷當前是否已為空槽,用于防止刪除兩次,弄亂空槽個數
    if (table_[idx].GetReference()->IsNull()) {
      return false;
    }

    *table_[idx].GetReference() = GcRoot<mirror::Object>(nullptr); //置空
    current_num_holes_++;  //空槽個數+1
  }
  return true;
}

2.3 小節

有了前面的準備,可知道每個進程的global reference的上限為51200個,如果達到個數上限,則會在下一次添加引用的過程[小節2.1.3]中拋出 Abort message: ‘art/runtime/indirect_reference_table.cc:258] JNI ERROR (app bug): global reference table overflow (max=51200)’。

引用的添加和移除都是成對出現的,常見的使用場景是JNI調用過程中使用JNIEnv的NewGlobalRef()和DeleteGlobalRef()方法,使用過程一定要記得成對出現,否則有可能導致global reference overflow問題。

三、案例

3.1 linkToDeath導致溢出

Abort message: 'art/runtime/indirect_reference_table.cc:258] JNI ERROR (app bug): global reference table overflow (max=51200)'

backtrace:
#00 pc 0000000000079208 /system/lib64/libc.so (tgkill+8)
#01 pc 0000000000076480 /system/lib64/libc.so (pthread_kill+64)
#02 pc 00000000000249a0 /system/lib64/libc.so (raise+24)
#03 pc 000000000001ce8c /system/lib64/libc.so (abort+52)
#04 pc 000000000047eeec /system/lib64/libart.so (_ZN3art7Runtime5AbortEPKc+472)
#05 pc 00000000000e7564 /system/lib64/libart.so (_ZN3art10LogMessageD2Ev+1320)
#06 pc 00000000002745cc /system/lib64/libart.so (_ZN3art22IndirectReferenceTable3AddEjPNS_6mirror6ObjectE+324)
#07 pc 0000000000325c6c /system/lib64/libart.so (_ZN3art9JavaVMExt12AddGlobalRefEPNS_6ThreadEPNS_6mirror6ObjectE+68)
#08 pc 0000000000364b04 /system/lib64/libart.so (_ZN3art3JNI12NewGlobalRefEP7_JNIEnvP8_jobject+604)
#09 pc 00000000000ffd54 /system/lib64/libandroid_runtime.so
#10 pc 0000000002204a34 /system/framework/arm64/boot-framework.oat (offset 0x1986000) (android.os.BinderProxy.linkToDeath+160)
#11 pc 0000000001512ef4 /system/framework/oat/arm64/services.odex (offset 0xf68000)

linkToDeath(DeathRecipient recipient, int flags)是一個native方法,詳見Binder死亡通知機制之linkToDeath。該過程會調用在JavaDeathRecipient對象初始化過程會NewGlobalRef

3.1.1 JavaDeathRecipient

[-> android_util_Binder.cpp]

class JavaDeathRecipient : public IBinder::DeathRecipient
{
public:
    class JavaDeathRecipient : public IBinder::DeathRecipient
    {
    public:
        JavaDeathRecipient(JNIEnv* env, jobject object, const sp<DeathRecipientList>& list)
            : mVM(jnienv_to_javavm(env)), mObject(env->NewGlobalRef(object)), 
              mObjectWeak(NULL), mList(list)
        {
            list->add(this);   //將當前對象sp添加到列表
            android_atomic_inc(&gNumDeathRefs);
            incRefsCreated(env);   //增加引用計數
        }
    }
    
    void binderDied(const wp<IBinder>& who)
    {
        if (mObject != NULL) {
            JNIEnv* env = javavm_to_jnienv(mVM);

            env->CallStaticVoidMethod(gBinderProxyOffsets.mClass,
                    gBinderProxyOffsets.mSendDeathNotice, mObject);
        
            sp<DeathRecipientList> list = mList.promote();
            if (list != NULL) {
                AutoMutex _l(list->lock());
                mObjectWeak = env->NewWeakGlobalRef(mObject);
                env->DeleteGlobalRef(mObject);  //移除全局引用
                mObject = NULL;
            }
        }
    }
    
    void clearReference()
    {
        sp<DeathRecipientList> list = mList.promote();
        if (list != NULL) {
            list->remove(this);
        } 
    }
protected:
    virtual ~JavaDeathRecipient()
    {
        android_atomic_dec(&gNumDeathRefs);
        JNIEnv* env = javavm_to_jnienv(mVM);
        if (mObject != NULL) {
            env->DeleteGlobalRef(mObject);  //移除全局引用
        } else {
            env->DeleteWeakGlobalRef(mObjectWeak);
        }
    }

private:
    JavaVM* const mVM;
    jobject mObject;
    jweak mObjectWeak; 
    wp<DeathRecipientList> mList;
}

說明:

  • JavaDeathRecipient對象創建的過程,會執行env->NewGlobalRef()為recipient創建相應的全局引用
  • JavaDeathRecipient對象析構和binderDied死亡回調過程,會執行env->DeleteGlobalRef移除全局引用
    • clearReference()過程,將DeathRecipientList從list移除,從而能觸發對象析構來移除

這里需要重點注意linkToDeath和unlinkToDeath需要配合出現。

3.2 javaObjectForIBinder導致溢出

Abort message: 'art/runtime/indirect_reference_table.cc:258] JNI ERROR (app bug): global reference table overflow (max=51200)'

backtrace:
#00 pc 000000000006ac34 /system/lib64/libc.so (tgkill+8)
#01 pc 00000000000683c4 /system/lib64/libc.so (pthread_kill+68)
#02 pc 0000000000023ae4 /system/lib64/libc.so (raise+28)
#03 pc 000000000001e284 /system/lib64/libc.so (abort+60)
#04 pc 00000000004322b8 /system/lib64/libart.so (_ZN3art7Runtime5AbortEv+324)
#05 pc 0000000000136204 /system/lib64/libart.so (_ZN3art10LogMessageD2Ev+3136)
#06 pc 0000000000273604 /system/lib64/libart.so (_ZN3art22IndirectReferenceTable3AddEjPNS_6mirror6ObjectE+1964)
#07 pc 0000000000309400 /system/lib64/libart.so (_ZN3art9JavaVMExt12AddGlobalRefEPNS_6ThreadEPNS_6mirror6ObjectE+56)
#08 pc 000000000033f624 /system/lib64/libart.so (_ZN3art3JNI12NewGlobalRefEP7_JNIEnvP8_jobject+320)
#09 pc 00000000000e6ca8 /system/lib64/libandroid_runtime.so (_ZN7android20javaObjectForIBinderEP7_JNIEnvRKNS_2spINS_7IBinderEEE+412)
#10 pc 00000000000daa1c /system/lib64/libandroid_runtime.so
#11 pc 0000000002c5ec50 /system/framework/arm64/boot.oat (offset 0x28a0000)

從Android 9.0之前的版本中再javaObjectForIBinder()會執行NewGlobalRef,從Android 9.0開始,Google優化了該問題,采用新的實現方案,改動比較多,這里就不再展開,有興趣的可自行查看。

3.3 解決方案

所有應用進程以及其他一些native進程都會system_server通信,有很多API接口的內部實現涉及到linkToDeath使用,某些應用濫用公開接口引發Global reference數量過多而導致系統重啟的問題。從Android 9.0開始,在native層中保存每個uid下所有的Binder Proxy記錄,從而可以定位哪個應用濫用并將其殺掉,以保證系統的健壯性和可靠性。對于此類濫用的行為會打印如下日志:

Killing 15015:com.gityuan.appdemo/u0a176 (adj 0): Too many Binders sent to SYSTEM

比如在Android 6.0的原生版本,App中不斷調用AppOpsManager的startWatchingMode()就能導致手機重啟,小米手機早已修復這個問題。當前從Android 9.0版本開始,Google原生系統剛解決此類問題。

3.3.1 setBinderProxyCountCallback

[-> ActivityManagerService.java]

public void systemReady(final Runnable goingCallback, TimingsTraceLog traceLog) {
    ...
    synchronized (this) {
        //【小節3.3.2】
        BinderInternal.nSetBinderProxyCountWatermarks(6000,5500);
        //【小節3.3.3】
        BinderInternal.nSetBinderProxyCountEnabled(true);
        //【小節3.3.4】
        BinderInternal.setBinderProxyCountCallback(
            new BinderInternal.BinderProxyLimitListener() {
                @Override
                public void onLimitReached(int uid) {
                    Slog.wtf(TAG, "Uid " + uid + " sent too many Binders to uid "
                            + Process.myUid());
                    if (uid == Process.SYSTEM_UID) {
                        Slog.i(TAG, "Skipping kill (uid is SYSTEM)");
                    } else {
                        //當觸發水線,則殺掉發送Binder請求過多的進程
                        killUid(UserHandle.getAppId(uid), UserHandle.getUserId(uid),
                                "Too many Binders sent to SYSTEM");
                    }
                }
            }, mHandler);
    }
    ...
}

3.3.2 nSetBinderProxyCountWatermarks

[-> android_util_Binder.cpp]

static void android_os_BinderInternal_setBinderProxyCountWatermarks(JNIEnv* env, jobject clazz,
                                                                    jint high, jint low)
{
    BpBinder::setBinderProxyCountWatermarks(high, low); //【見下文】
}

[-> BpBinder.cpp]

uint32_t BpBinder::sBinderProxyCountHighWatermark = 2500;
uint32_t BpBinder::sBinderProxyCountLowWatermark = 2000;

void BpBinder::setBinderProxyCountWatermarks(int high, int low) {
    AutoMutex _l(sTrackingLock);
    sBinderProxyCountHighWatermark = high;
    sBinderProxyCountLowWatermark = low;
}

每個進程默認的Binder代理數量的水線區間為[2000,2500],對于system_server進程的水線區間為[5500,6000]。

3.3.3 nSetBinderProxyCountEnabled

[-> android_util_Binder.cpp]

static void android_os_BinderInternal_setBinderProxyCountEnabled(JNIEnv* env, jobject clazz,
                                                                 jboolean enable)
{
    BpBinder::setCountByUidEnabled((bool) enable); //【見下文】
}

[-> BpBinder.cpp]

void BpBinder::setCountByUidEnabled(bool enable) 
{ 
    sCountByUidEnabled.store(enable); 
}

sCountByUidEnabled的數據類型為std::atomic_bool,這是一個原子操作的bool,保證了多線程并發訪問的安全問題。

3.3.4 setBinderProxyCountCallback

[-> BinderInternal.java]

public static void setBinderProxyCountCallback(BinderProxyLimitListener listener,
        @NonNull Handler handler) {
    Preconditions.checkNotNull(handler,
            "Must provide NonNull Handler to setBinderProxyCountCallback when setting "
                    + "BinderProxyLimitListener");
    //【見下文】
    sBinderProxyLimitListenerDelegate.setListener(listener, handler);
}

注意,此處handler不能為空

[-> BinderInternal.java]

public class BinderInternal {
    static final BinderProxyLimitListenerDelegate sBinderProxyLimitListenerDelegate =
        new BinderProxyLimitListenerDelegate();
    ...
    
    public static void binderProxyLimitCallbackFromNative(int uid) {
        //執行notifyClient回調方法
       sBinderProxyLimitListenerDelegate.notifyClient(uid);
    }
    
    static private class BinderProxyLimitListenerDelegate {
        private BinderProxyLimitListener mBinderProxyLimitListener;
        private Handler mHandler;

        void setListener(BinderProxyLimitListener listener, Handler handler) {
            synchronized (this) {
                mBinderProxyLimitListener = listener;
                mHandler = handler;
            }
        }

        void notifyClient(final int uid) {
            synchronized (this) {
                if (mBinderProxyLimitListener != null) {
                    mHandler.post(new Runnable() {
                        @Override
                        public void run() {
                            mBinderProxyLimitListener.onLimitReached(uid);
                        }
                    });
                }
            }
        }
    }
}

setListener設置好Binder代理限制的監聽器,以及執行回調的Handler對象。當收到native層傳遞的某個進程使用system_server的binder代理超過水線,則在mHandler所在線程中執行onLimitReached()方法。

3.3.5 notifyClient

接下來,再來看看native層回調通知的觸發時機

在int_register_android_os_BinderInternal()過程調用BpBinder的setLimitCallback方法將android_os_BinderInternal_proxyLimitcallback保存在Bpbinder的sLimitCallback。

binder_proxy_limit_callback BpBinder::sLimitCallback;
BpBinder* BpBinder::create(int32_t handle) {
    int32_t trackedUid = -1;
    if (sCountByUidEnabled) {
        //獲取對端的uid
        trackedUid = IPCThreadState::self()->getCallingUid();
        AutoMutex _l(sTrackingLock);
        uint32_t trackedValue = sTrackingMap[trackedUid];
        if (CC_UNLIKELY(trackedValue & LIMIT_REACHED_MASK)) {
            if (sBinderProxyThrottleCreate) {
                return nullptr;
            }
        } else {
            //超過高位的水線
            if ((trackedValue & COUNTING_VALUE_MASK) >= sBinderProxyCountHighWatermark) {
                ALOGE("Too many binder proxy objects sent to uid %d from uid %d (%d proxies held)",
                      getuid(), trackedUid, trackedValue);
                sTrackingMap[trackedUid] |= LIMIT_REACHED_MASK;
                //當binder代理個數超過高水位線,則執行回調方法
                if (sLimitCallback) sLimitCallback(trackedUid);
                ...
            }
        }
        sTrackingMap[trackedUid]++;
    }
    return new BpBinder(handle, trackedUid);
}

sLimitCallback調用鏈最終達到AMS中的[3.3.1]的onLimitReached過程,殺掉目標進程并打印日志。

android_os_BinderInternal_proxyLimitcallback
    binderProxyLimitCallbackFromNative 
        notifyClient
            onLimitReached

3.5 小節

所有應用進程以及其他一些native進程都會system_server通信,有很多API接口的內部實現涉及到linkToDeath使用,某些應用濫用公開接口引發Global reference數量過多而導致系統重啟的問題。從Android 9.0開始,在native層中保存每個uid下所有的Binder Proxy記錄,當某個應用向system_server發起的binder代理對象超過6000個,則意味著該應用濫用API,則并將其殺掉,以保證系統的健壯性和可靠性。這一點需要應用要按規范使用接口,比如每次調用startWatchingMode接口后,當不再需要使用時,應該執行相應的配對方法stopWatchingMode,否則會不斷增加binder proxy數量只會增加而不減少,當達到閾值就會被系統所殺。

同理,還有類似的方法對:

  • linkToDeath():該方法內會調用new JavaDeathRecipient(),在創建recipient對象過程需要調用NewGlobalRef來添加全局引用,防止recipient被回收。
  • unlinkToDeath():該方法內會調用clearReference()將當前JavaDeathRecipient對象從列表中移除,從而會執行JavaDeathRecipient的析構函數,調用DeleteGlobalRef來移除全局引用。

還有一點需要說明,對于linkToDeath()后,在收到binderDied()過程本身也會移除全局引用。即便如此,對于建立死亡訃告情況,如果不在需要了,還是建議主動unlinkToDeath()。為了避免全局引用溢出問題,以上兩方法需要配對出現,對于發生全局引用溢出問題,需要定位具體是哪個引用導致的,可以從日志中查詢”global reference table dump”關鍵,會打印出最近的TOP 10引用實體,具體問題還需要結合上下文來分析,在最新Android 9.0已修復該問題。


微信公眾號 Gityuan | 微博 weibo.com/gityuan | 博客 留言區交流 gityuan
万人炸金花官网