问题布景

最近在群里看到群友在评论app保活的问题,回想之前做使用(运动类)开发时也遇到过相似的需求,所以便又来了爱好,果断加入其中,和群友展开了剧烈的评论

从framework角度看app保活问题

不少群友的主意和我最初的主意相同,这特么保活不是看体系的心境么,体系想让谁活谁才干活,作为app开发者,根本无能为力,可真的是这样的吗?

保活计划

首先,我整理了从古到今,app开发者所使用过的以及当时还在使用的保活办法,首要思路有两个:保活和复生

保活的计划有:
  • 1像素惨案

  • 后台无声音乐

  • 前台service

  • 心跳机制

  • socket长连接

  • 无障碍服务

  • ……

复生的计划有:
  • 双进程看护(java层和native层)
  • JobScheduler守时任务
  • 推送/彼此唤醒
  • ……

不难看出,app开发者为了能让自己的使用多存活一瞬间,可谓是费尽心机,但即便这样,随着Android体系升级,尤其是进入8.0之后,体系对使用的限制越来越高,传统的保活办法已经不生效,这让Android开发者不知所措,所以乎,出现了一种比较调和的保活办法:

  • 引导用户开启手机白名单

这也是目前绝大多数使用所选用的的办法,相对于传统黑科技而言,此办法显得不那么流氓,比较容易被用户所接受。

但跟微信这样的国民级使用比起来,保活作用还是差了一大截,那么微信是怎样完成保活的呢?或许回到咱们最初的问题,使用的存亡真的只能靠体系调度吗?开发者能否干预控制呢?

进程调度准则

解开这个疑问之前,咱们需求了解一下Android体系进程调度准则,首要介绍framework中承载四大组件的进程是怎样依据组件状况而动态调理本身状况的。进程有两个比较重要的状况值:

  • oom_adj,界说在frameworks/base/services/core/java/com/android/server/am/ProcessList.java傍边

  • procState,界说在frameworks/base/core/java/android/app/ActivityManager.java傍边

OOM_ADJ

以Android10的源码为例,oom_adj划分为20级,取值规模[-10000,1001],Android6.0以前的取值规模是[-17,16]

  • oom_adj值越大,优先级越低

  • oom_adj<0的进程都是体系进程。

public final class ProcessList {
    static final String TAG = TAG_WITH_CLASS_NAME ? "ProcessList" : TAG_AM;
    // The minimum time we allow between crashes, for us to consider this
    // application to be bad and stop and its services and reject broadcasts.
    static final int MIN_CRASH_INTERVAL = 60 * 1000;
    // OOM adjustments for processes in various states:
    // Uninitialized value for any major or minor adj fields
    static final int INVALID_ADJ = -10000;
    // Adjustment used in certain places where we don't know it yet.
    // (Generally this is something that is going to be cached, but we
    // don't know the exact value in the cached range to assign yet.)
    static final int UNKNOWN_ADJ = 1001;
    // This is a process only hosting activities that are not visible,
    // so it can be killed without any disruption.
    static final int CACHED_APP_MAX_ADJ = 999;
    static final int CACHED_APP_MIN_ADJ = 900;
    // This is the oom_adj level that we allow to die first. This cannot be equal to
    // CACHED_APP_MAX_ADJ unless processes are actively being assigned an oom_score_adj of
    // CACHED_APP_MAX_ADJ.
    static final int CACHED_APP_LMK_FIRST_ADJ = 950;
    // Number of levels we have available for different service connection group importance
    // levels.
    static final int CACHED_APP_IMPORTANCE_LEVELS = 5;
    // The B list of SERVICE_ADJ -- these are the old and decrepit
    // services that aren't as shiny and interesting as the ones in the A list.
    static final int SERVICE_B_ADJ = 800;
    // This is the process of the previous application that the user was in.
    // This process is kept above other things, because it is very common to
    // switch back to the previous app.  This is important both for recent
    // task switch (toggling between the two top recent apps) as well as normal
    // UI flow such as clicking on a URI in the e-mail app to view in the browser,
    // and then pressing back to return to e-mail.
    static final int PREVIOUS_APP_ADJ = 700;
    // This is a process holding the home application -- we want to try
    // avoiding killing it, even if it would normally be in the background,
    // because the user interacts with it so much.
    static final int HOME_APP_ADJ = 600;
    // This is a process holding an application service -- killing it will not
    // have much of an impact as far as the user is concerned.
    static final int SERVICE_ADJ = 500;
    // This is a process with a heavy-weight application.  It is in the
    // background, but we want to try to avoid killing it.  Value set in
    // system/rootdir/init.rc on startup.
    static final int HEAVY_WEIGHT_APP_ADJ = 400;
    // This is a process currently hosting a backup operation.  Killing it
    // is not entirely fatal but is generally a bad idea.
    static final int BACKUP_APP_ADJ = 300;
    // This is a process bound by the system (or other app) that's more important than services but
    // not so perceptible that it affects the user immediately if killed.
    static final int PERCEPTIBLE_LOW_APP_ADJ = 250;
    // This is a process only hosting components that are perceptible to the
    // user, and we really want to avoid killing them, but they are not
    // immediately visible. An example is background music playback.
    static final int PERCEPTIBLE_APP_ADJ = 200;
    // This is a process only hosting activities that are visible to the
    // user, so we'd prefer they don't disappear.
    static final int VISIBLE_APP_ADJ = 100;
    static final int VISIBLE_APP_LAYER_MAX = PERCEPTIBLE_APP_ADJ - VISIBLE_APP_ADJ - 1;
    // This is a process that was recently TOP and moved to FGS. Continue to treat it almost
    // like a foreground app for a while.
    // @see TOP_TO_FGS_GRACE_PERIOD
    static final int PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ = 50;
    // This is the process running the current foreground app.  We'd really
    // rather not kill it!
    static final int FOREGROUND_APP_ADJ = 0;
    // This is a process that the system or a persistent process has bound to,
    // and indicated it is important.
    static final int PERSISTENT_SERVICE_ADJ = -700;
    // This is a system persistent process, such as telephony.  Definitely
    // don't want to kill it, but doing so is not completely fatal.
    static final int PERSISTENT_PROC_ADJ = -800;
    // The system process runs at the default adjustment.
    static final int SYSTEM_ADJ = -900;
    // Special code for native processes that are not being managed by the system (so
    // don't have an oom adj assigned by the system).
    static final int NATIVE_ADJ = -1000;
    // Memory pages are 4K.
    static final int PAGE_SIZE = 4 * 1024;
    //省略部分代码
}
ADJ等级 取值 说明(可参阅源码注释)
INVALID_ADJ -10000 未初始化adj字段时的默认值
UNKNOWN_ADJ 1001 缓存进程,无法获取详细值
CACHED_APP_MAX_ADJ 999 不行见activity进程的最大值
CACHED_APP_MIN_ADJ 900 不行见activity进程的最小值
CACHED_APP_LMK_FIRST_ADJ 950 lowmemorykiller优先杀死的等级值
SERVICE_B_ADJ 800 旧的service的
PREVIOUS_APP_ADJ 700 上一个使用,常见于使用切换场景
HOME_APP_ADJ 600 home进程
SERVICE_ADJ 500 创立了service的进程
HEAVY_WEIGHT_APP_ADJ 400 后台的重量级进程,system/rootdir/init.rc文件中设置
BACKUP_APP_ADJ 300 备份进程
PERCEPTIBLE_LOW_APP_ADJ 250 受其他进程束缚的进程
PERCEPTIBLE_APP_ADJ 200 可感知组件的进程,比方布景音乐播映
VISIBLE_APP_ADJ 100 可见进程
PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ 50 最近运转的后台进程
FOREGROUND_APP_ADJ 0 前台进程,正在与用户交互
PERSISTENT_SERVICE_ADJ -700 体系耐久化进程已绑定的进程
PERSISTENT_PROC_ADJ -800 体系耐久化进程,比方telephony
SYSTEM_ADJ -900 体系进程
NATIVE_ADJ -1000 native进程,不受体系管理

能够经过cat /proc/进程id/oom_score_adj检查方针进程的oom_adj值,例如咱们检查电话的adj

从framework角度看app保活问题

值为935,处于不行见进程的规模内,当我启动电话app,再次检查

从framework角度看app保活问题

此时adj值为0,也便是正在与用户交互的进程

ProcessState

process_state划分为23类,取值规模为[-1,21]

@SystemService(Context.ACTIVITY_SERVICE)
public class ActivityManager {
    //省略部分代码
    /** @hide Not a real process state. */
    public static final int PROCESS_STATE_UNKNOWN = -1;
    /** @hide Process is a persistent system process. */
    public static final int PROCESS_STATE_PERSISTENT = 0;
    /** @hide Process is a persistent system process and is doing UI. */
    public static final int PROCESS_STATE_PERSISTENT_UI = 1;
    /** @hide Process is hosting the current top activities.  Note that this covers
     * all activities that are visible to the user. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_TOP = 2;
    /** @hide Process is hosting a foreground service with location type. */
    public static final int PROCESS_STATE_FOREGROUND_SERVICE_LOCATION = 3;
    /** @hide Process is bound to a TOP app. This is ranked below SERVICE_LOCATION so that
     * it doesn't get the capability of location access while-in-use. */
    public static final int PROCESS_STATE_BOUND_TOP = 4;
    /** @hide Process is hosting a foreground service. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_FOREGROUND_SERVICE = 5;
    /** @hide Process is hosting a foreground service due to a system binding. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_BOUND_FOREGROUND_SERVICE = 6;
    /** @hide Process is important to the user, and something they are aware of. */
    public static final int PROCESS_STATE_IMPORTANT_FOREGROUND = 7;
    /** @hide Process is important to the user, but not something they are aware of. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_IMPORTANT_BACKGROUND = 8;
    /** @hide Process is in the background transient so we will try to keep running. */
    public static final int PROCESS_STATE_TRANSIENT_BACKGROUND = 9;
    /** @hide Process is in the background running a backup/restore operation. */
    public static final int PROCESS_STATE_BACKUP = 10;
    /** @hide Process is in the background running a service.  Unlike oom_adj, this level
     * is used for both the normal running in background state and the executing
     * operations state. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_SERVICE = 11;
    /** @hide Process is in the background running a receiver.   Note that from the
     * perspective of oom_adj, receivers run at a higher foreground level, but for our
     * prioritization here that is not necessary and putting them below services means
     * many fewer changes in some process states as they receive broadcasts. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_RECEIVER = 12;
    /** @hide Same as {@link #PROCESS_STATE_TOP} but while device is sleeping. */
    public static final int PROCESS_STATE_TOP_SLEEPING = 13;
    /** @hide Process is in the background, but it can't restore its state so we want
     * to try to avoid killing it. */
    public static final int PROCESS_STATE_HEAVY_WEIGHT = 14;
    /** @hide Process is in the background but hosts the home activity. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_HOME = 15;
    /** @hide Process is in the background but hosts the last shown activity. */
    public static final int PROCESS_STATE_LAST_ACTIVITY = 16;
    /** @hide Process is being cached for later use and contains activities. */
    @UnsupportedAppUsage
    public static final int PROCESS_STATE_CACHED_ACTIVITY = 17;
    /** @hide Process is being cached for later use and is a client of another cached
     * process that contains activities. */
    public static final int PROCESS_STATE_CACHED_ACTIVITY_CLIENT = 18;
    /** @hide Process is being cached for later use and has an activity that corresponds
     * to an existing recent task. */
    public static final int PROCESS_STATE_CACHED_RECENT = 19;
    /** @hide Process is being cached for later use and is empty. */
    public static final int PROCESS_STATE_CACHED_EMPTY = 20;
    /** @hide Process does not exist. */
    public static final int PROCESS_STATE_NONEXISTENT = 21;
    //省略部分代码
}
state等级 取值 说明(可参阅源码注释)
PROCESS_STATE_UNKNOWN -1 不是真正的进程状况
PROCESS_STATE_PERSISTENT 0 耐久化的体系进程
PROCESS_STATE_PERSISTENT_UI 1 耐久化的体系进程,而且正在操作UI
PROCESS_STATE_TOP 2 处于栈顶Activity的进程
PROCESS_STATE_FOREGROUND_SERVICE_LOCATION 3 运转前台位置服务的进程
PROCESS_STATE_BOUND_TOP 4 绑定到top使用的进程
PROCESS_STATE_FOREGROUND_SERVICE 5 运转前台服务的进程
PROCESS_STATE_BOUND_FOREGROUND_SERVICE 6 绑定前台服务的进程
PROCESS_STATE_IMPORTANT_FOREGROUND 7 对用户很重要的前台进程
PROCESS_STATE_IMPORTANT_BACKGROUND 8 对用户很重要的后台进程
PROCESS_STATE_TRANSIENT_BACKGROUND 9 暂时处于后台运转的进程
PROCESS_STATE_BACKUP 10 备份进程
PROCESS_STATE_SERVICE 11 运转后台服务的进程
PROCESS_STATE_RECEIVER 12 运动广播的后台进程
PROCESS_STATE_TOP_SLEEPING 13 处于休眠状况的进程
PROCESS_STATE_HEAVY_WEIGHT 14 后台进程,但不能康复本身状况
PROCESS_STATE_HOME 15 后台进程,在运转home activity
PROCESS_STATE_LAST_ACTIVITY 16 后台进程,在运转最后一次显示的activity
PROCESS_STATE_CACHED_ACTIVITY 17 缓存进程,包含activity
PROCESS_STATE_CACHED_ACTIVITY_CLIENT 18 缓存进程,且该进程是另一个包含activity进程的客户端
PROCESS_STATE_CACHED_RECENT 19 缓存进程,且有一个activity是最近任务里的activity
PROCESS_STATE_CACHED_EMPTY 20 空的缓存进程,备用
PROCESS_STATE_NONEXISTENT 21 不存在的进程
进程调度算法

frameworks/base/services/core/java/com/android/server/am/OomAdjuster.java中,有三个中心办法用于核算和更新进程的oom_adj值

  • updateOomAdjLocked():更新adj,当方针进程为空,或许被杀则回来false,不然回来true。
  • computeOomAdjLocked():核算adj,核算成功回来true,不然回来false。
  • applyOomAdjLocked():使用adj,当需求杀掉方针进程则回来false,不然回来true。
adj更新机遇

也便是updateOomAdjLocked()被调用的机遇。通俗的说,只需四大组件被创立或许状况发生变化,或许当时进程绑定了其他进程,都会触发adj更新,详细可在源码中检查此办法被调用的当地,比较多,这儿就不列举了

adj的核算过程

computeOomAdjLocked()核算过程适当杂乱,将近1000行代码,这儿就不贴了,有爱好可自行检查,整体思路便是依据当时进程的状况,设置对应的adj值,因为状况值许多,所以会有许多个if来判断每个状况是否符合,终究核算出当时进程归于哪种状况。

adj的使用

核算得出的adj值将发送给lowmemorykiller(简称lmk),由lmk来决定进程的存亡,不同的厂商,lmk的算法略有不同,下面是源码中对lmk的介绍

/* drivers/misc/lowmemorykiller.c
 *
 * The lowmemorykiller driver lets user-space specify a set of memory thresholds
 * where processes with a range of oom_score_adj values will get killed. Specify
 * the minimum oom_score_adj values in
 * /sys/module/lowmemorykiller/parameters/adj and the number of free pages in
 * /sys/module/lowmemorykiller/parameters/minfree. Both files take a comma
 * separated list of numbers in ascending order.
 *
 * For example, write "0,8" to /sys/module/lowmemorykiller/parameters/adj and
 * "1024,4096" to /sys/module/lowmemorykiller/parameters/minfree to kill
 * processes with a oom_score_adj value of 8 or higher when the free memory
 * drops below 4096 pages and kill processes with a oom_score_adj value of 0 or
 * higher when the free memory drops below 1024 pages.
 *
 * The driver considers memory used for caches to be free, but if a large
 * percentage of the cached memory is locked this can be very inaccurate
 * and processes may not get killed until the normal oom killer is triggered.
 *
 * Copyright (C) 2007-2008 Google, Inc.
 *
 * This software is licensed under the terms of the GNU General Public
 * License version 2, as published by the Free Software Foundation, and
 * may be copied, distributed, and modified under those terms.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 */

保活中心思路

依据上面的Android进程调度准则得知,咱们需求尽或许下降app进程的adj值,从而减少被lmk杀掉的或许性,而咱们传统的保活办法终究意图也是下降adj值。而依据adj等级分类能够看出,经过使用层的办法最多能将adj降到100~200之间,我别离测验了微信、支付宝、酷狗音乐,启动后回来桌面并息屏,测验成果如下

微信测验成果:

从framework角度看app保活问题

微信创立了两个进程,检查这两个进程的adj值均为100,对应为adj等级表中的VISIBLE_APP_ADJ,此成果为测验机上微信未登录状况测验成果,当换成我的小米8测验后发现,登录状况下的微信有三个进程在运转

从framework角度看app保活问题

后查阅资料得知,进程名为com.tencent.soter.soterserver的进程是微信指纹支付,此进程的adj值居然为-800,上面咱们说过,adj小于0的进程为体系进程,那么微信是怎样做到创立一个体系进程的,我和我的小伙伴都惊呆了o.o,为此,我对比了一下支付宝的测验成果

支付宝测验成果:

从framework角度看app保活问题

支付宝创立了六个进程,检查这六个进程的adj值,除了一个为915,其余均为0,怎样肥事,0就意味着正在与用户交互的前台进程啊,我的国际要崩塌了,只要一种或许,支付宝经过不知道的黑科技下降了adj值。

酷狗测验成果:

从framework角度看app保活问题

酷狗创立了两个进程,检查这两个进程的adj值别离为700、200,对应为adj等级表中的PREVIOUS_APP_ADJPERCEPTIBLE_APP_ADJ,还好,这个在意料之中。

测验考虑

经过上面三个app的测验成果能够看出,微信和支付宝一定是使用了某种保活手法,让本身的adj降到最低,尤其是微信,居然能够创立体系进程,几乎太逆天了,这是使用层肯定做不到的,一定是在native层完成的,但详细什么黑科技就不得而知了,毕竟反编译技术不是我的强项。

合理我郁郁寡欢之时,我想起了前两天看过的一篇文章《当 App 有了体系权限,真的能够随心所欲?》,文章叙述了第三方App怎样利用CVE缝隙获取到体系权限,然后神不知鬼不觉的干一些匪夷所思的事儿,这让我茅塞顿开,或许这些大厂的app便是利用了体系缝隙来保活的,不然真的就说不通了,已然都能获取到体系权限了,那创立个体系进程不是分分钟的事儿吗,还需求啥厂商白名单。

总结

进程保活是一把双刃剑,增加app存活时间的同时献身的是用户手机的电量,内存,cpu等资源,甚至还有用户的忍受度,作为开发者一定要合理取舍,不要为了保活而保活,即便需求保活,也尽量选用白色保活手法,别让用户手机变板砖,然后再来哭爹骂娘。

参阅资料:

讨论Android6.0及以上体系APP常驻内存(保活)完成-争宠篇

讨论Android6.0及以上体系APP常驻内存(保活)完成-复生篇

讨论一种新式的双进程看护使用保活

史上最强Android保活思路:深入分析腾讯TIM的进程永生技术

当 App 有了体系权限,真的能够随心所欲?

「 深蓝洞悉 」2022 年度最“不行赦”缝隙