愿历尽千帆 归来仍少年

一个底层内存分配异常导致无法进入launcher问题分析

字数统计: 3,186阅读时长: 16 min
2018/05/31

问题平台
Android go

复现概率
always

问题现象
解锁后,Launcher图标未能加载出,按虚拟back,home键,下拉状态栏响应异常缓慢

前言
Android Go是一个简化版本的Android O(及以上),能够在超低端的Android手机上流畅运行,具体量化就是RAM仅为512MB1GB的机型

初步分析
首次合入MTK提供的Android Go代码后开机,解锁后Launcher图标没有加载出,鉴于该平台代码是芯片商已经调试稳定后释放,Launcher本身应用出问题的概率比较小,操作发现下拉状态栏异常卡顿

第一反应是会不会是内存很紧张呢? 有可能的哦,ok, 下面开始分析日志

Step1:日志分析
命令调mtk日志应用, 没调出来?ok….这一点比较异常, 后面会解释原因,既然完整的mtk日志抓取不了,下面通过adb logcat命令抓取

logcat日志
日志中开头部分就看到如下异常,根据日志可以知道对应的操作是用命令调mtk日志应用,正是我上面的操作,报的异常说我们尝试去调用的对象已经挂了, ok…大写的懵逼

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
05-21 05:43:31.691 I/ActivityManager(  619): Start proc 5691:com.mediatek.mtklogger/u0a46 for activity com.mediatek.mtklogger/.MainActivity

05-21 05:43:32.297 W/ActivityManager( 619): Exception when starting activity com.mediatek.mtklogger/.MainActivity

05-21 05:43:32.297 W/ActivityManager( 619): android.os.DeadObjectException

05-21 05:43:32.297 W/ActivityManager( 619): at android.os.BinderProxy.transactNative(Native Method)

05-21 05:43:32.297 W/ActivityManager( 619): at android.os.BinderProxy.transact(Binder.java:764)

05-21 05:43:32.297 W/ActivityManager( 619): at android.app.IApplicationThread$Stub$Proxy.scheduleLaunchActivity(IApplicationThread.java:1232)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.realStartActivityLocked(ActivityStackSupervisor.java:1457)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.startSpecificActivityLocked(ActivityStackSupervisor.java:1579)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStack.makeVisibleAndRestartIfNeeded(ActivityStack.java:2069)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStack.ensureActivitiesVisibleLocked(ActivityStack.java:1873)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.ensureActivitiesVisibleLocked(ActivityStackSupervisor.java:3366)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.attachApplicationLocked(ActivityStackSupervisor.java:988)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplicationLocked(ActivityManagerService.java:7276)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplication(ActivityManagerService.java:7343)

05-21 05:43:32.297 W/ActivityManager( 619): at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:291)

05-21 05:43:32.297 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:3013)

05-21 05:43:32.297 W/ActivityManager( 619): at android.os.Binder.execTransact(Binder.java:697)

接着往下看

1
05-21 05:43:33.035 I/ActivityManager(  619): Process com.mediatek.mtklogger (pid 5705) has died: fore TOP

mtk日志这个进程挂了,这种情况头一次见,那会不会Launcher中没有加载出图标也是因为进程挂了, ok, 搜索下"launcher"关键字

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
05-21 05:45:31.865 W/ActivityManager(  619): Exception in new application when starting activity com.android.launcher3/.Launcher

05-21 05:45:31.865 W/ActivityManager( 619): android.os.DeadObjectException

05-21 05:45:31.865 W/ActivityManager( 619): at android.os.BinderProxy.transactNative(Native Method)

05-21 05:45:31.865 W/ActivityManager( 619): at android.os.BinderProxy.transact(Binder.java:764)

05-21 05:45:31.865 W/ActivityManager( 619): at android.app.IApplicationThread$Stub$Proxy.scheduleLaunchActivity(IApplicationThread.java:1232)

05-21 05:45:31.865 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.realStartActivityLocked(ActivityStackSupervisor.java:1457)

05-21 05:45:31.865 W/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.attachApplicationLocked(ActivityStackSupervisor.java:974)

05-21 05:45:31.865 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplicationLocked(ActivityManagerService.java:7276)

05-21 05:45:31.865 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplication(ActivityManagerService.java:7343)

05-21 05:45:31.865 W/ActivityManager( 619): at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:291)

05-21 05:45:31.865 W/ActivityManager( 619): at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:3013)

05-21 05:45:31.865 W/ActivityManager( 619): at android.os.Binder.execTransact(Binder.java:697)

05-21 05:45:31.874 E/ActivityManager( 619): Exception thrown launching activities in ProcessRecord{922c6e6 5794:com.android.launcher3/u0a12}

05-21 05:45:31.874 E/ActivityManager( 619): android.os.DeadObjectException

05-21 05:45:31.874 E/ActivityManager( 619): at android.os.BinderProxy.transactNative(Native Method)

05-21 05:45:31.874 E/ActivityManager( 619): at android.os.BinderProxy.transact(Binder.java:764)

05-21 05:45:31.874 E/ActivityManager( 619): at android.app.IApplicationThread$Stub$Proxy.scheduleLaunchActivity(IApplicationThread.java:1232)

05-21 05:45:31.874 E/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.realStartActivityLocked(ActivityStackSupervisor.java:1457)

05-21 05:45:31.874 E/ActivityManager( 619): at com.android.server.am.ActivityStackSupervisor.attachApplicationLocked(ActivityStackSupervisor.java:974)

05-21 05:45:31.874 E/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplicationLocked(ActivityManagerService.java:7276)

05-21 05:45:31.874 E/ActivityManager( 619): at com.android.server.am.ActivityManagerService.attachApplication(ActivityManagerService.java:7343)

05-21 05:45:31.874 E/ActivityManager( 619): at android.app.IActivityManager$Stub.onTransact(IActivityManager.java:291)

05-21 05:45:31.874 E/ActivityManager( 619): at com.android.server.am.ActivityManagerService.onTransact(ActivityManagerService.java:3013)

05-21 05:45:31.874 E/ActivityManager( 619): at android.os.Binder.execTransact(Binder.java:697)

05-21 05:45:31.886 I/ActivityManager( 619): Killing 5794:com.android.launcher3/u0a12 (adj 0): error during init

果然Launcher进程也挂了,ok…有意思

通过adb shell cat /proc/meminfo命令看下机器内存情况

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
MemTotal:        162404 kB

MemFree: 5804 kB

MemAvailable: 40740 kB

Buffers: 672 kB

Cached: 34920 kB

SwapCached: 796 kB

Active: 29176 kB

Inactive: 31516 kB

这里的内存MemTotal为什么这么小? 同驱动沟通了下,这个项目是512内存,可是现在打印出来MemTotal只有一百多
这里顺便说下上面几个重点关键字

MemTotal
系统从加电开始到引导完成,firmware/BIOS要保留一些内存,kernel本身要占用一些内存,最后剩下可供kernel支配的内存就是MemTotal。这个值在系统运行期间一般是固定不变的。

MemFree
表示系统尚未使用的内存。[MemTotal-MemFree]就是已被用掉的内存。

MemAvailable
有些应用程序会根据系统的可用内存大小自动调整内存申请的多少,所以需要一个记录当前可用内存数量的统计值,MemFree并不适用,因为MemFree不能代表全部可用的内存,系统中有些内存虽然已被使用但是可以回收的,比如cache/bufferslab都有一部分可以回收,所以这部分可回收的内存加上MemFree才是系统可用的内存,即MemAvailable/proc/meminfo中的MemAvailable是内核使用特定的算法估算出来的,要注意这是一个估计值,并不精确

这里说下cachebuffers的作用,Linux利用空闲的物理内存,划分出一部份空间,作为cachebuffers,目的是为了提高数据访问性能
这里可以看到我们当前的系统可用的内存大约在40M左右,这显然是不够的,现在问题方向指向了内存

下面抓取一份串口日志,看下其中的内存分配

串口日志

如下片段是开机动画完成后

1
2
3
4
5
6
7
8
9
10
11
12
13
[    0.000000] <0>-(0)[0:swapper]Memory: 159860K/454336K available (11264K kernel code, 931K rwdata, 3320K rodata, 1024K init, 2701K bss, 294476K reserved, 0K cma-reserved, 0K highmem)

[ 0.000000] <0>-(0)[0:swapper]Virtual kernel memory layout:

[ 0.000000] <0> vector : 0xffff0000 - 0xffff1000 ( 4 kB)

[ 0.000000] <0> fixmap : 0xffc00000 - 0xfff00000 (3072 kB)

[ 0.000000] <0> vmalloc : 0xe0800000 - 0xff800000 ( 496 MB)

[ 0.000000] <0> lowmem : 0xc0000000 - 0xe0000000 ( 512 MB)

[ 0.000000] <0> pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB)

这里可以看到lowmem的确是512M(lowmem代表linux最终可支配的物理内存大小)

这里有个奇怪的地方是reserved占据了294476K ,将近300M,开机结束也没有变化。

这里对Reserved memory的概念进行一个描述

Linux中保留内存(Reserved memory)是指把系统中的一部分内存保留起来,内核不会为它建立页表,一般的应用程序无法访问到这段内存。在板卡调试、内存测试和设备DAM调试的过程中,可以运用这种办法,先验证系统在只有低端内存的情况下能否顺利启动;

那么reserved占据的这块内存是提供给谁使用的呢? 继续看串口日志,看到modem加载的片段如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
//加载modem
[2030] Enter load_modem_image v2.0
[2030] ccci_request_mem: ret:4000000, smem_limit:78000000, align:2000000
[2031] mblock[0].start: 0x40000000, sz: 0x8000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x3c000000,reserved_size: 0x4000000
[2033] mblock_reserve dbg[0]: 1, 0, 1, 1
[2033] mblock[1].start: 0x43208000, sz: 0xdf8000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x40000000,reserved_size: 0x4000000
[2035] mblock_reserve dbg[1]: 1, 0, 1, 1
[2035] mblock[2].start: 0x44080000, sz: 0x580000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x40000000,reserved_size: 0x4000000
[2037] mblock_reserve dbg[2]: 1, 0, 1, 1
[2037] mblock[3].start: 0x44640000, sz: 0x9c0000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x40000000,reserved_size: 0x4000000
[2039] mblock_reserve dbg[3]: 1, 0, 1, 1
[2039] mblock[4].start: 0x450f8000, sz: 0x1ab58000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x5a000000,reserved_size: 0x4000000
[2041] mblock_reserve dbg[4]: 1, 1, 1, 1
[2041] mblock[5].start: 0x5fe40000, sz: 0x120000, limit: 0x78000000, max_addr: 0x5fc50000, target: 4, reserved_addr: 0x5a000000,reserved_size: 0x4000000
[2043] mblock_reserve dbg[5]: 1, 0, 1, 1
[2043] mblock[4]: 450f8000, 14f08000 from mblock
mblock[5]: 5e000000, 1c50000 from mblock
[2045] mblock_reserve: 5a000000 - 5e000000 from mblock 4
[2045] mblock_reserve [0].start: 0x40000000, sz: 0x8000
[2046] mblock_reserve [1].start: 0x43208000, sz: 0xdf8000
[2046] mblock_reserve [2].start: 0x44080000, sz: 0x580000
[2047] mblock_reserve [3].start: 0x44640000, sz: 0x9c0000
[2048] mblock_reserve [4].start: 0x450f8000, sz: 0x14f08000
[2048] mblock_reserve [5].start: 0x5e000000, sz: 0x1c50000
[2049] mblock_reserve [6].start: 0x5fe40000, sz: 0x120000
[2050] mblock_reserve-R[0].start: 0x5ff80000, sz: 0x80000 map:1 name:log_store
[2051] mblock_reserve-R[1].start: 0x44600000, sz: 0x40000 map:0 name:atf-reserved
[2051] mblock_reserve-R[2].start: 0x5fe00000, sz: 0x40000 map:0 name:atf-log-reserved
[2052] mblock_reserve-R[3].start: 0x44000000, sz: 0x80000 map:0 name:dtb_kernel_addr_mb
[2053] mblock_reserve-R[4].start: 0x40008000, sz: 0x3200000 map:0 name:kernel_addr_mb
[2054] mblock_reserve-R[5].start: 0x45000000, sz: 0xf8000 map:0 name:ramdisk_addr_mb
[2055] mblock_reserve-R[6].start: 0x5fc50000, sz: 0x1b0000 map:0 name:framebuffer
[2056] mblock_reserve-R[7].start: 0x5ff70000, sz: 0x10000 map:0 name:SPM-reserved
[2057] mblock_reserve-R[8].start: 0x5ff60000, sz: 0x10000 map:0 name:MCUPM-reserved
[2058] mblock_reserve-R[9].start: 0x5a000000, sz: 0x4000000 map:0 name:ccci
[2059] request size: 0x04000000, get start address: 0x5a000000
[2060] Using defalut RAT all in one
[2060] cust_tel_mode_idx val:-1
[2060] [LK_ENV]get_env opt_ps1_rat from area 1
[2061] [LK_ENV]get_env opt_md1_support from area 1
[2061] [LK_ENV]get_env opt_md3_support from area 1
[2062] [LK_ENV]get_env opt_c2k_lte_mode from area 1
[2063] [LK_ENV]get_env opt_lte_support from area 1
[2063] [LK_ENV]get_env opt_c2k_support from area 1
[2064] default rat:LF_LT_W_T_G[0x3b][0x0]
[2064] -- MD1 --
[2064] mblock[0].start: 0x40000000, sz: 0x8000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x30000000,reserved_size: 0x10000000
[2066] mblock_reserve dbg[0]: 1, 0, 1, 1
[2066] mblock[1].start: 0x43208000, sz: 0xdf8000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x34000000,reserved_size: 0x10000000
[2068] mblock_reserve dbg[1]: 1, 0, 1, 1
[2069] mblock[2].start: 0x44080000, sz: 0x580000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x34000000,reserved_size: 0x10000000
[2070] mblock_reserve dbg[2]: 1, 0, 1, 1
[2071] mblock[3].start: 0x44640000, sz: 0x9c0000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x34000000,reserved_size: 0x10000000
[2072] mblock_reserve dbg[3]: 1, 0, 1, 1
[2073] mblock[4].start: 0x450f8000, sz: 0x14f08000, limit: 0x78000000, max_addr: 0x0, target: -1, reserved_addr: 0x4a000000,reserved_size: 0x10000000
[2074] mblock_reserve dbg[4]: 1, 1, 1, 1
[2075] mblock[5].start: 0x5e000000, sz: 0x1c50000, limit: 0x78000000, max_addr: 0x5a000000, target: 4, reserved_addr: 0x4e000000,reserved_size: 0x10000000
[2076] mblock_reserve dbg[5]: 1, 0, 1, 1
[2077] mblock[6].start: 0x5fe40000, sz: 0x120000, limit: 0x78000000, max_addr: 0x5a000000, target: 4, reserved_addr: 0x4e000000,reserved_size: 0x10000000
[2079] mblock_reserve dbg[6]: 1, 0, 1, 1
[2079] mblock_reserve: 4a000000 - 5a000000 from mblock 4
[2080] mblock_reserve [0].start: 0x40000000, sz: 0x8000
[2080] mblock_reserve [1].start: 0x43208000, sz: 0xdf8000
[2081] mblock_reserve [2].start: 0x44080000, sz: 0x580000
[2082] mblock_reserve [3].start: 0x44640000, sz: 0x9c0000
[2082] mblock_reserve [4].start: 0x450f8000, sz: 0x4f08000
[2083] mblock_reserve [5].start: 0x5e000000, sz: 0x1c50000
[2084] mblock_reserve [6].start: 0x5fe40000, sz: 0x120000
[2084] mblock_reserve-R[0].start: 0x5ff80000, sz: 0x80000 map:1 name:log_store
[2085] mblock_reserve-R[1].start: 0x44600000, sz: 0x40000 map:0 name:atf-reserved
[2086] mblock_reserve-R[2].start: 0x5fe00000, sz: 0x40000 map:0 name:atf-log-reserved
[2087] mblock_reserve-R[3].start: 0x44000000, sz: 0x80000 map:0 name:dtb_kernel_addr_mb
[2088] mblock_reserve-R[4].start: 0x40008000, sz: 0x3200000 map:0 name:kernel_addr_mb
[2089] mblock_reserve-R[5].start: 0x45000000, sz: 0xf8000 map:0 name:ramdisk_addr_mb
[2090] mblock_reserve-R[6].start: 0x5fc50000, sz: 0x1b0000 map:0 name:framebuffer
[2091] mblock_reserve-R[7].start: 0x5ff70000, sz: 0x10000 map:0 name:SPM-reserved
[2092] mblock_reserve-R[8].start: 0x5ff60000, sz: 0x10000 map:0 name:MCUPM-reserved
[2093] mblock_reserve-R[9].start: 0x5a000000, sz: 0x4000000 map:0 name:ccci
[2093] mblock_reserve-R[10].start: 0x4a000000, sz: 0x10000000 map:1 name:ccci
//此处modem申请了268左右的内存
[2094] request size: 0x10000000, get start address: 0x4a000000

该问题同驱动同事沟通了下,驱动同事发现modem占据的内存没有释放掉,因为modem加载过程中会request一块比较大的内存,如上述日志中描述大概268的内存,正常加载完modem后会释放掉这块内存

这个问题中的软件版本modem因为存在异常没有合入但是内存依旧申请了,导致加载没有完成直接返回,后面modem占据的内存释放的流程没有走,这块内存就一直占据着,这也就是我们上面开始看的串口日志中reserved占用比较大的原因,reserved占据的内存在开机过程中是变化的,后面驱动同事修复modem异常并合入modem后,机器内存分配正常,进入launcher加载图标正常

该问题很早就定位到内存问题,排查到modem问题耗时较久,ok,问题分析到此

Have a good day!


本文写于住所,上传于公司,午休时间,电脑前,雨停了,等外卖
CATALOG