2014-07-29

How to get patch files between two branches under same repo?

It's easy - Executing the given shell command in each git project by running repo forall -c ''

For example, I want to know how many patch files between master-product-qc-4.3(Ara mainline) and r18521.1(pure QCOM release without any Ara changes). It can be done with below steps. 1. Check your repo branches to avoid typo;

$ git --git-dir=.repo/manifests/.git branch -a
* default
remotes/m/master-product-qc-4.3 -> origin/master-product-qc-4.3
remotes/origin/int-qc/r1031.1
remotes/origin/int-qc/r18521.1
remotes/origin/int-qc/r18522.2
remotes/origin/master
remotes/origin/master-product-qc-4.3

2. Put below shell command in a script and run it(You can use attached script file);

repo forall -c '\
repo_url=$(echo $REPO_PROJECT | sed -e "s:\/:_:g") && \
reportsdir=~/repo_branch_diff && \
range_f= remotes/origin/int-qc/r18521.1 && \
range_t= remotes/origin/master-product-qc-4.3 && \
mkdir -p $reportsdir/$repo_url && \
git diff --name-only $range_f..$range_t > $reportsdir/$repo_url/$repo_url.filediff && \
git diff -b $range_f..$range_t > $reportsdir/$repo_url/$repo_url.thediff && \
[ -s $reportsdir/$repo_url/$repo_url.thediff ] && \
git format-patch $range_f..$range_t -o $reportsdir/$repo_url && \
echo "from: " >> $reportsdir/$repo_url/git_branch_revisions.txt && \
git show-ref $range_f >> $reportsdir/$repo_url/git_branch_revisions.txt && \
echo "to: " >> $reportsdir/$repo_url/git_branch_revisions.txt && \
git show-ref $range_t >> $reportsdir/$repo_url/git_branch_revisions.txt'

Before running above command, 2 things you need to check: - Specify which two branches you want to do diff. One is specified by 'range_f'(from); another is specified by 'range_t'(to). For my case, from' is remotes/origin/int-qc/r18521.1, to is remotes/origin/master-product-qc-4.3 ; - Specify a path with write permission for saving diff results(For my case, the patch is ~/repo_branch_diff).

Generally, above command will do 'diff' for each repo project(Each repo project is a single git repository). If two repo projects have difference, then a folder named as repo project will be created. That means I will put each repo project's patch files in difference folder. It's help us to know how many patches for a single repo project. Here's the result for my case:

$ ls -lh
drwxr-sr-x 2 m7yang psw_easha 24K Feb 15 10:59 aol1_device_nokia
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:29 aol1_device_qcom_common
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 10:59 aol1_kernel_lk
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_kernel_msm
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 10:59 aol1_platform_bionic
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 10:59 aol1_platform_bootable_recovery
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 20 15:52 aol1_platform_build
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 10:59 aol1_platform_dalvik
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_chromium
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_dhcpcd
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_ebtables
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_icu4c
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_iptables
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_jhead
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_libsepol
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_llvm
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_mp4parser
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_oprofile
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_skia
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_srec
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_svox
drwxr-sr-x 2 m7yang psw_easha 4.0K Feb 15 11:00 aol1_platform_external_tinyxml

Each folder contains git patch files( .patch) and below 3 files: - A .filediff file. Open that file you can see how many files are difference between two branches; - A .thediff file that is a big diff file - All differences are in this single file. It's useful when you want to use 'patch' to patch whole changes to somewhere; - git_branch_revisions.txt keep two SHA-1s for 'from' branch and 'to' branch;

-- MingYang - 29 Jul 2014

 

 

About VSS, RSS, PSS and USS

  • VSS - Virtual Set Size
  • RSS - Resident Set Size
  • PSS - Proportional Set Size
  • USS - Unique Set Size

Android has a tool called procrank (/system/xbin/procrank), which lists out the memory usage of Linux processes in order from highest to lowest usage. The sizes reported per process are VSS, RSS, PSS, and USS.

For the sake of simplicity in this description, memory will be expressed in terms of pages, rather than bytes. Linux systems like ours manage memory in 4096 byte pages at the lowest level.

VSS (reported as VSZ from ps) is the total accessible address space of a process. This size also includes memory that may not be resident in RAM like mallocs that have been allocated but not written to. VSS is of very little use for determing real memory usage of a process.

RSS is the total memory actually held in RAM for a process. RSS can be misleading, because it reports the total all of the shared libraries that the process uses, even though a shared library is only loaded into memory once regardless of how many processes use it. RSS is not an accurate representation of the memory usage for a single process.

PSS differs from RSS in that it reports the proportional size of its shared libraries, i.e. if three processes all use a shared library that has 30 pages, that library will only contribute 10 pages to the PSS that is reported for each of the three processes. PSS is a very useful number because when the PSS for all processes in the system are summed together, that is a good representation for the total memory usage in the system. When a process is killed, the shared libraries that contributed to its PSS will be proportionally distributed to the PSS totals for the remaining processes still using that library. In this way PSS can be slightly misleading, because when a process is killed, PSS does not accurately represent the memory returned to the overall system.

USS is the total private memory for a process, i.e. that memory that is completely unique to that process. USS is an extremely useful number because it indicates the true incremental cost of running a particular process. When a process is killed, the USS is the total memory that is actually returned to the system. USS is the best number to watch when initially suspicious of memory leaks in a process.

For systems that have Python available, there is also a nice tool called smem that will report memory statistics including all of these categories.

# procrank

procrank

PID      Vss      Rss      Pss      Uss cmdline

481   31536K   30936K   14337K    9956K system_server

 

 

Hardward acceleration

For Android, forcing hardware acceleration to render apps means enable 'Force GPU rendering' in developer menu.See force_hw_ui in packages/apps/Settings/res/values/strings.xml.

You can control hardware acceleration at the following levels: Application, Activity, Window and View. See http://developer.android.com/guide/topics/graphics/hardware-accel.html#controlling

Enabling 'Force GPU rendering' option in developer menu will set persist.sys.ui.hw to 'true'. File: packages/apps/Settings/src/com/android/settings/DevelopmentSettings.java

       private static final String HARDWARE_UI_PROPERTY = "persist.sys.ui.hw";
       ...
       SystemProperties.set(HARDWARE_UI_PROPERTY, mForceHardwareUi.isChecked() ? "true" : "false");

To enable 'Hardware Accelerate', we can either enable 'Force GPU rendering' option in develop menu or explicit set it via setWindowManager API by developer, like below examples: File: ./frameworks/base/core/java/android/service/dreams/Dream.java

     mWindow.setWindowManager(null, windowToken, "dream", true);
  

If calling setWindowManager without specify the last input parameter(hardwareAccelerated), the default is false. File:./frameworks/base/core/java/android/view/Window.java

     public void setWindowManager(WindowManager wm, IBinder appToken, String appName) {
         setWindowManager(wm, appToken, appName, false);
     } 

File: ./frameworks/base/core/java/android/view/Window.java

 465     public void setWindowManager(WindowManager wm, IBinder appToken, String appName) {
 466         setWindowManager(wm, appToken, appName,false);
 467     }
 468 
 476     public void setWindowManager(WindowManager wm, IBinder appToken, String appName,
 477             boolean hardwareAccelerated) {
             ...
 483         mWindowManager = new LocalWindowManager(wm, hardwareAccelerated);
 484     }
 
 
 491     private class LocalWindowManager extends WindowManagerImpl.CompatModeWrapper {
 492         private static final String PROPERTY_HARDWARE_UI = "persist.sys.ui.hw";
 493 
 494         private final boolean mHardwareAccelerated;
 495 
 496         LocalWindowManager(WindowManager wm, boolean hardwareAccelerated) {
 497             ...
 498             mHardwareAccelerated = hardwareAccelerated ||                                                                          
 499                     SystemProperties.getBoolean(PROPERTY_HARDWARE_UI, false);
 500         }
 501 
 502         public boolean isHardwareAccelerated() {
 503             return mHardwareAccelerated;
 504         }

App can force 'Hardware Accelerated' to true(no matter developer menu->Force GPU rendering option is on or off), or make it depends to developer menu->Force GPU rendering option. Here are two examples from Normandy source:

     
File: packages/apps/Launcher2/AndroidManifest.xml
 64     <application
            ...
 68         android:hardwareAccelerated="@bool/config_hardwareAccelerated"                                                              
 
File: packages/apps/Gallery2/AndroidManifest.xml
 64     <application android:icon="@mipmap/ic_launcher_gallery" android:label="@string/app_name"
                ...
 67             android:hardwareAccelerated="true"

 

 

ANR

 

1. What's ANR

An ANR happens when some long operation takes place in the "main" thread. This is the event loop thread, and if it is busy, Android cannot process any further GUI events in the application, and thus throws up an ANR dialog. See http://developer.android.com/training/articles/perf-anr.html

2. ANR output format


ANR output displays threads information in the following order: - DVM mutexes (only for ‘main’ thread) - Thread’s info - Thread stack

1.1.DVM mutexes format

 "(mutexes: tll=%x tsl=%x tscl=%x ghl=%x hwl=%x hwll=%x)", where
        tll  - thread list lock,
        tsl  - thread suspend lock,
        tscl - thread suspend count lock,
        dhl  - GC heap lock,
        hwl  - heap worker lock,
        hwll - heap worker list lock

1.2.Thread's info format Thread’s info format - first line

“%name %priority %tid %status”, where
        name     - thread name,
        priority - thread priority,
        tid      - thread id,
        status   - thread status

Thread’s info format - second line

" group=%s sCount=%d dsCount=%d obj=%p self=%p“, where
group   - group name,
sCount  - suspend count,
dsCount - debug suspend count,
obj     - Linux thread that we are associated with
self    - self reference

Thread’s info format - third line

 " sysTid=%d nice=%d sched=%d/%d cgrp=%s handle=%d", where
sysTid - Linux thread id,
nice   - Linux "nice“ priority (lower numbers indicate higher priority),
sched  - scheduling priority,
cgrp   - scheduling group buffer
handle - thread handle

Stack trace The stack trace contains functions call stack at the moment ‘dumpstate’ command was called. If the thread is in the TIMED_WAIT state it also contains the address of the object the current thread is waiting for e.g

2. Thread status


Thread status can be one of the following values:

ZOMBIE       - terminated thread
RUNNABLE     - runnable or running now
TIMED_WAIT   - timed waiting in Object.wait()
MONITOR      - blocked on a monitor
WAIT         - waiting in Object.wait()
INITIALIZING - allocated, not yet running
STARTING     - started, not yet on thread list
NATIVE       - off in a JNI native method
VMWAIT       - waiting on a VM resource
SUSPENDED    - suspended, usually by GC or debugger
UNKNOWN      - thread is in the undefined state

3. Example. Here's a Normandy ANR error and how I identified it by checking trace.

1. Check android.server.ServerThread->search 'waiting to lock'->get 'held by..'. Searching 'waiting' is more genetically

1281 "android.server.ServerThread" prio=5 tid=12 MONITOR
1282   | group="main" sCount=1 dsCount=0 obj=0x424f1fd8 self=0x4f810008
1283   | sysTid=583 nice=-2 sched=0/0 cgrp=apps handle=1337892560
1284   | schedstat=( 0 0 0 ) utm=1196 stm=106 core=0
1285   at com.android.server.AlarmManagerService$ResultReceiver.onSendFinished(AlarmManagerService.java:~1034)
1286   - waiting to lock <0x42876cb8> (a java.lang.Object) held by tid=23 (AlarmManager)

2. In this case, search AlarmManager? based on what we got in 'held by..' of step 1

1070 "AlarmManager" prio=5 tid=23 MONITOR
1071   | group="main" sCount=1 dsCount=0 obj=0x42985c58 self=0x4fd91a78
1072   | sysTid=639 nice=0 sched=0/0 cgrp=apps handle=1337835624
1073   | schedstat=( 0 0 0 ) utm=15 stm=5 core=0
1074   at com.android.server.PowerManagerService.acquireWakeLock(PowerManagerService.java:~911)
1075   - waiting to lock <0x424feaf8> (a com.android.server.PowerManagerService$LockList) held by tid=9 (Binder_1)

3. check 'Binger_1' from step 2 to figure out where's lock place in source.

1324 "Binder_1" prio=5 tid=9 NATIVE
1325   | group="main" sCount=1 dsCount=0 obj=0x424ed8e8 self=0x50bae4b0
1326   | sysTid=580 nice=0 sched=0/0 cgrp=apps handle=1345522072
1327   | schedstat=( 0 0 0 ) utm=1745 stm=912 core=0
1328   at com.android.server.PowerManagerService.nativeSetScreenState(Native Method)
1329   at com.android.server.PowerManagerService.setScreenStateLocked(PowerManagerService.java:1928)
1330   at com.android.server.PowerManagerService.setPowerState(PowerManagerService.java:2060)
1331   at com.android.server.PowerManagerService.setPowerState(PowerManagerService.java:1963)
1332   at com.android.server.PowerManagerService.acquireWakeLockLocked(PowerManagerService.java:1063)
1333   at com.android.server.PowerManagerService.acquireWakeLock(PowerManagerService.java:912)
1334   at android.os.IPowerManager$Stub.onTransact(IPowerManager.java:62)
1335   at android.os.Binder.execTransact(Binder.java:367)
1336   at dalvik.system.NativeStart.run(Native Method)

Here's PowerManagerService?.java:~911

 900     public void acquireWakeLock(int flags, IBinder lock, String tag, WorkSource ws) {
 901         int uid = Binder.getCallingUid();
 902         int pid = Binder.getCallingPid();
 903         if (uid != Process.myUid()) {
 904             mContext.enforceCallingOrSelfPermission(android.Manifest.permission.WAKE_LOCK, null);
 905         }
 906         if (ws != null) {
 907             enforceWakeSourcePermission(uid, pid);
 908         }
 909         long ident = Binder.clearCallingIdentity();
 910         try {
>> 911             synchronized (mLocks)  {             
 912                 acquireWakeLockLocked(flags, lock, uid, pid, tag, ws);
 913             }
 914         } finally {
 915             Binder.restoreCallingIdentity(ident);
 916         }
 917     }

 

KSM(Kernel Samepage Merging)

 

KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,added to the Linux kernel in 2.6.32.When KSM enabled, a kernel thread kswapd runs periodically to reclaim memory. Kswapd traverses each process and reclaims pages until free memory size > some threshold. After reclaiming 100 pages(It can be set by change /sys/kernel/mm/ksm/pages_to_scan value), kswaps yields the CPU and calls the scheduler. Reclaiming means kswapd looking for pages of identical content which can be replaced by a single write-protected page (which is automatically copied if a process later wants to update its content).

As you can see, if KSM enabled(kswapd) will do periodic reclaiming, which is positive for reducing out_of_memory freq.

How to measure KSM

The effectiveness of KSM and MADV_MERGEABLE is shown in /sys/kernel/mm/ksm/:

pages_shared - how many shared pages are being used

pages_sharing - how many more sites are sharing them i.e. how much saved

pages_unshared - how many pages unique but repeatedly checked for merging

pages_volatile - how many pages changing too fast to be placed in a tree

full_scans - how many times all mergeable areas have been scanned

A high ratio of pages_sharing to pages_shared indicates good sharing, but a high ratio of pages_unshared to pages_sharing indicates wasted effort.pages_volatile embraces several different kinds of activity, but a high proportion there would also indicate poor use of madvise MADV_MERGEABLE.

To measure the effectiveness of KSM, we made a utility(see attached ksm.txt) that can collect system KSM information(like pages_shared, pages_sharing, etc) and generate a csv file. Here's a example.

How to enable KSM on Android 4.1-4.3
------------------------------------------------------
1. Add CONFIG_KSM=y in kernel config. For example kernel/arch/arm/configs/xxx _defconfig.
Enabling KSM in kernel will introduce a new kernel thread,kswapd, to run periodically for reclaimming memory. Kswapd traverses each process and reclaims pages until free memory size > some threshold. After reclaiming pages_to_scan pages, kswapd yields the CPU and calls the scheduler, to let other processes run. Its parameters can be set by next setp.

2. Set KSM parameters in init.rc.

File: system/core/rootdir/init.rc
on post-fs
...
# Configure and enable KSM
write /sys/kernel/mm/ksm/pages_to_scan 100
write /sys/kernel/mm/ksm/sleep_millisecs 500
write /sys/kernel/mm/ksm/run 1
...

pages_to_scan - Number of pages ksmd should scan in one batch
sleep_millisecs - Milliseconds ksmd should sleep between batches
run - If KSM is running

3. To measure/debug KSM, I bring ksminfo utility to Android 4.1(No time to port procrank and librank yet).You can get that by applying ksminfo.patch.

4. To test KSM, google recommend looking at long running devices (several hours) and seeing whether KSM makes any noticeable improvement on launch times and rendering times. So we made a utility to show KSM infor on screen that can be enable/disable by develop menu. Please apply showksm.diff to get that.

5. After done, /sys/kernel/mm/ksm/run can present if KSM running
$root@aosp:/ # cat /sys/kernel/mm/ksm/run
1

How to use ADB to call N/A* application?

Here's a example: Create an alarm clock and set its time as 3:30, alarm name is m7yang.

$adb shell am start -a android.intent.action.SET_ALARM --ei "android.intent.extra.alarm.HOUR" 3 --ei "android.intent.extra.alarm.MINUTES" 30 --es "android.intent.extra.alarm.MESSAGE" m7yang

If don’t want the UI popup, add parameter marked as red below.

$adb shell am start -a android.intent.action.SET_ALARM --ei "android.intent.extra.alarm.HOUR" 3 --ei "android.intent.extra.alarm.MINUTES" 30 --es "android.intent.extra.alarm.MESSAGE" m7yang --ez "android.intent.extra.alarm.SKIP_UI" true

 

Diabling JIT for saving RAM on low RAM Android device

System-wide JIT memory usage is dependent on the number of applications running and the code footprint of those applications. The JIT establishes a maximum translated code cache size and touches the pages within it as needed. JIT costs somewhere between 3M and 6M across a typical running system.

The large apps tend to max out the code cache fairly quickly (which by default has been 1M). On average, JIT cache usage runs somewhere between 100K and 200K bytes per app. Reducing the max size of the cache can help somewhat with memory usage, but if set too low will send the JIT into a thrashing mode. For the really low-memory devices, we recommend the JIT be disabled entirely.

For how much memory JIT code cache size consumed, I checked Normandy dalvik source and have below comments:

1. The default code cache size is 1.5M.

File: dalvik/vm/Globals.h
#define DEFAULT_CODE_CACHE_SIZE 0xffffffff
 
File: dalvik/vm/compiler/codegen/arm/armv7-a-neon/ArchVariant.cpp
/* Architecture-specific initializations and checks go here /
bool dvmCompilerArchVariantInit(void)
{
   ...
   gDvmJit.threshold = 40;
   if (gDvmJit.codeCacheSize = DEFAULT_CODE_CACHE_SIZE) {
        gDvmJit.codeCacheSize = 1500 * 1024;
   } else if ((gDvmJit.codeCacheSize = 0) && (gDvm.executionMode == kExecutionModeJit)) {
        gDvm.executionMode = kExecutionModeInterpFast;
   }


2. Dalvik create Anonymous Shared Memory with that size

File: dalvik/vm/compiler/Compiler.cpp
 
bool dvmCompilerSetupCodeCache(void)
{
    int fd;
    / Allocate the code cache /
    fd = ashmem_create_region("dalvik-jit-code-cache", *gDvmJit.codeCacheSize);
    ...
    gDvmJit.codeCache = mmap(NULL, gDvmJit.codeCacheSize, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE , fd, 0);


That means each VM will consume 1.5M memory for holding code cache size. Moreover, Dalvik system will consume 2M for system wide usage. Meaning that 38 Java process->38 VM->38*1.5+2=59M.

Since disabling JIT will drop Java performance a lot, at least from benchmark result, I prefer reducing code cache size instead of disable it totally. I'm still testing 512K per VM. I will test 256K next week or maybe this weekend. For 512K case, I didn't see any significant performance drop when launching/playing Java app,like game. But I need more test.

JIT can be fisabled by adding the following line to the product makefile(e.g. build/target/product/core.mk):
PRODUCT_PROPERTY_OVERRIDES += dalvik.vm.jit.codecachesize=0

But to disable JIT, only set 'dalvik.vm.jit.codecachesize=0' indeed doesn't work. The reason is because 4.1's AndroidRuntime?::startVm method even not ready that property at all.

I checked 4.4 Dalvik, basically below 3 commits introduced JIT disabling feature.
- "Process new system property for max JIT cache size" (SHA-1 b63de6de026b8ebe0b7d7b7f188afc30fff42411)
To allow low-memory devices to reduce (or eliminate entirely) the RAM used by the JIT, dalvikvm has a new command-line option to set the max size of the JIT's translation cache. In this CL, we pass that new option based on a system property.

- "JIT tuning; set cache size on command line" (SHA-1 bbbe552a31f7229708bfc748480ce538218ae076)
The tuning knobs for triggering trace compilation for the JIT had not been revisited for several years. In that time, the working set of some applications have significantly increased, leading to frequent cache overlows & flushes. This CL adds the ability to set the maximum size of the JIT's cache on the command line, and we expect to use different settings depending on device configuration (rule of thumb: 1K for each 1M for system RAM, with 2M limit). Additionally, the trace compilation trigger has been tightened to limit the compilation of cold traces.

- "Suppress warning if JIT disabled" (SHA-1 b6ffb72838cc4a8f60028c21ed740c5f48c89c80)

Set "dalvik.vm.execution-mode=int:fast" seems only impact execution mode of VM. I doubt that execution mode will not impact memory used by JIT.

File: frameworks/base/core/jni/AndroidRuntime.cpp
property_get("dalvik.vm.execution-mode", propBuf, "");
if (strcmp(propBuf, "int:portable") = 0)  {
    executionMode = kEMIntPortable;
} else if (strcmp(propBuf, "int:fast") 0)  {
    executionMode = kEMIntFast;
#if defined(WITH_JIT)
  } else if (strcmp(propBuf, "int:jit") = 0) {
    executionMode = kEMJitCompiler;