Thursday, January 28, 2016

v4.4.0-vrq2 released

v4.4.0-vrq2 has been released. The all in one patch file can be downloaded here. And both bitbucket and github repository have been updated with linux-4.4.y-vrq branch and v4.4.0-vrq2 tag.

What's new:
Mainly focus on startup/shutdown and suspend/resume issues in previous releases.
Holding back some feature commits like removal of SMT_NICE code.

BR Alfred


Story about starup/shudown and suspend/resume issue

There are several issues combine together in previous release.
1. dmesg shows there is about 1 secs delay in kernel log while system booting up.
2. Failed to reboot/shutdown machine.
3. Failed to resume from suspend.

The causes are complicated, the most major one is I have removed some code path to reschedule a cpu/rq after putting a task into the global run queue.  The second one maybe a circle deadlock in mainline, I catch the dmesg twice during my 200+ suspend/resume tests, and reduce the task cached time-out seems to helping with the resume success rate.

In this release, beside adding back the code to pump the scheduler, the NORMAL policy task caching time-out has been changed to 3ms, all rt policy task caching time-out to 0ms(in fact that rt policy tasks never be impacted by caching time-out, unless they are changed to NORMAL policy after caching). Issue 1 and 2 are fixed, issue 3 tested with 10 suspend/resume in console and 10 suspend/resume in X, so the failure rate of suspend/resume should be <5%.

Wednesday, January 13, 2016

First BFS/VRQ patch for kernel v4.4

Here is the all in one vrq patch for the latest linux kernel v4.4.

What's new:
1) Sync up with upsteam schedule code changes.
2) Remove original SMT_NICE code in BFS, something new incoming.
3) Quick path for best_mask_cpu(), which improve performance when workload<100%.
4) Minor refines.

I'd like to wait for other patches(BFQ etc) and do some commit merges before pushing the code to git. Meanwhile, of course, the most important, I'd like to hear your feedback about this patch on v4.4 and see if any adjustment is needed.

Having fun with VRQ in this new kernel release and the 2016.

BR Alfred

Thanks pf for testing and reports back. I have update the code change the link to

Please be notified that current vrq may failed to reschedule in some rare cases, specially when system boot up/reboot/shut-down and suspend/resume. I am looking back what code changes introduce the issue.

Looks like there are 2~3 issues in the field I'm hurting. One is about 1sec boot up delay shows in dmesg, and fix is done. Another is suspend/resume issue, I have bisected and found the commit, the issue is not related to bfq v7r10, fixing code is ready and need more time to verify it then see if any other commits cause suspend/resume issue back to the latest commit. The third issue is unable to shutdown, hopefully the fix of second issue also help with this.

Another heads-up:
Remember the "unplugged io" issue in bfs? Since mainline code changes, it also impact the fix code for this issue. So I have removed one condition checking in the fix code because that is never be true in current version. But anyway, please re-check the "unplugged io" issue, as which I can reproduce in my machines to verify it.