Tuesday, December 6, 2016

VRQ 0.89f release

VRQ 0.89f release

Normally I won't do two release in a day, but there are always exception. Here is the VRQ 0.89f relase with just one single commit

Rewrite the best_mask_cpu(). Which now use sched_cpu_affinity_chk_masks, to provide better performance improvement and to avoid addtional checking in non smt abilty cpu but has SMT kernel config enabled.

In short, in my sanity test, it shows improvement in all kinds of workload, here comes the sanity result of VRQ 0.89f before the next kernel release.

vrq0.89f

>>>>>50% workload
>>>>>round 1
real    5m27.812s
user    10m13.565s
sys     0m39.533s
>>>>>round 2
real    5m27.771s
user    10m13.407s
sys     0m39.521s
>>>>>round 3
real    5m27.834s
user    10m13.448s
sys     0m39.579s
>>>>>100% workload
>>>>>round 1
real    2m54.660s
user    10m30.269s
sys     0m41.142s
>>>>>round 2
real    2m54.602s
user    10m30.652s
sys     0m41.021s
>>>>>300% workload
>>>>>round 1
real    2m57.899s
user    10m40.231s
sys     0m41.864s
>>>>>round 2
real    2m57.682s
user    10m40.238s
sys     0m41.928s
>>>>>round 3
real    2m57.480s
user    10m40.219s
sys     0m41.282s

Enjoy this final vrq release before next kernel, :)

code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test


All-in-one patch is available too.

BR Alfred

Monday, December 5, 2016

VRQ 0.89e release

4.9 kernel will be released soon and vrq 0.89 will be replaced previous version as the official release of VRQ branch. To give enough hand-off period  before 4.9 officially come out, here come VRQ 0.89e release with just two commits

1. Fix hang issue once switch to schedutil governor.
2. Use mainline loadavg.c for load avg calculation.


code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

All-in-one is available too.

Enjoy this release, :)


There is just one may be two missing features which existed in previous vrq but not yet be in vrq 0.89 release. Once they are done, 0.9 can be official released, hopefully it could happened this year.

 BR Alfred

Thursday, December 1, 2016

VRQ 0.89d release

VRQ 0.89d now release with

1. Fix the cpu c-state issue. It is a long existed bug but covered by other issue.
2. Don't punish run queue time slice for RT/ISO and NORMAL policy task. The hackbench test shows that sharing time slices between parent and child task(enabled in 089c) limited the fork boost in one time slice. So here comes this policy specified modification.
3. Rewrite task_preemptible_rq(), more efficiency than previous version and help with policy fairness.
4. Remove unneeded code and debug code.

cpufreq_trigger investigation is still on going and policy fairness is being watched to see if further improvement is needed.

code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

All-in-one patch is available too.

Enjoy this release, :)

BR Alfred

PS, if you want to see some sanity test result comparing to vrq089a

089d

>>>>>50% workload
>>>>>round 1
real    5m27.954s
user    10m12.988s
sys     0m40.254s
>>>>>round 2
real    5m27.918s
user    10m13.064s
sys     0m40.219s
>>>>>round 3
real    5m28.132s
user    10m13.435s
sys     0m40.086s
>>>>>100% workload
>>>>>round 1
real    2m54.629s
user    10m30.754s
sys     0m41.447s
>>>>>round 2
real    2m54.776s
user    10m30.643s
sys     0m41.513s
>>>>>round 3
real    2m54.765s
user    10m30.421s
sys     0m41.619s
>>>>>300% workload
>>>>>round 1
real    2m58.007s
user    10m40.934s
sys     0m42.030s
>>>>>round 2
real    2m57.813s
user    10m40.255s
sys     0m42.349s
>>>>>round 3
real    2m58.158s
user    10m40.527s
sys     0m42.589s

089a

>>>>>50% workload
>>>>>round 1
real    5m29.051s
user    10m15.233s
sys     0m40.015s
>>>>>round 2
real    5m28.288s
user    10m13.595s
sys     0m40.065s
>>>>>round 3
real    5m28.229s
user    10m13.232s
sys     0m40.328s
>>>>>100% workload
>>>>>round 1
real    2m55.358s
user    10m32.229s
sys     0m41.553s
>>>>>round 2
real    2m55.629s
user    10m32.527s
sys     0m41.358s
>>>>>round 3
real    2m55.252s
user    10m31.858s
sys     0m41.873s
>>>>>300% workload
>>>>>round 1
real    2m59.998s
user    10m47.413s
sys     0m42.727s
>>>>>round 2
real    3m0.404s
user    10m47.422s
sys     0m43.425s
>>>>>round 3
real    2m59.934s
user    10m47.287s
sys     0m43.103s

Saturday, November 26, 2016

VRQ 0.89c released


VRQ 0.89c was released with

1. Introduce sched_rq_queued_masks, which made the run queue looking for higher policy queued task from other run queue firstly.
2. Fix unexpected design intention when creating new tasks and fix a hang issue enabling this new code.

I have decided to make a last-minute callback of planned task policy fairness feature commit, as it introduced a dead-lock scenario, by fixing this dead-lock, it leads to other side effects. It may need different solution for task policy fairness.

The cause of the cpu c-state issue has been found, it's the cpufreq_trigger callback.  It seems it also related to schedutil and intel cpufreq governor issues. It's time to solve it once for all.

Above two are in the to-do list in 0.89d. Although there are less commits in these release, but the sanity test also shows visible improvement in all kinds of workload.

code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

all-in-one patch also available.

Enjoy it! :)

Wednesday, November 16, 2016

VRQ 0.89b released

VRQ 0.89b is released with

1. Fix low workload regression comparing to previous vrq release by not trying to pick tasks from other run queue when cpu is scaling.
2. Follow cpu affinity order when pick tasks from other run queue.
3. Fix long existed wrong run queue scaling value when run on performance governor or exit from dynamic governor.

With all above changes, this release show better sanity performance than any other previous vrq releases. Next release will be focus on task policy fairness.

code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

all-in-one patch also available.

Enjoy it, :)

Thursday, November 10, 2016

VRQ 0.89a released

VRQ 0.89a is released which mainly fix the imbalance cpu usage issue. Now all cpu can run at 100%, and no issue found in daily usage.

Also, there are other code changes/refines in scheduler core, but I afraid it is not so visible.

For this release, sanity kernel compilation tests also show there is noticeable improvement comparing the previous vrq release at 100%~300% workload. At 50% workload, a small regression is found in the 4 cores system, a debug patch already gets it back to the same level of the previous vrq, but I'd like it to be well test on other systems before officially commit it.

I am happy with the sanity test results, it show this new release of vrq already better than previous release, and there are several performance improvement idea not yet implemented.

Have fun with this release and expect the next. :)

BR Alfred

PS: code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

Wednesday, November 2, 2016

VRQ 0.89 test branch released


VRQ at its beginning is to reduce grq(global run queue) lock access as much as possible. In previous release of VRQ, it was trying to get rid of grq lock access hot spots and create grq lock free code path. With the recently introduced skip list queue data structure, grq could be wiped out completely, and it will be happened in this and the incoming release of VRQ.

The immediately question will be whether it is base on MuQSS by CK? Answer is NO. It actually divided from an early commit of 4.8 -vrq branch. The reasons why not based on MuQSS are
1. Different skip list implementation.
2. There are still many sync-up and feature commits need to be picked up from previous -vrq branch.
3. Different rules to be followed and different routine implementation.
4. Codes are more controlable when work on a familiar code base.

So here comes this new release of VRQ with the below changes

* Totally remove the global rq structure, per cpu run queue has its own skip list to hold the running tasks on this run queue and be accessed by rq->lock(which is existed in previous version of VRQ).

* Update task_access_lock/unlock(...) strategy for grq/rq data structure changes.

* Update set_task_cpu() logic and usage as it has to follow the principle rules, 1. don't use set_task_cpu() for blocked tasks, let ttwu to solve out. 2. Setting task's cpu means to change the cpu/rq which task resided on.

* Update set_cups_allowed_ptr() logic and usage when task is queued or running on wrong cpu.

* Update cpu hot-plug api implementation as tasks now reside on per cpu run queue instead of a global run queue. And makes cpu on/off-line and suspend/resume work more reliable.

* Remove unused code such as sticky task, because by putting prev task to per rq skip list is natural stick/cache.

* more to be listed.

It's not done yet, so this is called release 0.89, the major known issue is the imbalanced cpu loading, on two cpus system, most system workload will be on cpu0. On a quad cores system, it becomes better but some cores are still not running at 100% cpu usage. This is because a very simple version of __schedule() and TTWU() is using in this release, not fantastic feature is deployed yet.

With the above known issue/limitation, it's not suitable to use this release in a production environment. This release is aim to demonstration the foundation routine changes adapting to per cpu run queue and verify no major system broken occurs. Then in next release(0.9), fix for imbalance cpu loading and scheduler performance improvement could be added.

You are encouraged to test this VRQ release, don't looking at performance, it sucks due to the known limitation. But please looking for miss behaviors of system or suspected kernel log comparing to previous VRQ release.

Code is available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

Enjoy it.

BR Alfred