VRQ 0.89d now release with
1. Fix the cpu c-state issue. It is a long existed bug but covered by other issue.
2. Don't punish run queue time slice for RT/ISO and NORMAL policy task. The hackbench test shows that sharing time slices between parent and child task(enabled in 089c) limited the fork boost in one time slice. So here comes this policy specified modification.
3. Rewrite task_preemptible_rq(), more efficiency than previous version and help with policy fairness.
4. Remove unneeded code and debug code.
cpufreq_trigger investigation is still on going and policy fairness is being watched to see if further improvement is needed.
code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test
All-in-one patch is available too.
Enjoy this release, :)
BR Alfred
PS, if you want to see some sanity test result comparing to vrq089a
089d
>>>>>50% workload
>>>>>round 1
real 5m27.954s
user 10m12.988s
sys 0m40.254s
>>>>>round 2
real 5m27.918s
user 10m13.064s
sys 0m40.219s
>>>>>round 3
real 5m28.132s
user 10m13.435s
sys 0m40.086s
>>>>>100% workload
>>>>>round 1
real 2m54.629s
user 10m30.754s
sys 0m41.447s
>>>>>round 2
real 2m54.776s
user 10m30.643s
sys 0m41.513s
>>>>>round 3
real 2m54.765s
user 10m30.421s
sys 0m41.619s
>>>>>300% workload
>>>>>round 1
real 2m58.007s
user 10m40.934s
sys 0m42.030s
>>>>>round 2
real 2m57.813s
user 10m40.255s
sys 0m42.349s
>>>>>round 3
real 2m58.158s
user 10m40.527s
sys 0m42.589s
089a
>>>>>50% workload
>>>>>round 1
real 5m29.051s
user 10m15.233s
sys 0m40.015s
>>>>>round 2
real 5m28.288s
user 10m13.595s
sys 0m40.065s
>>>>>round 3
real 5m28.229s
user 10m13.232s
sys 0m40.328s
>>>>>100% workload
>>>>>round 1
real 2m55.358s
user 10m32.229s
sys 0m41.553s
>>>>>round 2
real 2m55.629s
user 10m32.527s
sys 0m41.358s
>>>>>round 3
real 2m55.252s
user 10m31.858s
sys 0m41.873s
>>>>>300% workload
>>>>>round 1
real 2m59.998s
user 10m47.413s
sys 0m42.727s
>>>>>round 2
real 3m0.404s
user 10m47.422s
sys 0m43.425s
>>>>>round 3
real 2m59.934s
user 10m47.287s
sys 0m43.103s
@Alfred:
ReplyDeleteYour cpu core work balancing is still out of range. First cpu gets too much work.
BR, Manuel Krause
@Manuel
DeleteIt's kind of design itention in current VRQ, to make tasks stick on the run queue it resides on and avoid switch cpu overhead. For example, 2 tasks require cpu, #0 use 100% and #1 use 50%, in vrq, it's normal to see #0 occupy one cpu, usage 100%, while #1 just shows in another cpu and usage is about 50%.
For you case, please capture a top output when you observe imlalancing, so I can check if it is normal or not.
BR Alfred
@Alfred:
DeleteI'm not familiar with special top options to provide you with most relevant information you may want, so please give me a short advice on this, and I'd send it by email soon.
Some more info on the scenario: Flash player has a video loaded. When stopped both cores do ~10% of us+sy load equally, rest is nice load (wcg). When playing the video, 1st core makes ~60% us+sy, 2nd core ~30%, each rest to 100% is wcg. This is the observation of gkrellm. The sum divided by 2 results in the same value as observed on a most recent tested MuQSS kernel (where gkrellm shows equalised load on the 2 cores). So, the VRQ scheduler is working really well, only the different load distribution is still confusing me (as in earlier times^^).
And, I didn't face any problems with this VRQ during ~23h of uptime (not tested previous ones for kernel 4.8), so great work coming from you hands. Thank you!
BR, Manuel Krause
@Manuel
Deletetop output should be easy as open terminal, execute "top" command, press "1" to show all cpus usage instead of sum, then use mouse to select and paste the text and save in another file, :). I just want to check what tasks are running and how many cpu are occupied.
Hi Alfred,
DeleteI'm sorry to be so late. I emailed you.
BR, Manuel Krause
I have checked your "top" output. It looks normal as it should be, firefox use ~50% cpu and stick "most" to cpu0, while another task(plugin of firefox) takes ~20% cpu and stick "most" to cpu1.
DeleteBR Alfred
Hi. Here are the benchmark results for VRQ0.89d:
ReplyDeletehttp://openbenchmarking.org/result/1612032-LO-CFSVSVRQ833
Acpi-cpufreq + ondemand is still used for VRQ, whereas intel-pstate is used on CFS.
The standard deviations are quite high with VRQ on sqlite and john the ripper.
That may be of interest to you.
As for the incomplete results with VRQ0.89b and VRQ0.89c, I had to interrupt the tests. It was not a problem with your scheduler.
Pedro
Thanks for the benchmark, I'd take a look at the regression on sqlite and john the ripper. john the ripper should be a good start point as there are 3 samples from 089b to 089d.
DeleteBR Alfred
@Pedro
DeleteI have tested johntheripper on three machines with kernel vrq089d back to vrq089b, all the results of three kernel on three machines are almost the same. So I wonder it may be kernel config related, would you please provide the kernel config which benchmark is using. So I can check with it.