BMQ 0.98 is released with the following changes
1. Minor comment/code readable changes.
2. BMQ default RLIMIT_NICE, which let normal privilege users can set nice up to -10.
3. Instroduce __bmq_find_first_bit()/__bmq_find_next_bit() for rq->queue.bitmap, to avoid zero checking.
In this release, there is just one minor improvement(#3) for BMQ itself. But there is #2 for user experience. Now, even normal privilege users can promote interactive focused tasks to nice level up to -10, for example, I'd like to use "nice --5 firefox" to run firefox so heavy normal nice level 0 tasks on the terminal won't impact it. Usually, normal privilege users can't set nice level lower than 0. But it should be a nice to have privilege if using BMQ.
Further explanation about the above use case, consider there is +/-4 auto nice level adjustment in BMQ, the firefox
and the normal nice level 0 tasks will still have some overlap in scheduler
level for some time, but the result turns out very good. If no overlap
is really required, nice level -10 can be used for the firefox.
Enjoy BMQ 0.98 for v5.2 kernel, :)
Full kernel tree repository can be found at https://gitlab.com/alfredchen/linux-bmq
And all-in-one patch can be found at gitlab.
Please report bugs at https://gitlab.com/alfredchen/bmq/issues.
Up and running on multiple machines no problems so far.
ReplyDeleteRuns good but I think 0.97 has been faster.
ReplyDeleteThere is just #2 minor change to BMQ itself, it shouldn't be any human noticeable different. :)
DeleteThis comment has been removed by the author.
ReplyDeleteHere is the debug patch to enable CGROUP_CPUACCOUNTING, pls give it a test, it's not guarantee to work.
Deletehttps://gitlab.com/alfredchen/bmq/blob/master/5.2/enabled_CGROUP_CPUACCOUNTING.patch
Thanks to response, yes, it works :)
DeleteI noted that cpuacct is also disabled on others cpu sched, like muqss, so I discovered that had an reason for it and decided delete my post above xD.
I'll continue to use bmq with this patch and see how it will work for next days.
my test with ryzen 5 2600 in kernel compilation
ReplyDelete# bmq0.98 500hz NO_HZ_FULL
real 5m14,579s
user 33m45,957s
sys 3m48,909s
# muqss0.193-smt 100hz NO_HZ_FULL
real 4m13,836s
user 41m25,789s
sys 2m29,432s
# muqss0.193-mc 100hz NO_HZ_FULL
real 4m13,132s
user 42m56,970s
sys 2m38,041s
Interesting. On my test platforms, there is no sign of regression during the sanity tests from release to release, also comparing to CFS mainline scheduler.
DeleteSo, does other ryzen user has similar issues?
And would you plz post the output of "dmesg | grep -i bmq" so I can check the cpu topology setup for bmq?
[ 0.167088] bmq: BMQ CPU Scheduler 0.98 by Alfred Chen.
Delete[ 0.461691] bmq: cpu #0 affinity check mask - smt 0x00000040
[ 0.461691] bmq: cpu #0 affinity check mask - coregroup 0x000001c6
[ 0.461691] bmq: cpu #0 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #1 affinity check mask - smt 0x00000080
[ 0.461691] bmq: cpu #1 affinity check mask - coregroup 0x000001c5
[ 0.461691] bmq: cpu #1 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #2 affinity check mask - smt 0x00000100
[ 0.461691] bmq: cpu #2 affinity check mask - coregroup 0x000001c3
[ 0.461691] bmq: cpu #2 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #3 affinity check mask - smt 0x00000200
[ 0.461691] bmq: cpu #3 affinity check mask - coregroup 0x00000e30
[ 0.461691] bmq: cpu #3 affinity check mask - core 0x000001c7
[ 0.461691] bmq: cpu #4 affinity check mask - smt 0x00000400
[ 0.461691] bmq: cpu #4 affinity check mask - coregroup 0x00000e28
[ 0.461691] bmq: cpu #4 affinity check mask - core 0x000001c7
[ 0.461691] bmq: cpu #5 affinity check mask - smt 0x00000800
[ 0.461691] bmq: cpu #5 affinity check mask - coregroup 0x00000e18
[ 0.461691] bmq: cpu #5 affinity check mask - core 0x000001c7
[ 0.461691] bmq: cpu #6 affinity check mask - smt 0x00000001
[ 0.461691] bmq: cpu #6 affinity check mask - coregroup 0x00000187
[ 0.461691] bmq: cpu #6 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #7 affinity check mask - smt 0x00000002
[ 0.461691] bmq: cpu #7 affinity check mask - coregroup 0x00000147
[ 0.461691] bmq: cpu #7 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #8 affinity check mask - smt 0x00000004
[ 0.461691] bmq: cpu #8 affinity check mask - coregroup 0x000000c7
[ 0.461691] bmq: cpu #8 affinity check mask - core 0x00000e38
[ 0.461691] bmq: cpu #9 affinity check mask - smt 0x00000008
[ 0.461691] bmq: cpu #9 affinity check mask - coregroup 0x00000c38
[ 0.461691] bmq: cpu #9 affinity check mask - core 0x000001c7
[ 0.461691] bmq: cpu #10 affinity check mask - smt 0x00000010
[ 0.461691] bmq: cpu #10 affinity check mask - coregroup 0x00000a38
[ 0.461691] bmq: cpu #10 affinity check mask - core 0x000001c7
[ 0.461691] bmq: cpu #11 affinity check mask - smt 0x00000020
[ 0.461691] bmq: cpu #11 affinity check mask - coregroup 0x00000638
[ 0.461691] bmq: cpu #11 affinity check mask - core 0x000001c7
and with muqss i have bigger cpu usage and more/stable fps
DeletePls check with mainline CFS, I'll double check coregroup scheduling in some cases.
Delete# cfs default (not tuned) 500hz NO_HZ_FULL
Deletereal 4m1,772s
user 36m55,804s
sys 4m2,401s
hm i didn't think it would be faster than muqss
all cpu schedulers tested under the same conditions
I guess you pass "-j12" for the kernel compilation, right?
DeleteAll my machines have just two level topology, those are "smt" and "coregroup". It's different form ryzen 5 2600 which has an additional "core" level. My best guess is BMQ failed to sustain cpu load from "core" level cpus, but it's hard to tell from code.
Pls help to report a bug at https://gitlab.com/alfredchen/bmq/issues, I'd like to see if other ryzen 5(or similar topology) users report similar issues.
Meanwhile, I will try to reproduce it using qemu with proper cpu topology setup.
>> I guess you pass "-j12" for the kernel compilation, right?
Deleteyes
>> Pls help to report a bug at https://gitlab.com/alfredchen/bmq/issues
done: https://gitlab.com/alfredchen/bmq/issues/7
Since Con Kolivas recommends the whole -ck patchset, and not only the MuQSS patch + that it is intended to run with CONFIG_NO_HZ_IDLE as per comment: http://ck-hack.blogspot.com/2018/12/linux-420-ck1-muqss-version-0185-for.html?showComment=1549056997845#c6415843931930386538 , i was wondering if you had done the same comparison with those settings?
DeleteI have Ryzen 1700 + OC, I'll try to reproduce the issue in Manjaro which has full tickless by default, therefore I'm not able to check tickless idle without recompiling stock kernel and my own.
DeleteLet's see how much time I have to compile everything for tests :)
Btw, ryzen differs from intel with so called core complexes (CCX) 4 cores in each CCX, multiple CCX found in one ryzen chip.
The benefit could be to keep tasks scheduled in one CCX then there is no penalty, but when multiple CCX communicate, bigger latency is involved.
My observations are that CK do not address this issue either, therefore queue count is off when running muqss on ryzen, like 17 on 8 cores or 25 on 4 cores, etc., users report different counts and performance numbers, this somehow doesn't seem right, but I'm just an observer here :)
BR, Eduardo
I have done testing on Manjaro, there seems to be a regression. I'll post details to gitlab soon.
DeleteThe thing is and I don't know the reason, on Ubuntu with 5.2+BMQ+nohz_full kernel builds even faster - 11 minutes, the same kernel tree with the same config with the same gcc version was tested in Manjaro and Ubuntu, but I'll discard that result for now until I'm sure why.
Other thing is that 5.2 kernel or BMQ hangs on Ryzen and i7 as well, more frequent on Ryzen. I haven't figured out whether that's 5.2 or BMQ.
BR, Eduardo