VRQ 0.95b is released with the following changes
1. Fix UP compilation issue with hrtick_enabled(), thanks jwh7 for reporting.
2. Sync up 4.11 mainline scheduler code changes.
This is mainly a sync-up release for 4.11 kernel. Please be notices that the frozen some time after suspend/resume issue still remains and it is confirmed that this also happen with vanilla kernel, so it's not caused by VRQ scheduler code. If suspend/resume is important for you, you may need to wait for the fix from mainline.
Enjoy VRQ 0.95b for
v4.11 kernel, :)
code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.11.y-vrq
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.1.y-vrq
All-in-one patch is available too.
BR Alfred
Good news (at least, it seems to be ATM) from me is that VRQ for 4.10 lasted for 19 days for me without hanging.
ReplyDeleteThank you very much, Oleksandr, for this patient testing!
DeleteA question aside: Have you made use of suspend-to-disk during that period?
BR, Manuel Krause
Nope, I use s2ram only.
Delete>Please be [aware] that the [freeze] some time after suspend/resume issue still remains and it is confirmed that this also happen with vanilla kernel [...]
ReplyDeletePerhaps related?:
https://www.spinics.net/lists/intel-gfx/msg127016.html
https://www.spinics.net/lists/intel-gfx/msg127017.html
I'll try this tonight, for my i915 UP netbook.
@jwh:
DeleteThank you for posting those links. Do you eventually have any insight whether only kernel 4.11 is affected or also earlier ones?
My reason to ask is the fact that also 4.10 kernel sometimes fails to properly bring up my system after suspend-to-disk. I'm not sure about the failure rate, and the more overall uptime is, the higher the failure probability. But it may be also random or some systemd/ KDE component failing when RAM recovery takes too long. I'm not sure, as I don't get logs in failure cases.
Thank you for any answer and best regards,
Manuel Krause
@jwh:
DeleteIn the meantime I've just applied the two patches mentioned by you above, to see how far i'd come with kernel 4.10.14 with them (before trying 4.11 later this week).
i915 as my gfx, patched with only one little fuzz in 2nd patch, compiled without failures and now it's up and running well and has passed one s2d & resume cycle under my normal workload so far. Of course, I'd need more uptime to prove it and maybe post more enthusiastic statements later.
Thank you for sharing your knowledge about these patches!
BR, Manuel Krause
@jwh7: Sorry, for always having forgotten to add the "7".
Delete@jwh7 and other i915 driver users:
DeleteThe aforementioned two patches are of real benefit on here with my Intel GM45 integrated gfx notebook. Still with 4.10.14 kernel +VRQ 0.95 +BFQ v8r11.
ATM only 4 suspend/ resume cycles done (without reboot) so far, but no issues.
What's remarkable, too, I don't get random processes segfaulting some while after resume from disk any more, what was usual without these patches -- before, I had suspected KDE plasma5 being faulty.
At the current state of my testing, unless/ until these patches are included into official 4.11, I'd recommend applying them.
Thank you @jwh7 and best regards,
Manuel Krause
Built UP last night, incl those patches; works fine, including a couple suspend/resume cycles. Suspend-to-RAM, as my netbook is limited on its 4GB embedded SSD, but has 2GB RAM. Manuel, aren't you also using i915? Do you get a bunch of kernel warnings in dmesg? I've been getting them since 4.9 or 4.8, on any -vrq, -ck, and vanilla.
DeleteAlso now running the x64 build as I type...
@jwh7:
DeleteYes, I use the i915 driver as module for GM45 + Core2duo. Recently with 4.10.12 or .13 I was able to see a WARNing shortly after resume from disk but was unable to capture it because the system completely starved at that moment. It was in the i915 driver and IIRC had to do with some memory confusion. I strongly suspect this issue to be the reason for failing resumes from disk in the past 4.10 kernels (where I was completely unable to get or see log messages). In these cases simply the screen/ gfx didn't come up any more.
Let's hope that the two patches cure this. Current uptime and resume behavior is quite promising.
BR, Manuel Krause
Shi*, I need to reboot now and begin testing from 0. The random segfaults remain and killed KDE's most important processes. So the patches don't heal software errors, per se. BR, MK
Delete@Manuel
DeleteDoes those two patches help with resume/suspend? I don't have time to try those patch yet. :)
@Alfred:
DeleteIMO, I've yet collected too few follow-up suspend/resume cycles (5) under realistic use. So I can only tell that everything worked well until now with those patches. I'm going to continue testing. Please, try them for yourself to check whether they make a change for your system(s).
BR, Manuel Krause
It's a pity. My sixth suspend-to-disk attempt locked up my system just after initiating it via KDE. Never had that before, earlier it only failed upon resume, as I wrote before.
DeleteThere was no extreme memory or cpu or i/o or swap load at that moment. Seems like there are more things to fix in suspend-to-disk. :-( This is still with 4.10.14. If one of you people has a hint -- thank you for it! I've already read some of the actual bugzilla threads in Power Management/ Hibernation/Suspend but am not through all of them yet.
At least the two patches don't make things worse, they deliver it to a new stage ;-)
BR, Manuel Krause
ATM I'm so upset, that I'm working on refurbishing my humble old TuxOnIce port from 4.33/for 4.9 according to 4.10 now.
DeleteIn-kernel suspend-to-disk sucks so much. Regarding speed and reliability at least.
BR, Manuel Krause
[off-topic]
DeleteThis time my porting job of TOI went wrong utterly. First step of only fixing wrongly-patching hunks didn't make it resume. Second step of adding Nigel's most recent own commits (upon his newer TOI codebase) made it resume, but resulted in an ugly corruption of my root fs. After resuming well the system bloated the ext4 root partition up to the max. This was two days ago and I still have no clue, where the "bloating" landed. Forced e2fsck and a forth-and-back copy held this fs for o.k. :-(
Strange side-note: Apparently this root fs firms as without journal, although not having changed it, only having upgraded the software through the openSUSE distros.
Yes, yes, with some more knowledge this TOI-mess wouldn't have happened.
BR, Manuel Krause
regarding my [off-topic] from just above:
DeleteJust want to inform you about my new findings. It was no FS or disk corruption at all. Today my system recovered 1,1GB from my root FS (correct ext4 since some days) by some magic coincidence, so that I was able to check the differences to a safety backup taken one week ago. And -- shame on me :-( -- it was only a _succeeded_ crash dump of kwin_X11 in folder /var/lib/systemd/coredump. Earlier, no crashdump succeeded due to low disk space (and I wasn't aware of that location). I had taken counter measures against coredumps before, but obviously haven't all possible of them in use against it on openSUSE 42.2/ systemd 228.
My testing of in-kernel suspend-to-disk, still with 4.10 kernel, now at 4.10.16, still didn't get above 5 followups. No. 6 failed again (second row of realistic use test). Remarkable with this kernel: I don't get random segfaults any more.
If someone of you has results of in-kernel suspend-to-disk with VRQ (with or without the two i915 related patches above), please let us know. BTW, the MuQSS community hasn't reported issues with it and 4.11 kernel up to now.
BR, Manuel Krause
@All,
ReplyDeleteI have done quite a testing on VRQ, and there seems to be a problem.
See, try this: run stress -c 12 on quad core (non HT) and play a movie using software decoding (cpu), all kernels (muqss or standard) - no frame drop, vrq - a lot of frame drop. Thats 300% load on Phenom.
I'll post a youtube video soon or I'll share detailed test results here.
I'm just wondering, if simple play under load is jerky, does that mean that actually certain interactivity thing is not that great, but it's so fast that I don't notice it normally. I find vrq ok, but those humbers make me wonder a little, a lot of tests what I have done are ok, but movie play and 300% load is not gut.
Additionally, I tried compiling a kernel on ryzen with j16, all cores 100% used, so it's fine at least in that way and I can not confirm jwh7's problem.
Anyhow, wait some time and I'll have some detailed results.
Regards, Eduardo
@Eduardo:
DeleteRegarding the ugly frame drop thing -- is it happening on 4.11+VRQ for the first time, or already with 4.10 ?
Thank you for clarifying and best regards,
Manuel Krause
@Eduardo
DeleteFor your tests, try put "stress" workload to IDLE policy using "schedtool -D -e xxxx", and see if it help with fontground movie playback?
@Alfred,
Deletetested using IDLE policy, that worked fine. Rising up load to 10K worked fine as well.
BR, Eduardo
@Manuel,
ReplyDeleteMaybe I was commenting not in the right tooic, sorry for that, but at the time I was using 4.10.13 + VRQ. Will check whether it's still true with Ryzen and 4.11.
But if I understood correctly, this VRQ version has just compilation and sync up fixes.
Br, Eduardo
@Eduardo:
DeleteIt seems like users on here like to switch to the newest topics' thread, even if it's not matching 100%. As the VRQ community is still manageable, IMO it's the best way to keep being up-to-date.
I've still not tested 4.11 and would like to postpone it to late next week to get a longer term test of jwh7's advertised patches for my i915 gfx driver in relation to s-2-d/ resume issues.
BR, Manuel Krause
Btw, I am using 4.11.0 :-)
Delete@Manuel,
DeleteI just tested 4.11 + VRQ (100Hz), 4.10.13 + MUQSS (100Hz), 4.10.13 mainline (250Hz) on Ryzen, the same test stress + mpv playing FHD movie (deliberately on CPU), VRQ starts dropping frames at 40 processes (stress -c 40), MUQSS somewhere around 100, mainline does not even hiccup at 3000 :) 3000 was a surprise for me, as I did not test this before that high.
Ok, 3000 does not really reflect anything remotely close to real life scenario, but that probably shows how scheduler manages processes and assigns resources that every one gets a bit of CPU.
Maybe Alfred can comment on this, does this test shows anything useful at all.
Looking forward for comments and also suggestions what to test on Ryzen.
BR, Eduardo
@Eduardo
DeleteLet me explain a little bit about your findings.
1st, MuQSS and VRQ use virtual deadline for decide which task got the cpu to run on, while mainline scheduler use a complicated logic to do this, one of these logic is task doing I/O works get a higher pripority to get cpu. This may explain why mainline scheduler boost movie play back over the stress tasks. But it is a heuristic algorithm, and it is not always be good, just think about that the stress tests over movie play back is what you really need.
2nd, MuQSS vs VRQ. When a lot of long lasting task runing at same policy, take your tests for example, movie decode tasks and stress tasks, currently, VRQ is not able to migrate the lowest deadline tasks among cpus, two or more movie decode tasks may in the same cpu run queue and not able to run on other cpus even they all have lowest deadline among those tasks in the same policy. So, frame-drop will happen.
MuQSS has its logic to select the lowest deadline task among all cpu run queues to run. But IMO, it is an overhead, especially when there are lots of cpu in the system. Be honest, balance and migration is a big problem for all dedicated cpu run queue design(mainline, MuQSS and VRQ).
So, the workaround in VRQ for this senario is put background tasks in IDLE policy, VRQ currently can balance higher policy tasks among cpus, that will make fontground tasks got the cpu time they need.
For long-term solution, maybe need another "Brain Fuck", :)
@Alfred,
Deletethanks for explanation, now it's a tad more clear to me what's going on there. Yesterday I finished remaining, at the moment, tests on Ryzen, so hopefully soon I'll be able to process the results all together and present them.
BR, Eduardo
@Euardo and whom owns Ryzen cpu
DeleteWould you please post a "dmesg | grep -i vrq" output of the system, so I can catch the topology setup on Ryzen cpu? Many thanks.
@Alfred,
Deletehere it is.
br, Eduardo
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[0] smt 0x00000002
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[0] coregroup 0x000000fc
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[0] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[1] smt 0x00000001
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[1] coregroup 0x000000fc
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[1] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[2] smt 0x00000008
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[2] coregroup 0x000000f3
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[2] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[3] smt 0x00000004
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[3] coregroup 0x000000f3
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[3] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[4] smt 0x00000020
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[4] coregroup 0x000000cf
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[4] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[5] smt 0x00000010
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[5] coregroup 0x000000cf
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[5] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[6] smt 0x00000080
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[6] coregroup 0x0000003f
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[6] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[7] smt 0x00000040
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[7] coregroup 0x0000003f
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[7] core 0x0000ff00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[8] smt 0x00000200
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[8] coregroup 0x0000fc00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[8] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[9] smt 0x00000100
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[9] coregroup 0x0000fc00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[9] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[10] smt 0x00000800
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[10] coregroup 0x0000f300
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[10] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[11] smt 0x00000400
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[11] coregroup 0x0000f300
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[11] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[12] smt 0x00002000
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[12] coregroup 0x0000cf00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[12] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[13] smt 0x00001000
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[13] coregroup 0x0000cf00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[13] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[14] smt 0x00008000
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[14] coregroup 0x00003f00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[14] core 0x000000ff
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[15] smt 0x00004000
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[15] coregroup 0x00003f00
[ 2.481574] vrq: sched_cpu_affinity_chk_masks[15] core 0x000000ff
[ 7.323182] BFS enhancement patchset VRQ 0.95b by Alfred Chen.
@Eduardo
DeleteThanks for the dmesg info. I haven't studied Ryzen cpu topology closely. But from your dmesg, I can see a lovely 16 smt cores cpu, group by 2 cpu groups, each group has 4 cores and each core has two smt cores. :)
@Alfred:
ReplyDeleteI've now switched to your VRQ 0.95b with kernel 4.11.2. It's ugly somehow:
Having an idle load like my two WCG clients and then starting a tab-blown-up firefox, the firefox' loading can take ages compared to 4.10. It seems heavily braked out by your algorithms.
BR, Manuel Krause
@Manuel
DeleteAs you can tell, there is no new feature on VRQ0.95b compared to VRQ0.95 on 4.10, just the sync up codes.
So it just happened when firefox loading tabs?
@Alfred:
DeleteI have to apologise. Today I've found time to do more correct tests under most comparable conditions: 90 same tabs loading until finished, both kernels VRQ, all tests within last 30 minutes, but run sequentially with reboots between them.
4.10.16 w/o. idle load: 4m23s
4.10.16 with idle load: 4m24s
4.11.2 w/o. idle load: 4m22s
4.11.2 with idle load: 4m13s
So, most likely my last impression went wrong or I've had added/ now removed one tab with extremely slow server response that led to my writing. Anyways this test is obviously unpredictable -- but luckily shows here now: last kernel + VRQ plays as fine as the current ones in this test case.
BR, Manuel Krause
@Manuel or anyone else,
Deletesince You talk about rather nasty sudden slowdowns, I'll add my as well, in previous blog post (0.95 for 4.10, http://cchalpha.blogspot.com/2017/04/vrq-095-release.html?showComment=1494834080289#c1158054218901428452 ) and even before, I have added a comment about my strange sudden situation with cpu frequency, do anyone ever experienced that? It happens when resuming from S3 I guess, as I put laptop to sleep rather than hibernate.
That, as far I have observed, happens only on Dell Skylake laptop, more times than I wish :) All of a sudden it does not scale until reboot which leads to sudden slowdown. Manuel's finding about slowness for FF sorta kinda sounded like it.
Manuel, did You check frequency scaling, actual freq and stuff like that when it happened?
P.S. 4.11 itself (vanilla, MUQSS, VRQ) "flocks up" my integrated intel card monitor resolution switch functionality (really annoying), so I can't use it on laptop, my experience w/ 4.11 will be only on Ryzen I suppose.
BR, Eduardo
@Eduardo:
DeletePlease, let me know what exact info you like to see from my side (propose console commands). It seems that my system is too old or too "well configured" to be NOT affected. As you can read above, the slowness of my FF was a one-shot bad experience only.
BTW, what do you mean with "flocking up"? (words not familiar to me)
BR, Manuel Krause
@Manuel,
Deletesince You have C2D (if I recall correctly), You can use only acpi cpufreq governor and can not use i7z to determine actual cpu speed, so I guess You can use this: cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
You can use stress -c 1 as well, to check whether it scales up at that moment.
My system was affected for about 10 times in couple of years by this cpu frequency thing. Not too often, but when it happens, it's frustratingly slow, I have to reboot to have the speed back.
"Flocking up" is saying the f-word w/o getting censored :)
BR, Eduardo
@Eduardo:
DeleteFirst of all, thank you for the explanation(s). So, hehe, VRQ is based on the BrainFlockedScheduler (TM) :-)))
And yes, your assumptions are all right (C2D, acpi_cpufreq and accessing actual speed). I hope to remember this stuff in future. But I really don't remember having suffered from such symptoms like yours on this machine since I use it for ~5y.
BR, Manuel Krause
@Manuel,
DeleteYeah, it's something like that, BrainFlockScheduler sounds about right :)
Since 4.11 flocks up my intel, I'm back to 4.10.17 + VRQ + nohz_full. Let's see how that fares in long term.
@Alfred,
I do You still use some sort of "sticky task" thing? See, I ask because, at least for Ryzen, I have 8 cores and max turbo is 3.7GHz w/ 1 or 2 cores active, all 8 active have max 3.2GHz, if, for instance, game or any single threaded app gets shuffled amongst all cores together with all those little and many processes, CPU doesn't really reach max turbo. Or I don't understand that shuffle and max turbo thing right :) Can You please comment on this matter, maybe sched some light? :)
Thanks.
BR, Eduardo
@Manuel
DeleteGood to know there is no regression. Less work to put into my todo lest, :)
@Eduardo
For intel cpu, take a 2 core 4 threads cpu for example, when one one thread is active, it can boost to max cpu speed, saying 3.4. When only two threads in different cores are active, cpu can boost to 2.9(less than 3.4), when all 4 threads are active, it can only boost to 2.6. It's limited by the heat generated by cpu. I think this also apply to AMD's cpu.
And by the way, for the case of 2 active thread above, if scheduler can kick off workload on two threads in different cores, both threads can boost to 2.9, totally we have 2*2.9=5.8. But if 2 threads are in the same core, each thread just have about 70%(?) effectiveness as they share resources among each other. Currently, VRQ doesn't aware of smt cores and look at them as physical cores, that's why VRQ doesn't play good in benchmark when under <100% number of cpu workload comparing to mainline scheduler. I am current working on feature to make VRQ aware of smt cores and kick off workload smartly. Hopefully it can be done before next kernel cycle.
@Alfred
DeleteI'm looking forward to that feature. Thanks for the explanation.
Regards,
Dzon
Mmmmh, so many commits in the queue for 4.11.3 but still not released. And 4.10.17 is tagged as [EOL] so early. :-(
ReplyDeleteI still patch my 4.11.2 with the ones from above (https://cchalpha.blogspot.com/2017/05/vrq-095b-release.html?showComment=1493904291320#c4422413702620511499), but already had one failing S2RAM. Regarding S2DISK I'd also need to collect more cycles.
While reading through the Hibernation/Suspend Bugzilla (https://bugzilla.kernel.org/buglist.cgi?bug_status=__open__&component=Hibernation%2FSuspend&order=changeddate%20DESC%2Cbug_status%2Cpriority%2Cassigned_to%2Cbug_id&product=Power%20Management&query_format=advanced) I've found one patch snippet at the end of https://bugzilla.kernel.org/show_bug.cgi?id=188281 (only worth to read the latest comments at the bottom) that may be of benefit for some of you eventually: https://bugzilla.kernel.org/attachment.cgi?id=256709.
It's now in my early testing cycles and at least didn't harm S2RAM and S2DISK on my system so far.
BR, Manuel Krause
Oh, just after writing I realised, that 4.11.3 is out. Let's see what it can fix hopefully.
DeleteBR, Manuel Krause
Another "Oh!": Some minutes after KDE was up after a third resume from S2DISK with the proposed additional patch, my 4.11.2 system locked up all of the sudden. So, it is not safe for now. Forget it and forgive me.
DeleteBR, Manuel Krause
Does someone of you still suffer from the suspend/resume issue that Alfred refers to in his top message?
ReplyDeleteBy coincidence I've got a new finding: While trying to help with kernel bugzilla 188281, I had to swith to CONFIG_PM_DEBUG=y (but nothing in the submenu enabled). Unlike the naming including DEBUG, this setting really speeded up resume from hibernation and seems to also add reliability. I haven't experienced any slowdown or side-effect with it.
BR, Manuel Krause
Have you tested any of the debug modes?
Deletehttps://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt
@jwh7: Thanks for kidding.
DeleteBR, Manuel Krause
@jwh7:
DeleteI'm sorry, this shouldn't sound that impolite as it eventually could be understood. I just didn't get the link between the information I provided and the question/ suggestion you gave. So far, I haven't tested any of the debug modes but will have look at it over the weekend. Primarily I wanted to inform about a possible "workaround" and am wondering until now, why noone else experiences this (?).
BR, Manuel Krause