Wednesday, February 13, 2019

PDS 0.99m release

PDS 0.99m is released with the following changes

1. [Sync] f29a8be0e5d2 cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM

This is a sync-up release for 4.20.8+, and should be the last PDS for 4.20.

Enjoy PDS 0.99m for v4.20 kernel, :)

Code are available at https://gitlab.com/alfredchen/linux-pds
All-in-one patch is available too. 

51 comments:

  1. Strange problems with PDS and Akonadi/kmail

    Hi Alfred,
    have finally found the cause for my akonadi/kmail problem. It's PDS. I never ever think that this could be possible. I blamed all the time akonadi/kde/kmail.

    Error: akonadi/kmail hangs in synchronizing some of my imap accounts. It stuck at 0% for some imap accounts, other do run fine. If I restart akonadi, than other or sometimes the same accounts do hang at 0% and don't synchronize at all. Imho, I was believing, that's akoandi fault and some kde update was the cause (to be fair, akonadi and kmail was really buggy some years ago).

    No such error with the standard arch kernel (or the zen kernel). No such error, if I only change the scheduler for my kernel .config and go with CFS. Still the same error, if I go for (full/"half") tickless or not with PDS.

    So I couldn't exactly date the first problem with PDS, because I never ever think of it in conjunction with this problem. But if I remember correct, the first time the error occurs was mid December last year.

    So any suggestion how to find a solution (and no, I don't wan't to go with thunderbird ;) )?

    Regards sysitos

    ReplyDelete
    Replies
    1. Tested 4.20.10 kernel now with MuQSS, no problem with akonadi and kmail. Tried an older PDS version, but there are compiling errors with newest 4.19.23. Maybe I find a solution for this compiling probelm.

      Regards sysitos

      Delete
    2. Interesting problem Sysitos.
      Could you perhaps do a diff -u between the two configs?
      diff -u 4.20.10_PDS_Not_Working/.config 4.20.10_MuQSS_Working_Fine/.config

      And post the difference. Just in case there is something else here than the scheduler, and some auto setting that fubars this. (I have had problems with -ck/MuQSS and CONFIG_FORCE_IRQ_THREADING where i get timeouts/errors when doing a "make -j12" compile)

      Delete
    3. Hi Sveinar,

      to be exact, I haven't used plain PDS nor plain MuQSS. For PDS I use the pf-Kernel, for MuQSS I used your ck-patchset (without the 000_patch-4.20.10.patch which leads after copying my .config to a compiling error). Maybe I should test PDS in plain form to exclude other patches from being the cause.

      So the diff here only in shortened form (excluded mismatches like "# ... is not set"):
      -CONFIG_SCHED_PDS=y
      +CONFIG_SCHED_MUQSS=y
      +CONFIG_RQ_MC=y
      +CONFIG_SHARERQ=2
      +CONFIG_MQ_IOSCHED_BFQ=y
      -CONFIG_UKSM=y

      Regards sysitos

      Delete
    4. Ok,

      here the confirmation. Plain linux 4.20.10 only with Alfred's all-in-one patch and PDS activated leads to the hung in akonadi/kmail. At the moment 4 from 7 of my imap accounts do stuck at 0% in the synchronizing queue.

      Regards sysitos

      Delete
    5. Another test,

      I checked out pf-kernel branch pf-18, so I get kernel 4.18.0-pf11 with PDS-mq CPU Scheduler 0.99a by Alfred Chen and there are already the errors with akonaid/kmail. At the moment there are 3of7 imap synchronizing task, which do hang at 0%.

      Some additional info: Had some time ago only 2-3 imap accounts within kmail/akonadi, and the synchronizing hung was not identified by me as a permanent problem in conjunction with PDS, but rather than a temporary problem with something else.

      Regards sysitos

      Delete
    6. Ok,

      here the finally results after some more test:

      runs fine: Linux 4.17.0-pf8 with PDS-mq CPU Scheduler 0.98t by Alfred Chen
      runs fine: Linux 4.18.0-pf6 with PDS-mq CPU Scheduler 0.98y by Alfred Chen

      error starts with: Linux 4.18.0-pf7 with PDS-mq CPU Scheduler 0.99a by Alfred Chen

      So Alfred, time for a new bug hunt ;)

      Regards sysitos

      Delete
    7. @sysitos:
      Really many thanks to you for elaborating and thoroughly testing your issue!
      I assume that other processes/ programs may suffer from the same underlying error, so I hope that your investigations and Alfred's code make PDS better. Hopefully, before next major kernel release... :-)

      Best regards,
      Manuel

      Delete
    8. Nothing pops out from that diff, but the "# ... is not set" could be important enough.

      The -ck (or my patchset) is intended for use with the plain kernel. The 0000-4.20.10.patch IS the 4.20 -> 4.20.10 patch, so without that, you are running plain 4.20.0 kernel.

      Not really sure what is the most difference between 0.98 -> 0.99, other than some "SMT_NICE" changes, and possibly something making "SCHED_ISO" not work too well. Cant remember on the top of my head, but i think it was around that 4.18'ish time i no longer could use SCHED_ISO with PDS when playing under wine. I can with MuQSS, so if you run some scheduler-priority-addon thingy, ala "Gamemode daemon" or similar this could perhaps explain something?

      Delete
    9. I would say that scheduler has very little to do with this particular hang.
      From my expierence I would say that it's a race condition in that code, which is exposed by more interactive scheduler. CFS and MUQSS behaves differently than PDS and PDS just happen to reliably trigger the condition.
      I might be wrong :)

      BR, Eduardo

      Delete
    10. I don't know if it's worth this question... @sysitos and all others:
      Do you use the "threadirqs"? Either at kernel command line (like me) or with MuQSS/ck as compiled-in setting? Just wondering if that setting makes a difference.

      TIA,
      Manuel

      Delete
    11. @Sveinar,

      I downloaded the 4.20.10 kernel source archive and than imported with quilt your patches, and so I think I running 4.20.10 ;)
      The 0.98 to 0.99 was a big internal code change.

      Regards sysitos

      Delete
  2. @sysitos
    Thanks for the testing and reporting. I would turn back and look what the code changes between 098y and 099a. But before I looking into the detail code changes, would you please try different yield type and see if this helps? That's the only thing I can think of some special may hang(unable to continue run on) if they rely on sched_yield().

    And sorry for the late reply here, was busy with new scheduler project(runs well so far on my nuc machine) and the 5.0 sync-up works.

    ReplyDelete
    Replies
    1. @Alfred, good news on new scheduler, any chance we'll see it soon?
      BR, Eduardo

      Delete
    2. @Eduardo
      I just make it run upon 5.0-rc kernel code today. The only benchmark data I got so far, is the vm kernel image boot up time. It is promising(comparing vs PDS). Some low priority feature still need to be done and massive benchmark will be kick-off in 5.0 kernel release.

      If all goes well, it will be released sometime in the 5.0 kernel release.

      Delete
    3. @Alfred, great news, I can offer to help with compilation, gaming and interactivity testing if You feel like You can send me the patch via e-mail, otherwise I'll be patiently waiting for the new thing :)
      BR, Eduardo

      Delete
    4. Hi Alfred

      tested on Linux 4.18.0-pf7 with PDS-mq CPU Scheduler 0.99a all the yield values, always with the described error. Than changed the rr_interval to 1 and it seems to work, tested it with all yields (0,1,2) too. Checked after it with rr_interval from 5..2, to verify the results. Seems that 2 was working too. In the hope, I had found the solution, I compiled the newest pf-kernel (and overwrote my old kernel with 099a ;/ ), changed the kernel command line with rr_interval=1, but no success at all. With the new kernel, there is no difference, if I change rr- and/or yield-values, always the error :(
      Btw. is there a kernel parameter for yield too or only over /proc/sys/kernel/yield_type?

      So I think, I must go the other long way with git bisect too ;)

      Regards sysitos

      Delete
    5. @sysitos:
      Don't give up, according to @Alfred's posting from February 19, 2019 at 6:43 PM (as seen below) it's only about 6 true commits. If you let each run overnight, then we know about in one week. ;-)

      Thanks for your time,
      BR, Manuel

      Delete
    6. Not sure whats the difference between PDS/CFS and the kernel option CONFIG_IRQ_FORCED_THREADING, and the added MuQSS option CONFIG_FORCE_IRQ_THREADING.. or if they are even supposed to be used together?

      I do know that i run with CONFIG_IRQ_FORCED_THREADING=y on both schedulers, but if i enable CONFIG_FORCE_IRQ_THREADING=y with MuQSS, it soon goes tits up with everything that loads the kernel at all. (compiling or whatnot).

      Delete
    7. Great news.
      Will it still be called PDS?

      Delete
    8. There will be no deadline concept in the new scheduler, so PDS is no more the proper name for it, :)

      Delete
    9. OK.
      Do you have any name yet?
      Thanks.

      Delete
  3. @sysitos
    Good news is there only 6 actual commits between 098y to 099a. I don't want to suspect which one is the cause at this time, but the best way is use "git bisect" on pds 4.18 branch(https://gitlab.com/alfredchen/linux-pds/tree/linux-4.18.y-pds) on your site to find it out.

    e9140c7fe85f (origin/linux-4.18.y-pds, linux-4.18.y-pds) Tag PDS 0.99a
    bac03ec0a243 pds: Fix task burst fairness issue.
    4f431cdc66a8 pds: Fix sugov_kthread_create fail to set policy.
    7bc1a2c56bee Tag PDS 0.98z
    0f294a28dec1 pds: Re-mapping SCHED_DEADLINE to SCHED_FIFO
    4f4af01d7ac7 pds: Improve idle task SMT_NICE handling in ttwu.
    5a8fbadc4e03 pds: Don't balance on an idle task.
    76919a998aa1 pds: Replace task_queued() by task_on_rq_queued().

    ReplyDelete
  4. Hi Alfred,

    so some trillions of tests later ;)
    Here are the results:

    Short:
    error starts with: "Fix task burst fairness issue."

    Long:
    bac03ec0a243 (HEAD, refs/bisect/bad) pds: Fix task burst fairness issue.
    4f431cdc66a8 (refs/bisect/good-4f431cdc66a8700629b607d1eec381e85130b2e1) pds: Fix sugov_kthread_create fail to set policy.
    7bc1a2c56bee Tag PDS 0.98z
    0f294a28dec1 (refs/bisect/good-0f294a28dec1adbcf3c3085204fa922ac2166b9b) pds: Re-mapping SCHED_DEADLINE to SCHED_FIFO
    4f4af01d7ac7 pds: Improve idle task SMT_NICE handling in ttwu.
    5a8fbadc4e03 pds: Don't balance on an idle task.
    76919a998aa1 pds: Replace task_queued() by task_on_rq_queued().
    0fb4e82d39ad (refs/bisect/good-0fb4e82d39ad1048aae987acc26d5039eb317dc3) Tag PDS 0.98y

    And now some surprise:
    I checked with commit bac03ec0a243 some rr_interval values.
    Result: rr_interval from 1 to 3 seems to run fine (as already seen on last tests), with rr_interval=4 the error with the hung imap sync queue appears from time to time.

    Hope that I could help you.
    Seems, that the error goes stronger with newer versions, because with actual version the rr_interval doesn't help anymore.

    Regards sysitos

    ReplyDelete
    Replies
    1. @sysitos
      Thanks for the bisect tests. Lower rr_interval just make the issue less likely to be triggered, so it should not be a solution.
      One more thing, does CONFIG_SCHED_HRTICK=y in your kernel config file?

      Delete
    2. Hi Alfred,

      CONFIG_SCHED_HRTICK=y is there.

      Should I check your new concept scheduler, because the repairing and fixing of this old bug in the soon deprecated PDS isn't worth it?

      Regards sysitos

      Delete
    3. @sysitos
      Would you pls send me an email? I'd like to prepare a patch for your debugging.
      The new scheduler is based on PDS code base, so most likely it will has the same issue.

      Delete
    4. Hi Alfred,

      short: your patch helped a lot, but the problem still persist.

      long: Applied your patch on top of pf-kernel and than additionally on top of your pds git tree for 4.20 to exclude some troubles. Results are the same. It's a way better than without the patch, now most time only 1 (or 0) hung imap sync process, prior it was 2-3. But now there is a influence of yield, what imho you had already in mind. I triggered the error until now only with yield_type=1. 0 and 2 run fine, without the error (with no influence of rr_interval at all, tested different values here). Btw. is the new default rr_interval=4? Wasn't it 6 some time ago?

      Regards sysitos

      Delete
    5. @sysitos
      Sorry for the late reply during the weekend. Based on your testing, I believe issue is caused by sys_yield() usage of user land code and a bug in PDS code together, and https://gitlab.com/alfredchen/linux-pds/commit/2fab3ad028e396a9b0de760425052a2ab1444936 is the proper code fix in PDS. And adjust yield type would be the workaround for the user who use affected applications.

      For rr_interval, it has been change to 4ms for some times, and it's not encouraged to change this value.

      Delete
    6. Hi Alfred,

      couldn't agree with you in this case. Yes, there are times, when the schedulers triggers errors produced within other applications. But it seems not be the case here. That's why I tested your patch not only on the newest 4.20 git, but also on older ones and here are the results:
      The error couldn't be triggered by me with kernel 4.18 and PDS 0.99a (or better on your last commit for this branch linux-4.18.y-pds). All runs fine. This is the case too for branch linux-4.19.y-pds and PDS 0.99b, commit 770c3b622528. No problems at all. But there are problems with your last commit on this branch, the error triggers instantly. Other commits are not tested yet by me.

      Btw, no error with cfs and muqss with all yields.

      Regards sysitos

      Delete
    7. Hi Alfred,

      so I bisect the wholw 4.19.y-pds tree (and patched always your fix) and here are my (shortened) results:

      51d8f8b86d81 (HEAD, refs/bisect/bad) pds: Rework time_slice_expired()
      a473f87a3bd1 (refs/bisect/good-a473f87a3bd13ca95b3838108aa8f3a2f7e0f8e6) pds: Fix cpu hot-plug Oops.
      55fdf19c03c1 (refs/bisect/good-55fdf19c03c121144717c95e9b0b177cf1cb883b) pds: [Sync] c377a2a8bf25 (refs/bisect/good-c377a2a8bf25e30707083156befda486b0e202b8) pds: Remove cpumask_weight() in best_mask_cpu().
      770c3b622528 (refs/bisect/good-770c3b6225288fb308631c3a1ede419bbe2d735a) Tag PDS 0.99b

      So I hope, that I could help.

      Regards sysitos

      Delete
    8. @sysitos
      Thanks for these further testing. Let me explain it this way, sched_yield() is an "evil" system call, which give up current task run time to let other tasks in the system to run and get the job making progress. In modern days, there are many ways to do IPC so current task can wait on something till other tasks can have cpu time, get the job done and notify it. But,it is legacy and it is still be used.
      It's "evil" b/c it is not reliable, it depends on the scheduler how to handle the yielded task and let other tasks to be run. CFS use skip flag in task structure and BFS/muqss/VRQ/PDS use yeild types, all are different in implementation, but none is guaranteed(IMO). So, application using sched_yield() may work different in behaviors under different scheduler/yield type.

      Back to the PDS, I have checked the 51d8f8b86d81 (HEAD, refs/bisect/bad) pds: Rework time_slice_expired(), the code change is correct and as expected. But it failed some yield type for your application sched_yield() usage. It still sounds good to me as there is other yield type can workaround it
      Maybe we should introduce a more reliable way to handle yield in scheduler, but I believe it's too late for PDS and thinking about it in the new incoming scheduler, it will still be a low priority item. Be honest, if I could control user-land usage, I'd eliminate sched_yield() system call, :)

      Delete
    9. @Alfred:
      If one would use a non-default sched_yield value as workaround, like sysitos, which one would you suggest/ recommend?
      Do they have different impacts, that you know of?

      Best regards,
      Manuel

      Delete
    10. I have escaped weird behavior by setting the value to 2, mostly related to intel graphics driver, if I remember correctly.
      BR, Eduardo

      Delete
    11. @Eduardo:
      Thank you for adding this info !
      Several years ago I've used a nvidia gfx card, where sched_yield = 2 was the only way to operate it properly over longer time. But code changed that much inbetween, that I won't pinpoint any former scheduler/ driver.

      I assume, that you use the normal kernel & X11 drivers for your Intel gfx ATM., right?

      BR, Manuel

      Delete
    12. Hi Alfred,

      many thanks for the detailed explanation, was time for me to google for it (I'm not a programmer).
      But what does that mean, if all yield_types lead to an error? With the new pf-kernel (your bug fix already included), the error triggers now with yield_type=0 too. So I only have yield_type=2 as a workaround, must only check, how to set it during boot up or in the source code. But if your new concept scheduler does work different, maybe the error doesn't trigger there, so don't invest to much time in it. We have (hopefully) Linux 5.0 next week ;)

      Regards sysitos

      Delete
    13. @sysitos:
      I always call a script to change openSUSE's defaults, once my desktop is up. I know, that's way too old-fashioned. But it leaves me in charge.
      Maybe you can place an appropriate script into the systemd folders and let it been called during bootup?
      Unfortunately, I'm too unexperienced with this.

      BR, Manuel

      Delete
    14. @Manuel,
      I use PDS exclusively on all machines I own (and not).
      I have placed some tweaks in /etc/rc.local to be executed every time computer starts, that includes yield_type as well.
      BR, Eduardo

      Delete
    15. @Manuel and Eduardo,

      thanks, had in mind some udev rule. Endless ways in linux to do so ;). But wouldn't help, see below.

      @Alfred

      bad news, the error was triggered now even with last yield_type=2 settings. So no workaround possible anymore. Checked it with newest pf-kernel.

      Regards sysitos

      Delete
    16. Hi Alfred,

      here I am again. I know, I'm a little bit insistent ;)

      Because there is no workaround for my problem and I don't want to go back to cfs, but on the other side I need my mail too, I have checked the problem again and found 2 solutions:

      1. solution (the ugly one):
      I completely reverted your Commit 51d8f8b8 on top of the actual pf-kernel. Does compile fine and even better, it does work without the mentioned errors. But I think, you wouldn't like it, because of your rework within this commit.

      2. solution, the elegant one, so I hope ;)
      I checked again your commit and than changed only a single sign:

      line 463 old: if (p->prio >= NORMAL_PRIO) {
      line 463 new: if (p->prio > NORMAL_PRIO) {

      Compiled and run fine. Even tested with different yields (0,1,2). Couldn't trigger the error yet, checked different situations. Haven't seen any drawbacks.

      Maybe you or someone else could double check it.

      Thanks for your help.
      Regards sysitos

      Delete
    17. @sysitos
      I kind of expect that commit to disable timeslice expiration for "normal tasks" by doing that. Not that i am a programmer or expert in any way tho :)

      Ie. You would never "update_task_priodl(p);" if a task is running as "Normal prio" (most tasks are).

      This would in turn probably work for you, but i am not entirely sure it is elegant for the rest of us? :)

      Delete
    18. @sveinar

      thanks for clarification, than the only clean solution (for me) is solution 1, the ugly one :/
      Alfred was handling the normal prio tasks in the old commit in an other way, which had no drawback here.

      But sorry for the stupid question, maybe you could clarify it a little bit, does that mean, that a "normal prio task" wouldn't never be refreshed and the assigned process time within a tick and only for this tick would be the same?

      So do you have a working load example, where I could check the wrong behavior?
      Thanks and regards
      Sysitos

      Delete
    19. Hi (@svainar),

      so changed back my stupidness and modified pds.c (in the mind of the old commit):


      if (p->prio >= NORMAL_PRIO) {
      if (p->prio == NORMAL_PRIO) {
      p->deadline /= 2;
      p->deadline += (rq->clock + task_deadline_diff(p)) / 2;
      } else
      p->deadline = rq->clock + task_deadline_diff(p);

      update_task_priodl(p);
      }

      This works here, and should be elegant for the rest too ;).

      PS: Could all be useless, because Alfred is working on a new scheduler ;)

      Regards sysitos

      Delete
    20. @sysitos
      I have to said that none of your change met the design intention. 1), revert the commit is not a good idea as there is bug in previous deadline calculation, that's why rework the time slice expiration. 2), it will by pass deadline update for NORMAL tasks at all.

      Your last code change looks ok, I'd suggest you to change the deadline calculation to
      p->deadline /= 2;
      p->deadline += rq->clock / 2 + task_deadline_diff(p);
      If it works for you and your issue,you can keep the code change for yourself. I am not going to make changes to PDS so far, b/c I believe this only fix particular cases, may fail other cases. I hope you can understand it.

      For long term, there will be no deadline concept in new scheduler, so less trouble to worry about. But there will be still yield problem. Will see how to handle it later, as it will be low priority item.

      Delete
    21. Hi Alfred,

      thx for the code change. Looks better and runs fine (As I wrote, I used the formula from your old commit as quick hack.)
      But not reusing the old p->deadline and ignoring it and recalculating it from scratch leads for me to mentioned bug. Had you asked me 2 weeks ago, than I had say'd, that there is no problem for me with PDS, because this hung is really difficult to identify. But this hung leads already to a bug-fix, no one else had mentioned.

      But anyway, thanks for your help. It's ok for me, to do after an "git pull" an "quilt import" too ;). Don't worry about it. I know now the cause and the solution.

      But maybe you could me please inform, will the new p->deadline will always goes bigger than the old one or will it depend on situation (load etc.)?

      Many Thanks and Regards
      sysitos

      Delete
  5. Anyway to implement this in PDS or the new scheduler?
    https://github.com/clearlinux-pkgs/linux/blob/master/0123-add-scheduler-turbo3-patch.patch

    ReplyDelete
    Replies
    1. I have done pre-study about itmt last year on my intel gen8 cpu on a notebook, but it turned out that it doesn't support itmt. Will check it back on the new scheduler when principal feature are done this year.

      Delete
    2. What does this clearlinux patch actually do?

      Delete
    3. It prefers higher clocking cores for tasks.
      https://www.intel.com/content/www/us/en/architecture-and-technology/turbo-boost/turbo-boost-max-technology.html

      Delete