Tuesday, October 10, 2017

PDS 0.98b release

PDS 0.98b is released with the following changes

1. UP compilation fix.
2. Task deadline catch-up algorithm V2. A simple deadline catch-up algorithm, give less cpu usage task earlier deadline when it is woken up. This will help for interactivity.
3. Sync up for 4.13 mainline release. Hopefully help with suspend/resume stability.
4. Fix a long existed hidden sync-up issue of get_user_cpu_mask().
5. Minor improvement.

This is a bug fix release, hopefully it will solve the issues reported in previous release.

Enjoy PDS 0.98b for v4.13 kernel, :)

code are available at
and also

 All-in-one patch is available too.


  1. @Alexandre Frade

    Can you please upload fresh xanmod-pds build so I can retest if freezes on Atom with many "heavy" tabs in Chrome still persists or not?


  2. Here w/ PDS 0.98b:


    sudo apt update && sudo apt install linux-image-4.13.5-pds-xanmod5 linux-headers-4.13.5-pds-xanmod5


  3. Retested on Atom under the same conditions.

    Freezes still happen, but responsibility comes back much quicker. Also I am able to ssh to that PC smoothly now. As kswapd0 is thrid process in top (first two are xorg and vnc server) I guess the reason is aggressive swapping: that PC has only 2gb of RAM (mb does not support more) and also zram is used.

    1. @unxed
      Ok, if you have old atom and kswap on the top 3 + zram as swap, that doesn't sound good. zram exchanges cpu power for space, but atom cpu is too week to provide cpu power that zram needs, expecially in high loading scenario.
      But anyway, I beleive scheduler is doing its best in your system.

    2. Retested without zram, now freezes are totally gone.

      I thougt memory was main bottleneck for that PC, but it looks like it actually was cpu power.

      Thanks again!

  4. This comment has been removed by the author.

  5. Awesome work, thanks Alfred, thanks Alexandre!

  6. responsiveness, not responsibility, of course :)

  7. Compiled/booted OK, will get back in case of issues.

  8. @Alfred,

    Compiled, used in i7 @ work for ~ 5 hrs, so far it's good except this one: https://pastebin.com/ZrnVzs8B
    LibreOffice is behaving badly here with large document - stalling for at least 20 sec when editing that doc, according to "top" I've seen CPU usage spike for soffice.bin up to exactly 700, but no more.
    Total or per task CPU usage calculation is clearly wrong. Can You please look at it?
    Will check (hopefully today) on Ryzen @ home.

    BR, Eduardo

    1. @Eduardo
      Have you cross compared such usage with 098a and mainline CFS scheduler? I don't think LibreOffice is special thann other normal app, unlike the wine + D3.

    2. @Alfred,

      Just did a test with 4.13.5 mainline: https://pastebin.com/wxQdsGnG
      The same workload, very different result :) soffice.bin never uses more than 100-101%, which might be about right.

      So, based on my testing, it seems that per task CPU usage calculation is way off. Additionally I did a test with 4.13.2-vrq97b, soffice.bin uses up to exactly 700 (it never goes higher not by single point).
      It this seems this issue is in vrq/pds for some time, I just didn't have to edit large docs in the past :)

      If You need me to debug more, can You please specify VRQ versions to test, I hope I have them compiled somewhere.

      BR, Eduardo

    3. @Eduardo
      Beside the cpu usage calculation issue(top display), does CFS and 4.13.2-vrq97b stalls 20s to edit the large file?
      For cpu usage calculation, as I remember, there is no code changes for a long time, so, if you have previous vrq kernels build and installed, you can try to test with them, if you find a version desn't has this problem, then we have a good start point.

    4. @Alfred,

      I can not state this with confindence as I did not measure it exactly, but the feeling is that mailine is a tad better. I would say it's quite close between both.
      I'm inclined to think that only accounting in pds/vrq is off.
      I'll try kernels I have let's see which is the first to show this issue.

      BR, Eduardo

    5. @Edurado
      Thanks for clarification. Looks like it is an accounting issue, not major issue that would impact usage.
      And I suggest you start with the oldest kernel with VRQ/PDS you have installed.

    6. @Eduardo
      While you are testing old kernels, could you able to get more detail information about how Libreoffice is running? I don't know how to do in top, but in "htop", you have options to look at thread instead of process and turn on UTIME and STIME display for tasks.
      If you able to do so, please also capture a "normal" snapshot on CFS?
      Many thanks.

    7. @Alfred,

      Send screenshots and a little explanation to You via e-mail. That includes 4.13.5 mainline and PDS. UTIME and STIME included.

      BR, Eduardo

  9. @Alfred:
    I'm not convinced of the recent changes/ design goals for PDS regarding load balancing.

    When normally having 2 WCG clients running at IDLE and then in parallel start compiling the kernel with -j2, on my dualcore, the kernel compilation only occurs on 2nd core, while 1st core only serves for some basic system and the idle set WCG processes.
    IMHO this is not the way that IDLE processes should be handled in presence of -j2 normal priority tasks. {In prior VRQ releases the normal scheduled compilation threads always filled both cores with load, so to draw back all idle load on both cores almost equally <5%.}

    BR, Manuel Krause

    1. @Manuel
      Thanks for pointing it out. It's caused by 0997c734b6a1 pds: Reduce policy fairness balance overhead. The interval for next_balance seems to be too long. It is a setting about balance vs overhead, I will try to find a reasonable value for next release, but in the long term, the overhead needs to be cut in other way.

    2. @Alfred:
      Thank you for taking it into your future considerations. :-)

      Are you sure you referenced the right commit? Shouldn't it rather be commit No. aded50b6d382 (@github) ?
      And is there something I can do when compiling my next kernel, e.g. like reverting this specific commit, to get back to previous balancing? (I haven't tried it yet.) ATM I'm not really concerned about overhead in my use scenario but rather want to fill both cores more equalized.

      TIA and BR, Manuel Krause

    3. @Alfred:
      I've now re-tested it on my own risk. Reverted ef7acba77441 first, then reverted aded50b6d382.
      That brings back, what I want to see (and watch in gkrellm) --> equally used cores. ATM kernel compilation performs faster and looks pretty.

      BR, Manuel Krause

    4. But kernel conmpilation still doesn't outperform the IDLE tasks. They still get ~50% on each of the 2 cores. Not nice. *MK

    5. @Manuel
      You can wait for my next_balance adjustment this weekend.

      Sorry that it's a bit busy recent days, so delay for debug load, blog reply etc.

    6. @Manuel
      Please try with this adjustment(https://bitbucket.org/alfredchen/linux-gc/commits/cc769fc97f982dceb917900ea2e7560202f91152)

    7. @Alfred:
      I've tested with the three most recent pds commits applied. Kernel compilation still shows the unbalanced behaviour.
      Only stopping the IDLE WCG-clients make "make -j2" fill both cores. When then, still in presence of the kernel compilation, WCG-clients are restarted the make job migrates back to the 2nd core again. This looks like no change for me.

      BR, Manuel Krause

    8. @Manuel
      Interesting. My test show IDLE task drop back to about 2~3% on each core while normal tasks are running.
      Would you please capture a top/htop output so I can inspect tasks status?

    9. @Alfred:
      Sure! -- But, please, first advise me on how to provide most valuable info for you:
      * Screenshot? (With a screenshot I'd be able to send you the info of gkrellm graphs too.)
      * Special settings to top/htop?
      * What is better info for you: top or htop?
      * Something else useful for you?

      BR, Manuel Krause

    10. *htop would be best, as I am familar with it, default setting will works
      *What I want to check is the priority of compiler tasks and the WCG-clients
      *screenshot would be good
      Thanks in advanced.

    11. O.k. -- I just made some test setup to provide the things you need.
      Within the next 30 minutes you'd get the screen captures to your email.

      ATM I'm watching, that uptime or reusing processes does have some severe impact on balancing/ core's distribution, what a fresh booted system didn't show.

      BR, Manuel Krause

    12. @Manuel
      Try 1000HZ if you are not with it. The inbalance issue may be caused by tick interval + next balnace interval > rr interval.

    13. @Alfred:
      Although you were right with your assumption (I had 512HZ), setting it to 1000HZ doesn't make a difference on here, unfortunately. And no related settings like rr_interval was changed. Must be something else.

      BR, Manuel Krause

    14. @Manuel
      What's the last version of VRQ/PDS you consider as well balanced for your testing work load? If you don't have installed versions on your system, I recommend you test from 145204efa0ee Tag VRQ 0.98.

    15. @Alfred:
      I had written about good balancing results in "Anonymous -- October 18, 2017 at 2:18 PM", that was with reverted commits "pds: Fix UP compilation issue." (used for following patch reverting only) and "pds: Reduce policy fairness balance overhead." with PDS v0.98b. Please give me an hour or two to re-check that kernel and whether my follow-up observation that only 50% of each core got filled with compilation threads had been correct.
      (BTW. I'm a little confused by the commit numbers ATM. Do the older ones change each time you add new commits on top?)

      BR, Manuel Krause

    16. @Alfred:
      O.k. I remembered correctly. Indeed this was the kernel compiled after your suggestion from "October 17, 2017 at 5:50 PM", with the above mentioned commits taken out, so this kernel is also without your recent three improvements and still with my custom 512HZ. Kernel compilation reaches around 50-60% on each core, quickly varying upon what's done.
      Astonishing: I just did one kernel compilation with the old setup for the test, then decided to cancel it (and make clean) to change the .config to 1000HZ to better match your .config -- And now the newly started compilation fills both cores with ~90% each. I don't understand this behaviour, but maybe you find new debugging approaches from this finding.
      I'd report back soon, after the 1000HZ kernel with the reverted commits got ready and tested.

      Thanks for your work and BR, Manuel Krause

    17. @Manuel
      Nice to see you have got a good code base, but I am a little confused which git reversion your PDS code is at and what commits you have reverted. Would you list 'git log --oneline -- kernel/sched/pds.c' of your kernel tree to show what git reversion you are at?(first 30 lines should be ok) and list what commits you have reverted(if there are not committed in your local copy).

      Yes, the git commit hash id changed each time I rebase to a new kernel reversion and I have to --fource update the reposiroty branch. It is not a user friendly strategy if you hearvy tracing the pds changes, but if you just fetch the pds code from time to time, that is ok and the pds code are always on the top of the the git log.

    18. @Alfred:
      I'm sorry that I cannot provide the version info in the way you suggested, but I'll try to keep it as simple as possible and use your current commit No.s of the time I write this here.
      I'm using the -pf9 patch from Oleksandr that includes PDS 0.98b up to 78339fe.
      Then reverted "pds: Fix UP compilation issue." 8c48c5e, only to let the next commit revert properly,
      that is reverting "pds: Reduce policy fairness balance overhead." 2847cfc,
      then applied inc. patch from kernel.org to 4.13.8.

      With this setup and the 1000HZ .config, kernel compilation balances both cores at ~95% equally, with the WCG IDLE tasks at ~3% already at the first compilation attempt. IMHO this is the desired behaviour, don't you agree?

      BR, Manuel Krause

  10. @Alfred

    First of all, thank you very much for this great scheduler. During my typical workloads it works very very well.

    However, recently I've tried playing a game through wine and I'm getting consistent lockups during the game startup. Interestingly enough, the same game + wine on another machine works just fine (didn't play much on it, I just verified that it works...).

    The machine that exhibits the lockup is an i5 Skylake CPU (doesn't support hyperthreading) with a Radeon 390x GPU. The machine that works is a laptop with i7 Haswell CPU and integrated Intel/Nvidia GPU.

    I would not discern this as faulty hardware just yet as -ck and the stock kernels work just fine.

    Here are my findings:
    - this happens on versions as old as VRQ 0.89g with linux 4.9.x. This is where I stopped testing, but I can go further back in time if you think it's useful.
    - X including the window manager completely freezes; I can't ctrl-alt-switch to text consoles, I can't ssh into the machine
    - I managed to start the game and switch to a text console before the freeze happens -- the text console seems to be frozen as well
    - tried without the CONFIG_SMT_NICE option
    - usually I'm using the -pf kernel, but the lockup happens with only the PDS patches as well
    - tried without graysky's GCC patch (generic kernel)
    - tried without UKSM/KSM
    - stock and -ck kernels work fine

    If you have any idea how to debug this please let me know.

    Mitja Horvat

    1. @mitja
      At this stage, please try different yield_type refernece in Documentation/sysctl/kernel.txt and see if it helps.

    2. @alfred

      That was spot on. Both yield_type 0 and 2 seem to work. I haven't really given it much testing, but the game did start 6 times in a row without problems (first 3 runs with yield_type=0, last 3 runs with yield_type=2). Also yield_type=0 seems to give a slight FPS boost.

      If you intend to debug this issue I'm willing to try out custom patches. If not, I'm fine with setting the yield_type :)

      Mitja Horvat

    3. @Mitja
      Please use yield_type = 0 and see if there is any side effect.
      My current plan is set it default 0 in next release, if no complain with it, I tend to totally remove this option.
      IMO, the yield* API should not be existed. :)

    4. @Alfred Chen

      For CPU Atom options yield_type=0 fixes problem consistent lockups. Thanks.

    5. @Andrei
      Thanks. yield should be killed +1.

    6. Observations to consider.

      Building kernel linux make -j8 + ioQuake3 (urbanterror 4.3.2):
      rr_interval = 6 (~77 fps)
      rr_interval = 2 (~115 fps) (xanmod default)

      Test: 3 latest commits (cc769fc) and yield_type=0 (yield 1/2 same behavior).

    7. @Alexandre
      Stick with the yield_type topic for this thread, after reconsider the yield things
      yield_type = 1, the current default, is the most useless type, it just trigger a schedule().
      yield_type = 2, is most likely the original yield design, which expired the time slice.
      yield_type = 0, do nothing. Reference to the comments of function yield(void), and you should know why it is not recommended.
      In next release, the default yield_type will be changed to 0.

      For the rr_interval, normally I don't recommend to change it as it is related to interactivity and throughput performance balance. But as you have done some tests, I would like to know the result of below scenarios
      1. Bare ioQuake3 (urbanterror 4.3.2), rr_interval = 6, fps?
      2. Bare ioQuake3 (urbanterror 4.3.2), rr_interval = 2, fps?
      3. Kernel compile as IDLE prolicy, make -j8 + ioQuake3 (urbanterror 4.3.2), rr_interval = 6, fps?
      4. Kernel compile as IDLE prolicy, make -j8 + ioQuake3 (urbanterror 4.3.2), rr_interval = 2, fps?

    8. System: i5-3470@3.20GHz / DDR3 1600@CL9 / Mesa DRI i915.

      Bare urbanterror 4.3.2 w/ heavy map ut4_suburbs:
      1. 6ms = ~88 fps
      2. 2ms = ~88 fps

      SCHED_IDLE (chrt -i 0 make -j8 + game)
      3. 6ms = ~74 fps
      4. 2ms = ~74 fps

      SCHED_NORMAL (make -j8 + game)
      6ms = 42~32 fps
      2ms = 50~28 fps

      In some map scenarios 6ms is better, mostly the 2ms shows a smooth game with system under heavy load.

    9. @Alexandre
      Sorry for late reply for this topic.
      So, based on your test result, 2ms doesn't help much for gaming fps in the above workload. Yes, I agree that low rr interval helps with gaming interactivity under heavy load. But consider low rr interval also introduce switch overhead that impact throughput performance. And as a common scheduler, PDS needs to consider all kinds of usage besides gaming, so the 6ms should be kept as default rr interval.