Friday, November 30, 2018

PDS 0.99f release

PDS 0.99f is released with the following changes

1. Remove load balance code from scheduler_tick(). Based on recent code change and testing, looks like there is no need to check load balance in scheduler_tick() any more.
2. Default 4ms rr_interal. Trade off performance for interactivity improvement.
3. Rework sibling group balance. Reduce the overhead in sibling group balance by rework the code path.

This release includes balance code rework, and will be the biggest change in this kernel cycle. Please give it a try and feedback will be welcome.

Enjoy PDS 0.99f for v4.19 kernel, :)

Code are available at https://gitlab.com/alfredchen/linux-pds
All-in-one patch is available too.

57 comments:

  1. Thank you very much.

    ReplyDelete
  2. Hi Alfred.

    I got an error with 4.19.7-rc1

    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x28): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x40): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x58): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x70): undefined reference to `sched_smt_present'
    make: *** [Makefile:1032: vmlinux] Error 1
    ==> ERROR: A failure occurred in build().
    Aborting...

    I've compiled 4.19.7-rc1 without pds and error doesn't occur.

    Regards

    ReplyDelete
    Replies
    1. 4.19.7-rc1:

      https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-4.19.y

      Delete
    2. Must be: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.19/schedsmt_Expose_sched_smt_present_static_key.patch

      Delete
    3. Thanks for the headup. I'm still on .4, will look into this soon.

      Delete
    4. According to: http://ck-hack.blogspot.com/2018/11/linux-419-ck1-muqss-version-0180-for.html#comment-form there are 2 patches that might cause breakage.

      Delete
    5. @Anonymous
      Thanks for the info. It should not hard to provide a fix for PDS, so let's plan for it when 4.19.7 come out next week.

      Meanwhile, I am looking at sync up changes in the incoming 4.10 kernel release.

      Delete
    6. Thanks you too.

      Delete
    7. https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/stable-review/

      4.19.7 come out in two days

      Thanks for your answer.

      Delete
  3. @Alfred:
    After several days of testing with PDS 0.99f (up to kernel 4.19.6) I can report that everything works very well -- or even improved (meaning, that occasional stuttering with audio/video under certain load situations is gone away and that dual cores' load distribution is o.k. (in gkrellm display e.g.)).

    Thank you for your continued hard and obviously good work,

    BR, Manuel

    P.S.: Sorry, that I had to discontinue my benchmarkings some months ago, but it took too much time compared to the missing provided evidence of tests on my system. So I'm back to simple usability tests under everyday's conditions and to subjective judgements.

    ReplyDelete
  4. 4.19.7 is out!

    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-4.19.y

    ReplyDelete
  5. I get this while compiling 4.19.7:

    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x28): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x40): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x58): undefined reference to `sched_smt_present'
    ld: arch/x86/kernel/cpu/bugs.o:(__jump_table+0x70): undefined reference to `sched_smt_present'

    Could this be related to the recent addition of new x86speculation patches?

    ReplyDelete
    Replies
    1. More precisely I'm thinking of this one:

      commit 340693ee912e29f51769ae4af3b11486ad921170
      sched/smt: Expose sched_smt_present static key

      Delete
    2. Mind also reading this thread, not just writing?..

      Delete
    3. You're right, sorry. I was doing too many things at the same time :/

      Delete
    4. Be patience. I'm struggling with v4.20-rc* ... :(
      Will sync up with 4.19.7 and see what need to be done this weekend, hopefully a new release next week.

      Delete
    5. Pls check this sync-up fix for 4.19.7, it happens be in 4.20-rc*, so I cherry-pick the sync-up fix from 4.20

      https://gitlab.com/alfredchen/linux-pds/commit/55fdf19c03c121144717c95e9b0b177cf1cb883b

      It will be included in next pds release, maybe the only code change.

      Delete
    6. 4.19.7 seems to build fine, and have no ill effects after that patch Alfred.

      Delete
    7. It builds, but fails to boot on my laptop with Intel SMT CPU. "nosmt" makes it booting, though.

      The panic happens somewhere in __schedule(): https://imgur.com/a/oUcut3k (sry, I was able to capture the very tail of the panic only, but at least you can see %rip). Interestingly enough, it doesn't panic inside a QEMU VM with emulated SMT. So far, the panic happens only on a real machine.

      Delete
    8. OK, now, it is better. See the full panic message from the QEMU VM: https://gist.github.com/82ab9b6fc63438f70567cf6b6de9ad6e

      Delete
    9. @pf
      Will double check the code today.

      Delete
    10. @pf
      I don't have any clue from the panic message. It's more like the cause of issue but not show the root cause. But it definite cause by the new introduced "sched_smt_present". The code is same from mainline scheduler code in core.c. So, would you pls answer below question?
      1. Does this issue 100% reproducible on your laptop and qemu?
      2. Does this issue happens on mainline CFS?

      Delete
    11. Alfred,

      please find the answers here: https://gist.github.com/2901a50728235927397363495fda4821

      (to avoid losing of formatting)

      Delete
    12. @pf
      Thanks for above info. It will be useful. I will investigate it.
      One thing from your log
      [ 0.675222] CPU: 5 PID: 0 Comm: PDS/5 Tainted: G T 4.19.0-pf7+ #1
      Is it running on 4.19.0? If so, would you pls recheck it on 4.19.7?

      Delete
    13. No, it is v4.19.7. That's just me re-numbering things.

      Delete
    14. I've compiled 4.19.8-rc1 with pds (and bfq-sq/bfq-mq and uksm):

      CONFIG_MHASWELL=y
      CONFIG_SMT_NICE=y
      bfq [bfq-mq] none
      Linux version 4.19.8.ll12-0.2-lucjan-git+ (lucjan@archlinux) (gcc version 8.2.1 20180831 (GCC)) #1 SMP PREEMPT Thu Dec 6 23:58:53 CET 2018
      Linux archlinux 4.19.8.ll12-0.2-lucjan-git+ #1 SMP PREEMPT Thu Dec 6 23:58:53 CET 2018 x86_64 GNU/Linux
      [ 0.000000] pds: PDS-mq CPU Scheduler 0.99f by Alfred Chen.
      [ 0.061023] pds: cpu #0 affinity check mask - smt 0x00000004
      [ 0.061024] pds: cpu #0 affinity check mask - coregroup 0x0000000e
      [ 0.061025] pds: cpu #1 affinity check mask - smt 0x00000008
      [ 0.061026] pds: cpu #1 affinity check mask - coregroup 0x0000000d
      [ 0.061026] pds: cpu #2 affinity check mask - smt 0x00000001
      [ 0.061027] pds: cpu #2 affinity check mask - coregroup 0x0000000b
      [ 0.061028] pds: cpu #3 affinity check mask - smt 0x00000002
      [ 0.061029] pds: cpu #3 affinity check mask - coregroup 0x00000007
      [ 0.381278] io scheduler noop registered
      [ 0.381302] io scheduler bfq registered
      [ 0.381320] io scheduler bfq-sq registered (default)
      [ 0.381321] BFQ I/O-scheduler: v9 (with cgroups support)
      [ 0.381339] io scheduler bfq-mq registered
      [ 0.381340] BFQ I/O-scheduler: v9 (with cgroups support)
      [ 0.533373] UKSM: relative memcmp_cost = 184 hash=1682298426 cmp_ret=0.

      Delete
    15. @pf
      Thanks for your crashdump log, I believe that "sched_sibling_cpu" is used before proper setup. And "threadirqs" helps to trigger the problem.
      I don't have qemu and other required SW installed on my machines right now(will try to install them later), and I believe that this issue also existed in 4.19.6 with 0.99f pds code. So you can help to test QEMU + 4.19.6 + 0.99f PDS while I solving the issue and prepare the test patch.

      Delete
    16. Indeed, v4.19.6 + PDS-mq v0.99f crashes too.

      (I guess the reason I have not spotted this before is either that CONFIG_SCHED_SMT was disabled, and now it is enabled by default, or because I was using "nosmt" due to Spectre.)

      Delete
    17. Since it seems QEMU is what's causing the crash, does it crash just as much with 4.19.7 and default CFS and same config?

      I have SMT on, with 8700K 6cores/12threads with the above patch, and have all threads online without a hitch.. but i do not use QEMU.
      I use virtualbox tho, but no hiccups there.

      Delete
    18. Read carefully please, the issue is not limited to QEMU. The statement about CFS was also presented previously (hint: it works).

      +1 report from the physical machine of one of my users ("nosmt" did the trick as well as a temporary workaround).

      Delete
    19. Indeed, it is a issue related to #3 code change in 0.99f, not related to the new new introduced "sched_smt_present" in new commit for 4.19.7. Working on a quick fix.

      Delete
    20. @pf
      Pls try this patch.

      diff --git a/kernel/sched/pds.c b/kernel/sched/pds.c
      index cddf591a1603..5a5ad448f23c 100644
      --- a/kernel/sched/pds.c
      +++ b/kernel/sched/pds.c
      @@ -619,7 +619,7 @@ static inline void update_sched_rq_queued_masks(struct rq *rq)
      return;

      #ifdef CONFIG_SCHED_SMT
      - if (~0 == per_cpu(sched_sibling_cpu, cpu))
      + if (cpu == per_cpu(sched_sibling_cpu, cpu))
      return;

      if (SCHED_RQ_EMPTY == last_level) {
      @@ -6103,7 +6103,7 @@ void __init sched_init(void)
      rq->queued_level = SCHED_RQ_EMPTY;
      rq->pending_level = SCHED_RQ_EMPTY;
      #ifdef CONFIG_SCHED_SMT
      - per_cpu(sched_sibling_cpu, i) = ~0;
      + per_cpu(sched_sibling_cpu, i) = i;
      rq->active_balance = 0;
      #endif
      #endif

      Delete
    21. VM is not panicking with this one. Thanks.

      I'll be able to test things on a real machine in 4 days since I'm travelling.

      Delete
    22. The fix is also confirmed on a physical machine of one of my users.

      Delete
    23. @pf
      Thanks for the feedback. Looks like it is the correct fix with the minimal code change. It has been pushed to git at

      https://gitlab.com/alfredchen/linux-pds/commit/72da04f0d96e12bb20beaa33a980f7fd279110eb

      I'd like to do 099g pds release early next week.

      Delete
    24. Hi
      But have other bug with latest kernel 4.19.8:

      /usr/bin/ld: arch/x86/kernel/cpu/bugs.o: in function `arch_smt_update.part.2':
      bugs.c:(.text+0x183): undefined reference to `sched_smt_present'
      /usr/bin/ld: bugs.c:(.text+0x1c8): undefined reference to `sched_smt_present'
      Makefile:1032: recipe for target 'vmlinux' failed
      make[3]: *** [vmlinux] Error 1
      make[3]: Leaving directory '/usr/src/linux-4.19.8'

      Delete
    25. With kernel 4.19.6 build ok without problem and boot fine.

      Delete
    26. after check
      this patch fix problem:

      https://github.com/hhoffstaette/kernel-patches/blob/4.19/4.19/pds-20181206-make-sched_smt_present-track-topology.patch

      Delete
    27. @Micron
      I think you are supposed to use both these patches:
      https://gitlab.com/alfredchen/linux-pds/commit/55fdf19c03c121144717c95e9b0b177cf1cb883b
      and
      https://gitlab.com/alfredchen/linux-pds/commit/72da04f0d96e12bb20beaa33a980f7fd279110eb

      Until 0.99g comes out anyway :)

      Delete
    28. ayws i use both and work fine

      Delete
  6. It seems that Holger already fixed it:
    https://github.com/hhoffstaette/kernel-patches/commit/016ded9a8490fbc4a84a8dee9514bf8e50291748#diff-51b8bcf9172ce658921d37cd70eead0d

    ReplyDelete
    Replies
    1. I don't think this is a correct fix. While it can prevent the build failure, it doesn't represent the actual SMT state, because it is not touched by the actual PDS-mq code.

      Delete
  7. @Alfred:
    I'm seeing non-wishful behaviour on my dual-core: Non multi-threaded applications only use 50% of each cpu core's power. Currently I'm not running any WCG or alike clients in the background. I've noticed the issue with heavy FreeCAD calculation operations first and also kernel make -j1 shows that only ~50% are utilized. This testing is still from 4.19.6 kernel with PDS 0.99f.
    I'm not convinced that this is the desired behaviour.

    TIA and BR,
    Manuel

    ReplyDelete
    Replies
    1. @Manuel
      Maybe I am not get what is the non-wishfull behavior on your dual-core. Single thread "use 50% of each cpu core's power", add up to ~ "100%" is a normal behavior to me. And My Celeron server which should have similar topology with dual-core runs well w/o unexpected behavior here.

      Delete
    2. I don't get it either.

      Delete
    3. @Alfred and @Anonymous:
      Thank you very much for your quick reply.
      Maybe I've misused the word "wishful" instead of "wished"?!

      I wished that the single thread also comes to 100% on both cores when nothing else inhibits it.
      Please tell me, when I've generally misunderstood the single-thread distribution over the two cores.
      For the current kernel (4.19.8 + PDS 0.99g) I've changed my .config to only use NR_CPUS = 2 (vs. former 4) what may be correct somehow but somehow not... The CPU supports SMT/HT but the chipset forbids it. With this new kernel the kernel "make -j1" remains at ~50% load on each core.
      Oh, and with the new reduced NR_CPUS I get an error message:
      [ 0.071150] smpboot: 4 Processors exceeds NR_CPUS limit of 2

      But what would now be the correct setting??

      @Alfred:
      A question aside: What NR_CPUS do you set in your server's .config?

      Many thanks in advance for any helpful and clarifying answer!

      TIA,
      Manuel

      Delete
    4. @Manuel
      One thread can only attach to(occupy/use) one cpu at one time. So one thread can't take >100% cpu time.

      For your NR_CPUS issue, I think your system is trying to populate 4 cpus but you limit it to 2, you can try to google your cpu type and see how many cores/threads it has and set it in NR_CPUS.

      Delete
    5. Yeah.
      Due to core migration both cpus will be at 50% max. for a single threaded task and not more.

      Delete
    6. @Alfred and @Anonymous:
      Thank you both very much for teaching me the right understanding of what's going on with single threaded applications.

      Regarding my journey with NR_CPUS:
      It seems, at least for the kernel, that NR_CPUS=4 is the most error-free for my Core2Duo. The search on the web for this is astonishingly poor in the number of results, useful ones ~0.

      I'm not understanding the differences around the added "hotplug CPUs", "pcpu-alloc" and "Max logical packages", when cross-testing NR_CPUS=2 vs. 4.
      Let me add a summary of relevant dmesg log lines for both cases at the bottom of this post. Can you, please, have a look at it, to check if it's correct (especially regarding the pds' affinity check)? What advantage do the 2 added hotplug cpus have?

      TIA and BR,
      Manuel

      -> Summary:
      ----------- NR_CPUS=2 -----------
      [0.071150] smpboot: 4 Processors exceeds NR_CPUS limit of 2
      [0.071151] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
      [0.180112] setup_percpu: NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1
      [0.181395] percpu: Embedded 42 pages/cpu @(____ptrval____) s133336 r8192 d30504 u1048576
      [0.181404] pcpu-alloc: s133336 r8192 d30504 u1048576 alloc=1*2097152
      [0.181406] pcpu-alloc: [0] 0 1
      [0.295502] pds: PDS-mq CPU Scheduler 0.99g by Alfred Chen.
      [0.303137] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz (family: 0x6, model: 0x17, stepping: 0x6)
      [0.319171] smpboot: Max logical packages: 1
      [0.319171] smpboot: Total of 2 processors activated (9043.30 BogoMIPS)
      [0.320153] pds: cpu #0 affinity check mask - coregroup 0x00000002
      [0.320156] pds: cpu #1 affinity check mask - coregroup 0x00000001

      ----------- NR_CPUS=4 -----------
      [0.071133] smpboot: Allowing 4 CPUs, 2 hotplug CPUs
      [0.180662] setup_percpu: NR_CPUS:4 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1
      [0.182086] percpu: Embedded 42 pages/cpu @(____ptrval____) s133336 r8192 d30504 u524288
      [0.182095] pcpu-alloc: s133336 r8192 d30504 u524288 alloc=1*2097152
      [0.182096] pcpu-alloc: [0] 0 1 2 3
      [0.296187] pds: PDS-mq CPU Scheduler 0.99g by Alfred Chen.
      [0.303841] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz (family: 0x6, model: 0x17, stepping: 0x6)
      [0.319876] smpboot: Max logical packages: 2
      [0.319876] smpboot: Total of 2 processors activated (9043.75 BogoMIPS)
      [0.320866] pds: cpu #0 affinity check mask - coregroup 0x00000002
      [0.320868] pds: cpu #1 affinity check mask - coregroup 0x00000001
      <-

      Delete
    7. @Manuel
      I don't think P8400 has smt capability. But your bios seems to have some problem reporting 4 cpus available. But anyway both 2 or 4 should works for you, be safe, 4 will be a good choice.

      Delete
    8. @Alfred:
      Many thanks for your added expertise!

      I've given the "limited" NR_CPUS=2 only one day of testing but it didn't reveal any advantage/ disadvantage in my normal use. Maybe I should test it a little longer.
      According to docs from Intel the cpu itself ships with the "ht" flag (also shown by /proc/cpuinfo), but the chipset "officially" doesn't make use of it(?).
      My Core2Duo series (Penryn comparables) also was shipped with quad-core versions or such with dedicated cores turned off, depending e.g. on mobile powersaving use etc..

      I'm not sure about the consequences of 2 real and 2 additional populated cores, the hotplug ones. Do you imagine any?

      From some dmesg logs found in my web search on the Core2Duo, I've often read about the "4 CPUs, 2 hotplug CPUs" (btw., also with far higher allowed NR_CPUS). So maybe it's not an issue with (only/ only my) BIOS.

      Would I see different "pds: cpu #? affinity check mask - coregroup 0x*" lines when the hotplug CPUs were accepted 'correctly'?

      Thanks for your guidance, TIA and BR,
      Manuel

      Delete
    9. @Manuel
      pf just asked about affinity check mask in 0.99g thread, you can find the explanation there. And you can paste your affinity check masks log here.

      Delete
    10. @Alfred:
      Yes, I've read @Oleksandr's posts and your replies. But I remain clueless:

      My affinity check masks:
      For NR_CPUS=4:
      [ 0.319876] smp: Brought up 1 node, 2 CPUs
      [ 0.319876] smpboot: Max logical packages: 2
      [ 0.319876] smpboot: Total of 2 processors activated (9043.75 BogoMIPS)
      [ 0.320866] pds: cpu #0 affinity check mask - coregroup 0x00000002
      [ 0.320868] pds: cpu #1 affinity check mask - coregroup 0x00000001

      For NR_CPUS=2:
      [ 0.319171] smp: Brought up 1 node, 2 CPUs
      [ 0.319171] smpboot: Max logical packages: 1
      [ 0.319171] smpboot: Total of 2 processors activated (9043.30 BogoMIPS)
      [ 0.320153] pds: cpu #0 affinity check mask - coregroup 0x00000002
      [ 0.320156] pds: cpu #1 affinity check mask - coregroup 0x00000001

      I absolutely don't know, how your algorithm determines the above and don't understand the related code base.
      And... neither, what benefit my system wants to achieve with two additional hotplug CPUs -- not being recognized?!

      BR, Manuel

      Delete
    11. @Manuel
      All looks goods. Either 2 or 4 should works for PDS.

      Delete
    12. @Alfred:
      Many thanks for sacrificing your time to explain and clarify this topic to/for me.
      So, my conclusion is now, to just let it be at NR_CPUS=4, what had worked for years and didn't show error messages.
      I hope it won't produce additional overhead with enabled SMT & SMT_NICE .config settings... Let me know if you are aware of any.

      Thanks and BR,
      Manuel

      Delete