Thursday, August 15, 2019

BMQ 0.99 release

BMQ 0.99 is released with the following changes

1. A lot of refines here and there in BMQ routines and code path.
2. Consider llc(last level cache) in sg balance and ttwu code path. This hopefully can help with cpu topology with multiple llc, like ryzen. But I don't have hw to further test.

Your feedback will be welcome, especially for #2 changes for multiple llc cpus.

Enjoy BMQ 0.99 for v5.2 kernel, :)

Full kernel tree repository can be found at https://gitlab.com/alfredchen/linux-bmq
And all-in-one patch can be found at gitlab.

26 comments:

  1. Up and running on multiple machines.

    ReplyDelete
  2. I have done tests, this time on Ubuntu, Ryzen 1700 (OC to 3.9) 16GB RAM, Samsung 960 EVO SSD.
    Initially checked out pf git tree + copy config and for each run "make clean" + "time make -j16 bzImage modules".

    5.2.8 + muqss(0.193) + nohz_idle
    real 11m0.875s
    user 170m58.492s
    sys 10m26.721s

    5.2.8 + bmq(0.98) + nohz_idle
    real 10m51.005s
    user 170m31.756s
    sys 10m31.877s

    5.2.5 + bmq(0.98) + nohz_full
    real 11m1,942s
    user 153m34,910s
    sys 16m58,260s

    5.2.8 + bmq(0.99) + nohz_idle
    real 10m48.307s
    user 152m7.931s
    sys 15m25.326s

    5.2.8 + bmq(0.99) + nohz_full
    real 10m56.546s
    user 152m37.428s
    sys 16m35.317s

    At least on Ubuntu, there is no regression at all, not with previous and not with this version of BMQ (nohz_idle or nohz_full).

    On Manjaro I saw different situation with BMQ 0.98, I assume either there is different default compilation parameters or running kernel configuration is different enough.
    I still don't know why the same command on the same kernel tree with the same config performs so much differently in Ubuntu than Manjaro (I haven't checked much either :)).
    Will try to run compile benchmarks on Manjaro that as my time allows.

    Thanks Alfred!

    BR, Eduardo

    ReplyDelete
    Replies
    1. Forgot to mention that it's all with ggc9.1 compiler in both, Manjaro and Ubuntu.

      Delete
    2. On a i7-7700K no real difference between nohz_full and nohz_full both with bmq 0.99
      nohz_full:
      Command being timed: "make -j8 -s"
      User time (seconds): 904.29
      System time (seconds): 80.19
      Percent of CPU this job got: 768%
      Elapsed (wall clock) time (h:mm:ss or m:ss): 2:08.08
      Average shared text size (kbytes): 0
      Average unshared data size (kbytes): 0
      Average stack size (kbytes): 0
      Average total size (kbytes): 0
      Maximum resident set size (kbytes): 230232
      Average resident set size (kbytes): 0
      Major (requiring I/O) page faults: 106
      Minor (reclaiming a frame) page faults: 59808091
      Voluntary context switches: 148710
      Involuntary context switches: 161432
      Swaps: 0
      File system inputs: 0
      File system outputs: 0
      Socket messages sent: 0
      Socket messages received: 0
      Signals delivered: 0
      Page size (bytes): 4096
      Exit status: 0

      nohz_idle:
      Command being timed: "make -j8 -s"
      User time (seconds): 903.96
      System time (seconds): 78.14
      Percent of CPU this job got: 770%
      Elapsed (wall clock) time (h:mm:ss or m:ss): 2:07.44
      Average shared text size (kbytes): 0
      Average unshared data size (kbytes): 0
      Average stack size (kbytes): 0
      Average total size (kbytes): 0
      Maximum resident set size (kbytes): 230060
      Average resident set size (kbytes): 0
      Major (requiring I/O) page faults: 86
      Minor (reclaiming a frame) page faults: 59828057
      Voluntary context switches: 152816
      Involuntary context switches: 141838
      Swaps: 0
      File system inputs: 0
      File system outputs: 0
      Socket messages sent: 0
      Socket messages received: 0
      Signals delivered: 0
      Page size (bytes): 4096
      Exit status: 0

      Delete
    3. Looks like there is around 3seconds improvement from 0.98 to 0.99 on 5.2.8, that's a positive result. :)

      Delete
    4. Thanks for your tests on Ryzen!

      Delete
  3. It seems like I am getting random system freezes after this update. I run pf-kernel which includes bmq, and the commit I built from only included this bmq update.

    ReplyDelete
    Replies
    1. I have cleaned my kernel source dir and compiled the latest pf commit. I've had it running for longer than it was taking to freeze before so it seems fine. Not sure if the problem was pf or if I just needed to build fresh. Either way it looks like it wasn't bmq. I'll post another reply if it ends up crashing again. Thanks for your great work!

      Delete
  4. Got an io stall, leading to system lockup, while running with 5.2.9, not sure if its the BMQ patch or just a bug with 5.2.9, ive been running 5.2.8 with the ck patches without issue. will try with 5.2.8 and see if issue persists.

    ReplyDelete
  5. Version 0.99 has worked fine for me. No issues whatsoever. However, it does not apply successfully to kernel 5.2.10, released today.

    patching file kernel/sched/cpufreq_schedutil.c
    Hunk #1 succeeded at 180 (offset 4 lines).
    Hunk #2 succeeded at 288 (offset 4 lines).
    Hunk #3 FAILED at 434.
    Hunk #4 succeeded at 681 (offset 5 lines).
    Hunk #5 succeeded at 912 (offset 6 lines).
    Hunk #6 succeeded at 943 (offset 6 lines).
    1 out of 6 hunks FAILED -- saving rejects to file kernel/sched/cpufreq_schedutil.c.rej

    ReplyDelete
    Replies
    1. The fix is trivial, just check what's rejected and adjust the hunk to match the upstream line.

      Delete
    2. Sorry for late reply. Very busy recently. Pls check the updated all in one patch in https://gitlab.com/alfredchen/bmq/raw/master/5.2/v5.2.10+_bmq099.patch

      Git tree will not been updated as it's based on 5.2.0. This mainline change is also in 5.3.

      Delete
  6. I decided to give BMQ a whirl with 5.2.11 using the above patch https://gitlab.com/alfredchen/bmq/raw/master/5.2/v5.2.10+_bmq099.patch

    Compare with my MuQSS config, the diff is basically changing from HZ_100 (MuQSS) -> HZ_1000 (BMQ). Both are built with CONFIG_NO_HZ_FULL. My MuQSS kernel is built with the full -ck patchset, and with CONFIG_RQ_SMT=y.

    I ran Monster Hunter Online Benchmark (d3d11 under Wine 4.14 with DXVK).
    MuQSS: score: 28162 max:128.6 fps/min:54.9 fps
    BMQ: score: 21395 max:97.3 fps/min: 26.3 fps

    I experimented with "yield_type" setting it to 0, and got a slightly better result:
    BMQ: score: 25634 max:117.2 fps/min 35 fps

    I also tested "yield_type=2", but that was basically useless and just stutter/non working.

    Any tips?

    Also one other weird tidbit when i used BMQ was dmesg indicating:
    [ 0.158114] MDS: Mitigation: Clear CPU buffers
    [ 0.241327] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.

    When running with MuQSS, i only get the (correct):
    [ 0.200328] MDS: Mitigation: Clear CPU buffers

    Same microcode and os, so there should be no reason why BMQ scheduler suddenly should introduce this?

    I have to have made a horrible mistake someplace with my config, although there is not really much to configure when comparing the two schedulers? Tips to try? The rest of my kernel config is basically Ubuntu default, although i have disabled AMD processor, and build with -march=native and -O3. (Graysky's patchset)

    ReplyDelete
    Replies
    1. For performance tuning. You can simply try "nice -5" or "nice -10" for the program you are focusing on. The result may depend on what other components your program depend on and what nice level they are running. It's a system level design. But for BMQ, it's as simple as setting the nice level.

      For MDS, it's the first time I hear about this, will find sometime for investigation. BTW, what CPU you are using?

      Delete
    2. Intel 8700K. (6 core/12 threads).

      I run my benchmarks and stuff with "schedtool -n -5 -e wine ./whatever.exe" for nicelevel -5.

      When using MuQSS i also set "/proc/sys/kernel/interactive = 0" (default is 1).

      I did NOT use this patch when i built tho: https://gitlab.com/alfredchen/bmq/blob/master/5.2/enabled_CGROUP_CPUACCOUNTING.patch
      Is this "required"? (cos i was somewhat under the impression it was for some debugging...)

      Delete
    3. After reading https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html carefully. I think most Intel cpu is affected. I also found "MDS CPU bug present and SMT on, data leak possible." in my dmesg of intel 8259U system. As I also enabled smt, so I got the below info
      cat /sys/devices/system/cpu/vulnerabilities/mds
      Mitigation: Clear CPU buffers; SMT vulnerable
      It's normal and nothing unexpected, IMO.

      Delete
    4. The enabled_CGROUP_CPUACCOUNTING.patch is not necessary. Another comment is for interactive setting, for gaming, normally should set it as its default, but it's up for you for testing.

      Delete
    5. Interesting... Running with MuQSS (CONFIG_RQ_SMT=y), this reads:
      cat /sys/devices/system/cpu/vulnerabilities/mds
      Mitigation: Clear CPU buffers; SMT disabled

      And ref. https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/MDS, this means:
      The file will contain the following contents for processors that do not support Intel Hyper-Threading or where Hyper-Threading has been disabled:

      $ cat /sys/devices/system/cpu/vulnerabilities/mds
      Mitigation: Clear CPU buffers; SMT disabled

      But doing lscpu i get:
      Architecture: x86_64
      CPU op-mode(s): 32-bit, 64-bit
      Byte Order: Little Endian
      CPU(s): 12
      On-line CPU(s) list: 0-11
      Thread(s) per core: 2
      Core(s) per socket: 6
      Socket(s): 1
      NUMA node(s): 1
      Vendor ID: GenuineIntel
      CPU family: 6
      Model: 158
      Model name: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz

      So that was somewhat weird.... I will check what this sais when running default Ubuntu kernel. (And Hyperthreading IS enabled in my bios and i HAVE 12 threads running..)

      Delete
    6. Running the default Ubuntu 5.0.0-26-generic kernel gives this:

      cat /sys/devices/system/cpu/vulnerabilities/mds
      Mitigation: Clear CPU buffers; SMT vulnerable

      [ 0.141535] MDS: Mitigation: Clear CPU buffers
      [ 0.147123] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.

      So i guess that is normal when actually having HT enabled.

      I will pop that question over at CK's blog then, and see what turns up :)

      Delete
    7. Given the context, MuQSS might not be of particular interest here, does BMQ shows SMT disabled as well when it's actually not?

      Delete
    8. Hmm.. i thought i had replied to this, but seemingly not.

      Anyway, BMQ show the same as Ubuntu default kernel, and since i actually do NOT disable HT in the bios (or kernel cmd), the correct behavior is:
      Mitigation: Clear CPU buffers; SMT vulnerable

      Atleast for my Intel cpu (8700K).

      Delete
    9. Hmm... I obviously didn't put my reading glasses on :) Apologies.

      Delete
  7. Please check: https://gitlab.com/alfredchen/bmq/issues/8

    ReplyDelete
  8. Hello, I use gentoo on 5.2 kernel. I tried your patches for 5.2 and my system hangs for several minutes when I compile (emerge) something. All compilations start with nice 10 and I only use -j 4, since my system is a quad core intel. After the freeze, system operates normally. How can I debug this?

    ReplyDelete