Thursday, November 10, 2016

VRQ 0.89a released

VRQ 0.89a is released which mainly fix the imbalance cpu usage issue. Now all cpu can run at 100%, and no issue found in daily usage.

Also, there are other code changes/refines in scheduler core, but I afraid it is not so visible.

For this release, sanity kernel compilation tests also show there is noticeable improvement comparing the previous vrq release at 100%~300% workload. At 50% workload, a small regression is found in the 4 cores system, a debug patch already gets it back to the same level of the previous vrq, but I'd like it to be well test on other systems before officially commit it.

I am happy with the sanity test results, it show this new release of vrq already better than previous release, and there are several performance improvement idea not yet implemented.

Have fun with this release and expect the next. :)

BR Alfred

PS: code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.8.y-test
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.8.y-test

4 comments:

  1. @Alfred:
    Can you, please, consider posting an all-in-one-patch also/again/too for this release?

    Thanks in advance,
    BR, Manuel Krause

    ReplyDelete
  2. @Alfred,

    finally I have tested this version, Unigine results are there (as usual): https://docs.google.com/spreadsheets/d/1EayezAsGlJdXjZbS3b9m7YtvtRF-DJ3xrT3hYCvfymQ/edit?usp=sharing

    Couple of things, I used the same config file for MuQSS and VRQ, but still this VRQ release does not allow my i7 to go into any sleep state below c1. Firstly, when I wrote an e-mail, it was none of cores, it seems that it's rather random, sometimes none of cores go to c7, sometims 2 can go to c7, sometimes 1, but never all of them. So this still is an issue. The same config in MuQSS works fine.

    For comparison.
    VRQ:
    Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
    Core 1 [0]: 1384.42 (13.88x) 3.47 97.7 0 0 0 56 0.8656
    Core 2 [2]: 1392.30 (13.96x) 6.95 95.4 0 0 0 57 0.8706
    Core 3 [4]: 1427.95 (14.31x) 1 99.5 0 0 0 60 0.8706
    Core 4 [6]: 1377.51 (13.81x) 3.75 97.5 0 0 0 60 0.8706

    MUX:
    Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
    Core 1 [0]: 1286.53 (12.90x) 5.63 12.5 1 0 83.1 55 0.8756
    Core 2 [2]: 1385.62 (13.89x) 1.7 6.14 1 0 91.7 55 0.8756
    Core 3 [4]: 1577.86 (15.82x) 2.04 2.3 1 0 95.2 56 0.8756
    Core 4 [6]: 1336.10 (13.39x) 1.63 4.74 1 0 93.2 55 0.8756

    Good news are that D3 does not stutter :)
    Thanks and keep up good work, at Unigine results seems to be quite promising ;)

    Some questions, if I may...
    So You are going multiqueue design as well, right, what is Your prediction how it influences interactivity?
    What are Your recommended HZ values?
    Will You support fully tickless kernel?

    Br, Eduardo

    ReplyDelete
    Replies
    1. @Manuel
      Here comes the all-in-one patch you ask for. at https://bitbucket.org/alfredchen/linux-gc/downloads/v4.8_vrq089a.patch

      If you want to get the vrq patch from it, "git diff xxxx > vrq_089a.patch" which xxxx is the previous commit hash of the first bfs/vrq related commit.
      use "git log --oneline" to check for commit hash, xxxx now should be --> 6dd0b0c Turn BFQ-v7r11 into BFQ-v8r4 for 4.8.0
      ...
      605adc6 bfs: bfs472-skiplist.patch
      2d1d0ab bfs: 0480 skiplists.patch
      4804150 bfs: [Fix] 0472 SMT_NICE undefine crash fix
      0d2453a The Brain Fuck Scheduler v0.472 by Con Kolivas.
      6dd0b0c Turn BFQ-v7r11 into BFQ-v8r4 for 4.8.0
      c58a890 block, bfq: add Early Queue Merge (EQM) to BFQ-v7r11, to port to 4.8.0
      ...

      Then you can use the all-in-one vrq_089a.patch for your own kernel tree.

      BR Alfred

      Delete
    2. @Eduardo
      As I quick check two of my system, cpu 0 is less likely enter >c1 state, it looks like an issue, but as it doesn't break system and I have no idea what's causing this, may need to look back to previous -vrq and -bfs codes, so I'd like put it at medium priority. When I finish schedule core improvement, I will work on it.

      Q&A session:
      Yes. As the sanity on 089a shows very impressive results, vrq will keep per cpu run queue design. Previous vrq is also semi-multiqueue design, it has rq lock and preempt task but not a real queue data structure for tasks.

      The lock strategy change in __schedule() and TTWU code path and the O(1) task look up would help with interactivity. I haven't check interactivity of 0.89a yet, task policy fairness among run queues still need to be adjusted.

      1000HZ as old bfs suggested. Because there is no HZ code changes in current -vrq, and there is no plan to introduce such new changes in before schedule core changes are settled down.

      Same reason as above, one thing at a time, no new feature(include fully tickless) will be added till schedule core changes are done(vrq 0.9).

      BR Alfred

      Delete