Tuesday, November 4, 2014

What's new for 3.17 gc patch set

In the last kernel release cycle, I played around some ideas with BFS, ideas like how to select a task to run, sticky task etc. The result is not as good as expected but all these trying let me knows BFS code better. So there will be no huge changes in BFS VRQ solution branch, I may need another round to think it all over again. Instead, there are some changes which can barely apply upon the original bfs grq lock solution are back-ported to -gc branch.

3e678e6 bfs: sync with mainline sched_setscheduler() logic.
-- There is new parameter called sched_attr is introduced during 3.14 in sched_setscheduler() and related functions. BFS is not fully sync with these changes yet.

f3a98f9 bfs: Refactory online_cpus() checking in try_preempt().
9c72372 bfs: Refactory needs_other_cpu().
--Two changes in try_preempt().

65f06e8 bfs: priodl to speed up try_preempt().
-- Introduced task priodl, which first 8 bits is the task prio and the last 56 bits are the task deadline(higher 56 bits) to speed up try_preempt() calculation.

1fa3bb5 bfs: Full cpumask based and LLC sensitive cpu selection.
-- Rewrite the cpu selection logic for tasks, by using full cpumask based calculation. The benefit of cpumask based calculation is that the cost is not scaled with cpu numbers when it is among a certain range(64 cpus for 64bits system and 32 cpus for 32bit system). The cost to transfer tasks among cpu which shares same LLC(Last Level Cache) should be consider free.
The best_mask_cpu() cpu selection logic now follows below orders:
* Non scaled same cpu as task originally runs on
* Non scaled SMT of the cup
* Non scaled cores/threads shares last level cache
* Scaled same cpu as task originally runs on
* Scaled same cpu as task originally runs on
* Scaled SMT of the cup
* Scaled cores/threads shares last level cache
* Non scaled cores within the same physical cpu
* Non scaled cpus/Cores within the local NODE
* Scaled cores within the same physical cpu
* Scaled cpus/Cores within the local NODE
* All cpus avariable
To implement full cpumask calculation, non_scaled_cpumask is introduced in grq structure. The plug-in code in cpufreq and intel_pstate drivers also be modified to avoid multi-trigger when scaling down from max cpu freq(intel_pstate driver just pass compile test, I have no hardware which runs on intel_pstate driver)

Here are the test result of the 3.17-gc, comparing to vrq-02-baseline-test-result, in low workload, these patches give another 3~4 seconds improvement.

#1  3.17.2-gc 50% tasks/cores ratio
5m18.630s
5m18.509s
5m18.504s
5m18.494s
5m18.489s
5m18.487s
5m18.481s
5m18.461s
5m18.383s
5m18.339s

#2 3.17.2-gc 150% tasks/cores ratio
2m56.475s
2m56.447s
2m56.418s
2m56.410s
2m56.402s
2m56.367s
2m56.349s
2m56.197s
2m56.173s
2m56.128s

#3 3.17.2-gc 100% tasks/cores ratio
2m52.318s
2m50.698s
2m50.654s
2m50.596s
2m50.578s
2m50.543s
2m50.534s
2m50.508s
2m50.437s
2m50.430s

#4 3.17.2-gc 100%+50% tasks/cores ratio
2m52.115s
2m51.246s
2m50.892s
2m50.880s
2m50.847s
2m50.819s
2m50.805s
2m50.804s
2m50.789s
2m50.645s

If you want to try these new patches, please check my linux-3.17.y-gc git branch.

15 comments:

  1. Hi Alfred,
    I'm testing your 3.17 gc-patchset for 2 days now, that are 18 BFS related patches (I hope that I haven't missed one) plus 3 of your collection that I like and, of course, BFQ.
    Working well on here. What means that I don't see any disadvantages so far! ;-)
    { To be honest, I had one X server crash 22h ago, but until it happens again, I'd have to blame intel_drv.so only, what showed up in the related Xorg.0.log. }

    I really miss actualised TuxOnIce patches for 3.17. And in that meaning I really want to thank you very much for your ongoing work on and for BFS,
    best regards,

    Manuel Krause

    ReplyDelete
    Replies
    1. Thanks for testing. I am currently trying some new ideas, hopefully can catch up with -rc6 release time. :)

      Delete
  2. Are you planning to update / clean up this branch, now after CK published his 3.17 related BFS-457 patchset?

    Thanks, Manuel

    ReplyDelete
    Replies
    1. I am working on it, but may wait for 3.17.3 then release as tag 3.17.3-gc. Stay tuned.

      Delete
    2. I estimate that it won't take long for 3.17.3 to surface, so a little waiting would make it then easier to distinguish the different releases by tag and as well by the timestamps. Thank you for your continued work!!!
      And of course I'm curious about your mentioned "new ideas" finding a way into new patches. :-)

      Manuel

      Delete
    3. *lurking around* ;-) Any efforts with the actualisation hopefully including the bfs-458 fix? No need to hurry, TuxOnIce is far more outdated... ^^

      Good luck and best regards, Manuel

      Delete
    4. Sure. Actually I have rebased on bfs-458 in local branches, for the -gc branch, I will wait for 3.17.3 stable release then make a new tag. For the -vrq branch, I will stick with 3.17.2, since there are some changes in both bfs itself and the c compiler version upgrade in my test system, I will kick off another round testing soon.

      Delete
    5. Now even 3.17.4 is out... Manuel

      Delete
    6. Hehe, you've had updated your gc to the 3.17.3 tag in the meantime. Thank you! I'm currently using it for some hours now. It applied and compiled fine and no errors, so far.

      Best regards, Manuel

      Delete
    7. With 3.17.4. Manuel

      Delete
    8. Maybe you'd need to update the top blog entry? ;-)
      You've updated the -gc branch for 3.17.4 (tag), but do you have changed the patches since tag 3.17.3 ? The latter worked well also for 3.17.4 for me. Meaning: Do I need to fetch the newer patches or not?

      Another couple of questions, now regarding the -vrq branch: Would you consider it safe to apply these 6 (six) -vrq patches on top of the 3.17.3 tagged -gc patches on 3.17.4? I'd like to try it out after some coming 3.18-rc6 debugging days. With as few additional work as possible ;-) BTW, is it still v0.2 or has it advanced in the meantime?

      Best regards and thank you for your work!
      Manuel

      Delete
    9. what i say, git is bullshit, a "broken out" patch set would be much easier

      just imagine, bfs, BFQ, tuxonice etc., everything would only be available via git...

      Delete
    10. Just imagine, we would get honored for bug fixing help... just imagine we'd once get paid for gathering git patches... just imagine... The Man in the Moon...
      ;-) Manuel

      Delete
    11. I have updated the blog, busy with debugging. You can have a check.
      I am not recommended -vrq at this release as I am still trying to fix known issues on it.

      Delete
    12. Thank you for the warning. But I'll stay tuned and would like to test the following new revision of your -vrq patches.

      Keep up your good work, best regards, Manuel

      Delete