In the last kernel release cycle, I played around some ideas with BFS, ideas like how to select a task to run, sticky task etc. The result is not as good as expected but all these trying let me knows BFS code better. So there will be no huge changes in BFS VRQ solution branch, I may need another round to think it all over again. Instead, there are some changes which can barely apply upon the original bfs grq lock solution are back-ported to -gc branch.
3e678e6 bfs: sync with mainline sched_setscheduler() logic.
-- There is new parameter called sched_attr is introduced during 3.14 in sched_setscheduler() and related functions. BFS is not fully sync with these changes yet.
f3a98f9 bfs: Refactory online_cpus() checking in try_preempt().
9c72372 bfs: Refactory needs_other_cpu().
--Two changes in try_preempt().
65f06e8 bfs: priodl to speed up try_preempt().
-- Introduced task priodl, which first 8 bits is the task prio and the last 56 bits are the task deadline(higher 56 bits) to speed up try_preempt() calculation.
1fa3bb5 bfs: Full cpumask based and LLC sensitive cpu selection.
-- Rewrite the cpu selection logic for tasks, by using full cpumask based calculation. The benefit of cpumask based calculation is that the cost is not scaled with cpu numbers when it is among a certain range(64 cpus for 64bits system and 32 cpus for 32bit system). The cost to transfer tasks among cpu which shares same LLC(Last Level Cache) should be consider free.
The best_mask_cpu() cpu selection logic now follows below orders:
* Non scaled same cpu as task originally runs on
* Non scaled SMT of the cup
* Non scaled cores/threads shares last level cache
* Scaled same cpu as task originally runs on
* Scaled same cpu as task originally runs on
* Scaled SMT of the cup
* Scaled cores/threads shares last level cache
* Non scaled cores within the same physical cpu
* Non scaled cpus/Cores within the local NODE
* Scaled cores within the same physical cpu
* Scaled cpus/Cores within the local NODE
* All cpus avariable
To implement full cpumask calculation, non_scaled_cpumask is introduced in grq structure. The plug-in code in cpufreq and intel_pstate drivers also be modified to avoid multi-trigger when scaling down from max cpu freq(intel_pstate driver just pass compile test, I have no hardware which runs on intel_pstate driver)
Here are the test result of the 3.17-gc, comparing to vrq-02-baseline-test-result, in low workload, these patches give another 3~4 seconds improvement.
#1 3.17.2-gc 50% tasks/cores ratio
5m18.630s
5m18.509s
5m18.504s
5m18.494s
5m18.489s
5m18.487s
5m18.481s
5m18.461s
5m18.383s
5m18.339s
#2 3.17.2-gc 150% tasks/cores ratio
2m56.475s
2m56.447s
2m56.418s
2m56.410s
2m56.402s
2m56.367s
2m56.349s
2m56.197s
2m56.173s
2m56.128s
#3 3.17.2-gc 100% tasks/cores ratio
2m52.318s
2m50.698s
2m50.654s
2m50.596s
2m50.578s
2m50.543s
2m50.534s
2m50.508s
2m50.437s
2m50.430s
#4 3.17.2-gc 100%+50% tasks/cores ratio
2m52.115s
2m51.246s
2m50.892s
2m50.880s
2m50.847s
2m50.819s
2m50.805s
2m50.804s
2m50.789s
2m50.645s
If you want to try these new patches, please check my linux-3.17.y-gc git branch.
Hi Alfred,
ReplyDeleteI'm testing your 3.17 gc-patchset for 2 days now, that are 18 BFS related patches (I hope that I haven't missed one) plus 3 of your collection that I like and, of course, BFQ.
Working well on here. What means that I don't see any disadvantages so far! ;-)
{ To be honest, I had one X server crash 22h ago, but until it happens again, I'd have to blame intel_drv.so only, what showed up in the related Xorg.0.log. }
I really miss actualised TuxOnIce patches for 3.17. And in that meaning I really want to thank you very much for your ongoing work on and for BFS,
best regards,
Manuel Krause
Thanks for testing. I am currently trying some new ideas, hopefully can catch up with -rc6 release time. :)
DeleteAre you planning to update / clean up this branch, now after CK published his 3.17 related BFS-457 patchset?
ReplyDeleteThanks, Manuel
I am working on it, but may wait for 3.17.3 then release as tag 3.17.3-gc. Stay tuned.
DeleteI estimate that it won't take long for 3.17.3 to surface, so a little waiting would make it then easier to distinguish the different releases by tag and as well by the timestamps. Thank you for your continued work!!!
DeleteAnd of course I'm curious about your mentioned "new ideas" finding a way into new patches. :-)
Manuel
*lurking around* ;-) Any efforts with the actualisation hopefully including the bfs-458 fix? No need to hurry, TuxOnIce is far more outdated... ^^
DeleteGood luck and best regards, Manuel
Sure. Actually I have rebased on bfs-458 in local branches, for the -gc branch, I will wait for 3.17.3 stable release then make a new tag. For the -vrq branch, I will stick with 3.17.2, since there are some changes in both bfs itself and the c compiler version upgrade in my test system, I will kick off another round testing soon.
DeleteNow even 3.17.4 is out... Manuel
DeleteHehe, you've had updated your gc to the 3.17.3 tag in the meantime. Thank you! I'm currently using it for some hours now. It applied and compiled fine and no errors, so far.
DeleteBest regards, Manuel
With 3.17.4. Manuel
DeleteMaybe you'd need to update the top blog entry? ;-)
DeleteYou've updated the -gc branch for 3.17.4 (tag), but do you have changed the patches since tag 3.17.3 ? The latter worked well also for 3.17.4 for me. Meaning: Do I need to fetch the newer patches or not?
Another couple of questions, now regarding the -vrq branch: Would you consider it safe to apply these 6 (six) -vrq patches on top of the 3.17.3 tagged -gc patches on 3.17.4? I'd like to try it out after some coming 3.18-rc6 debugging days. With as few additional work as possible ;-) BTW, is it still v0.2 or has it advanced in the meantime?
Best regards and thank you for your work!
Manuel
what i say, git is bullshit, a "broken out" patch set would be much easier
Deletejust imagine, bfs, BFQ, tuxonice etc., everything would only be available via git...
Just imagine, we would get honored for bug fixing help... just imagine we'd once get paid for gathering git patches... just imagine... The Man in the Moon...
Delete;-) Manuel
I have updated the blog, busy with debugging. You can have a check.
DeleteI am not recommended -vrq at this release as I am still trying to fix known issues on it.
Thank you for the warning. But I'll stay tuned and would like to test the following new revision of your -vrq patches.
DeleteKeep up your good work, best regards, Manuel