tag:blogger.com,1999:blog-2963790426029213933.post7262143628042174145..comments2024-02-29T00:33:07.382-08:00Comments on Alfred Chen's Blog: PDS 0.99f releaseAlfred Chenhttp://www.blogger.com/profile/03164306846702841944noreply@blogger.comBlogger57125tag:blogger.com,1999:blog-2963790426029213933.post-53594877049351736822018-12-17T08:11:15.864-08:002018-12-17T08:11:15.864-08:00@Alfred:
Many thanks for sacrificing your time to ...@Alfred:<br />Many thanks for sacrificing your time to explain and clarify this topic to/for me.<br />So, my conclusion is now, to just let it be at NR_CPUS=4, what had worked for years and didn't show error messages.<br />I hope it won't produce additional overhead with enabled SMT & SMT_NICE .config settings... Let me know if you are aware of any.<br /><br />Thanks and BR,<br />ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-27747308131046025752018-12-15T06:59:03.806-08:002018-12-15T06:59:03.806-08:00@Manuel
All looks goods. Either 2 or 4 should work...@Manuel<br />All looks goods. Either 2 or 4 should works for PDS.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-51372216021262765242018-12-14T12:35:52.849-08:002018-12-14T12:35:52.849-08:00@Alfred:
Yes, I've read @Oleksandr's posts...@Alfred:<br />Yes, I've read @Oleksandr's posts and your replies. But I remain clueless:<br /><br />My affinity check masks:<br />For NR_CPUS=4:<br />[ 0.319876] smp: Brought up 1 node, 2 CPUs<br />[ 0.319876] smpboot: Max logical packages: 2<br />[ 0.319876] smpboot: Total of 2 processors activated (9043.75 BogoMIPS)<br />[ 0.320866] pds: cpu #0 affinity check mask - coregroup 0x00000002<br />[ 0.320868] pds: cpu #1 affinity check mask - coregroup 0x00000001<br /><br />For NR_CPUS=2:<br />[ 0.319171] smp: Brought up 1 node, 2 CPUs<br />[ 0.319171] smpboot: Max logical packages: 1<br />[ 0.319171] smpboot: Total of 2 processors activated (9043.30 BogoMIPS)<br />[ 0.320153] pds: cpu #0 affinity check mask - coregroup 0x00000002<br />[ 0.320156] pds: cpu #1 affinity check mask - coregroup 0x00000001<br /><br />I absolutely don't know, how your algorithm determines the above and don't understand the related code base. <br />And... neither, what benefit my system wants to achieve with two additional hotplug CPUs -- not being recognized?!<br /><br />BR, ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-16578203479474198092018-12-13T06:12:37.762-08:002018-12-13T06:12:37.762-08:00@Manuel
pf just asked about affinity check mask in...@Manuel<br />pf just asked about affinity check mask in 0.99g thread, you can find the explanation there. And you can paste your affinity check masks log here.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-13049279407318634282018-12-12T02:37:43.487-08:002018-12-12T02:37:43.487-08:00@Alfred:
Many thanks for your added expertise!
I&...@Alfred:<br />Many thanks for your added expertise!<br /><br />I've given the "limited" NR_CPUS=2 only one day of testing but it didn't reveal any advantage/ disadvantage in my normal use. Maybe I should test it a little longer.<br />According to docs from Intel the cpu itself ships with the "ht" flag (also shown by /proc/cpuinfo), but the chipset "officially" doesn't make use of it(?).<br />My Core2Duo series (Penryn comparables) also was shipped with quad-core versions or such with dedicated cores turned off, depending e.g. on mobile powersaving use etc..<br /><br />I'm not sure about the consequences of 2 real and 2 additional populated cores, the hotplug ones. Do you imagine any?<br /><br />From some dmesg logs found in my web search on the Core2Duo, I've often read about the "4 CPUs, 2 hotplug CPUs" (btw., also with far higher allowed NR_CPUS). So maybe it's not an issue with (only/ only my) BIOS.<br /><br />Would I see different "pds: cpu #? affinity check mask - coregroup 0x*" lines when the hotplug CPUs were accepted 'correctly'?<br /><br />Thanks for your guidance, TIA and BR,<br />ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-18009856000640922272018-12-11T17:42:53.906-08:002018-12-11T17:42:53.906-08:00@Manuel
I don't think P8400 has smt capability...@Manuel<br />I don't think P8400 has smt capability. But your bios seems to have some problem reporting 4 cpus available. But anyway both 2 or 4 should works for you, be safe, 4 will be a good choice.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-81584492440649821722018-12-11T06:12:36.710-08:002018-12-11T06:12:36.710-08:00@Alfred and @Anonymous:
Thank you both very much f...@Alfred and @Anonymous:<br />Thank you both very much for teaching me the right understanding of what's going on with single threaded applications.<br /><br />Regarding my journey with NR_CPUS: <br />It seems, at least for the kernel, that NR_CPUS=4 is the most error-free for my Core2Duo. The search on the web for this is astonishingly poor in the number of results, useful ones ~0.<br /><br />I'm not understanding the differences around the added "hotplug CPUs", "pcpu-alloc" and "Max logical packages", when cross-testing NR_CPUS=2 vs. 4.<br />Let me add a summary of relevant dmesg log lines for both cases at the bottom of this post. Can you, please, have a look at it, to check if it's correct (especially regarding the pds' affinity check)? What advantage do the 2 added hotplug cpus have?<br /><br />TIA and BR,<br />Manuel<br /><br />-> Summary:<br />----------- NR_CPUS=2 -----------<br />[0.071150] smpboot: 4 Processors exceeds NR_CPUS limit of 2<br />[0.071151] smpboot: Allowing 2 CPUs, 0 hotplug CPUs<br />[0.180112] setup_percpu: NR_CPUS:2 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:1<br />[0.181395] percpu: Embedded 42 pages/cpu @(____ptrval____) s133336 r8192 d30504 u1048576<br />[0.181404] pcpu-alloc: s133336 r8192 d30504 u1048576 alloc=1*2097152<br />[0.181406] pcpu-alloc: [0] 0 1 <br />[0.295502] pds: PDS-mq CPU Scheduler 0.99g by Alfred Chen.<br />[0.303137] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz (family: 0x6, model: 0x17, stepping: 0x6)<br />[0.319171] smpboot: Max logical packages: 1<br />[0.319171] smpboot: Total of 2 processors activated (9043.30 BogoMIPS)<br />[0.320153] pds: cpu #0 affinity check mask - coregroup 0x00000002<br />[0.320156] pds: cpu #1 affinity check mask - coregroup 0x00000001<br /><br />----------- NR_CPUS=4 -----------<br />[0.071133] smpboot: Allowing 4 CPUs, 2 hotplug CPUs<br />[0.180662] setup_percpu: NR_CPUS:4 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1<br />[0.182086] percpu: Embedded 42 pages/cpu @(____ptrval____) s133336 r8192 d30504 u524288<br />[0.182095] pcpu-alloc: s133336 r8192 d30504 u524288 alloc=1*2097152<br />[0.182096] pcpu-alloc: [0] 0 1 2 3 <br />[0.296187] pds: PDS-mq CPU Scheduler 0.99g by Alfred Chen.<br />[0.303841] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU P8400 @ 2.26GHz (family: 0x6, model: 0x17, stepping: 0x6)<br />[0.319876] smpboot: Max logical packages: 2<br />[0.319876] smpboot: Total of 2 processors activated (9043.75 BogoMIPS)<br />[0.320866] pds: cpu #0 affinity check mask - coregroup 0x00000002<br />[0.320868] pds: cpu #1 affinity check mask - coregroup 0x00000001<br /><-<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-56585454660957552622018-12-11T01:27:47.649-08:002018-12-11T01:27:47.649-08:00Yeah.
Due to core migration both cpus will be at 5...Yeah.<br />Due to core migration both cpus will be at 50% max. for a single threaded task and not more.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-71060260173066273112018-12-10T22:21:14.882-08:002018-12-10T22:21:14.882-08:00@Manuel
One thread can only attach to(occupy/use) ...@Manuel<br />One thread can only attach to(occupy/use) one cpu at one time. So one thread can't take >100% cpu time.<br /><br />For your NR_CPUS issue, I think your system is trying to populate 4 cpus but you limit it to 2, you can try to google your cpu type and see how many cores/threads it has and set it in NR_CPUS.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-49570671084575233952018-12-10T06:56:13.836-08:002018-12-10T06:56:13.836-08:00@Alfred and @Anonymous:
Thank you very much for yo...@Alfred and @Anonymous:<br />Thank you very much for your quick reply.<br />Maybe I've misused the word "wishful" instead of "wished"?!<br /><br />I wished that the single thread also comes to 100% on both cores when nothing else inhibits it.<br />Please tell me, when I've generally misunderstood the single-thread distribution over the two cores.<br />For the current kernel (4.19.8 + PDS 0.99g) I've changed my .config to only use NR_CPUS = 2 (vs. former 4) what may be correct somehow but somehow not... The CPU supports SMT/HT but the chipset forbids it. With this new kernel the kernel "make -j1" remains at ~50% load on each core.<br />Oh, and with the new reduced NR_CPUS I get an error message:<br />[ 0.071150] smpboot: 4 Processors exceeds NR_CPUS limit of 2<br /><br />But what would now be the correct setting??<br /><br />@Alfred:<br />A question aside: What NR_CPUS do you set in your server's .config?<br /><br />Many thanks in advance for any helpful and clarifying answer!<br /><br />TIA,<br />ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-5326163990144875542018-12-09T22:37:04.439-08:002018-12-09T22:37:04.439-08:00I don't get it either.I don't get it either.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-35923346395556085792018-12-09T19:06:45.047-08:002018-12-09T19:06:45.047-08:00@Manuel
Maybe I am not get what is the non-wishful...@Manuel<br />Maybe I am not get what is the non-wishfull behavior on your dual-core. Single thread "use 50% of each cpu core's power", add up to ~ "100%" is a normal behavior to me. And My Celeron server which should have similar topology with dual-core runs well w/o unexpected behavior here.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-79045448754466407032018-12-09T09:34:46.243-08:002018-12-09T09:34:46.243-08:00@Alfred:
I'm seeing non-wishful behaviour on m...@Alfred:<br />I'm seeing non-wishful behaviour on my dual-core: Non multi-threaded applications only use 50% of each cpu core's power. Currently I'm not running any WCG or alike clients in the background. I've noticed the issue with heavy FreeCAD calculation operations first and also kernel make -j1 shows that only ~50% are utilized. This testing is still from 4.19.6 kernel with PDS 0.99f.<br />I'm not convinced that this is the desired behaviour.<br /><br />TIA and BR,<br />ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-45833610030067333212018-12-09T08:21:22.706-08:002018-12-09T08:21:22.706-08:00ayws i use both and work fineayws i use both and work fineMicronhttps://www.blogger.com/profile/00612104081287508741noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-31095110750683122292018-12-09T04:41:54.427-08:002018-12-09T04:41:54.427-08:00@Micron
I think you are supposed to use both these...@Micron<br />I think you are supposed to use both these patches:<br />https://gitlab.com/alfredchen/linux-pds/commit/55fdf19c03c121144717c95e9b0b177cf1cb883b<br />and<br />https://gitlab.com/alfredchen/linux-pds/commit/72da04f0d96e12bb20beaa33a980f7fd279110eb<br /><br />Until 0.99g comes out anyway :)Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-194479765588046552018-12-09T03:24:06.230-08:002018-12-09T03:24:06.230-08:00after check
this patch fix problem:
https://git...after check <br />this patch fix problem: <br /><br />https://github.com/hhoffstaette/kernel-patches/blob/4.19/4.19/pds-20181206-make-sched_smt_present-track-topology.patchMicronhttps://www.blogger.com/profile/00612104081287508741noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-3342436592047586782018-12-09T03:05:44.963-08:002018-12-09T03:05:44.963-08:00With kernel 4.19.6 build ok without problem and bo...With kernel 4.19.6 build ok without problem and boot fine.<br />Micronhttps://www.blogger.com/profile/00612104081287508741noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-8993334175059665752018-12-09T03:05:20.443-08:002018-12-09T03:05:20.443-08:00Hi
But have other bug with latest kernel 4.19.8:
...Hi <br />But have other bug with latest kernel 4.19.8:<br /><br />/usr/bin/ld: arch/x86/kernel/cpu/bugs.o: in function `arch_smt_update.part.2':<br />bugs.c:(.text+0x183): undefined reference to `sched_smt_present'<br />/usr/bin/ld: bugs.c:(.text+0x1c8): undefined reference to `sched_smt_present'<br />Makefile:1032: recipe for target 'vmlinux' failed<br />make[3]: *** [vmlinux] Error 1<br />make[3]: Leaving directory '/usr/src/linux-4.19.8'Micronhttps://www.blogger.com/profile/00612104081287508741noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-57331340859471160182018-12-08T06:25:34.508-08:002018-12-08T06:25:34.508-08:00kthx :-)kthx :-)Oleksandr Natalenkohttps://www.blogger.com/profile/12098091624630953604noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-23405024355815238162018-12-08T06:02:24.331-08:002018-12-08T06:02:24.331-08:00@pf
Thanks for the feedback. Looks like it is the ...@pf<br />Thanks for the feedback. Looks like it is the correct fix with the minimal code change. It has been pushed to git at<br /><br />https://gitlab.com/alfredchen/linux-pds/commit/72da04f0d96e12bb20beaa33a980f7fd279110eb<br /><br />I'd like to do 099g pds release early next week.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-14712640265997156722018-12-08T04:19:56.052-08:002018-12-08T04:19:56.052-08:00The fix is also confirmed on a physical machine of...The fix is also confirmed on a physical machine of one of my users.Oleksandr Natalenkohttps://www.blogger.com/profile/12098091624630953604noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-67558170110892832842018-12-07T23:27:25.673-08:002018-12-07T23:27:25.673-08:00VM is not panicking with this one. Thanks.
I'...VM is not panicking with this one. Thanks.<br /><br />I'll be able to test things on a real machine in 4 days since I'm travelling.Oleksandr Natalenkohttps://www.blogger.com/profile/12098091624630953604noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-63464458867432123282018-12-07T19:18:00.099-08:002018-12-07T19:18:00.099-08:00@pf
Pls try this patch.
diff --git a/kernel/sched...@pf<br />Pls try this patch.<br /><br />diff --git a/kernel/sched/pds.c b/kernel/sched/pds.c<br />index cddf591a1603..5a5ad448f23c 100644<br />--- a/kernel/sched/pds.c<br />+++ b/kernel/sched/pds.c<br />@@ -619,7 +619,7 @@ static inline void update_sched_rq_queued_masks(struct rq *rq)<br /> return;<br /> <br /> #ifdef CONFIG_SCHED_SMT<br />- if (~0 == per_cpu(sched_sibling_cpu, cpu))<br />+ if (cpu == per_cpu(sched_sibling_cpu, cpu))<br /> return;<br /> <br /> if (SCHED_RQ_EMPTY == last_level) {<br />@@ -6103,7 +6103,7 @@ void __init sched_init(void)<br /> rq->queued_level = SCHED_RQ_EMPTY;<br /> rq->pending_level = SCHED_RQ_EMPTY;<br /> #ifdef CONFIG_SCHED_SMT<br />- per_cpu(sched_sibling_cpu, i) = ~0;<br />+ per_cpu(sched_sibling_cpu, i) = i;<br /> rq->active_balance = 0;<br /> #endif<br /> #endif<br />Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-38241314604147682612018-12-07T16:38:46.536-08:002018-12-07T16:38:46.536-08:00Indeed, it is a issue related to #3 code change in...Indeed, it is a issue related to #3 code change in 0.99f, not related to the new new introduced "sched_smt_present" in new commit for 4.19.7. Working on a quick fix.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-43365053344290215432018-12-07T08:53:15.591-08:002018-12-07T08:53:15.591-08:00Read carefully please, the issue is not limited to...Read carefully please, the issue is not limited to QEMU. The statement about CFS was also presented previously (hint: it works).<br /><br />+1 report from the physical machine of one of my users ("nosmt" did the trick as well as a temporary workaround).Oleksandr Natalenkohttps://www.blogger.com/profile/12098091624630953604noreply@blogger.com