tag:blogger.com,1999:blog-2963790426029213933.post940956534002279044..comments2024-02-29T00:33:07.382-08:00Comments on Alfred Chen's Blog: Time to have fun with kernel 4.1Alfred Chenhttp://www.blogger.com/profile/03164306846702841944noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-2963790426029213933.post-68478811468991081852015-07-20T00:39:21.438-07:002015-07-20T00:39:21.438-07:00@kernelOfTruth
I have test btrfs scrub on my produ...@kernelOfTruth<br />I have test btrfs scrub on my productive machine, 187G Size 83G used raid0 setup btrfs and all other partitions, all btrfs scrub run fine w/o deadlock.<br /><br />So I'll suggest you to enable below kernel hacking config and see if there are any useful log can be captured in dmesg when deadlock happens.<br /><br />CONFIG_SCHED_DEBUG<br />CONFIG_SCHEDSTATS<br />CONFIG_SCHED_STACK_END_CHECK<br />CONFIG_TIMER_STATS<br />CONFIG_PROVE_LOCKING<br />CONFIG_LOCK_STAT<br />CONFIG_DEBUG_LOCKDEP<br /><br />Remember to use the latest -gc branch code there are 2 fixes when you enable these configs.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-30524175792463279662015-07-19T20:23:08.090-07:002015-07-19T20:23:08.090-07:00Oh, o.k., fine!
I'm looking forward to the 4.1...Oh, o.k., fine!<br />I'm looking forward to the 4.1-vrq. And I hope to find more time to dig into systemd's /SuSE internals to safely disable and reenable the failing services+subprocesses on my running system. Still hoping I won't ever need it with your new release to come. ;-)<br /><br />ManuelAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-70794246821514902362015-07-19T19:16:12.841-07:002015-07-19T19:16:12.841-07:00I am reworking -vrq branch and enable some kernel ...I am reworking -vrq branch and enable some kernel hack config to help. During this, I got some issues when enable kernel hack configs, some these two commits are the fixes.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-54567049191502598132015-07-19T19:08:57.389-07:002015-07-19T19:08:57.389-07:00No, kernelOfTruth, according to the edited message...No, kernelOfTruth, according to the edited message on top, at least two new patches have been added:<br />https://bitbucket.org/alfredchen/linux-gc/commits/96546670bc617a0d84b78664e9d3baf0f0c00de3?at=linux-4.1.y-gc<br />and<br />https://bitbucket.org/alfredchen/linux-gc/commits/b6e4eafcbc1bf5754d5703f80b76d22d53d6b3e6?at=linux-4.1.y-gc<br /><br />So, we'd better wait for Alfred's answer.<br /><br />Manuel KrauseAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-62163975012828582052015-07-19T03:04:44.238-07:002015-07-19T03:04:44.238-07:00@Manuel Krause:
Looks like a rebase to me - so no...@Manuel Krause:<br /><br />Looks like a rebase to me - so no functional changes or new patcheskernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-46821128227108539342015-07-17T21:10:15.221-07:002015-07-17T21:10:15.221-07:00Hi, Alfred,
regarding your EDIT from July 17th, I ...Hi, Alfred,<br />regarding your EDIT from July 17th, I want to thank you very much for your continued in depth care for the BFS patches + enhancing them!<br />As I think that I don't get it correctly, can you please clarify: Do the newly added fixes only fix compile-time errors or also errors in the BFS?<br /><br />So far, I can say, it's working well, and, yes, again: Thank you!<br /><br />Manuel KrauseAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-75943679710813356582015-07-15T11:20:59.090-07:002015-07-15T11:20:59.090-07:00Lol - that was too easy ;)
the change made the sy...Lol - that was too easy ;)<br /><br />the change made the system more resilient to the load but it wasn't enough ...<br /><br />after roughly an hour it slowly hardlocked (while playing back music from youtube + browsing on github)<br /><br />the music snippet kept repeating and then eventually music stopped.<br /><br />The system didn't respond to Magic SYSRQ key :/kernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-47622315073154692482015-07-15T10:46:16.900-07:002015-07-15T10:46:16.900-07:00This was a rather "trivial" fix:
The fi...This was a rather "trivial" fix:<br /><br />The fix for the hardlock from upstream (ck) was removed by the following commit: https://github.com/cchalpha/linux-gc/commit/911bac7b2fcd8a7ec9d1b82109e77d89cb025c24<br /><br />re-adding it and btrfs scrub so far has survived scanning 40 GB of data =)<br /><br />https://github.com/kernelOfTruth/linux/commit/a9efc3e88854732b724f99767f525a9849beb274kernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-11272762141748868622015-07-10T11:13:57.669-07:002015-07-10T11:13:57.669-07:00cryptsetup with aes size of 512,
cryptsetup -y -...cryptsetup with aes size of 512, <br /><br />cryptsetup -y --cipher aes-xts-benbi:sha256 --key-size 512<br /><br />for / (root), /usr/portage and /bak<br /><br />after opening the luks Container<br /><br />the partitions are mounted with noatime,nodiratime,compress=lzo<br /><br />The hardlock hardly occurs when run on / (root) [35 GB size, 19 GB used]<br />or /usr/portage [9.8 GB size, 4.7 GB used]<br />both are on an SSD<br /><br />but pretty instantly and reliably on /bak [3 TB size, 1.9 TB used]kernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-41812236659829682482015-07-10T02:17:00.391-07:002015-07-10T02:17:00.391-07:00Updates: unable to reproduce with a btrfs usb part...Updates: unable to reproduce with a btrfs usb partition, so, what's your btrfs setup?Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-25002559250635163762015-07-09T03:22:16.433-07:002015-07-09T03:22:16.433-07:00linux-4.1.y-gc from https://github.com/cchalpha/li...linux-4.1.y-gc from https://github.com/cchalpha/linux-gc/commits/linux-4.1.y-gc<br /><br />up to commit https://github.com/cchalpha/linux-gc/commit/cd356bf85dbdba7ba7066e20ebc2adc51d38155e was being used<br /><br />Btrfs changes are always latest integration or for-linus branches merged against stable from http://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.gitkernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-26658232282903338672015-07-08T19:48:54.674-07:002015-07-08T19:48:54.674-07:00What's the code base of bfs when you have the ...What's the code base of bfs when you have the btrfs scrub issue? Pure BFS or with some of the -gc commits.<br />When I first know this btrfs issue in ck's blog at about 3.18 or 3.19 time frame, I have tested in my machines but not reproducible. Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-86308115211606510402015-07-08T14:08:45.113-07:002015-07-08T14:08:45.113-07:00Seems like I'm not the only one:
http://ck-ha...Seems like I'm not the only one:<br /><br />http://ck-hack.blogspot.com/2015/04/bfs-462-linux-40-ck1.html?showComment=1432135190412#c8301870429764130044<br /><br />http://ck-hack.blogspot.com/2015/04/bfs-462-linux-40-ck1.html?showComment=1436327448100#c8013470520406022151<br /><br />Alexander has the exact same issue: when running Btrfs scrub - it crasheskernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-64978341025565572072015-07-08T13:39:31.821-07:002015-07-08T13:39:31.821-07:00There's an issue with current BFS and 4.1.
To...There's an issue with current BFS and 4.1.<br /><br />Today I fired up a <br /><br />btrfs scrub start /<br /><br />and it hardlocked<br /><br /><br />a few days back it did run for a few seconds and then also hardlocked when requesting the status of the running scrub:<br /><br />btrfs scrub status /bak<br /><br /><br />This has been an issue a few kernel releases back with BFS when some small bugs had to be fixed (e.g. my incomplete port to a newer kernel - which in total ran fine but showed also an hardlock when <br /><br />firing up a <br /><br />btrfs scrub start)<br /><br />Anyone else experiencing this ?<br /><br /><br />Anyway - I'm back to a kernel with CFS for now - not much time for testing out new stuff<br /><br />stability is priority no.1 for now (also lost enough time [2 days] with troubleshooting QT & KDE-related non-working desktop :/ )<br /><br /><br />Hope the info helps to track this downkernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-55403184741360339992015-07-05T19:25:44.944-07:002015-07-05T19:25:44.944-07:00@kernelOfTruth
Got your reply.@kernelOfTruth<br />Got your reply.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-81210580841060695642015-07-04T12:00:09.545-07:002015-07-04T12:00:09.545-07:00Hi Alfred,
(not sure why but either my comment i...Hi Alfred,<br /><br /><br />(not sure why but either my comment is awaiting moderation or the browser or the site ate my reply)<br /><br /><br />I took a deeper look at the kernel and realized that I had forgotten that it included scheduler changes for 4.2 - therefore the compilation issue.<br /><br />Starting with a new base from scratch - it compiled fine with GCC 5.1 and is running great so far<br /><br />(I did a few rounds of Mass Effect 3 Multiplayer with WINE staging in 1920x1080 - it really has come a long way, the BFS improvements clearly help)<br /><br />So please ignore that false alarm =)kernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-5211418752045105992015-07-01T19:33:04.882-07:002015-07-01T19:33:04.882-07:00Have you confirm that this happens only in GCC 5.1...Have you confirm that this happens only in GCC 5.1? Does elder gcc version works for you?<br />I use gcc 4.8.x and now 4.9.2, never has such compile issue with these two functions.<br /><br />BR AlfredAlfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-54932380815345514212015-07-01T16:20:03.506-07:002015-07-01T16:20:03.506-07:00Hi Alfred,
thanks a lot for your hard work !
Un...Hi Alfred,<br /><br />thanks a lot for your hard work !<br /><br /><br />Unfortunately it fails with GCC 5.1:<br /><br />*kernel/sched/bfs.c:687:33: error: implicit declaration of function ‘cpu_sibling_mask’ [-Werror=implicit-function-declaration]<br />* cpumask_and(res_mask, cpumask, cpu_sibling_mask(cpu))<br />* ^<br /><br /><br />*kernel/sched/bfs.c:709:33: error: implicit declaration of function ‘cpu_core_mask’ [-Werror=implicit-function-declaration]<br />* cpumask_and(res_mask, cpumask, cpu_core_mask(cpu)) ||<br />* ^<br /><br /><br /><br />* CC kernel/irq/dummychip.o<br />* CC arch/x86/kernel/process.o<br />* CC fs/btrfs/root-tree.o<br />* CC mm/migrate.o<br />* CC mm/huge_memory.o<br />* CC mm/memory-failure.o<br />*--<br />* CC kernel/sched/completion.o<br />* CC kernel/sched/idle.o<br />* CC kernel/sched/cpupri.o<br />* CC arch/x86/kernel/check.o<br />*kernel/sched/bfs.c: In function ‘llc_cpu_check’:<br />*kernel/sched/bfs.c:687:33: error: implicit declaration of function ‘cpu_sibling_mask’ [-Werror=implicit-function-declaration]<br />* cpumask_and(res_mask, cpumask, cpu_sibling_mask(cpu))<br />* ^<br />*kernel/sched/bfs.c:687:33: warning: passing argument 3 of ‘cpumask_and’ makes pointer from integer without a cast [-Wint-conversion]<br />*--<br />* from kernel/sched/bfs.c:31:<br />*include/linux/cpumask.h:351:19: note: expected ‘const struct cpumask *’ but argument is of type ‘int’<br />* static inline int cpumask_and(struct cpumask *dstp,<br />* ^<br />*kernel/sched/bfs.c: In function ‘nonllc_cpu_check’:<br />*kernel/sched/bfs.c:709:33: error: implicit declaration of function ‘cpu_core_mask’ [-Werror=implicit-function-declaration]<br />* cpumask_and(res_mask, cpumask, cpu_core_mask(cpu)) ||<br />* ^<br />*kernel/sched/bfs.c:709:33: warning: passing argument 3 of ‘cpumask_and’ makes pointer from integer without a cast [-Wint-conversion]<br />*--<br />* from kernel/sched/bfs.c:31:<br />*include/linux/cpumask.h:351:19: note: expected ‘const struct cpumask *’ but argument is of type ‘int’<br />* static inline int cpumask_and(struct cpumask *dstp,<br />* ^<br />*kernel/sched/bfs.c: In function ‘thread_cpumask’:<br />*kernel/sched/bfs.c:6930:9: error: implicit declaration of function ‘topology_thread_cpumask’ [-Werror=implicit-function-declaration]<br />* return topology_thread_cpumask(cpu);<br />* ^<br />*kernel/sched/bfs.c:6930:9: warning: return makes pointer from integer without a cast [-Wint-conversion]<br />*--<br />* CC arch/x86/kernel/cpu/perf_event_intel.o<br />* CC security/apparmor/domain.o<br />* CC security/apparmor/policy.o<br />* CC security/apparmor/policy_unpack.o<br />*arch/x86/kernel/cpu/perf_event_intel.c: In function ‘intel_pmu_cpu_starting’:<br />*arch/x86/kernel/cpu/perf_event_intel.c:2632:7: warning: unused variable ‘h’ [-Wunused-variable]<br />*--<br />* CC arch/x86/kernel/amd_gart_64.o<br />* CC arch/x86/kernel/aperture_64.o<br />* CC arch/x86/kernel/cpu/perf_event_intel_cqm.o<br />* CC arch/x86/kernel/cpu/perf_event_intel_pt.o<br />* CC arch/x86/kernel/cpu/perf_event_intel_bts.o<br />*cc1: some warnings being treated as errors<br />* CC arch/x86/kernel/cpu/perf_event_intel_uncore.o<br />*scripts/Makefile.build:258: recipe for target 'kernel/sched/bfs.o' failed<br />*make[2]: *** [kernel/sched/bfs.o] Error 1<br />*scripts/Makefile.build:403: recipe for target 'kernel/sched' failed<br />*make[1]: *** [kernel/sched] Error 2<br />*Makefile:946: recipe for target 'kernel' failed<br />*make: *** [kernel] Error 2<br />kernelOfTruthnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-53811688076774902772015-06-30T20:49:31.951-07:002015-06-30T20:49:31.951-07:00Sorry...it was showing double posts after the '...Sorry...it was showing double posts after the 'Publish' so I tried to delete it.jwh7https://www.blogger.com/profile/09659185315567537391noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-10066518655627202762015-06-30T20:44:24.765-07:002015-06-30T20:44:24.765-07:00>We found an issue that UP is broken in BFS sin...>We found an issue that UP is broken in BFS since kernel 3.18, investigation is going on but I put it in low priority than the kernel 4.1 -vrq branch release.<br /><br />Thanks Alfred; looking forward to the UP panic getting fixed, so I can remove the SMP workaround from my kernel config!jwh7https://www.blogger.com/profile/09659185315567537391noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-17083203100841773462015-06-30T20:41:03.328-07:002015-06-30T20:41:03.328-07:00This comment has been removed by the author.jwh7https://www.blogger.com/profile/09659185315567537391noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-88989020987039621082015-06-30T19:42:08.758-07:002015-06-30T19:42:08.758-07:00Thanks for testing. I have push addtional patches ...Thanks for testing. I have push addtional patches to -gc branch and still waiting for the new bfq release.Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-17023624739731392812015-06-29T11:07:00.041-07:002015-06-29T11:07:00.041-07:00I'm running this 4.1-gc kernel for some hours ...I'm running this 4.1-gc kernel for some hours now. I know that 22h plus only 4 suspends to disk are not a sufficient testing. But it seems to work very well so far.<br /><br />Additionally to -gc I have applied<br />* BFQ for 4.0.0 without any modifications<br />* Tuxonice for 4.1.0-rc8 (only 1 hunk needed to modify)<br />* my usual patches to get my laptop fan working<br />* Alfred's "old" patches for cpu optimisations, XOR templates, fast strings<br /><br />Many thanks for your great work,<br />best regards,<br />Manuel KrauseAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-86149586709827910992015-06-28T19:30:54.650-07:002015-06-28T19:30:54.650-07:00I have checked the differences, all are expected. ...I have checked the differences, all are expected. :)Alfred Chenhttps://www.blogger.com/profile/03164306846702841944noreply@blogger.comtag:blogger.com,1999:blog-2963790426029213933.post-51252193702859513752015-06-28T09:38:28.625-07:002015-06-28T09:38:28.625-07:00I've ported BFS as well a bit earlier, and it ...I've ported BFS as well a bit earlier, and it seems that everything is OK. However I've updated pf-kernel tree against your changes, and here are some small differences:<br /><br />https://github.com/pfactum/pf-kernel/commit/34b9112f8ca9175ed714465fd1b6495ddebec5c9<br /><br />Thanks for your work!Oleksandr Natalenkohttps://www.blogger.com/profile/12098091624630953604noreply@blogger.com