Nothing remarkable items, I have put all of them in commits if I remember correctly.
Pls check it from bitbucket or github
PS, recently I got a google chromebook pixel(2013), I could have some test with SMT after I set up the system on it.
BR Alfred Chen
Edit: We found an issue that UP is broken in BFS since kernel 3.18, investigation is going on but I put it in low priority than the kernel 4.1 -vrq branch release.
Edit(Jul 17): Update -gc branch to rebase kernel v4.1.2 and fix compile error when enable some kernel hack config.
9654667 bfs: [Fix] Fix undeclared sched_domains_mutex.
b6e4eaf bfs: [Fix] Fix wrong rcu_dereference_check() usage.
I have done a force update on the linux-4.1.y-gc branch, so if you have fetched it before, please delete the remote branch in your git and re-fetched it again.
I've ported BFS as well a bit earlier, and it seems that everything is OK. However I've updated pf-kernel tree against your changes, and here are some small differences:
ReplyDeletehttps://github.com/pfactum/pf-kernel/commit/34b9112f8ca9175ed714465fd1b6495ddebec5c9
Thanks for your work!
I have checked the differences, all are expected. :)
DeleteI'm running this 4.1-gc kernel for some hours now. I know that 22h plus only 4 suspends to disk are not a sufficient testing. But it seems to work very well so far.
ReplyDeleteAdditionally to -gc I have applied
* BFQ for 4.0.0 without any modifications
* Tuxonice for 4.1.0-rc8 (only 1 hunk needed to modify)
* my usual patches to get my laptop fan working
* Alfred's "old" patches for cpu optimisations, XOR templates, fast strings
Many thanks for your great work,
best regards,
Manuel Krause
Thanks for testing. I have push addtional patches to -gc branch and still waiting for the new bfq release.
DeleteThis comment has been removed by the author.
ReplyDeleteSorry...it was showing double posts after the 'Publish' so I tried to delete it.
Delete>We found an issue that UP is broken in BFS since kernel 3.18, investigation is going on but I put it in low priority than the kernel 4.1 -vrq branch release.
ReplyDeleteThanks Alfred; looking forward to the UP panic getting fixed, so I can remove the SMP workaround from my kernel config!
Hi Alfred,
ReplyDeletethanks a lot for your hard work !
Unfortunately it fails with GCC 5.1:
*kernel/sched/bfs.c:687:33: error: implicit declaration of function ‘cpu_sibling_mask’ [-Werror=implicit-function-declaration]
* cpumask_and(res_mask, cpumask, cpu_sibling_mask(cpu))
* ^
*kernel/sched/bfs.c:709:33: error: implicit declaration of function ‘cpu_core_mask’ [-Werror=implicit-function-declaration]
* cpumask_and(res_mask, cpumask, cpu_core_mask(cpu)) ||
* ^
* CC kernel/irq/dummychip.o
* CC arch/x86/kernel/process.o
* CC fs/btrfs/root-tree.o
* CC mm/migrate.o
* CC mm/huge_memory.o
* CC mm/memory-failure.o
*--
* CC kernel/sched/completion.o
* CC kernel/sched/idle.o
* CC kernel/sched/cpupri.o
* CC arch/x86/kernel/check.o
*kernel/sched/bfs.c: In function ‘llc_cpu_check’:
*kernel/sched/bfs.c:687:33: error: implicit declaration of function ‘cpu_sibling_mask’ [-Werror=implicit-function-declaration]
* cpumask_and(res_mask, cpumask, cpu_sibling_mask(cpu))
* ^
*kernel/sched/bfs.c:687:33: warning: passing argument 3 of ‘cpumask_and’ makes pointer from integer without a cast [-Wint-conversion]
*--
* from kernel/sched/bfs.c:31:
*include/linux/cpumask.h:351:19: note: expected ‘const struct cpumask *’ but argument is of type ‘int’
* static inline int cpumask_and(struct cpumask *dstp,
* ^
*kernel/sched/bfs.c: In function ‘nonllc_cpu_check’:
*kernel/sched/bfs.c:709:33: error: implicit declaration of function ‘cpu_core_mask’ [-Werror=implicit-function-declaration]
* cpumask_and(res_mask, cpumask, cpu_core_mask(cpu)) ||
* ^
*kernel/sched/bfs.c:709:33: warning: passing argument 3 of ‘cpumask_and’ makes pointer from integer without a cast [-Wint-conversion]
*--
* from kernel/sched/bfs.c:31:
*include/linux/cpumask.h:351:19: note: expected ‘const struct cpumask *’ but argument is of type ‘int’
* static inline int cpumask_and(struct cpumask *dstp,
* ^
*kernel/sched/bfs.c: In function ‘thread_cpumask’:
*kernel/sched/bfs.c:6930:9: error: implicit declaration of function ‘topology_thread_cpumask’ [-Werror=implicit-function-declaration]
* return topology_thread_cpumask(cpu);
* ^
*kernel/sched/bfs.c:6930:9: warning: return makes pointer from integer without a cast [-Wint-conversion]
*--
* CC arch/x86/kernel/cpu/perf_event_intel.o
* CC security/apparmor/domain.o
* CC security/apparmor/policy.o
* CC security/apparmor/policy_unpack.o
*arch/x86/kernel/cpu/perf_event_intel.c: In function ‘intel_pmu_cpu_starting’:
*arch/x86/kernel/cpu/perf_event_intel.c:2632:7: warning: unused variable ‘h’ [-Wunused-variable]
*--
* CC arch/x86/kernel/amd_gart_64.o
* CC arch/x86/kernel/aperture_64.o
* CC arch/x86/kernel/cpu/perf_event_intel_cqm.o
* CC arch/x86/kernel/cpu/perf_event_intel_pt.o
* CC arch/x86/kernel/cpu/perf_event_intel_bts.o
*cc1: some warnings being treated as errors
* CC arch/x86/kernel/cpu/perf_event_intel_uncore.o
*scripts/Makefile.build:258: recipe for target 'kernel/sched/bfs.o' failed
*make[2]: *** [kernel/sched/bfs.o] Error 1
*scripts/Makefile.build:403: recipe for target 'kernel/sched' failed
*make[1]: *** [kernel/sched] Error 2
*Makefile:946: recipe for target 'kernel' failed
*make: *** [kernel] Error 2
Have you confirm that this happens only in GCC 5.1? Does elder gcc version works for you?
DeleteI use gcc 4.8.x and now 4.9.2, never has such compile issue with these two functions.
BR Alfred
Hi Alfred,
Delete(not sure why but either my comment is awaiting moderation or the browser or the site ate my reply)
I took a deeper look at the kernel and realized that I had forgotten that it included scheduler changes for 4.2 - therefore the compilation issue.
Starting with a new base from scratch - it compiled fine with GCC 5.1 and is running great so far
(I did a few rounds of Mass Effect 3 Multiplayer with WINE staging in 1920x1080 - it really has come a long way, the BFS improvements clearly help)
So please ignore that false alarm =)
@kernelOfTruth
DeleteGot your reply.
There's an issue with current BFS and 4.1.
DeleteToday I fired up a
btrfs scrub start /
and it hardlocked
a few days back it did run for a few seconds and then also hardlocked when requesting the status of the running scrub:
btrfs scrub status /bak
This has been an issue a few kernel releases back with BFS when some small bugs had to be fixed (e.g. my incomplete port to a newer kernel - which in total ran fine but showed also an hardlock when
firing up a
btrfs scrub start)
Anyone else experiencing this ?
Anyway - I'm back to a kernel with CFS for now - not much time for testing out new stuff
stability is priority no.1 for now (also lost enough time [2 days] with troubleshooting QT & KDE-related non-working desktop :/ )
Hope the info helps to track this down
Seems like I'm not the only one:
Deletehttp://ck-hack.blogspot.com/2015/04/bfs-462-linux-40-ck1.html?showComment=1432135190412#c8301870429764130044
http://ck-hack.blogspot.com/2015/04/bfs-462-linux-40-ck1.html?showComment=1436327448100#c8013470520406022151
Alexander has the exact same issue: when running Btrfs scrub - it crashes
What's the code base of bfs when you have the btrfs scrub issue? Pure BFS or with some of the -gc commits.
DeleteWhen I first know this btrfs issue in ck's blog at about 3.18 or 3.19 time frame, I have tested in my machines but not reproducible.
linux-4.1.y-gc from https://github.com/cchalpha/linux-gc/commits/linux-4.1.y-gc
Deleteup to commit https://github.com/cchalpha/linux-gc/commit/cd356bf85dbdba7ba7066e20ebc2adc51d38155e was being used
Btrfs changes are always latest integration or for-linus branches merged against stable from http://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git
Updates: unable to reproduce with a btrfs usb partition, so, what's your btrfs setup?
Deletecryptsetup with aes size of 512,
Deletecryptsetup -y --cipher aes-xts-benbi:sha256 --key-size 512
for / (root), /usr/portage and /bak
after opening the luks Container
the partitions are mounted with noatime,nodiratime,compress=lzo
The hardlock hardly occurs when run on / (root) [35 GB size, 19 GB used]
or /usr/portage [9.8 GB size, 4.7 GB used]
both are on an SSD
but pretty instantly and reliably on /bak [3 TB size, 1.9 TB used]
This was a rather "trivial" fix:
DeleteThe fix for the hardlock from upstream (ck) was removed by the following commit: https://github.com/cchalpha/linux-gc/commit/911bac7b2fcd8a7ec9d1b82109e77d89cb025c24
re-adding it and btrfs scrub so far has survived scanning 40 GB of data =)
https://github.com/kernelOfTruth/linux/commit/a9efc3e88854732b724f99767f525a9849beb274
Lol - that was too easy ;)
Deletethe change made the system more resilient to the load but it wasn't enough ...
after roughly an hour it slowly hardlocked (while playing back music from youtube + browsing on github)
the music snippet kept repeating and then eventually music stopped.
The system didn't respond to Magic SYSRQ key :/
@kernelOfTruth
DeleteI have test btrfs scrub on my productive machine, 187G Size 83G used raid0 setup btrfs and all other partitions, all btrfs scrub run fine w/o deadlock.
So I'll suggest you to enable below kernel hacking config and see if there are any useful log can be captured in dmesg when deadlock happens.
CONFIG_SCHED_DEBUG
CONFIG_SCHEDSTATS
CONFIG_SCHED_STACK_END_CHECK
CONFIG_TIMER_STATS
CONFIG_PROVE_LOCKING
CONFIG_LOCK_STAT
CONFIG_DEBUG_LOCKDEP
Remember to use the latest -gc branch code there are 2 fixes when you enable these configs.
Hi, Alfred,
ReplyDeleteregarding your EDIT from July 17th, I want to thank you very much for your continued in depth care for the BFS patches + enhancing them!
As I think that I don't get it correctly, can you please clarify: Do the newly added fixes only fix compile-time errors or also errors in the BFS?
So far, I can say, it's working well, and, yes, again: Thank you!
Manuel Krause
I am reworking -vrq branch and enable some kernel hack config to help. During this, I got some issues when enable kernel hack configs, some these two commits are the fixes.
DeleteOh, o.k., fine!
DeleteI'm looking forward to the 4.1-vrq. And I hope to find more time to dig into systemd's /SuSE internals to safely disable and reenable the failing services+subprocesses on my running system. Still hoping I won't ever need it with your new release to come. ;-)
Manuel
@Manuel Krause:
ReplyDeleteLooks like a rebase to me - so no functional changes or new patches
No, kernelOfTruth, according to the edited message on top, at least two new patches have been added:
Deletehttps://bitbucket.org/alfredchen/linux-gc/commits/96546670bc617a0d84b78664e9d3baf0f0c00de3?at=linux-4.1.y-gc
and
https://bitbucket.org/alfredchen/linux-gc/commits/b6e4eafcbc1bf5754d5703f80b76d22d53d6b3e6?at=linux-4.1.y-gc
So, we'd better wait for Alfred's answer.
Manuel Krause