As title, this big commit is 117d783 bfs: VRQ solution v0.5
I think the most unstable issues in previous vrq release is caused by this and I believe most known issues(on my machines) have been fixed. It has been run stably for two weeks. So you are encouraged to have a try.
Know issue:
BUG: using smp_processor_id() in preemptible code, call trace from sys_sched_yield().
There still a few commits left I haven't reworked yet. I plan to finish them in two weeks before new kernel release and another sync-up cycle begins.
BR Alfred
No problems during ~3h of uptime. :-)
ReplyDeleteBR, Manuel
Just to "send you some more energy" for your further reworking sessions, here some more POSITIVE feedback:
ReplyDeleteKernel 4.1.5 with all these -gc & -vrq patches is now running fine for more than 24h. No issues at all, and nothing related to your mentioned "known issue: BUG: ..." like you've written above. I followed your opinion (and post-factum's as of his 4.1-pf2 announcement) to _not_ use the revert-unplugged-i/o patch and see how far I'd come without it.
This afternoon I've spent some time to try to reproduce crashing with data transfer to/from external drives. In the lack of a flash/SD card, I used my USB stick to copy larger files to and from, to extract compressed files from and to, but got no problems at all. Not even stuttering video playback during all these processes. BTW, that's a really great point to mention! :-))) Now: maybe the data rates are too low with my dual core notebook hardware and a USB 2.0 stick? (Approximate overall copying speed to it == 14MiB/s and from it == 28MiB/s, extracting takes much longer, of course.) Oh, and both file systems were not Btrfs (but ext4 and vfat).
After all of that good experience, I'm really convinced of the usefulness and effectiveness of the so-far-published -vrq patches to reduce desktop latency, improve interactivity and kind of "equalize" bottleneck situations, especially regarding high i/o and high cpu and also both together. That also includes my use of /dev/shm and resulting swapping to disk. I'm using the BFQ disk scheduler and TuxOnIce hibernation together with -vrq.
At the moment I don't feel the need to compare with plain&pure BFS/CK at all. And a recent test of the 4.2-rc5 kernel with the CFS scheduler (and without BFQ, too) at least showed, how far away the plain kernel is from desirable AND achievable targets regarding low latency.
Dear Alfred, please keep up your wonderful work!
BR Manuel Krause
Hi Alfred,
ReplyDeletegreat news, thanks for your great work !
I've quickly looked for [via Google] a description of VRQ but haven't found a short one:
is it an optimized locking mechanism which improves scalability and performance ?
Are there any changes related to SMT_NICE and VRQ ?
Will see if I find some time to update to BFS & VRQ within the next days (or even weeks) and give it a good testing
Thanks !
@kernelOfTruth
DeleteCatch you! How is your "btrfs scrub" issue going? CK released an reverse_unplug patch which similar to your previous "trail patch". I want to know how your issue going and if ck's patch resolve it?
Hi Alfred,
DeleteI wiped Btrfs from that particular Backup drive - so now only ZFS is remaining as Filesystem for valuable data (Btrfs is still on / [root], /usr/portage and /var/tmp [in RAM]),
scrubs for ZFS and rsync haven't shown any issues so far
So yes that patch resolved it (using BFS 463) :)
Thanks
@kernelOfTruth: Have you already tried to remove Con's patch and use Alfred's modification from
Deletehttps://bitbucket.org/alfredchen/linux-gc/downloads/sched_submit_work.patch
instead?
I've had it running on top of the -vrq solution as of before yesterday's updates for three days without related problems.
BR Manuel Krause
Hi Manuel,
DeleteI haven't but I'm running latest VRQ 0.5 now,
so that patch is needed to address the heavy i/o lockups ?
Oh fun, another round of compiling :P
Thanks
Hi, kernelOfTruth,
DeleteI don't know if -vrq would fix the issue by itself. In the other threads (including CK's blog), post-factum has written, that with Con's fix his machine worked well (and NOT without it) and he's testing Alfred's new one at the moment (...until something happens). He's using the plain -gc branch, I assume.
With the complete new -vrq as of today +Alfred's patch I have had no issues at all in ~4h of run time.
Maybe you can use & test the above mentioned patch, if ever you face errors you've had in the past when you're running it without Con's patch atm.
I just only wanted to invite you for another testing challenge. Never mind. ;-)
Hi Manuel,
DeleteI'm generally interested in testing new and exciting things =)
but I'm currently occupied with tracking down an issue with ZFS and I've also got some deep scientific research in top priority - so not much additional time ;)
Yes, I had issues on Con's 463 BFS without his patch - so I assume that I also need Alfred's approach with VRQ
@kernelOfTruth
DeleteIf you still have similar "unpluged io" issue, try CK's patch first. If it fixed the problem, plz try my replacement patch at https://bitbucket.org/alfredchen/linux-gc/downloads/sched_submit_work.patch and see whether it works too.
BR Alfred
@Alfred:
DeleteSeems like you're not completely convinced of your approach, when you lead kernelOfTruth to test Con's patch first... ? If he's too busy -- shouldn't he verify the usefulness of your patch first, especially under the aspect, that post-factum already confirmed that -gc worked well on his machine with Con's patch?
BR Manuel
Well, I want both patch tested. But giving a limited time for testing and priority these two patch, I'll suggest CK's first, as it's tested and it's more likely fix the issue, although there may be another better way, if have more time to find it out.
DeleteO.k., looking forward to the other "better" way, as always... :-))
DeleteBTW, the new complete -vrq keeps running fine now for ~ 23h.
BR Manuel
Last message is now superseeded by post-factum's posting in the other thread on here: Alfred's patch would not heal the issue. :-(
DeleteThat's weird - I did a full rsync of 2 TiB to a (newly created) Btrfs backup drive
Deleteand with https://bitbucket.org/alfredchen/linux-gc/downloads/sched_submit_work.patch applied on top of VRQ 0.5
no hardlocks or error messages observed
meanwhile I mostly modified the rsync settings to
rsync -ai --delete --stats
rsync -aiz --delete --stats
previously it was
rsync -ai -W --inplace --delete --stats
and
rsync -aiz -W --inplace --delete --stats
so that could have been the settings that lead to lots of stress in the past
And... If you used the previous rsync commands with the new VRQ kernel, would you see hardlocks/ errors again?
DeleteDon't know if that test makes sense for you atm.
BR, Manuel
I did try both (the old ones seemed to slow down rsync to some degree so I modified them to the new ones) - and still no hardlock
Deletebut I'll see if I find the time to let it run as one great job on the whole partition (harddrive) with that (old) command if that provokes something.
Usually I let it run split up in several smaller jobs for bigger (sub-)folders to have a better overview (and perhaps speed-up).
The thing is:
DeleteI don't have that data (or: partition) with Btrfs that seemed to provoke that hardlock,
for some time I ran ZFS on that harddrive but realized that it's better that have at least two different filesystems in case something gets messed up with one
so I formatted that backup drive and re-formatted it with Btrfs.
The difference now is that the Btrfs code changed and the data is more up-to-date:
Triggers could have been the Btrfs or the data on the Btrfs partition.
If the latter was the case it might no longer trigger ...
Ah crap,
Deleteso it wasn't rsync (that had been several months or even years back) as the trigger,
but
btrfs scrub
only,
will see if I can let a btrfs scrub run over night (I need the box right now for work)
Okay,
Deleteso I got the NULL pointer dereference error during scrub of the said partition (close to 2 TiB)
https://github.com/kernelOfTruth/linux/commit/8e178422f08a112172c545d734acbb26d98ca3f6
it occured pretty quickly after 1-2 minutes
CONFIG_SCHED_DEBUG
CONFIG_SCHEDSTATS
CONFIG_SCHED_STACK_END_CHECK
CONFIG_TIMER_STATS
CONFIG_PROVE_LOCKING
CONFIG_LOCK_STAT
CONFIG_DEBUG_LOCKDEP
was also set
@kernelOfTruth:
DeleteThank you very much for your additional time for testing -- and for providing traces! Let's hope Alfred had a nice weekend and will find time to elaborate a fix (together with the trace from post-factum).
Best regards,
Manuel
Hope it helps, besides this VRQ runs quite well (no lockup with ZFS scrub if I recall correctly)
DeleteIn the other thread the mount options were requested:
noatime,nodiratime,compress=lzo
the volume is on top of cryptsetup_luks
mkfs.btrfs
btrfs-progs v4.1.2
Btrfs code is pretty bleeding edge but it shouldn't make a difference since it occurred with earlier Btrfs code
@kernelOfTruth
DeleteThanks for test and provide the trace. I guess the sched_submit_work() doesn't work for bfs b/c bfs use grq_lock instead task_lock() in mainline which a combine of task's pi_lock and rq->lock, the checking of tsk_is_pi_blocked(tsk) is not enough for BFS. Here comes an enhancement patch to add more checking. Plz apply it upon gc-branch code and see if it works.
https://bitbucket.org/alfredchen/linux-gc/downloads/sched_submit_work_02.patch
I assume, we shouldn't use the first sched_submit_work.patch any more? Right?
DeleteYes. I have removed the sched_submit_work.patch from the download page.
DeleteAnother possible trigger is to create a stage4 tarball (backing up the system)
ReplyDeletevia
time (tar -cp / -X /root/scripts/stage4.excl | 7z a -si -t7z -m0=lzma2 -mx=7 -mfb=64 -md=64m -ms=on -mmt=8 /home/system/stage4.t7z)
haven't tested if both that and the scrub lead to a positive (hard)lock
You name this as another possible trigger scenario? But not, running it at the same time with btrfs scrub. Right?
DeleteSome other questions before I try this:
Delete* Your -X exclusion list... Is it a public one or your private? I don't know what is needed to exclude.
* The -mmt value is set according to your No. of cores?
* Why is it called "stage4"?
Thanks for clarification and best regards,
Anonymous Manuel Krause ;-)
Mmmh. was just too curious that I've tried on my own with a directory with 1.2 GiB first.
Deletetime (tar -cp /directory/with/1_2GiB/ | 7z a -si -t7z -m0=lzma2 -mx=7 -mfb=64 -md=64m -ms=on -mmt=2 /place/for/1_2GiB.t7z)
No issues on here. Also video playback from a german "Mediathek" in flashplayer within firefox had only low stuttering during that process.
Best regards,
Manuel
Yes, it's another possible trigger scenario,
Deletenot concurrently, yes separately, there also was running certain rsync jobs but that doesn't seem to apply here
sure:
/mnt/*
/boot/*
/tmp/*
/proc/*
/home/*
/sys/*
/usb/*
/var/cache/edb/dep/*
/var/cache/squid/*
/var/tmp/*
/media/*
/usr/portage/*
/usr/gentoo/*
There were issues with the restored system when including /dev/* in that least, so I deliberately left it out
Also I've a separate backup command for /boot, but that doesn't matter for this purpose - it's simply for causing a high i/o, cpu and scheduler load
yes, mmt equals the cores, afaik it should do it automatically (?) but I remember having had issues in the past without it (less throughput)
It's rooted in Gentoo's stages and backup procedures
http://badpenguins.com/gentoo-build-test/
http://www.gentoo-wiki.info/HOWTO_Custom_Stage4
https://wiki.gentoo.org/wiki/Handbook:AMD64/Installation/Media#What_are_stages_then.3F
stage4 in that case would be fully installed and configured system :)
stage3 is where you usually start when following the gentoo handbook
Need to add: all involved partitions are ext4. ^^ *MK
DeleteNoone could ever count on crossposting. But especially on here? ;-)
DeleteYou've seen, that I've done some compressing with ext4 partitions' content without issues. It was only about 1.2 GiB.
Thank you for your added info.
Manuel
@kernelOfTruth & @post-factum:
DeleteNow it seems to be at you, to prove that the new https://bitbucket.org/alfredchen/linux-gc/downloads/sched_submit_work_02.patch
works for you even on btrfs scrub.
I'm running it on the -vrq branch, btw.
Thank you all for your participation,
Manuel
Will test perhaps at the weekend or earlier,
Deletethe lockups would mostly occur with Btrfs,
I haven't used ext4 for a long time so I'm not sure if there are still quirks with it
Crosses fingers that this fixes it =)
I don't see/ feel negative subjective experiences with -vrq and the new patch. Uptime ~9h.
DeleteBR Manuel
Compiling and testing sched_submit_work_02.patch, stay tuned.
DeleteStupid blogger interface ?
DeleteWhere did my post go ?
@Alfred:
Great news !
it survived the first 2 minutes and finished without hardlocks (5-6 hours)
Once there's enough changes to the system I'll attempt another stage4 backup and see whether that hardlocks the system - but I doubt it will :)
Awesome work !
Also:
Delete===
pf@defiant:~ » uptime
16:57:31 up 5:43, 1 user, load average: 3.51, 1.92, 1.17
pf@defiant:~ » sudo btrfs scrub status /
scrub status for 14140a7f-23bc-4dab-b263-f2f46f5d70aa
scrub started at Tue Aug 25 16:55:10 2015 and finished after 00:02:15
total bytes scrubbed: 76.83GiB with 0 errors
===
Still works OK, but uptime is too small, need more time.
Thanks all of you for testing. While waiting for pf's final confirm, I'd like to prepare another patch for testing.
DeleteBR Alfred
Just had a hardlock during ZFS snapshot send:
DeleteAug 26 00:29:13 morpheus kernel: [69082.418467] INFO: rcu_preempt detected stalls on CPUs/tasks:
Aug 26 00:29:13 morpheus kernel: [69082.418477] 4: (0 ticks this GP) idle=9f9/140000000000000/0 softirq=3923228/3923228 fqs=12328 last_accelerate: f53f/85c8, nonlazy_posted: 0, L.
Aug 26 00:29:13 morpheus kernel: [69082.418481] 5: (1 GPs behind) idle=8c7/140000000000001/0 softirq=2298621/2298622 fqs=12328 last_accelerate: f53f/85c8, nonlazy_posted: 0, L.
Aug 26 00:29:13 morpheus kernel: [69082.418482] (detected by 3, t=37002 jiffies, g=1688364, c=1688363, q=13497)
Aug 26 00:29:13 morpheus kernel: [69082.418485] Task dump for CPU 4:
Aug 26 00:29:13 morpheus kernel: [69082.418486] irq/23-ehci_hcd R running task 0 353 2 0x00000008
Aug 26 00:29:13 morpheus kernel: [69082.418488] ffffffff81e796ae ffffffff81e7b192 0000000000000003 ffff8807f9850000
Aug 26 00:29:13 morpheus kernel: [69082.418490] ffff8800cf1a0000 ffff8800cf19fd68 ffff8807f4b2cf00 ffff8807f4e40800
Aug 26 00:29:13 morpheus kernel: [69082.418492] ffff8807f4e40800 ffff8800cf1a0000 ffffffff8114d640 ffff8800cf19fd88
Aug 26 00:29:13 morpheus kernel: [69082.418494] Call Trace:
Aug 26 00:29:13 morpheus kernel: [69082.418508] [] ? __schedule+0x11ae/0x2c60
Aug 26 00:29:13 morpheus kernel: [69082.418510] [] ? schedule+0x32/0xc0
Aug 26 00:29:13 morpheus kernel: [69082.418513] [] ? irq_thread_fn+0x40/0x40
Aug 26 00:29:13 morpheus kernel: [69082.418516] [] ? usb_hcd_irq+0x21/0x40
Aug 26 00:29:13 morpheus kernel: [69082.418517] [] ? irq_forced_thread_fn+0x2e/0x70
Aug 26 00:29:13 morpheus kernel: [69082.418519] [] ? irq_thread+0x13f/0x170
Aug 26 00:29:13 morpheus kernel: [69082.418520] [] ? wake_threads_waitq+0x30/0x30
Aug 26 00:29:13 morpheus kernel: [69082.418521] [] ? irq_thread_dtor+0xb0/0xb0
Aug 26 00:29:13 morpheus kernel: [69082.418524] [] ? kthread+0xf2/0x110
Aug 26 00:29:13 morpheus kernel: [69082.418528] [] ? sched_clock+0x9/0x10
Aug 26 00:29:13 morpheus kernel: [69082.418530] [] ? kthread_create_on_node+0x2f0/0x2f0
Aug 26 00:29:13 morpheus kernel: [69082.418532] [] ? ret_from_fork+0x42/0x70
Aug 26 00:29:13 morpheus kernel: [69082.418533] [] ? kthread_create_on_node+0x2f0/0x2f0
Aug 26 00:29:13 morpheus kernel: [69082.418534] Task dump for CPU 5:
Aug 26 00:29:13 morpheus kernel: [69082.418535] irq/33-xhci_hcd R running task 0 840 2 0x00000008
Aug 26 00:29:13 morpheus kernel: [69082.418537] 0000000000000003 ffff88066ef1eb80 ffff8800be358000 00000000f9852300
Aug 26 00:29:13 morpheus kernel: [69082.418539] 00000000296b0ad0 ffff8807f5593d68 ffff8807f550d100 ffff8807f51c5a00
Aug 26 00:29:13 morpheus kernel: [69082.418541] ffff8807f51c5a00 ffff8807f50d4600 ffffffff8114d640 ffff8807f5593d88
Aug 26 00:29:13 morpheus kernel: [69082.418543] Call Trace:
Aug 26 00:29:13 morpheus kernel: [69082.418544] [] ? irq_thread_fn+0x40/0x40
Aug 26 00:29:13 morpheus kernel: [69082.418557] [] ? xhci_msi_irq+0xc/0x10 [xhci_hcd]
Aug 26 00:29:13 morpheus kernel: [69082.418558] [] ? irq_forced_thread_fn+0x2e/0x70
Aug 26 00:29:13 morpheus kernel: [69082.418559] [] ? irq_thread+0x13f/0x170
Aug 26 00:29:13 morpheus kernel: [69082.418561] [] ? wake_threads_waitq+0x30/0x30
Aug 26 00:29:13 morpheus kernel: [69082.418562] [] ? irq_thread_dtor+0xb0/0xb0
Aug 26 00:29:13 morpheus kernel: [69082.418563] [] ? kthread+0xf2/0x110
Aug 26 00:29:13 morpheus kernel: [69082.418565] [] ? sched_clock+0x9/0x10
Aug 26 00:29:13 morpheus kernel: [69082.418567] [] ? kthread_create_on_node+0x2f0/0x2f0
Aug 26 00:29:13 morpheus kernel: [69082.418568] [] ? ret_from_fork+0x42/0x70
Aug 26 00:29:13 morpheus kernel: [69082.418570] [] ? kthread_create_on_node+0x2f0/0x2f0
Aug 26 00:32:17 morpheus kernel: [ 0.000000] Initializing cgroup subsys cpuset
looks like it's most likely not related to the scheduler, no ?
@kernelOfTruth
DeleteMost likely not. But I'm sure it's not the unplugged_io issue we are tracing.
I think, no bad news from post-factum is good news? Isn't it?
DeleteWhat about the new patch you've mentioned August 25, 2015 at 8:20 AM -- or are you still investigating, whether kernelOfTruth's traces may be scheduler related or not?
BR Manuel
This comment has been removed by the author.
Delete> no bad news from post-factum is good news? Isn't it?
DeleteOh, jerk off with that :/. As if I bring bad news only.
Anyway, second patch still works OK for me.
@post-factum:
DeleteSorry, you've definitely got me wrong. I meant: As long as we don't get lockup messages from your side, everything seems good for the time you're doing testing until now. Longer, but more precisely.
I didn't intend to say that you're only bringing bad news.
I really appreciate your work and testing time and would never want to be impolite to you,
Best regards,
Manuel
Maybe I also misused the word "bad". I just see the other side of the medal, too: Even "bad" news, those regarding failures, are "good" news -- as they would lead to fixes, sooner or later, for our beloved Linux operating system.
DeleteBest regards,
Manuel
I'm still investigating the unplugged_io patch and try to improve it. For kernel's new ZFS trace, I believe rcu preempt checking mostly likely happens at schedule time, so it's hard to tell it's a scheduler issue.
DeleteFor the next patch for testing, currently I think preempt should be disabled for the additional checking but it may impact performance, so I need a benchmark to see how it goes. I'll start a new post once it is done. This one is growing long and off-topic, :)
@Manuel, take it easy :).
Delete@Alfred, more patches to test are coming?
Just write a new post about the issue. In short, no new patches for testing, last one seems good.
Delete@Alfred:
Deletemost likely related to threadirqs (as I expected), I got another hardlock during attempt of transferring ZFS snapshots (around 400 GiB out of 2 TiB - so I have to start over again XD )
this time without threadirqs
related thread: https://lkml.org/lkml/2013/12/31/144 [3.13 <= rc6. Using USB 2.0 devices is braking the system when using "threadirqs" kernel optio]
@kernelOfTruth:
DeleteAlthough Alfred already named this thead getting off-topic... some new off-topic comment ;-)
I'm also using the threadirqs kernel command line option and have not seen direct(!) negative effects. This refers to my postings especially regarding my tests fron August 24th+. These involved a USB 2.0 stick? drive (FAT formatted for compatibility reasons; friends^^).
Have you been able to finish the transfer without the "threadirqs" option successfully?
(The lkml thread is... somekind of... old? Do you think it's still relevant for the issue? Honest question.)
BTW, I'm still searching for "something" (driver, setting, patch e.g.) responsible for TuxOnIce being unreliable sometimes. What I've seen is, that reliability got much better with a) kernel 4.1 up to 4.1.6, b) Alfred's -gc enhancements, equal or better with: c) the -vrq patches' addons. The -vrq patched kernel may fail really rarely, but if it then failed, once in ~one week with ~21 hibernations, the TuxOnIce image is gone.
Best regards,
Manuel Krause
@Manuel:
Deletenot sure where your post did go: yes that change "fixed" it for me,
@Alfred:
to calm your mind: the lockup I experienced during the ZFS send (twice) it's not scheduler related - well, it appears to be to some point but the focus lies on other system parts (rcu, IRQs, hardware, drivers, etc.)
so it's not caused by BFS or your BFS changes :)
Thanks !
Mmmh. I've written a comment to the long thread above last night, but can't see it. And the comment count increased by 1 then (and by 2) until now. Can't see the reply. Strange interface.
ReplyDeleteBR Manuel
Aaaahhh. O.k. Forget my posting. I've just read the switch "Load more" at the very bottom of the page.
Delete