Monday, March 23, 2015

3.19.y-gc branch updates

Yes. -gc branch should be updated but I personal want to wait for the v3.19.2 stable version to be out then rebase on it, so it is late.
And while the -new branch once appeared in git which I used to sync up git tree between desktop and notebook make some trouble for friends who have tried it. This remind me to publish codes carefully.

There is no new commits for -gc branch, just minor bug/comment fixes. I have rewritten some commit titles and tag with [Fix]/[Sync], so it would be easier to tell what the commit is used for.

And, I have done one *make no more friend* thing, I have delete the origin linx-3.19.y-gc branch and create a new one, with the same name. Because I don't want to keep the old merge commits info in the branch. Any one know how to make it clean please let me know.
So, you may have trouble to pull out this new linx-3.19.y-gc branch, because your local copy is different and no more existed in "server" side. Simply delete the old remote linx-3.19.y-gc branch in your git tree and pull again, then you will be fine.

Branch url:
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-3.19.y-gc

17 comments:

  1. Hi, Alfred!
    Thank you for the "official update". :-) It's up and running well, so far. :-)))

    The only reason for me to have tested the former -gc-new patches was, that it was the first version of your patches after Con actualized BFS/CK to v0.461 for kernel 3.19.
    I don't understand the "*make no more friend* thing" sentence.
    My workflow with your newly published patches is always fetching+reading each of them manually and applying them to a most recently published+installed openSUSE vanilla kernel source .rpm. Maybe not the fastest workflow, indeed. But I consider it somekind of "failure resistant".

    I have a question regarding your refreshed y-gc branch: While you claim that your patches are only syncs, fixes and enhancements to Con's BFS, do you introduce any tunable variables in the code you add?

    Thank you for the information and best regards, and keep up the good work!

    Manuel


    ReplyDelete
    Replies
    1. BTW, Alfred, the refreshed -gc actually is your old -gc-new tree for 3.19, only without _all_ the VRQ patches.
      I've just re-checked, that there is no code diff inbetween the 15 current BFS/CK related patches to the former one.

      Manuel Krause

      Delete
    2. Just to be clear, the 15 first former patches are the same as the current 15.
      M*

      Delete
    3. Yes. The -gc commits are same 15 first commits from -new branch. They are most usable/valuable/stable changes. While the VRQ patches, I am still having something new working on it.

      There is no tunable factor I can see in current code changes. So the only tunable variable is rr as BFS provided, and I haven't touch it.

      Delete
    4. I wasn't aware that you had a blog where you wrote about the kernel changes & releases

      great work !

      Testing now the refreshed -gc, thanks

      the old -gc-new tree had no stability problems - so I doubt I'll run into anything unusual

      Delete
  2. O.k. Then we may have a problem somewhere. I don't know where to begin to describe the problem most efficiently or even completely. So please ask in case of doubts.

    Involved: Kernel 3.19.2+gc-patches+BFQ. Firefox with two windows, 1 loaded with many open tabs, 1 playing a video stream via flash-plugin; 4GB RAM; 10GB swapfile on second disk; 4GB /dev/shm; recoding a 750MB .avi movie with avidemux in smart copy mode residing on /dev/shm to /dev/shm/otherfilename. Plus, I had worldcommunitygrid client running most probably at SCHED_BATCH.

    Problematic behaviour: During and after having recoded the .avi video in 1 minute, the flash stream in FF stops playing any more video frames absolutely (but plays the sound and that continuing and without stuttering), the normal+system load dropped from about 53% to ~2% (watching gkrellm), the disks of my notebook spinning highly (watching related notebook's LED), assumingly due to swapping, and the FF windows don't actualize their content when clicked on.
    This situation remained for many many minutes, no matter what _other_ windows I clicked to actualize their content (what worked for them, but I thought, maybe it's in swap and blocking something important for FF?).

    My only working solution: Stopped the worldcommunitygrid client(s). -- And the system recovered within less than half a minute. Firefox + flash streamed video playback and all other desktop responsiveness came back.

    As I use the described workflow very often, and I haven't had these issues with the plain -ck1 patch applied, please, add your ideas and thoughts regarding your -gc patches in this context!

    Best regards,
    Manuel

    ReplyDelete
    Replies
    1. Last kernel with your patches had an uptime of over 27h, to produce this misbehaviour, now I even reproduced it with plain -ck1 within 1h. So, most probably the 3.19.2 has an issue, and NOT YOUR OR ck's patches. Or only the combination with BFS 461. ... Clueless. :-(
      Another idea is, that the most recent flashplayer for linux is broken.

      Delete
    2. Hi Manuel,

      during that incident or behavior

      what would

      free -m


      and


      cat /proc/meminfo


      say ?


      do you use a composited window/desktop manager ? switching to non-compositing should help in cutting latencies down

      what gpu driver ?

      Delete
    3. O.k. atm I would not be able to reproduce it, as I reduced tabs in the first open firefox window from 158 to 146, yesterday. Don't tell me this is too much. And I also changed from 3.19.2 to .3 and updated the graphics .rpms today, too.

      So, atm it doesn't make sense to incude the memory info for now.

      The graphics driver from lspci: {
      00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
      Kernel driver in use: i915
      }
      from Xorg.0.log: {
      [ 36.418] (II) Loading /usr/lib64/xorg/modules/drivers/intel_drv.so
      [ 36.419] (II) Module intel: vendor="X.Org Foundation"
      [ 36.419] compiled for 1.17.99, module version = 2.99.917
      [ 36.419] Module class: X.Org Video Driver
      [ 36.419] ABI class: X.Org Video Driver, version 19.0
      ...
      [ 36.865] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20141121
      [ 36.865] (--) intel(0): Integrated Graphics Chipset: Intel(R) GM45
      }

      I have compositing enabled in xorg.conf but definitely disabled all possible "effects" in my KDE and oxygen settings.

      But one maybe important thing remains: From time to time since years with BFS I do face this issue and it has to do with firefox' memory management and kernel not dealing well with that. I remember having had this issue many times before. (Former only solution: reduce FF tab number. What still works.)

      Do you as developers see any reason for this ONE process firefox to be able to be blocked by the system? Side info: The flash-plugin is running within the firefox thread on here. Some months ago I got rid of the (opensuse) plugin-wrapper, as it produced too much overhead/ latency.

      Please, let me know of any ideas you have, and ask for additional info you may need, best regards,

      Manuel Krause

      P.S.: I do set some parameters manually:
      ## opensuse default == 10:
      echo 4 > /proc/sys/vm/dirty_background_ratio
      ## opensuse default == 20:
      echo 9 > /proc/sys/vm/dirty_ratio
      ## opensuse default == 60:
      echo 70 > /proc/sys/vm/swappiness
      ## 1: 2kB, 2: 4kB, opensuse default == 3: 8kB, 4: 16kB. 5: 32kB:
      echo 5 > /proc/sys/vm/page-cluster
      ## opensuse default == 0: (0 = autoselect, 1 = always overcommit, 2 = never overcommit):
      echo 2 > /proc/sys/vm/overcommit_memory

      I f you see any issue with these settings, please comment!

      Delete
  3. Bottlenecks/latency, thoughts:

    - zswap -> enable lz4 compression here; append to boot
    zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=50
    - swap -> try zram & lz4 compression + swap
    - Intel(R) GM45 - shared memory: reduce memory usage in Bios (?), or don't use anything too demanding on GPU
    - KDE is rather heavy and/or slow, consider either switching the window manager (kwin) or the desktop
    - consider appending threadirqs to boot - if you prefer to go lower latency & the possibility of using rtirq: https://bbs.archlinux.org/viewtopic.php?pid=1002264#p1002264
    - use the "nice" command or "schedtool" to prioritize things
    - Firefox is using an sqlite variant which uses fsync heavily, not sure if it's that much improved under heavy load - but should according to: https://bugzilla.mozilla.org/show_bug.cgi?id=421482
    - if it swaps - it has to do it in bursts and large amounts - this might lead to some thrashing but after a short while the system stabilizes: I'm using
    echo 12 > /proc/sys/vm/page-cluster
    that setting worked best from experience
    the notebook harddrive(s) might not like that too much - on the desktop it's ok
    - how about /proc/sys/vm/vfs_cache_pressure ? keep that at 1000 or 100000
    - at what setting is /proc/sys/kernel/rr_interval ?, lowering the value might reduce performance but also latency

    - if you're running your box/laptop full throttle or high load all the time - consider using the performance governor (contrary to belief it still clocks the gpu down once it's not needed anymore)
    - why overcommit_memory at 2 ? does everything work for you ? ; any instabilities with 0 ?


    - especially if you're using lots of instances or programs with similar data - uksm might proof pretty beneficial

    if there should occur any instabilities in the future consider tweaking:
    /proc/sys/vm/min_free_kbytes
    and
    /proc/sys/vm/mmap_min_addr


    There's probably more to mention

    but that should be enough for now

    Hope that helps

    ReplyDelete
    Replies
    1. @kernelOfTruth
      Great reference. As my normal usage style is not heavy, I don't tune the system much.

      @Manuel
      Sorry, can't help with your case, your usage seems too complicated too me.

      PS, as you are using GM45, you should be noticed that in 3.19, there is nice drm improvement(may be related to drm). mpv now can play full-hd mkv files smoothly using opengl those once cause frame drops before 3.19.

      Delete
    2. @kernelOfTruth:
      Thank you very much for your comprehensive advices. I've already adjusted some of my settings into the direction you proposed, maybe not as aggressively as you do. But the latency seems to have decreased already for my usage. Need to give it more uptime + testing:
      * /proc/sys/vm/page-cluster from former 5 to 8 (default 3)
      * /proc/sys/vm/vfs_cache_pressure from default 100 to 1000
      * /proc/sys/kernel/rr_interval from default 6 to 5
      * /proc/sys/vm/overcommit_memory left at default 0 (former 2==NO!)
      * kernel cmdline append: threadirqs

      Atm I don't want to change the DE or windowmanager as I'm used to it for ages now. For my usual usage, plugged to a power supply, I already use the performance governor. Fiddling around with "nice" and "schedtool" is a nogo for me. I know it from my old unicore machine, that much trial and error work had led to very low effort. Regarding the GM45 gfx -- I've specially reseved 256M in xorg.conf for it -- that only means 6.35% of my 4G of available RAM, so why reduce it. Also, I'm in doubt, that this setting really does matter anything, nowadays. Damned shared memory^^. Not the best solution at all...
      And regarding uksm, I can say, that the last time I've tested it, the information it provided upon my usage pattern showed only low benefit: very low amount of similar data to be merged over time.

      Two questions remain for me for now:
      * Do you really recommend zswap / zram? I disabled these together with frontswap as I suspected them to cause unneeded overhead some releases ago (in the KISS principle manner). Now, re-reading the kernel xconfig help, it also suggests, that I may have been wrong?
      * Does the "threadirqs" kernel cmdline append do a job itself -- or would I need to use the "rtirq" script on top as well?

      Thank you very much, again, for your hints, and best regards,

      Manuel Krause

      P.S.: Some weeks ago, you've had these Frederic Weisbecker scheduling related patches (also for BFS/CK) in your repository. Atm, I cannot find them. Would you, please, be so kind to point me to them?

      Delete
    3. I am using zram with lzo. IMO, it's good for usage case when you have free cpu power and ram left to be shared. For your usage, about 200 FF tabs, i think you don't have much ram left, so, zswap > zram, and as your cpu usage is also very high, maybe a simple swap is the winner.

      Delete
    4. @kernelOfTruth:
      Just before you answer: Don't bother yourself with my question for the Frederic Weisbecker patches. I was a bit non-creative yesterday. With a little help of post-factum's repository (tagged 3.19-pf3) I've found the traces. As it looks like, there had been more development on that in the meantime, mainly bugfixing, see: "https://github.com/torvalds/linux/commits/master/kernel/sched/core.c". Some of them can be directly used on BFS, too, after path-/filename change in the patch. E.g.:
      sched: Prevent recursion in io_schedule(),
      sched: Fix preempt_schedule_common() triggering tracing recursion,
      but for -
      sched: Pull resched loop to __schedule() callers -
      I'm not experienced enough to deal with it safely ( schedule(void) vs. __schedule(void) + schedule(void) difference in BFS vs. cfs ). post-factum uses this to deal with it: https://github.com/pfactum/pf-kernel/commit/ae54d408ed1c19fe53d0d68710756676d139c3e6
      Nor can I estimate any advantage from the above code snippets.

      Thanks,
      Manuel

      Delete
    5. O.k. I must admit, that I've had tested the rtirq script some years ago, but didn't get to know in those old times, that "threadirqs" is needed as kernel command line append. Without this kernel setting rtirq doesn't do anything.

      I've now seen, that the rtirq script was on my system for years now, but just did ...nothing...

      Now, with updated things in kernel command line and a refreshed rtirq, I still miss important documentation for rtirq. What is the difference between RTIRQ_NON_THREADED and RTIRQ_NAME_LIST added processes?

      Thanks,
      Manuel

      Delete
  4. As the mentioned patches are very likely to go into kernel 4.0, you maybe should have a look at them, no matter, if it's before Con catches up with his patches..
    I currently use them (modified) on top of your -gc patches.
    They do work fine.
    The only thing with my usage pattern is, that I can see a slight imbalance between both CPU cores for system+normal load when having sched batch ones running.

    I just wanted to report it for your future development of your -gc patches, and not to complain.

    Thank you for your work!
    Manuel

    ReplyDelete
    Replies
    1. Thanks for the info. I will check -rc release of 4.0, if those patchs in going into 4.0, they should already there. Based on the experience in 3.18, extract the __schedule() from current schedule() cause very weird behaviors, I would see what happened this time.

      BTW, I have pre-release a patch of VRQ 0.4, you can have a try to see if it works for you.

      Delete