Thursday, February 23, 2017

VRQ 0.93a release

VRQ 0.93a is released with the following changes

1. UP compilation fix for task policy fairness(reported by jwh7)
2. Sync up mainline 4.10 scheduler changes
3. remove unused stime_pc and utime_pc
4. Fix task cpu runtime accounting (reported by Eduardo)

This is a bug fix release. Enjoy VRQ 0.93a for v4.10 kernel, :)

code are available at
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.10.y-vrq
and also
https://github.com/cchalpha/linux-gc/commits/linux-4.10.y-vrq

All-in-one patch is available too.

BR Alfred 

22 comments:

  1. Hi,

    running here nicely on skylake and phenom II for some 10 hrs, it seems that CPU accounting is fixed here and the rest are ok.

    thanks,
    Eduardo

    ReplyDelete
  2. Hi Alfred,


    thanks for the quick port and fixes for 4.10 !

    I'm switching from a rt-kernel (4.8 + 4.9) [4.10-rt isn't available yet, and there are issues with using virtualbox with an rt-kernel (kernel not booting, etc.)],

    and latency in cyclictest is pretty low.

    While e.g. compiling vlc and watching a high-bitrate 1080p video in mpv (via baka-player),

    the behavior is about 30-50+% comparable to rt-kernel - there were a few short interruptions but almost immediately continued (performance demand shouldn't be very high by default mpv player, not sure if baka-player uses mpv.conf for playback).

    With an rt-kernel you can cause rather high load (e.g. compiling and updating stuff) and the multi-threaded mpv videos + vapoursynth still work in lots of cases,

    will see how VRQ behaves in that regard ...

    So far no issues discovered,

    videos with motion interpolation, upscaling, 48 fps, superxbr, debanding [using vapoursynth] appears working optimally when passing schedtool -I -e

    to mpv while the system is idle - no priority inversion so that the video would stutter.

    I'll report when I have gathered some more feedback,

    so far - so good :)

    ReplyDelete
    Replies
    1. Compilation in portage was NOT reniced,

      so it should naturally cause interruptions (seems the rt-kernel is less susceptible in that regard)

      Delete
    2. @KernelOfTruth
      In your usage, I'd suggest use IDLE policy for compilation rather than use ISO policy for mpv playing. The ISO policy task has higher priority than even kernel threads, if ISO occupied too much cpu power, the whole system interactivity would be impacted.

      Delete
    3. Hi Alfred,

      thanks for the tip,

      yes - it not only is impaired, in fact the whole system sort of locked up in one instance, the other instance it kept on playing sound as if nothing happened but did NOT react to input anymore (also screen output was frozen) ;)

      Still searching how to apply SCHED_IDLE to gentoo, I'm sure I'll find it soon-ish

      Delete
  3. @Alfred:
    Thanks for the new release. It's running fine now for some hours with BFQv8r8, 512Hz and your older tune-ups.
    After recently upgrading my RAM from 4 to 8 GB and my openSUSE from 13.1 to 42.2, my humble tuxonice port got less reliable with 4.9.xy (50% hibernation successes, but RAM upgrade is out of doubt and does more good), I've left TOI out for this kernel... and surprise... in-kernel hibernation works as fast as former TOI now. Atm. I can't explain it to myself.

    BR, Manuel Krause

    ReplyDelete
    Replies
    1. Most probably the former limited RAM situation led to the issues I've had faced. As this notebook uses shared RAM for gfx, I assume that amounts of this gfx mem also went into swap. Would be evidenced by the fact that before my RAM upgrade every swappiness < 40 led to system starvation over time.
      I cannot estimate how much influence the omitting of dead TuxOnIce code has, but if in-kernel-hibernating keeps it's reliability now, I won't look back. (Before RAM upgrade Firefox needed a very long time to get interactive with in-kernel-hibernation.)

      BR, Manuel Krause

      Delete
    2. @Manuel
      Good to know it works for you. On my site, suspend in VRQ kernel is very reliable for me for 2 or 3 kernel releases.

      Delete
    3. @Alfred:
      Yes, man. I'm really glad that things settled so well functioning. Regarding your hard work on VRQ (and also my successful upgrade marathon with Susy). ;-)

      Any news on what's in your pipeline for VRQ?

      Thank you and BR,
      Manuel Krause

      Delete
    4. Btw., besides of using BFQ-v8r8 and enabling (now in-kernel) WBT I've also applied Con's "Swap-sucks" patch (http://ck.kolivas.org/patches/4.0/4.10/4.10-ck1/patches/0022-Swap-sucks.patch). With my former RAM a no-go, now it's working fine.
      BR, Manuel Krause

      Delete
    5. @Manuel
      Things on my VRQ list are
      1. <100% workload tests vs CFS on diff machines(smt and non-smt)
      2. Remove rd and sd in VRQ(2k+ LOC)

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. @Alfred, thanks again...x64 compiled fine, but 32-bit UP fails with:
    kernel/sched/bfs.c: In function ‘update_rq_clock_task’:
    kernel/sched/bfs.c:2367:18: error: implicit declaration of function ‘irq_time_read’ [-Werror=implicit-function-declaration]
    s64 irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;

    ReplyDelete
    Replies
    1. @jwh7
      Thanks for reporting this compilation issue. It's not related to UP, but disabled CPU_FREQ. It's an issue in VRQ. I have pushed a fix to git. Please have a check and any other issues, please let me know.

      https://bitbucket.org/alfredchen/linux-gc/commits/6af80e60987f86c4ce41fd02e333a34d4cdb03c8

      Delete
    2. Ah, ya I disabled CPUFREQ in my latest config, since the netbook can't use it. And yes; vrq compiles now for it. I picked up 075f828 and 6af80e6. Thanks @Alfred.

      Delete
  6. Hi Alfred and thank you for your work.

    Here are the usual throughput benchmarks on 4.10:
    https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

    This time I also tried VRQ with the recommended 1000Hz.
    I've added the test of 4 lame in parallel, because it also shows the throughput regression under half load.
    I've noticed that when I set the 4 lame process affinity to different cores with taskset, there is no regression, but I guess that's expected.

    Pedro

    ReplyDelete
    Replies
    1. @Pedro
      Thanks for the benchmarks. I have finished my sanity tests vs CFS on non-SMT machine, and the result shows that there is no regression. And the test is still running at a SMT notebook, will let you know the result once it is done.
      But most likely(and my best guess), VRQ select smt cpu while other phy-core is availiable.

      Once it is comfirmed, I will start work out a sulotion for it.

      Delete
    2. @Alfred:
      Thank you for the info, also for the earlier one posted above. Good luck with developing. We all appreciate and enjoy your good work's progress!

      BR, Manuel Krause

      Delete
    3. @Pedro
      BTW, have you enabled SCHED_MC_PRIO for v4.10 for your tests? I am testing with this new kernel option in CFS and trying to get more test result from others.

      Delete
    4. @Alfred
      SCHED_MC_PRIO is enabled, but the doc says it has no effect on CPU without Intel Turbo Boost Max Technology 3.0, and my Intel 4770k doesn't have it.

      Pedro

      Delete
    5. @all
      Here are my CFS vs VRQ test results I want to share
      On non-SMT machine, no regression is found, from 50% to 300% workload.
      On SMT machine, regression are found from 50% to 300% workload in both 4.9 and 4.10 kernel.
      The regression in >= 100% workload is out of my expectation. So I have to trace it far back to 4.8 kernel, will keep you updated when have more findings.

      Delete
  7. @Alfred:
    Any news from your VRQ programmer's front?! :-)

    Don't misunderstand me, I have absolutely no reason for any complaint. Atm. everything works fine and smooth with kernel 4.10.3 + VRQ 0.93a and BFQ v8r8+fix.
    The complete migration to openSUSE 42.2 needed much more cleanup than expected.
    Also I've taken back WBT inclusion completely from .config, as I only use BFQ as disk scheduler, and benefits in combination are not proved. I've also reverted one runtime setting, what seemed to help with my previous precarious memory setup, /proc/sys/vm/vfs_cache_pressure == 300, back to system default of 100. This and also the recent kernel patches appear to improve resume-from-hibernation speed significantly (meaning a tabs-bloated Firefox getting responsive asap like with former TuxOnIce).

    BR, Manuel Krause

    ReplyDelete