Monday, June 11, 2018

PDS 0.98r release

PDS 0.98r is released with the following changes

1. Fix compilation error when build CFS.
2. Optimization here and there.
3. Pick one more remaining sync-up from mainline.

This is the second PDS release for 4.17. It's mainly for bug fix, continue sync-up pick-up and minor optimization. After this, I'm reworking the balance, hopefully helping reducing the overhead.

Enjoy PDS 0.98r for v4.17 kernel, :)

Code are available at
https://github.com/cchalpha/linux-gc/commits/linux-4.17.y-pds
and also
https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-4.17.y-pds (Will be deprecated soon)

All-in-one patch is available too.(Now the link is from github PDS-mq repository, although MS has bought github which make uncertain future of github, but moving from bitbucket is certain, all-in-one patch will not be uploaded to download page of bitbucket repository from this release)

21 comments:

  1. I have not tried this release yet, but just wondering whether you consider moving things to the GitLab instead.

    FYI, I'm almost moved pf-kernel repo, and the only bit I'm missing is generating a plaintext diff between 2 tags for easy patch retrieval via https link. However, it looks like it won't be a blocker for you. As for me, I've filed an RFE for this.

    Another issue I'm tracking is a thing that prevents listing lots of branches of heavy repo (like kernel) via the web UI, but I've also managed to file a report for this and attract devs' attention.

    Also, GitLab has uploads, so you'll be able to provide an all-in-one patch (it is not that personally /me needs it at all).

    ReplyDelete
  2. Sure, Gitlab is the first choice if github is becoming evil. For bitbucket, I just don't like their repository size limitation, even for open source project like kernel src code.

    I don't need the uploads, I am happy with the current PDS-mq repository to store the all-in-one patches.

    Yeah, I need find some time to hand on gitlab and see how things goes on there.

    ReplyDelete
    Replies
    1. Have tried on gitlab, so far so good. Instead of importing the whole repository to gitlab, I may want to keep just the -pds branches. Will let you know when the new repository is ready in gitlab.

      Delete
  3. @Pedro and @Alfred:
    I've just put up my 4.16 kernel testing fron local libreoffice localc into google spreadsheet. I'ts quite ridiculous and possibly making anxious, what google want's to know from me.
    Now it's there, it's a simple import (yes, from the local localc file) only, but I'm now stuck with this: I don't see where to allow read access (and only this) to you on here.
    Can someone please help me with some quick advise?

    In the last days I've finished the 4.16 benchmarking with my xz-approach, that allows comparisons to almost all tess done by Pedro. And I began the 4.17 testing last night, after TOI got available, and keep it on.

    TIA and BR,
    Manuel

    ReplyDelete
    Replies
    1. O.k., one first shot:
      https://docs.google.com/spreadsheets/d/e/2PACX-1vTHCUzZbdY7h82mZplRoj26mxNcHh3w9KkBj3jIOoW1mqbMQqVgN-sq8U4Aphi7zx_mv6zfzuFs-rs0/pubhtml

      Please review and don't hesitate with criticism (except for the font size!).

      BR, Manuel

      Delete
    2. @Alfred, @Pedro and benchmarking-interested,
      I've just added the 4.17.1 xz-benchmarking to my spreadsheet above, including CFS, PDS098q and r, with a cross kernel comparison 4.16/ 4.17. Just call it with the link above.
      (Unfortunately the 'publishing' function isn't working as accurate as I wish, namely eating some text, but at least no relevant data to show.)

      Hopefully this work is useful for you
      BR, Manuel

      Delete
    3. @Pedro: Hi,
      I've just noticed that you've changed your benchmarking spreadsheet's calculations a lot, e.g. also taking the 95% CI for evaluation.
      Although this step should make our results (more) comparable, for which I want to thank you, I'm quite astonished when comparing the results of your 4.16 CFS, PDS, MuQSS-mc @1000Hz kernel-compilation vs. my xz-runs on different loads.
      To compare the right loads I've taken into account, that you're running a quad-core and me a dual-core, but, that you get better results for CFS than for PDS-p (except for 100% load) and the MuQSS, makes me stumble. I know, that "make -j?" would never be directly comparable with "xz -T?", but is your data correct?

      Please don't take this as an offence, I just want to understand, why we get so different results.

      BR, Manuel

      Delete
    4. @Manuel
      Indeed, I've been digging into a more robust way to compare results.
      I've ran around 70 make -j1 and -j8 run with CFS to check that the results are normally distributed, and indeed they are. The link you gave me a few weeks ago lead me to the welch t-test, which a believe is appropriate.

      Back to the results, yes they are correct.
      First, as you said xz and make are not the same thing.
      Second, the difference bewteen your results and mine may be due to Hyperthreading. MuQSS behave strangely with HT under partial load.
      Some times ago, I've ran make -j1 to -j4 with HT disabled and didn't see the regression at half load.
      And then, there are a lot of other factor: I use intel_pstate+powersave while you use acpi+performance AFAIR, ...

      Pedro

      Delete
    5. @Pedro:
      Thank you for your reply. This tells me to also dig deeper into this materia (instead of me simply taking over some first google catch formula). I've made a definite mistake to use the z-confidence approach with 95% AND z=1,96 -- regarding the small number of 18 samples. I've already adjusted the sheet for the use of correct +- t-confidence values. Atm I don't understand how you make use of the TTEST formula in practice for your spreadsheet.

      +@Alfred:
      Is it useful for you, that I continue my xz-based benchmarking? Due to the high average deviation of kernel "make -j?" on my system and limited time over night, I can't do the make tests to reach a valuable result quality.

      I've just added the new MuQSS 0.172 for the 4.17 kernel to the spreadsheet. I hope that the published version is readable for you.

      BR, Manuel

      Delete
    6. I've just reworked my local spreadsheet to reflect the changes of taking the correct t=2,11 for n=18 takes.
      But in fact the evaluation (shown by colouring) remains the same for me. (So I don't update the google for now.)

      BR, Manuel

      Delete
    7. @Manuel @Pedro
      Sorry that I haven't active in Blog recently. The Blog notification is broken for me, can't get the notification, and I am enjoying world cup recently, :)

      I think any kind of benchmark(if the result is correct) is useful, it shows how schedulers work on the given CPU and kernel setting, which I may not have time or resource to do so.
      On my site, I normally run kernel compilation sanity and other latency test. I think the kernel compilation is a very general test so I do like to use it, but I keep an eye on your benchmarks too.
      Depend on your free time, IMO, one benchmark per kernel release should be good enough to check regression/improvement during the kernel release cycle.

      Delete
    8. @Alfred:
      Don't worry about my data collection. With your and Pedro's collaboration my testing and data processing has quite matured, that I'd call it safe. The only thing to solve was, or maybe still is, how to evaluate the results to give you valuable/ significant results.
      Main thing you should always recall, that Pedro's and my systems behave differently and the test is different, too. Additional to the make/ xz difference, please always consider the dual-to-quad-core translation for the load levels.

      +@Pedro:
      I've updated my sheets to use the +- t-CI values now with the appropriate multiplier.
      Atm I'm really disappointed from google spreadsheet's publishing capabilities. It keeps eating text. Can you, please, give me an advice on how to give you all the same read access to my sheet as you do? I consider this useful for you all to see my formulas used and to notice and mention possible errors.

      BR, Manuel

      Delete
  4. I couldn't get 4.17 to boot (disk error messages, not PDS-related), including when excluding all my linux-block patches, so I switched to linux-next-git (20180614), and am running that now on x64. There were only two minor fixups to get PDS running with it. :) x86-UP built, but haven't tried booting it yet. Anyone had issues w/ 4.17? My system is probably a bit unique, in that I use a single disk BIOS-setup RAID JBOD (which allows me to use AHCI that isn't otherwise available), and then Linux is setup as dual-boot on a partition of that disk. I recall the errors I got in 4.17 mentioned my PARTUUID setup, but need to check again.

    ReplyDelete
    Replies
    1. A little sad, that I don't know how to help you. I haven't had issues with 4.17.1 so far. But as a starting point I always use the pf-kernel (patch), as Oleksandr most probably includes all available block-related commits he knows of and considers as useful. I have no overview of how many (if any) are in -pf1. Hopefully it's no hardware issue.

      BR, Manuel

      Delete
  5. I had a nice vacation, so I was not active at all... Now I'm back and installed this version on i7@work and Ryzen@home, seems to work fine.

    Thanks and BR,
    Eduardo

    ReplyDelete
    Replies
    1. @Eduardo:
      Welcome back! :-) Seems like everyone is on vacation right now.
      A few questions aside: Aren't you the guy having given info about the schedulers behaviour with gaming (or at least on behalf of your child)? Have you had a chance to test MuQSS 4.16 kernels with the rqshare={mc,smt,smp,none} options, regarding interactivity? I have only the xz-throughput testing results plus some hours of usual desktop office work and video playback with them (without issues).

      @Alfred:
      I don't know whether Con's rqshare path is applicable to PDS at all, but it looks like a promising new way for both, AMD and Intel, architectures.

      BR, Manuel

      Delete
    2. @Manuel,
      Yes, I was testing kernels for gaming myself and occasionally I install that on kids computer, which has my trusty old Phenom II.
      He's currently using standard kernel as he says that's the overall best, but I haven't given him any new ones, I might do that.
      For MUQSS I tried gaming and results were not stable, I could not choose just one, SMT was slow in my case, MC were better, I shared them on Con's site, but haven't done anything more. I like one kernel for all my computers and I like it perform well, so nowadays I use Oleksander's git tree and apply Ubuntu mainline config upon it, works for me.

      Thanks Alfred for stable kernel ;)

      BR, Eduardo

      Delete
    3. @Eduardo:
      Ah, o.k., thank you for the info. I haven't seen stability issues with 4.16 MuQSS and had omitted the SMT part. Nor have I seen latency degradation with it in my described use. From the numbers I can't decide whether rqshare=SMP or MC is really better on my dualcore without HT. From the logs they both establish the same number of rqs (three) but seem to utilize them slightly differently.

      Anyway IMO, stability first, then low desktop latency and then comes throughput, if no mission critical I/O starves due to one of the first points.

      BR, Manuel

      Delete
    4. Hi,

      I have done some very basic tests on PDS vs MUQSS and results are as follows:
      D3 location 1: PDS (105 FPS), MUX+MC (110 FPS), MUX+SMT (55 FPS)
      D3 location 2: PDS (67 FPS), MUX+MC (70 FPS), MUX+SMT (40 FPS)
      POE location 1: PDS (120 FPS), MUX+MC (120 FPS), MUX+SMT (120 FPS)

      So, as ususal D3 + wine is smth very special, POE + wine is not, as You can see, results are very comparable, except for D3 + MUX+SMT.
      The results are actual in-game results I have not done any synthetic benchmark tests, but from previous experience, these tend to be about the same as well.

      BR, Eduardo

      Delete