Sunday, July 15, 2018

PDS 0.98t release

PDS 0.98t is released with the following changes

1. Rework pds_load_balance() based on current implementation. It's the main feature of this release.
2. Minor optimization in task_preemptible_rq().

This is the last release for 4.17, with the rework of balance code path as promise. The sanity test shows there are minor improvement over all workload due to balance overhead cutting. So far, I'd like to call it for the balance overhead cutting work and move on for next main line sync up and investigation of other potential improvement.

Enjoy PDS 0.98t for v4.17 kernel, :)

Code are available at
https://github.com/cchalpha/linux-gc/commits/linux-4.17.y-pds
and also
https://gitlab.com/alfredchen/linux-pds

All-in-one patch is available too.

PS: Here is the updated patch for v4.17.12+ kernel

https://gitlab.com/alfredchen/PDS-mq/raw/master/4.17/v4.17.12+_pds098t.patch
 

17 comments:

  1. Hi
    Build new patch but in dmesg have this error :


    [ 0.110004] ------------[ cut here ]------------
    [ 0.110008] pds: 8 - 9, 102, 92325748 9, 102, 92325748
    [ 0.110020] WARNING: CPU: 0 PID: 1 at kernel/sched/pds.c:2936 scheduler_tick+0x63f/0x6a0
    [ 0.110024] Modules linked in:
    [ 0.110028] CPU: 0 PID: 1 Comm: PDS/0 Not tainted 4.17.6 #1
    [ 0.110030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)/Standard PC (i440FX + PIIX, 1996), BIOS
    [ 0.110034] RIP: 0010:scheduler_tick+0x63f/0x6a0
    [ 0.110035] RSP: 0000:ffff88007d003ed8 EFLAGS: 00010096
    [ 0.110038] RAX: 000000000000002a RBX: ffff88007d01ec80 RCX: ffffffff8202b338
    [ 0.110040] RDX: 0000000000000001 RSI: 0000000000000082 RDI: ffffffff82435b64
    [ 0.110042] RBP: 0000000000000000 R08: 0000000000000097 R09: 000000000000002a
    [ 0.110044] R10: 0720072007200720 R11: ffffffff82437fca R12: 000000000001ec80
    [ 0.110045] R13: 0000000000000000 R14: 0000000000000009 R15: ffff88007a460078
    [ 0.110048] FS: 0000000000000000(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000
    [ 0.110050] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 0.110055] CR2: 0000000000000000 CR3: 000000000200a000 CR4: 00000000000006f0
    [ 0.110057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 0.110059] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 0.110061] Call Trace:
    [ 0.110064]
    [ 0.110069] ? rcu_check_callbacks+0x22c/0x3d0
    [ 0.110073] update_process_times+0x4a/0x60
    [ 0.110077] tick_handle_periodic+0x1b/0x60
    [ 0.110080] smp_apic_timer_interrupt+0x48/0x80
    [ 0.110084] apic_timer_interrupt+0xf/0x20
    [ 0.110086]
    [ 0.110089] RIP: 0010:do_xor_speed+0x3c/0xb0
    [ 0.110090] RSP: 0000:ffffc9000000be70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
    [ 0.110093] RAX: 00000000fffffc22 RBX: 000000000000a274 RCX: ffff88007a460000
    [ 0.110095] RDX: ffff88007a4fc000 RSI: ffff88007a4f9000 RDI: 0000000000000000
    [ 0.110097] RBP: ffffffff8203e760 R08: 0000000000000096 R09: 0000000000000000
    [ 0.110099] R10: 0720072007200720 R11: 0720072007200720 R12: 000000000000a274
    [ 0.110101] R13: 0000000000000003 R14: 00000000fffffc22 R15: ffff88007a4f8000
    [ 0.110105] ? do_early_param+0x86/0x86
    [ 0.110107] calibrate_xor_blocks+0xb9/0x11e
    [ 0.110110] ? do_xor_speed+0xb0/0xb0
    [ 0.110112] do_one_initcall+0x2f/0x136
    [ 0.110114] ? do_early_param+0x86/0x86
    [ 0.110117] kernel_init_freeable+0x107/0x191
    [ 0.110120] ? rest_init+0xa0/0xa0
    [ 0.110123] kernel_init+0x5/0xee
    [ 0.110126] ret_from_fork+0x1f/0x30
    [ 0.110129] Code: 4a 83 fa 65 7e 74 41 b9 68 00 00 00 41 29 d1 4d 8b 47 f0 48 c7 c7 28 10 d5 81 41 8b 4f d8 56 44 89 d6 52 44 89 f2 e8 78 fc fd ff <0f> 0b 5a 59 e9 30 fd ff ff c6 83 c0 00 00 00 00 e9 95 fb ff ff
    [ 0.110153] ---[ end trace 200e413285c27d78 ]---

    ReplyDelete
    Replies
    1. @Micron
      Thanks for reporting. I think I have found the cause, patch will be provied soon.

      Delete
    2. @Micron
      In very rare chance, this kernel warning will be triggered. Here is the fix patch for it.
      https://github.com/cchalpha/linux-gc/commit/ee694d08363167d1dad25d578ababf01db7f1eb4

      Delete
  2. Compiled, booted on Ryzen cpu and apu as well, gaming for couple of hrs went good, I have not tried any heavy workloads yet.
    Waited patiently on 4.17.7/8 to avoid hard feeeze issues which I think I had on i7@work. Will install 098t+4.17.8 on i7@work soon.
    Still with nohz_full, seems fine so far.

    BR, Eduardo

    ReplyDelete
  3. Hi,
    upstream commit 8ecf04e11283a28ca88b8b8049ac93c3a99fcd2c is interfering with pds-098t.
    It's already in stable-queue/queue-4.17 as sched-cpufreq-modify-aggregate-utilization-to-always-include-blocked-fair-utilization.patch

    --- a/kernel/sched/cpufreq_schedutil.c
    +++ b/kernel/sched/cpufreq_schedutil.c
    @@ -183,22 +183,21 @@ static void sugov_get_util(struct sugov_
    static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
    {
    struct rq *rq = cpu_rq(sg_cpu->cpu);
    - unsigned long util;

    - if (rq->rt.rt_nr_running) {
    - util = sg_cpu->max;
    - } else {
    - util = sg_cpu->util_dl;
    - if (rq->cfs.h_nr_running)
    - util += sg_cpu->util_cfs;
    - }
    + if (rq->rt.rt_nr_running)
    + return sg_cpu->max;

    /*
    + * Utilization required by DEADLINE must always be granted while, for
    + * FAIR, we use blocked utilization of IDLE CPUs as a mechanism to
    + * gracefully reduce the frequency when no tasks show up for longer
    + * periods of time.
    + *
    * Ideally we would like to set util_dl as min/guaranteed freq and
    * util_cfs + util_dl as requested freq. However, cpufreq is not yet
    * ready for such an interface. So, we only do the latter for now.
    */
    - return min(util, sg_cpu->max);
    + return min(sg_cpu->max, (sg_cpu->util_dl + sg_cpu->util_cfs));
    }

    static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags)

    ReplyDelete
  4. Hi. Could you rewrite PDS against 4.17.12 ?
    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/commit/?h=linux-4.17.y&id=1dec63738015ae9be3384d96f478aae9014f38fc

    And error:

    patching file kernel/sched/cpufreq_schedutil.c
    Hunk #1 succeeded at 173 with fuzz 2.
    Hunk #2 FAILED at 206.
    Hunk #3 succeeded at 279 (offset -1 lines).
    Hunk #4 succeeded at 482 (offset -1 lines).
    Hunk #5 succeeded at 515 (offset -1 lines).
    1 out of 5 hunks FAILED -- saving rejects to file kernel/sched/cpufreq_schedutil.c.rej

    ReplyDelete
    Replies
    1. Once 4.17.12 is public at stable kernel git, I'll release a patch to fix this. Thanks for the remind.

      Delete
    2. 4.17.12 is out https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.17.y - thx :)

      Delete
    3. You can fetch the updated patch at https://gitlab.com/alfredchen/PDS-mq/raw/master/4.17/v4.17.12+_pds098t.patch

      When 4.18 is out, pds update for 4.17 will be stopped, as I only have bandwidth on the latest branch, :)

      Delete
    4. Alfred, the gitlab path for the patch for v4.17.17+ requires user login. Is not possible to share for public access? Thanks

      Delete
    5. Sorry. Maybe I have created it as a private project. Has changed to public. Pls check it again.

      Delete
    6. Second mirror: https://github.com/cchalpha/PDS-mq

      Delete
    7. The github repository will be abandon soon.

      Delete
  5. Thanks for new 4.17.12+ patch.

    ReplyDelete