Thursday, August 27, 2015

The BFS unpluged io issue

We traced the unplugged_io issue these two weeks, most discussion are in the replies of a-big-commit-added-to-41-vrq

At first I though that

"I guess the sched_submit_work() doesn't work for bfs b/c bfs use grq_lock instead task_lock() in mainline which a combine of task's pi_lock and rq->lock, the checking of tsk_is_pi_blocked(tsk) is not enough for BFS."

After investigation, it turns out that tsk_is_pi_blocked() is introduced in v3.3
3c7d518 sched/rt: Do not submit new work when PI-blocked
And it's not indicate tsk->pi_lock is held as I used to think it was.

So the question is back again, when sched_submit_works() is introduced in mainline 3.1, it moves the blk_schedule_flush_plug(tsk) call outside from schedule(), but relaxing the checking when not calling it. This code change is ok for mainline CFS but it's not for BFS somehow.
Adding back those checking is the current solution. The last patch for this issue is unchanged. I'd update -gc and -vrq branch soon to include it.

BR Alfred

Monday, August 24, 2015

4.1 -gc -vrq sanity test result and look forward

Since there are toolchain upgrade in my distribution. Now the system is using new gcc 4.9.x etc, it runs a little bit slow than 4.8.x, I have to overclock the test-bed system to get an acceptable run time of the sanity test. The result is as expected, comparing to previous test results, no regression is introduced in this release. The result is attached at the end of this post.

And 4.2 official release is delayed one week, it gives me a chance to list the todo items in next cycle, here they are
1. Sync up mainline 4.2, when preview the code changes during 4.2, there are much changes in scheduler code, over 1200+ lines of diff.
2. Start work on new commit which auto adjust the the cpu cache size factor of the task caching, current it's hard-code to optimize for my test-bed machine.
3. Fix known bugs, add comments and try to finalize some of the commits in VRQ.
4. Test and tune SMT.
5. Introduce another benchmark test.

Seems that there are enough thing to keep me busy for weeks, :)

BR Alfred

4.1 CFS
>>>>>spining up
>>>>>50% workload
>>>>>round 1
real    4m40.652s
user    8m39.005s
sys     0m35.902s
>>>>>round 2
real    4m40.688s
user    8m39.100s
sys     0m35.892s
>>>>>round 3
real    4m40.879s
user    8m39.041s
sys     0m35.881s
>>>>>100% workload
>>>>>round 1
real    2m30.750s
user    8m56.625s
sys     0m38.958s
>>>>>round 2
real    2m32.314s
user    9m2.696s
sys     0m39.169s
>>>>>round 3
real    2m32.873s
user    9m5.219s
sys     0m39.235s
>>>>>150% workload
>>>>>round 1
real    2m35.384s
user    9m13.719s
sys     0m40.464s
>>>>>round 2
real    2m34.874s
user    9m11.656s
sys     0m40.704s
>>>>>round 3
real    2m34.973s
user    9m10.739s
sys     0m40.397s
>>>>>200% workload
>>>>>round 1
real    2m36.812s
user    9m17.614s
sys     0m40.828s
>>>>>round 2
real    2m36.634s
user    9m18.383s
sys     0m40.701s
>>>>>round 3
real    2m36.992s
user    9m19.108s
sys     0m40.819s
>>>>>250% workload
>>>>>round 1
real    2m37.632s
user    9m21.271s
sys     0m41.163s
>>>>>round 2
real    2m38.446s
user    9m24.224s
sys     0m41.022s
>>>>>round 3
real    2m38.602s
user    9m24.575s
sys     0m41.436s
>>>>>300% workload
>>>>>round 1
real    2m39.867s
user    9m29.286s
sys     0m41.574s
>>>>>round 2
real    2m40.615s
user    9m29.444s
sys     0m41.578s
>>>>>round 3
real    2m40.111s
user    9m29.686s
sys     0m41.852s

4.1 BFS
>>>>>50% workload
>>>>>round 1
real    4m45.965s
user    8m53.304s
sys     0m32.862s
>>>>>round 2
real    4m45.964s
user    8m53.812s
sys     0m32.378s
>>>>>round 3
real    4m45.919s
user    8m53.194s
sys     0m32.927s
>>>>>100% workload
>>>>>round 1
real    2m30.846s
user    9m1.581s
sys     0m33.857s
>>>>>round 2
real    2m31.267s
user    9m2.822s
sys     0m34.096s
>>>>>round 3
real    2m31.666s
user    9m4.665s
sys     0m33.841s
>>>>>150% workload
>>>>>round 1
real    2m34.415s
user    9m16.511s
sys     0m34.483s
>>>>>round 2
real    2m34.530s
user    9m16.214s
sys     0m35.030s
>>>>>round 3
real    2m34.578s
user    9m17.104s
sys     0m34.456s
>>>>>200% workload
>>>>>round 1
real    2m35.951s
user    9m22.398s
sys     0m34.514s
>>>>>round 2
real    2m37.026s
user    9m22.704s
sys     0m34.639s
>>>>>round 3
real    2m36.158s
user    9m22.571s
sys     0m35.061s
>>>>>250% workload
>>>>>round 1
real    2m37.269s
user    9m25.792s
sys     0m35.212s
>>>>>round 2
real    2m37.058s
user    9m25.937s
sys     0m34.739s
>>>>>round 3
real    2m37.132s
user    9m25.538s
sys     0m35.453s
>>>>>300% workload
>>>>>round 1
real    2m37.935s
user    9m24.762s
sys     0m35.681s
>>>>>round 2
real    2m37.039s
user    9m25.452s
sys     0m35.822s
>>>>>round 3
real    2m38.103s
user    9m26.001s
sys     0m35.129s

4.1 GC
>>>>>50% workload
>>>>>round 1
real    4m43.899s
user    8m50.524s
sys     0m32.508s
>>>>>round 2
real    4m43.831s
user    8m50.031s
sys     0m32.868s
>>>>>round 3
real    4m43.810s
user    8m49.999s
sys     0m32.926s
>>>>>100% workload
>>>>>round 1
real    2m30.824s
user    9m1.669s
sys     0m34.747s
>>>>>round 2
real    2m31.382s
user    9m4.495s
sys     0m34.260s
>>>>>round 3
real    2m31.539s
user    9m5.008s
sys     0m34.470s
>>>>>150% workload
>>>>>round 1
real    2m35.457s
user    9m18.970s
sys     0m34.946s
>>>>>round 2
real    2m34.628s
user    9m18.050s
sys     0m34.884s
>>>>>round 3
real    2m34.648s
user    9m18.807s
sys     0m34.446s
>>>>>200% workload
>>>>>round 1
real    2m36.268s
user    9m23.971s
sys     0m35.149s
>>>>>round 2
real    2m36.410s
user    9m24.660s
sys     0m35.172s
>>>>>round 3
real    2m36.670s
user    9m25.137s
sys     0m35.346s
>>>>>250% workload
>>>>>round 1
real    2m37.606s
user    9m29.152s
sys     0m36.025s
>>>>>round 2
real    2m38.546s
user    9m27.398s
sys     0m35.950s
>>>>>round 3
real    2m38.509s
user    9m28.057s
sys     0m35.655s
>>>>>300% workload
>>>>>round 1
real    2m37.824s
user    9m28.526s
sys     0m36.302s
>>>>>round 2
real    2m37.473s
user    9m28.433s
sys     0m35.741s
>>>>>round 3
real    2m37.049s
user    9m27.219s
sys     0m35.622s

4.2 VRQ
>>>>>50% workload
>>>>>round 1
real    4m43.533s
user    8m49.706s
sys     0m32.653s
>>>>>round 2
real    4m43.630s
user    8m49.385s
sys     0m32.904s
>>>>>round 3
real    4m43.468s
user    8m49.845s
sys     0m32.537s
>>>>>100% workload
>>>>>round 1
real    2m30.467s
user    9m1.640s
sys     0m34.555s
>>>>>round 2
real    2m30.812s
user    9m1.790s
sys     0m34.305s
>>>>>round 3
real    2m30.675s
user    9m2.192s
sys     0m34.027s
>>>>>150% workload
>>>>>round 1
real    2m33.289s
user    9m12.513s
sys     0m34.640s
>>>>>round 2
real    2m33.166s
user    9m12.042s
sys     0m34.795s
>>>>>round 3
real    2m33.135s
user    9m12.005s
sys     0m35.120s
>>>>>200% workload
>>>>>round 1
real    2m36.200s
user    9m19.313s
sys     0m35.160s
>>>>>round 2
real    2m35.053s
user    9m18.936s
sys     0m35.322s
>>>>>round 3
real    2m34.917s
user    9m19.771s
sys     0m34.833s
>>>>>250% workload
>>>>>round 1
real    2m37.391s
user    9m23.886s
sys     0m35.097s
>>>>>round 2
real    2m35.889s
user    9m23.426s
sys     0m35.680s
>>>>>round 3
real    2m36.198s
user    9m23.343s
sys     0m35.443s
>>>>>300% workload
>>>>>round 1
real    2m36.724s
user    9m26.019s
sys     0m35.194s
>>>>>round 2
real    2m36.576s
user    9m25.513s
sys     0m35.794s
>>>>>round 3
real    2m36.759s
user    9m25.738s
sys     0m35.238s

Sunday, August 16, 2015

4.1 VRQ branch rework finished

Here are the new commits added to vrq branch(in reverse order)

2a8eea0 bfs: vrq: grq.lock free schedule for deactivate code path
8e1ae7c bfs: vrq: grq.lock free context switch for prev==idle path
34c262f bfs: vrq: refine task_preemptable_rq().
22ce18c bfs: vrq: [3/3] preempt task solution, v1.2
79265ca bfs: vrq: [2/3] introduce xxxx_choose_task() in __schedule().
fc44466 bfs: vrq: [1/3] RQ on_cpu states v1.1
be4207e bfs: vrq: refine rq->prq/w_prq as rq->try_preempt_tsk
f4aeee0 bfs: vrq: remove unused unsticky_task.
9c53147 bfs: vrq: Fix vrq solution 0.5 UP compile issue

Both bitbucket and github are updated! The most important objective of this release is stability. I got a new HW platform which found stability issues that can't be found in old platforms, and I believed the major ones have been fixed.

There still three key features on vrq branch as mentioned in vrq-04-update-for-linux-40y. But the cache count solution has advanced a little bit. Now the responsible commit is
ed20056 bfs: vrq: [2/2] scost for task caching v0.7

which is a replacement for the sticky_task design in origin bfs. I'll start another topic for it.

Now, all commits are set for vrq 4.1 branch. Benchmark will be run this week since there are many toolchain upgrade for my distribution in this release. Looking forward, next week 4.2 will be out and hopefully there will be less sync up work to spend more time on new commits.

BR Alfred

Monday, August 10, 2015

A big commit added to 4.1 VRQ

As title, this big commit is 117d783 bfs: VRQ solution v0.5

I think the most unstable issues in previous vrq release is caused by this and I believe most known issues(on my machines) have been fixed. It has been run stably for two weeks. So you are encouraged to have a try.

Know issue:
BUG: using smp_processor_id() in preemptible code, call trace from sys_sched_yield().

There still a few commits left I haven't reworked yet. I plan to finish them in two weeks before new kernel release and another sync-up cycle begins.

BR Alfred

Friday, August 7, 2015

4.1 vrq branch update -- reworking

4.1 vrq branch is updated, but there is no new commit added, as there is new sync to pick up bfs0463 and kernel v4.1.4, new commit has to be postponed to next week.

A fix has been added to the last commit to fix the compile error on UP config.

BR Alfred

Wednesday, August 5, 2015

gc-branch update with CK's BFS 0463

CK finally releases BFS 0463 against kernel 4.1 this week, so here comes the gc branch updates.

What's new:
1. Base on BFS 0463 and kernel v4.1.4
2. Fix/Sync against BFS 0463
  • 3b14908 bfs: [Sync] 4.1 schedule_user().
  • 9f9dc34 bfs: [Fix] 0463 remove unused register_task_migration_notifier().
3. New Sync patches which pick up sync changes from previous mainline changes (most from 3.17 to 3.18 and some fixes upon previous patch)
  • 0145370 bfs: [Sync] TIF_POLLING_NRFLAG for wake_up_if_idle() and resched_curr().
  • 775e28a bfs: [Sync] sched_init_numa().
  • c6c5894 bfs: [Sync] task_sched_runtime().
  • 4a48abf bfs: [Sync] sched_setscheduler() logic, v3
4. Give a meaningful version name for this patch sets "v4.1_0463_1"
  • dc4fa45 bfs: -gc BFS enchancement patch set version.

Code has been forced push to bitbucket and github . For those just want to easier apply the patches, here is the one for all patch include all BFS related commits in my gc-branch: bfs_enhancement_v4.1_0463_1.patch

If you are using the gc-branch, I'll highly suggest you to upgrade to this gc release. An updated -vrq branch will be coming soon, no new commits is planned(have to delay to next week as much sync-up works this week), but will be some bug fixes for the existed ones.

BR Alfred Chen

Add one more commit to fix RCU stall issue.

bfs: v4.1_0463_1 rcu stall fix.