Thursday, March 1, 2018

CacheHot experimental patch for PDS

Overview


In modern computer system design, cpu cache plays an important rule in performance but it "becomes a resource which is transparently used and administered by the processors.", which make it like a "black whole" and hard to be measured, especially from per task aspect in the kernel point of view. Despite of the difficulty and limitation, it is worthy to give it a shot based on recent test result of a prototype cache hot patch for PDS scheduler.

The Cache-Hot prototype patch here is trying to determine a task is cache hot or not when it is waking up, then choose different method to select a cpu to run on based on this cache hot information. If the task is cache hot, then it can use the accurate but complicated method to choose a cpu be affinity with the cache which the task resides in. If the task is cache cool, then it can simply choose any cpu it can run on. In this way, the overhead of selecting a cpu to run on is reduced.

The test result of kernel compilation result can be download from here. It shows that overhead is reduced in heavy workload, and the overhead cutting should be able to been seem in light workload if advance the measure model in the feature.

Limitation

1. The prefer setting/formula in this patch may just work for x86 arch.
2. The prefer setting/formula in this patch may just work for certain workload(linux kernel compilation)

Consider these limitations and this version is still a prototype patch, it will be better not official put into PDS and provide it as an experimental patch to try it out.

Try out the patch

The patch can be download from here, apply it upon pds098k.

Then adjust the SCHED_CACHE_HOT_SWITCHES_TH in pds.c to the reference value in the below formula according to your cpu topology

SCHED_CACHE_HOT_SWITCHES_TH = 8  * (LLC_SIZE / 3) * (1.8 / CPU_SPEED) * (4 / NUM_CPUS)

LLC_SIZE is last level cache size in MB
CPU_SPEED is the cpu speed in GHz
NUM_CPUS is the numbers of logical cpu(HT counts in)

Feel free to try other value than the reference one, and on different kind of workload. Your feedback will be welcome.

What's next

Don't worry about too much of the SCHED_CACHE_HOT_SWITCHES_TH, it's just a simple measurement in the prototype. Another model is working in progress and will be available in next version.