Wednesday, September 14, 2016

About skip list in BFS

It's great that CK release huge changes for BFS in 0480 release, the changes not only include cpufreq trigger and cpufreq governor change in recent kernel release but also the most fun part, use skip list to replace the bitmap mask linked queue in global run queue data structure.

May be there are many changes at a time, so BFS 0480 is not settle down yet, most issue are related to cpufreq trigger and cpufreq governor code. I was planed to rework the about 60 -vrq commits upon 0480, but now I'd had to wait till it is calmed down. Meanwhile, I have done some tests for the skiplist design in BFS. That means only three patches in 0480 is used, they are


bfs472-fix_set_task_cpu.patch
skiplists.patch
bfs472-skiplist.patch

For my kernel compilation sanity tests, sl is almost as same as bfs0472, only different is sl is a little bit better than 0472 at 300% workload.

For hackbench(from https://lwn.net/Articles/351058/), 20 groups, 0472 finished at about 4sec, sl finished at 3.6sec. Consider -vrq(0472 based) finished at 3.8sec and cfs at 3.2sec, I can't wait to see how it goes when sl combine with -vrq.

The skiplist design for BFS, IMO, as an initial release, it looks pretty good and very potential. Here are my comments for the skiplist in 0480

#1 In earliest_deadline_task(), RT tasks is now impacted by interactive setting. IMO, it should keep the same behaviors as it is at 0472.

#2 When interactivity is enabled and consider both normal policy tasks and idle policy tasks are in the queue, for example tasks in queue are in this order (N1, N2, D1, D2), in current implement, it's possible that earliest_deadline_task() go through all 4 tasks to pick up one among them. But normal tasks should always has higher priority than idle ones, so just go through the tasks with normal policy(that's the N1, N2 in this example) will be enough.

#3 kmalloc is used for current skip list implementation. It is not a good design for critical code like a scheduler. I have rewriten the skip list implementation to embedded the data structure into the task_struct etc. (PS, CK address this in 0497, meanwhile I have rewritten an embedded implementation too)

#4 Currently, there are same probability that randomLevel() returns 0~15. But "where an element in layer i appears in layer i+1 with some fixed probability p (two commonly used values for p are 1/2 or 1/4)" -- from https://en.wikipedia.org/wiki/Skip_list , so there should be some improvement to adjust the random level.

I have finished #1 and #2 propose enhancement patch and take it as a baseline for my rest SL related embedded/improvement skip list implementation patches. Once the code has been cleaned up, I will push the rest patches to the linux-4.7.y-sl branch in my git repositories.

BR Alfred

No comments:

Post a Comment