Here are the sanity test results of BFS, -gc branch and -vrq branch. No regression found on -gc branch, still doing better than origin BFS at 50% and 100% workload.
For vrq branch, there is not huge improvement against the gc branch, 50% and 300% workload performance are almost the same, there is even little regression at 100% workload, the only good news is there are improvement at 150% workload.
The reasons why vrq doesn't make good performance that I expected are
1. Introduced some additional rq lock sessions when implement the new lock strategy.
2. The grq lock conflict doesn't seem to be a major problem for system with few cores, at least like my test hw platform(4 cores).
I wished I had a chance to reach some 30+ cores monsters to prove that all codes in vrq worthy it. But before that, I'll continue the unfinished features in vrq like the cache_count, see how much performance can gain from these opened doors.
BFS0462:
>>>>>50% workload
>>>>>round 1
real 5m21.850s
user 9m55.977s
sys 0m41.537s
>>>>>round 2
real 5m21.653s
user 9m55.750s
sys 0m41.411s
>>>>>round 3
real 5m21.973s
user 9m56.570s
sys 0m41.192s
>>>>>100% workload
>>>>>round 1
real 2m52.203s
user 10m8.151s
sys 0m43.575s
>>>>>round 2
real 2m52.050s
user 10m8.423s
sys 0m43.515s
>>>>>round 3
real 2m50.865s
user 10m8.283s
sys 0m43.700s
>>>>>150% workload
>>>>>round 1
real 2m56.355s
user 10m29.334s
sys 0m44.955s
>>>>>round 2
real 2m56.189s
user 10m29.469s
sys 0m44.782s
>>>>>round 3
real 2m56.264s
user 10m29.485s
sys 0m44.845s
>>>>>300% workload
>>>>>round 1
real 3m0.412s
user 10m42.805s
sys 0m46.352s
>>>>>round 2
real 3m1.408s
user 10m42.618s
sys 0m46.341s
>>>>>round 3
real 3m0.287s
user 10m43.304s
sys 0m46.244s
linux-4.0.y-gc:
>>>>>50% workload
>>>>>round 1
real 5m18.823s
user 9m50.911s
sys 0m41.302s
>>>>>round 2
real 5m19.032s
user 9m51.597s
sys 0m40.984s
>>>>>round 3
real 5m18.960s
user 9m51.490s
sys 0m41.046s
>>>>>100% workload
>>>>>round 1
real 2m51.085s
user 10m8.806s
sys 0m43.699s
>>>>>round 2
real 2m50.870s
user 10m8.108s
sys 0m44.142s
>>>>>round 3
real 2m50.839s
user 10m8.290s
sys 0m43.979s
>>>>>150% workload
>>>>>round 1
real 2m56.285s
user 10m30.045s
sys 0m44.629s
>>>>>round 2
real 2m56.286s
user 10m30.054s
sys 0m44.866s
>>>>>round 3
real 2m56.333s
user 10m30.379s
sys 0m44.425s
>>>>>300% workload
>>>>>round 1
real 3m0.427s
user 10m43.455s
sys 0m46.739s
>>>>>round 2
real 3m0.222s
user 10m43.341s
sys 0m46.519s
>>>>>round 3
real 3m0.244s
user 10m43.029s
sys 0m46.608s
linux-4.0.y-vrq:
>>>>>round 1
real 5m18.905s
user 9m51.214s
sys 0m40.890s
>>>>>round 2
real 5m18.994s
user 9m51.203s
sys 0m41.029s
>>>>>round 3
real 5m18.818s
user 9m51.266s
sys 0m40.819s
>>>>>100% workload
>>>>>round 1
real 2m51.414s
user 10m7.739s
sys 0m43.785s
>>>>>round 2
real 2m51.146s
user 10m7.449s
sys 0m43.848s
>>>>>round 3
real 2m51.103s
user 10m7.721s
sys 0m43.499s
>>>>>150% workload
>>>>>round 1
real 2m54.407s
user 10m21.732s
sys 0m44.407s
>>>>>round 2
real 2m54.436s
user 10m21.212s
sys 0m44.824s
>>>>>round 3
real 2m55.156s
user 10m21.279s
sys 0m44.796s
>>>>>300% workload
>>>>>round 1
real 3m0.549s
user 10m43.723s
sys 0m46.342s
>>>>>round 2
real 3m0.475s
user 10m44.249s
sys 0m45.982s
>>>>>round 3
real 3m0.393s
user 10m44.088s
sys 0m46.114s
Drawing simple charts based on abovementioned numbers would really make them easier to perceive :).
ReplyDelete