Backlinks

20140902

20140902174529 cdent

Data checks, mysql backend, mariadb 5.5.38

Sample setup:

start a fresh devstack with proper ceilo branch
create several instances with javelin2
poll at a high frequency until approximately 10,000 samples
kill off all services except ceilo-api
use ceilorunner.py to make queries
- meter-list ceilorunner.py --num-iterations 50 --print-stats meter-list
- sample-list ceilorunner.py --num-iterations 50 --print-stats sample-list -m vcpus
data presented as average of 5 consecutive runs of the above

The hard data follows below but first:

Caveats and Conclusions

One thing we've realized from the various experiments being done is that the choice of database engine has a bigger impact than we would like. We shown that if these same tests are run with MySQL 5.5 the results will be different. Given what we're trying to find is a combination of changes that improves the worst case in MySQL 5.5 but also does not damage any other database (be that MariabDB, sqlite or PostgreSQL).

At a superficial level I think we can conclude:

A combination of the additional index plus Gordon's normalisation changes has reasonable improvement. Especially given that elsewhere it has been demonstrated that write improvement is much improved. The slight difference in sample-list performance can be attributed to noise.
My patch can be abandoned¹. It's not adding anything except uncertainty: the different results don't make much sense given the code diff. It probably means the testing is not robust enough (but we already know this).

Something we should keep in mind is that none of these tests include filtering the result set (by resource metadata and friends). That could potentially be a problem. A query which is fast without any WHERE clauses may behave entirely differently when there are some.

We also need to keep in mind that none of these tests are working with production level data. 10,000 samples is nothing. If/when time allows we need to be able to automate tests against much larger datasets.

Finally, in my experience performance of any mysql (or derivative) database server is highly dependent on the my.cnf settings. Painfully so. The defaults provided with most distributions are not very good because they try to save memory. This would not be the case on a dedicated database server. In those settings all of the indexes and much of the table space would be in RAM all the time.

All this just means we must keep in mind that these tests are taking a very narrow view on performance.

Master

MasterSchema
samples: 10006
meters: 43
meter-list: 4.946 seconds / 0.0982per
sample-list: 3.62 seconds / 0.0724per
resource-list: 77.65 seconds / 1.553per (ouch)

Index Patch

IndexPatchSchema
samples: 10005
meters: 43
meter-list: 4.266 seconds / 0.08532per
sample-list: 3.87 seconds / 0.0774per
resource-list: 21.19 seconds / 0.4238per

Normalise Patch

NormalisedSchema
samples: 10003
meters: 43
resources: 242
meter-list: 3.3 seconds / 0.066per
sample-list: 3.58 seconds / 0.0716per
resource-list: 18.09 seconds / 0.3618per

Normalise + Index

NormalisePatchSchema
samples: 10125
meters: 43
resources: 236
meter-list: 2.92 seconds / 0.058per
sample-list: 3.65 seconds / 0.073per
resource-list: 17.24 seconds / 0.345per

cdent's WIP + Index

WipSchema (should be same as NormalisePatchSchema)
samples: 10130
meters: 43
resources: 230
meter-list: 3.698 seconds / 0.0739per
sample-list: 3.68 seconds / 0.0736per
resource-list: 19.71 seconds / 0.392per

I'm not entirely certain I did the merge of various things properly but since my change is only related to the select used by get_meters and other changes seem to have better improvement we may as well use those. ↩