Add some benchmarks for KFileItemModel, improve worst-case performance of KFileItemModel::insertItems() and KFileItemModel::removeItems()

Review Request #108540 - Created Jan. 22, 2013 and submitted

Information
Frank Reininghaus
kde-baseapps
master
Reviewers
dolphin
I think it might be a good idea to make sure that Dolphin behaves better when handling many files and started looking at places in the code which have sub-optimal performance. There are probably quite a few of them - I'll start with KFileItemModel. Of course, everyone is invited to join the fun and look at other classes :-) My first target is inserting and removing items. The patch does two things:

1. Add a kfileitemmodelbenchmark that tests two things: inserting many files in two equally sized batches, and removing one half from a set of many files. The half that is added/removed can be the first or second half in a sorted list of files, or all odd files. I've made the benchmark a regular executable, rather than a unit test. The reasoning is that unit tests are often run automatically, and their results are only looked at if something goes wrong, so it looks like running benchmarks with every "make test" run might waste CPU cycles. Nonetheless, I do test the model state for correctness, I think that this can never hurt.

When passing the -xml option to the benchmark, the XML results can be transformed to nice graphs using qtestlib-tools: 
http://blog.qt.digia.com/blog/2008/12/05/qtestlib-now-with-nice-graphs-pointing-upwards/
I'll attach screenshots of "before" and "after" benchmark results. The values on the y-axis are milliseconds.

2. The problem with KFileItemModel::insertItems() and KFileItemModel::removeItems() is that they insert/remove the items one by one in the QList m_itemData. This was easy to implement, but inserting/removing is O(N) for QList, and if we do that N times, then we can get O(N^2) complexity, which is not good. My test results indeed show that this happens in some cases. The only cases that work well are "add many items at the back" and "remove many items from the back".

To resolve this, I propose to first only calculate the item ranges that are added/removed, and do the actual inserting/removing only when this is finished, such that the final position for each item is known, and every item has to be moved only once. This does make the code a bit more complicated, but it ensures that adding/removing N items is O(N). One might be able to do it with less code by replacing the int-based indices in the functions by iterators and use std::copy, std::copy_backward to do some of the work, but I didn't want to change too many things at the same time.

Any thoughts? If there is agreement that this makes sense, I'll start to look at other places in those functions and other parts KFileItemModel. I have some ideas already - for example, I think that we can make sure that removing items is not more expensive than inserting them, which is currently the case (see benchmark results).
Yes, unit tests still work fine. Benchmark results attached (obtained with a release build).

Files


Mark Gaiser
Frank Reininghaus
Commit Hook
Frank Reininghaus
Review request changed

Status: Closed (submitted)

Loading...