Only initialize the hash m_items in KFileItemModel if it is needed

Review Request #115432 - Created Feb. 2, 2014 and submitted

Frank Reininghaus
KFileItemModel has a member

QHash<KUrl, int> m_items

which makes it possible to find the index of an item if the URL is known. m_items is updated every time an item is added to/removed from the current directory, or if the sort order changes. This is convenient in many situations, but sometimes, this hash also causes problems:

1. In some corner cases, the model can get into a state where the indexes in m_items are not correct. There were, e.g.,

which I could reproduce by doing a search in Details View and expanding folders, such that the same URL appeared twice in the view (but only once in m_items).

But I think that there are other ways to make the relationship between m_items and the model contents inconsistent. A crash that looks like it is most likely caused by such problems is

2. Initializing m_items costs CPU cycles and memory, which is wasteful if m_items is not used at all.

Therefore, I propose to make the following changes:

(a) Do not require that all URLs are in m_items any more. Instead, the new requirement is: if there are N URLs in m_items, it is guaranteed that they are the first N URLs in the model.

(b) Replace most calls of m_items.value(url) by index(url) in KFileItemModel. Note that some local variables 'index' had to be renamed for this to work.

(c) In KFileItemModel::index(const KUrl&), check if the URL is in m_items already. If that is not the case, add 1000 URLs to the hash, and check again. Repeat until either the URL is found, or all URLs are in the hash (-1 is returned in the latter case). Note that we always know where to continue adding URLs: if m_items.count() is N, then ">item.url()" is the first URL that is not in m_items.

BTW, my first version of index(const KUrl&) added URLs to the hash one by one, and checked every time if the next URL is the right one by comparing the URLs with ==. However, this was slow and required a lot more memory because comparing URLs with the == operator triggers a parsing of the URL, which needs memory and CPU cycles. I don't know if this is still the case in Qt 5, but in any case, adding new URLs in larger batches and then checking if the hash contains the searched URL should have a lower amortized cost in any case.

(d) In insertItems() and removeItems(), clear the hash m_items. It will be re-populated the next time index() is called. Note that Dolphin < 4.11 also cleared the entire hash in these cases (it re-created the entire hash immediately though). In contrast to that, 4.11 and 4.12 only update the URLs/indexes starting from the first added/removed item. This can be problematic though - in bug 329494, we get a crash when trying to remove one of the removed URLs from m_items.
I could not find any regressions. Even if I could never reproduce the crash in bug 329494, I'm confident that the crash is fixed because the line where it tries to access the URL of a removed KFileItem and then crashes is removed (and there is no other access to the KFileItem in that function).

Memory usage when loading a directory with 100,000 files is reduced by about 5 MiB in all view modes (from about 150 to 145) if previews are disabled. (If previews are enabled, KFileItemModelRolesUpdater calls the index(KUrl&) method, which fills m_items, and memory usage is as before).

The time for loading a directory with 100,000 files is reduced by about 200 ms on my system (to about 4.3 seconds in Icons View, 3.7 in Compact and 2.8 in Details).
Mark Gaiser
Emmanuel Pescosta
Commit Hook
This review has been submitted with commit e45fc620b4c0d629eadf1b13538020cf7e79a4e3 by Frank Reininghaus to branch master.
Frank Reininghaus
Review request changed

Status: Closed (submitted)