KFileItemModelRolesUpdater: improve performance, fix some bugs, and waste less resources

Review Request #110839 - Created June 5, 2013 and submitted - Latest diff uploaded

Frank Reininghaus
As I announced on the mailing list some time ago, I changed some parts of KFileItemModelRolesUpdater that I consider sub-optimal. Unfortunately, my progress was rather slow because I don't have that much free time for KDE, and much of that is eaten up by the endless stream of incoming bug reports and reviews. I have mostly finished it now and intend to commit it tonight - even though one could argue that it's technically not a new feature (it fixes a couple of bugs and performance problems), I think it needs testing in the beta releases, and one week of additonal testing by our brave master users to find any serious issues that might have slipped through my own testing might make sense.

I realize that I'm not really giving people much time to comment on this. However, I think that this review request is still useful because it's the most convenient way to view what has changed IMHO, and it also gives me the opportunity to describe the motivation for this change and the actual improvements in more detail than in a commit message.

1. Shortcomings of the existing implementation

(a) Icons/mime types/previews (and possibly other things) are determined for all items. This is a huge waste of resources in folders with many items because it's unlikely that the user will ever scroll the view through all of them.

One could argue that a benefit of the existing solution is that once the asynchronous loading of all data is finished, you always see the items in their final state when scrolling through them. However, it may take several minutes until that is the case in really huge folders, so it is questionable how useful that "feature" really is IMHO.

(b) Loading the icons for the visible icons is sometimes too slow:

If previews are enabled, we try to do a fast determination of the icons first for 200 ms. However, if that fails, no further icons are determined until previews and all other data are determined.

The result is that the user might get to see "unknown" icons for a rather long time when entering a folder.

(c) The "sort role" is determined together with all other roles. This can be a problem, e.g., when sorting by "Type", and the initial quick mime type determination fails to handle all files in the folder. The remaining mime types are then resolved only when previews for the respective items are received, which might cause a resorting of the items quite some time after the folder has been entered.

Moreover, this aspect currently requires that the code that notifies the KFileItemModel about the sort progress is mixed with many parts of KFileItemModelRolesUpdater, which made it (at least for me) sometimes hard to understand the code.

(d) The existing design has bugs, which sometimes cause items not to be updated at all. Use case:

* Open an empty folder in Details View
* Terminal Panel: mkdir 1 3
* Terminal Panel: mkdir 2 4

Note that the icon for '4' is not loaded, and an expansion toggle is shown even though there are no files inside.

The reason is that the function

startUpdating(const KItemRangeList& itemRanges)

treats the itemRanges as *existing* item ranges with indexes that correspond to current model indexes. However, the itemsInsertes(KItemRangeList) signal uses indexes which correspond to the positions of the *old* items before which the new items are inserted. It works if only one range is inserted, but if there are more, it goes wrong because the indexes of the 2nd, 3rd, ... ranges are incorrect.

We cannot really change the function startUpdating() though - this could break updating changed items (where the indexes really correspond to the *current* positions of the changed items).

Another symptom of this bug is that sometimes when loading a huge folder, and the items are received in multiple batches, some items do not get icons and other roles.

Strange that nobody has reported this bug yet, considering how many bug reports we get all the time ;-)

(e) Resorting the model has no effect on an ongoing update: try to load a huge folder with previews enabled. The visible previews are determined first, but if you revert the sort order before all items are resolved, you'll notice that the previews for the new visible area are determined slowly in random order.

I admit that this would be easy to fix even with the current design though.

I hope I've made it clear why I belive that we have to change some things in this class.

2. Design considerations

Resolving the items happens in several phases. I've added a new member m_state, which helps the asynchronously invoked functions to know what's currently going on (and to help them to know when to do nothing because something happend after the last asynchronous invocation), and which can take the following values:


The first two should be clear - 'Idle' means that there is nothing to do at the moment, and 'Paused' means that the view has notified us that resolving the roles should be continued later.

These are the steps that are taken to resolve all interesting roles for the "interesting" items - these are the visible items, the items which are on a few pages before/after the visible area, and on the first and last pages, i.e., those items which are visible or which might become visible quickly:

(a) First, the sort role is determined for all items (if it is a role that can be resolved by this class). This makes sense because all other steps are only done for visible items and items that can be reached quickly ('interesting' items), but

* the sort role must be known for all items, and
* we don't know which items will be finally visible before the sort role is known for all items.

If determining the sort role cannot be finished in 200 ms, the state is set to 'ResolvingSortRole', and the rest is done asynchronously by resolveNextSortRole().

(b) If determining the sort roles is done, or anything happens that affects the interesting items, the function startUpdating() is called. It does the following:

* Try to determine the icons by calling applyResolvedRoles(item, ResolveFast) for all visible items synchronously in 200 ms.

* Without previews: start an asynchronous resolving of all roles for the interesting items with resolveNextPendingRoles(). The state is set to 'ResolvingAllRoles'.

* With previews: start a preview job for the interesting items. All roles will be resolved for an item once the preview is received. At the same time, a fast asynchronous determination of the icons is started using resolveNextPendingRoles() to prevent that scrolling down before all previews have arrived will result in 'unknown' icons in the view. The state is set to 'PreviewJobRunning'.

(c) When all roles have been determined for the visible items, we look at the changed items and update these.

Some additinonal remarks on the purpose of some of the members:

m_finishedItems: a set that keeps all items which are done. Before any roles are requested for items by startUpdating(), it is checked whether the items are inside this set or not.

m_pendingSortRoleItems: the items for which the sort role is currently determined asynchronously by resolveNextSortRole().

m_pendingIndexes: the indexes of the items for which either all roles (if previews are disabled) or only the icon (if previews are enabled) are currently being determined asynchronously by resolveNextPendingRoles().

m_pendingPreviewItems: items that are left over from the last preview job. A new preview job will be started for them as soon as the current preview job is finished.

m_recentlyChangedItems: if items are changed repeatedly, they are put into this set first to prevent that we permanently try to update large files which are currently being downloaded or copied. These items are moved to m_changedItems when the m_recentlyChangedItemsTimer expires.

m_changedItems: items which have been changed and should be updated.

One might argue that the asynchronous invocation of slots is not the only way to perform expensive updates without blocking the GUI. Moving those things to a thread might be an obvious alternative, but synchronization issues might be very tricky to get right, and considering that KFileItem is not thread-safe AFAIK, this might cause lots of subtle problems or require workarounds which are ugly and harmful for the overall performance.

3. Known issues

When most of the stuff was working nicely, I realized that I went a bit too far with the "determine the sort role before anything else" idea. When you sort by type, and enter a large folder, it looks weird that there are "Unknown" icons for many items until the sorting is finished.

I tried to fix this with some hacks (determine icons for visible items also in resolveNextSortRole(), and try to determine the sort role for visible items first by using m_pendingSortRoleIndexes). This improved the situation, but does still not work perfectly.

I'll remove this hack and work on this until next week (maybe just always determine icons for the visible items when something interesting happens, no matter if we are still in ResolvingSortRole state). I think that this is better than waiting longer because it makes sense that people test the other parts of the change, and I also don't want to make a quick fix now that might break other things.

4. Final words

As I said, I realize that there might not much time for people to comment on this right now, but comments on possible problems or ideas for improvements are still welcome when this patch is in master, of course!

Even if it turns out that some basic ideas are nonsense, please don't hesitate to tell me. In the worst case, we can always just revert and go back to the KDE 4.10 state.
Works for me (except for the problem that icons for visible items are sometimes not loaded while the sort role is being determined).