Speed up natural sorting by first sorting the items according to KFileItem::text() with QString::operator<()
Review Request #113485 - Created Oct. 28, 2013 and submitted
We all know now that natural sorting can be very slow. There are two ways to improve this: 1. Make the "natural comparison" of two items faster. We might be able to do this when we can use QCollator from Qt5. 2. Reduce the number of (expensive) comparisons. This patch implements option 2 by first doing a very fast non-natural sorting of the items according to their name. The resulting sequence of items is partially sorted even according to the "natural" comparison, and this makes it possible to do the final sorting with a far lower number of comparisons.
Loaded a folder with 100,000 items with the names "a1.txt", ..., "a100000.txt". I guess that this is also one of the worst cases with respect to "difference between natural and non-natural sort order". This patch reduces the time needed to sort in a release build (commented out the profiling output in insertItems()) from 1940 ms to 714 ms, which is a saving of about 63%.
Hi Frank, That's one hell of a lot faster! It still doesn't beat..... <ok, i just skip it> ;) I do have one suggestion for you that might make it even faster. What you don't see doesn't have to be sorted, right? And to get that done, you might be able to use std::partial_sort  Lets say this is your entire folder (in ascii) 54 2 564 32 1 678 234 23 21 866 12 8 45 Lets say that this is your view: 54 2 564 ------- top part of view 32 1 678 234 23 21 ------- bottom part of view 866 12 8 45 Now partial_sort is able(?) to do this (not tested so i could be wrong). Same view partial sorted. ... 54 2 564 ------- top part of view ... 8 12 21 23 ... ------- bottom part of view 866 45 ... I know the above does work if you start from the top. I'm not 100% if it also works from the middle of a view. If it works then sorting can be reduced to the items you see. This would make the sorting insanely faster since all you have to sort would be about 100 entries at most, more aren't on the screen at any given time. Doing that would probably mean that you have to maintain which parts are sorted just to prevent needless resorting when the user scrolls up/down.  http://en.cppreference.com/w/cpp/algorithm/partial_sort
A "Ship It" from my side. Great idea btw. Another great thing about this patch is, that we can make use of these sequences of already "sorted" items to speed up the sorting, when we use Timsort in future (maybe?).