PLF C++ Library - plf::indiesort
Indiesort (portmanteau of 'indirect sort') is a sort-function wrapper which enables use of std::sort (or other random-access sort techniques) with non-random-access containers, and also improves sort performance for large/non-trivially-copyable types in random-access containers and arrays. It achieves this by sorting pointers/indexes to elements by the value of the elements they point to, then relocating the elements, rather than sorting the elements themselves. Elements will be moved a maximum of two times - most will be moved once or not at all.
It has a temporary memory cost of N * (sizeof(pointer) + sizeof(size_t)) for sorting non-random-access iterators/containers,
and a N * sizeof(U) cost for random-access iterators/containers, where U = the smallest unsigned integer able to store N. For example if the size of the range being sorted is <= 255, U will be unsigned char.
Similar techniques are implemented by Morwenn's Mountain sort, and elsewhere.
Indiesort should be used when:
- The temporary memory cost mentioned is non-problematic,
- The container or iterators are not random_access and therefore std::sort cannot be used, and/or
- The element type is large or non-trivially-movable/copyable.
It is, on average across all numbers of sorted elements:
- +146% faster than std::sort when used on vectors or arrays of large structs (496 bytes). Crossover point for increased performance over std::sort is any type larger than 152 bytes.
- +28% faster than std::list's internal sort, on types smaller than a large struct. Crossover point for increased performance over std::list's internal sort() is any type smaller than 272 bytes.
See the benchmarks for more information.
This technique developed out of the plf::list project, where it was used with the linked list to create a much faster sort than was available in std::list. From there it was ported into plf::colony, which, while having a large memory cost compared to plf::list's sort, was still useful. On Jens Maurer's suggestion, I separated the sort technique from colony, and in the process, streamlined and reduced the memory cost. It can now be used with any iterator/pointer range or container. Note: plf::list's internal sort is still superior to indiesort for a linked list, as it changes the previous and next pointers of each node rather than relocating elements, whereas indiesort will always reallocate elements.
The algorithm consists of gather, sort and scatter phases:
- Gather pointers:
- Obtain the size N of the range to be sorted, either via a container's size() function,
last - first for random-access iterator ranges, or ++ iteration for non-random-access iterator ranges. If size is < 2 we return.
- Construct an array of pointer+size_t pair structs of size N. Fill the array with pointers to elements in the range + the corresponding index of each element within that range. This is the
sort_array. It's pointer member is called
original_location, while it's size_t member is called
Note: For random_access iterators, original_location can be obtained from first + original_index, and can be omitted, with the size_t reduced to the smallest unsigned integer type whose std::numeric_limits::max() is larger than N. This significantly reduces memory usage in those scenarios.
- Sort the sort_array via the values of the elements which the original_location members point to, by default using std::sort. This is done via use of a sort_dereferencer functor of a type similar to that used in plf::list's pointer-sort hack section.
- Scatter the elements to their new locations: to do this we traverse the sort_array using index i in the loop below:
- If the end of the array has been reached at this point we exit the loop.
If sort_array[i].original_index == i this means element position is unchanged post-sort, so we increment i and repeat this step.
Likewise if the value of sort_array[i].original_index == std::numeric_limits<size_t>::max() that means this index has already been processed in a move-chain (see (c) and (d)), so we increment i and repeat this step.
- Otherwise we move the element which this pair's original_location points to, into a temporary variable of the same element type called
We then store i as
destination_index and sort_array[i].original_index as
- We move the element pointed to by sort_array[source_index].original_location, to the location pointed to by sort_array[destination_index].original_location. We then set destination_index to the value of source_index. and set source_index to sort_array[destination_index].original_index. Then set sort_array[destination_index].original_index to std::numeric_limits<size_t>::max(). At this point, if source_index == i, that means the end of this move-chain has been reached, so we move the element from end_value to the location pointed to by sort_array[destination_index].original_location, increment i, and return to step (a). Otherwise we repeat this step.
- Once the loop is over, we destroy and free the sort_array.
plf::indiesort is under a permissive zlib license. This means:
Software is used on 'as-is' basis. It can be modified and used in commercial
software. Authors are not liable for any damages arising from its use. The
distribution of a modified version of the software is subject to the following
- The authorship of the original software must not be misrepresented,
- Altered source versions must not be misrepresented as being the original
- The license notice must not be removed from source distributions.
Download here (6kb zip file) or
view the repository
The indiesort function library is a simple .h header file, to be included with a #include
In addition if you are interested in benchmarking you can also download the plf benchmark suite (64kb zip
file), which includes plf::nanotimer.
Function Descriptions and syntax
template <class iterator_type, class sort_comparison>
void indiesort (iterator_type first, iterator_type last, sort_comparison compare)
Sorts range using std::sort internally.
template <class iterator_type>
void indiesort (iterator_type first, iterator_type last)
Sorts range using std::sort with a less-than comparison internally.
template <class container_type, class sort_comparison>
void indiesort (container_type &container, sort_comparison compare)
Sorts container using std::sort internally.
template <class container_type>
void indiesort (container_type &container)
Sorts container using std::sort and a less-than comparison.
template <class iterator_type, class sort_comparison>
void indiesort (iterator_type first, iterator_type last, sort_comparison compare, std::size_t size)
Sorts range using std::sort internally. If the size of the range is known in advance and the iterators are not random_access, use this variant for better performance. If the size of the range isn't known, use the range-based functions above. If the range is begin() - end() of a container, use one of the container-based functions above. If the iterators are random_access, the range-based functions above will figure out the range from last - first.
Note: in the case of non-contiguous random-access containers, such as std::deque, the function above as it will be much faster than the others. Unfortunately it will have the same memory cost in this case as calling indiesort on a non-random-access containers.
Unfortunately there is no tag in C++ for matching iterators of contiguous containers, for reasons which are hard to explain.
To use a sort function other than std::sort, #define PLF_INDSORT_SORT_FUNCTION to the name of your sort function before calling plf::indiesort. Whatever sort function you use must take arguments in the same format and order as std::sort. If PLF_INDSORT_SORT_FUNCTION is not defined STL
algorithm.h will be included and std::sort will be used instead. To avoid the plf_indiesort.h pulling in algorithm.h PLF_INDSORT_SORT_FUNCTION must be defined before including plf_indiesort.h.
Basic code example
#include <functional> // std::greater
for (int i = 0; i != 60; ++i)
vec.push_back(60 - i);
int last = 0;
for (std::vector<int>::iterator it = vec.begin(); it != vec.end(); ++it)
if (last > *it)
std::cout << "Less-than full-container sort failed\n";
last = *it;
std::cout << "Less-than full-container sort passed\n";
std::vector<int>::iterator it1 = vec.begin() + 5, it2 = vec.begin() + 50;
plf::indiesort(it1, it2, std::greater<int>());
last = 60;
for (std::vector<int>::iterator it = it1; it != it2; ++it)
if (last < *it)
std::cout << "Greater-than range sort failed\n";
last = *it;
std::cout << "Greater-than range sort passed\n";
Benchmark results for indiesort v1.05 under GCC 9.2 x64 on an Intel Xeon E3-1241 (Haswell core) are here.
- 2020-9-17: v1.07 - Removal of destructor calls since all element locations inevitably get written to in the container/array so any necessary destructor calls for non-trivial objects will be handled by the move/copy operators of that element type.
- 2020-9-14: v1.06 - Removal of unnecessary branch checks, thanks to sentinel reduction, courtesy of github user morwenn.
- 2020-7-05: v1.05 - Alteration to make workaround for non-contiguous random-access containers.
- 2020-7-05: v1.04 - Addition of random-access range specialisation - 7% faster on average with reduction of temporary memory usage down to N * unsigned, where unsigned is the smallest unsigned integer able to hold N eg. if N is < 255, unsigned can be unsigned char. Improvement of 16% with vectors & large structs. Correction to test suite under C++03. Fixes & improvements for C++11 mode.
- 2020-6-28: v1.02 - Further compiler switch reduction. Correction to element destruction with non-move-semantics-supporting compilers.
- 2020-6-27: v1.01 - Comments cleanup, change to code to remove algorithm.h include if PLF_INDSORT_SORT_FUNCTION is defined. Reduction of compiler support biolerplate code to bare minimum. Reduction of test_suite to bare minimum. Stronger enforcement of internal namespace for C++11-alike functions/structs.
- 2020-6-26: v1.00 - First release
plf:: library and this page Copyright (c) 2020, Matthew Bentley