PLF C++ Library - plf::queue

Intro
Implementation
License
Download
Function Descriptions and Syntax
Benchmarks
Frequently-asked Questions
Version History
Contact

Intro

plf::queue is a container that gives the programmer the functionality of a FIFO (first-in, first-out) queue. It is a more efficient drop-in replacement for std::queue, and faster than all std:: containers in the context of a queue. Like plf::colony it never invalidates references/pointers to non-popped elements, making it at least potentially useful for programming with containers of inter-related data structures. It supports C++11 move and emplace semantics and operations.

In real-world scenario benchmarking it is on average:

20% faster for char
10% faster for int
15% faster for double
40% faster for small structs
65% faster for large structs

* Averaged across total numbers of stored elements ranging between 10 and 1000000, with the number of samples = 126 and the number of elements increasing by 10% per sample. The test in question is a pump test, where elements are pushed and popped consecutively with the overall number of elements fluctuating over time. Benchmarked on a 3rd gen i5, GCC 9.2, x64. Queue priority is set to plf::speed.

It obtains this performance difference by having a more limited/efficient structure compared to deque (the default underlying container of std::queue) implementations, and by using an adaptive capacity for memory blocks, which no deques to date do. This decreases the number of allocations/deallocations necessary, the number of block traversals when pushing/popping, and increases cache locality by enabling larger blocks. In addition, unlike most deque implementations, the block capacities are based on element sizeof(), not a fixed byte size. For most deque implementations, this means that larger element types effectively turn the deque into a linked list. Currently, libstdc++ uses fixed 512-byte block sizes, libc++ uses 4096 block sizes, and MSVC, perplexingly, 16-byte block sizes. For all three, if the element type is larger than the block size, the block size becomes the same as the element type, except for clang where the block size becomes 16 * element size.

In addition there are a few block optimisations implemented which only make sense in the context of a queue, which are detailed in the below.

Implementation

plf::queue uses the same chained-group memory allocation pattern as plf::colony and plf::stack ie. it is made up of a linked chain of 'groups', structs containing a memory block and metadata. New block capacities are based on the current number of elements in the container, not the capacity of previous blocks. This makes more sense in the context of a queue, because the first block in the queue may have a low number of elements left. In effect this makes the block capacities adaptive to the current number of stored elements.

If the plf::queue template parameter 'priority' is set to plf::memory, plf::queue aims for block capacities which are roughly == 1/4 of size(). This results in less unused memory during use as elements are pushed and popped. If set to plf::speed (default), it aims for block capacities == size(). This results in more cache locality during use. In addition, when priority is plf::memory, the default minimum block capacity is lower.

When creating a new block, if size() (or in the case of priority == plf::memory, 1/4 size()) is less than 2x the back block's capacity and greater than 0.5x the back block's capacity, the next block allocated will be the same capacity as the back block, rather than equal to the number of elements in the container. The latter feeds into the following optimisation: In the case where only two memory blocks are present, if the first and second memory block capacities are equal, and the first block is emptied via pop(), instead of being deallocated, the first block will be retained and will become the 'next' memory block in the sequence after the second block. This helps prevent unnecessary allocations/deallocations when a queue is receiving an equal number of push's and pop's. In terms of a consecutive sequence equal numbers of of pushes and pops, this means there will be no new block allocations, only recycling of front blocks.

So for example, a plf::queue with an initial single block of 8 elements, which is full, will create another block of 8 element capacity for the next push(), effectively doubling the potential size(). But another plf::queue with 3 blocks, with capacities == 8, 8, and 16 respectively, where the back block is full but the first block has had all but one element popped, making the size() == 25, the next block created by push() will be 16 elements (if Priority == plf::speed) - not a block of 32 elements.

The metadata within groups - 'elements' (pointer to the memory block), previous_group (pointer or NULL), next_group (pointer or NULL), and 'end' (pointer to last element in memory block) - aid respectively in the identification of when when a pop reaches the end of the memory block, iterating backwards to the previous block (upon an element construction exception), iterating forward to the next block, and identifying when a push reaches the capacity of a given memory block.

When capacity of the current memory block is reached, a new group is created and linked to via the next_group pointer. Reserve() can be called to create larger memory blocks ahead of time, and any unused memory blocks can be released via the trim() command. Lastly, minimum and maximum memory block sizes can be specified by the user. By default both the maximum and minimum capacities are algorithmically-determined based on sizeof(element_type), sizeof(queue) + sizeof(group metadata). The means a larger type will have a smaller default minimum size. In terms of the minimum default size, this avoids a scenario where plf::queue internal metadata data takes up more space initially than the stored type's memory blocks.

License

plf::queue is under a permissive zlib license. This means: Software is used on 'as-is' basis. It can be modified and used in commercial software. Authors are not liable for any damages arising from its use. The distribution of a modified version of the software is subject to the following restrictions:

The authorship of the original software must not be misrepresented,
Altered source versions must not be misrepresented as being the original software, and
The license notice must not be removed from source distributions.

Download

Download here or view the repository.
plf::queue is a simple .h header-only library, to be used with an #include command.

If you are interested in benchmarking you can also download the plf benchmark suite.

Function Descriptions and syntax

For the most part the syntax and semantics of plf::queue functions are the same as std::queue. However there are a few key differences, such as the ability to specify min/max block capacities in the constructor. Formal description is as follows:

template <class T, class Allocator = std::allocator <T> >
class queue

T - the element type. In general T must meet the requirements of Erasable, CopyAssignable and CopyConstructible.
However, if emplace is utilized to insert elements into the queue, and no functions which involve copying or moving are utilized, T is only required to meet the requirements of Erasable.
If move-insert is utilized instead of emplace, T must also meet the requirements of MoveConstructible .
Priority - an enum whose value is either plf::memory or plf::speed. plf::speed is the default.
Allocator - an allocator that is used to acquire memory to store the elements. The type must meet the requirements of Allocator. The behavior is undefined if Allocator::value_type is not the same as T.

Basic example of usage

#include <iostream>
#include "plf_queue.h"

int main(int argc, char **argv)
{
  plf::queue<int> i_queue;

  // Insert 100 ints:
  for (int i = 0; i != 100; ++i)
  {
    i_queue.push(i);
  }

  // Total the ints:
  int total = 0;

  while (!i_queue.empty())
  {
    total += i_queue.front();
    i_queue.pop();
  }

  std::cout << "Total: " << total << std::endl;
  std::cin.get();
  return 0;
}

Member types

Member type	Definition
`value_type`	`T`
`allocator_type`	`Allocator`
`size_type`	`Allocator::size_type (pre-c++11) std::allocator_traits<Allocator>::size_type (post-c++11)`
`reference`	`Allocator::reference (pre-c++11) value_type & (post-c++11)`
`const_reference`	`Allocator::const_reference (pre-c++11) const value_type & (post-c++11)`
`pointer`	`Allocator::pointer (pre-c++11) value_type & (post-c++11)`
`const_pointer`	`Allocator::const_pointer (pre-c++11) std::allocator_traits<Allocator>::const_pointer (post-c++11)`

Constructors

default	`explicit queue(const allocator_type &alloc = allocator_type()) explicit queue(const size_type min_group_size, const size_type max_group_size = std::numeric_limits<size_type>::max() / 2, const allocator_type &alloc = allocator_type())`
copy	`queue(const queue &source)`
move	`queue(const queue &&source) noexcept (C++11 and upwards)` Source queue has same status as an empty queue post-move.

Constructor usage examples

queue<T> a_queue
Default constructor - default minimum group size is 8, default maximum group size is std::numeric_limits<size_type>::max() / 2. The reason for this maximum limitation is that each new group doubles the number of existing elements in the queue. And since the maximum number of elements that can be possibly held in a queue is std::numeric_limits<size_type>::max(), the maximum group size must be half that.
Example: plf::queue<int> int_queue;
queue<T, custom_allocator<T> > a_queue
Using a custom allocator.
Example: plf::queue<int, tbb::allocator<int> > int_queue;
Example2:
// Using an instance of an allocator as well as it's type tbb::allocator<int> alloc_instance; plf::queue<int, tbb::allocator<int> > int_queue(alloc_instance);
queue<T> a_queue(const size_type min_group_size)
This sets the minimum group size - for example, to 50 (unlike a vector, these 50 elements are not constructed at this point). This may yield a minor performance advantage if you know roughly how many elements are going to be stored in your queue in advance. Minimum group size may not be lower than 3.
Example: plf::queue<int> int_queue(62);
queue<T> a_queue(const size_type min_group_size, const size_type max_group_size)
This sets the minimum group size as above, as well as the maximum group size - this can be useful if you want to limit memory usage. Minimum group size may not be lower than 3. Maximum group size may not be larger than std::numeric_limits<size_type>::max() / 2.
Example: plf::queue<int> int_queue(64, 512);
queue<T> a_queue(const queue &source)
Copies all contents from source. New minimum group size is the total size of source, unless source's minimum group size is larger than it's total size, in which case the new minimum group size is the source's minimum group size. If source's total size is larger than it's maximum group size, the new minimum group size is the same as the source's maximum group size.
Example: plf::queue<int> int_queue_2(int_queue_1);
queue<T> a_queue(queue && source) C++11 and upwards
Moves all contents from source, does not alter any group sizes. Source is now empty and can be used or destructed safely.
Example: plf::queue<int> int_queue_2(std::move(int_queue_1));

Member functions

void push(const the_type &element)
Push an element to the back of the queue.
Example: object_queue.push(object1);
void push(the_type &&element) C++11 and upwards
Move an element to the back of the queue.
Example: object_queue.push(std::move(object1));
void emplace(Arguments ...parameters) C++11 and upwards
Constructs an element directly on the queue using the the constructor arguments specified.
Example: object_queue.emplace(10, 200, "a");
void pop()
Pop the front element off the queue. In debug mode will test for queue emptiness prior to operation via an assert. Calling pop() on an empty queue in non-debug mode results in undefined behaviour.
Example: object_queue.pop();
the_type & front()
Return a reference to the first element on the queue. In debug mode will test for queue emptiness prior to operation via an assert. Calling front() on an empty queue in non-debug mode results in undefined behaviour.
Example: object2 = object_queue.front();
the_type & back()
Return a reference to the last element on the queue. In debug mode will test for queue emptiness prior to operation via an assert. Calling last() on an empty queue in non-debug mode results in undefined behaviour.
Example: object2 = object_queue.last();
size_type size()
Return the current number of elements stored on the queue.
Example: unsigned int j = static_cast<unsigned int>(object_queue.size());
size_type max_size()
Returns the maximum number of elements that the allocator can store in the container. This is an approximation as it does attempt to measure the memory overhead of the container's internal memory structures. It is not possible to measure the latter because a copy operation may change the number of groups utilized for the same amount of elements, if the maximum or minimum group sizes are different in the source container.
Example: std::cout << i_queue.max_size() << std::endl;
size_type capacity()
Returns total number of elements currently able to be stored in container. Ignores the fact that if elements have been popped off the front of the queue, that element memory space will not be able to be used.
Example: std::cout << i_queue.capacity() << std::endl;
void reserve(size_type reserve_amount)
Preallocates memory space sufficient to store the number of elements indicated by reserve_amount. Will not change minimum and maximum group sizes. Will be rounded up if reserve_amount is lower than the minimum group size.
Example: i_queue.reserve(15);
void shrink_to_fit()
Reduces container capacity to the amount necessary to store all currently stored elements. If the total number of elements is larger than the maximum group size, the resultant capacity will be equal to ((total_elements / max_group_size) + 1) * max_group_size. Invalidates all pointers, iterators and references to elements within the container.
Example: i_queue.shrink_to_fit();
void trim()
Removes any trailing memory blocks within the container substructure, which may be present due to reserve or due to pop() re-using a memory block. Unlike a regular shrink_to_fit, does not shrink capacity to the size needed to contain the currently-stored elements, simply frees up unused blocks. Runtime cost is significantly lower than shrink_to_fit, but may not free as much memory.
Example: i_queue.trim();
bool empty()
Returns a boolean indicating whether the queue is currently empty of elements.
Example: if (object_queue.empty()) return;
void clear()
Empties the queue and removes all elements and groups.
Example: object_queue.clear();
void reshape(const size_type min_group_size, const size_type max_group_size)
Changes the minimum and maximum internal group sizes, in terms of number of elements stored per group. If the queue is not empty and either min_group_size is larger than the smallest group in the queue, or max_group_size is smaller than the largest group in the queue, the queue will be internally copy-constructed into a new queue which uses the new group sizes, invalidating all pointers/iterators/references.
Example: object_queue.reshape(1000, 10000);
void swap(queue &source)
Swaps the queue's contents with that of source.
Example: object_queue.swap(other_queue);
friend void swap(queue &A, queue &B)
External friend function, swaps the queue A's contents with that of queue B (assumes both queues have same element type).
Example: swap(object_queue, other_queue);
queue & operator = (const queue &source)
Copy the elements from source queue to this queue, clearing this queue of existing elements first.
Example: object_queue = object_queue2;
queue & operator = (const queue &&source) C++11 and upwards
Move the elements from source queue to this queue, clearing this queue of existing elements first. Source queue becomes empty and safely be reused or destructed.
Example: object_queue = std::move(object_queue2);
bool operator == (const queue &source)
Compare contents of another queue to this queue. Returns a boolean to indicate whether they are equal.
Example: if (object_queue == object_queue2) return;
bool operator != (const queue &source)
Compare contents of another queue to this queue. Returns a boolean to indicate whether they are not equal.
Example: if (object_queue != object_queue2) return;
allocator_type get_allocator()
Returns a copy of the allocator used by this queue

Benchmarks

Benchmark results for queue under GCC 9.2 x64 on an Intel 3rd gen i5 are here.

Version history

2024-04-10: v2.0.4 - plf::make_move_iterator changed to be static. plf::equal_to uses references now and is entirely noexcept.
2024-03-24: v2.0.3 - Minor optimisations and corrections. queue == and != operators changed to be friend functions. Version numbering change format to be compatible with vcpkg.
2023-11-25: v2.02 - Swapped some manual swapping with std::swap where appropriate.
2023-10-06: v2.01 - Correction to min/max block support.
2023-08-10: v2.00 - Added iterators, for debugging purposes.
2023-07-05: v1.25 - Bypass buggy MSVC /W4 warning, comment corrections.
2023-06-23: v1.24 - Improvement for support on older compilers.
2023-02-03: v1.23 - More accurate compiler/library detection for C++20 support, C++20 support disabled for apple clang as it is not as up to date as non-apple clang.
2023-01-11: v1.22 - Support for disabling exceptions under GCC, Clang and MSVC. Will (a) now compile under gcc/clang when -fno-exceptions is specified and (b) will terminate in cases where exceptions would normally be thrown.
2022-10-02: v1.21 - Correction to move constructor. Correction for older compilers which partially support C++11 like MSVC2010. Removed plf::rand dependency from test suite.
2022-09-03: v1.20 - Correction for use with allocators which supply non-trivial pointers.
2022-08-27: v1.19 - Correction/reduction to compiler detection routines. Support for stateful allocators. Constructor corrections. Faulty <=> operator removed.
2021-12-11: v1.18 - Added spaceship operator for three-way comparison of queues.
2021-09-18: v1.17 - Correction to compiler feature detections.
2021-06-23: v1.16 - Priority default changed to plf::speed, as I believe most people using plf::queue are going to be more interested in that than memory usage.
2021-06-13: v1.15 - Further changes to make queue more memory-efficient. The older model is still available when template parameter Priority is set to plf::speed. The newer model, which is the default of Priority == plf::memory, reduces performance but decreases memory usage by up to 30%, depending on number of elements and size of the element type. Benchmarks are updated to show the results of both models.
2021-06-06: v1.10 - Re-work of block expansion and recycling algorithms based on real-world benchmarking. Default maximum block capacity changed to be lower amount based on sizeof(element_type), due to increased performance and improved memory usage.
2021-05-26: v1.02 - Correction to compiler detection routines for clang/gcc running in MSVC environment.
2021-05-20: v1.01 - Correction to compiler detection routines for clang - only libc++ 13 and above support cpp20 fully.
2021-05-17: v1.00 - First release.