Context: The STL deque is implemented in libc++ using a two-level array, the map
is a dynamic array that contains pointers to fixed-sized arrays called blocks
(4096 bytes in libc++, 512 bytes in libstdc++. For proof, see Appendix A). There is only one map
, which is unbounded in size, and an unbounded number of blocks
, which are fixed in size. For a detailed explanation of how the STL deque push_front
function is implemented in libc++ and libstdc++, see Appendix D and Appendix E respectively.
Here is an accurate visualization (for proof, see Appendix B) of how the map
looks after each new block
(the blocks
are named in order of allocation, so b01 was the first block
that was allocated, b02 is the second block
that was allocated, and so on) is added to the map
by calls to push_front
. Linear time operations such as copying or shifting the entire map
array via memmove
are labeled:
[]
[b01]
[b02, b01] copied
[b03, b02, b01, ___] copied
[b04, b03, b02, b01] shifted by 1
[b05, b04, b03, b02, b01, ___, ___, ___] copied
[___, b06, b05, b04, b03, b02, b01, ___] shifted by 2
[b07, b06, b05, b04, b03, b02, b01, ___]
[b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1
[b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied
[___, ___, ___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] shifted by 4
[___, ___, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[___, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[___, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] shifted by 2
[b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___]
[b16, b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1
For comparison, here is the same visualization (for proof, see Appendix C) for the libstdc++ deque implementation. Linear time operations such as copying or shifting the entire map
array are labeled:
[___, ___, ___, b01, ___, ___, ___, ___]
[___, ___, b02, b01, ___, ___, ___, ___]
[___, b03, b02, b01, ___, ___, ___, ___]
[b04, b03, b02, b01, ___, ___, ___, ___]
[___, ___, ___, ___, ___, ___, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied
[___, ___, ___, ___, ___, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, ___, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b12, b11, ..., b01, ___x13] copied
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[b25, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
Time complexity analysis
For the libc++ deque:
Each copy and each shift is a linear time operation, and since the number of shifts clearly dominates the number of copies, we will only look at the cost of the shifts.
To derive a lower bound we will only look at the shifts that happen after the last doubling of the map
array.
Shifts are done using the memmove
instruction which is a linear time operation. The cost of a shift (asymptotically) is directly proportional to the number of elements that are shifted.
Just after the map
is resized, its capacity is n and the number of elements inside it is n/2 + 1
. When another block
is added to it, the elements inside it are shifted by half of the remaining distance from the last element to the end of the map
array, that is to say, it is shifted by half of n/2 - 1
, rounded up. This leaves exactly n/4 - 1
free slots at both ends of the map
. When the free space at the front is filled up, the elements are shifted again, by the same formula as before.
That is to say, each time the shift happens, the amount of free space between the last element and the end of the array is reduced by half (rounded up). The total number of shifts, therefore, is on the order of log2(n/2)
which is equal to log2(n) - 1
.
So now that we have the number of shifts, we just need to calculate the cost of each shift.
The first shift moves n/2 + 1
elements, the second shift moves n/2 + n/4 + 1
, the third shift moves n/2 + n/4 + n/8 + 1
elements and so on. To calculate a lower bound, we’ll simply assume that each shift moves exactly n/2
elements.
So we have: the number of shifts is log2(n) - 1
and each shift has a cost of n/2
.
The total cost is:
(log2(n) - 1) * n/2
= 0.5 * n * (log2(n) - 1)
= 0.5 * n * log2(n) - 0.5 * n
To compute the amortized time complexity of the push_front
operation, we divide the total cost by the number of times we called push_front
, which is n * block_size
times (we called it n * block_size
times to insert n
blocks). Thus we have that the amortized cost is:
0.5 * n * (log2(n) - 1) / (n * block_size)
= 0.5 * (log2(n) - 1) / block_size
= (0.5 * log2(n) - 0.5) / block_size
Now, to get the Big Omicron, we get rid of the constants and the constant factors, and we replace log2 with log (since log base doesn’t matter in Big O analysis), in order to arrive at the final result:
O(log n)
And that is the amortized time complexity of the push_front
operation in the libc++ deque implementation. Do note that this is a lower bound. I do not think that the upper bound would be different, so I leave its calculation as an exercise for the reader.
Now we do the time complexity analysis for the libstdc++ deque implementation:
Here there are only copies and no shifts. The algorithm is similar to how a dynamic array typically resizes itself: once it reaches the end, it then allocates a new block of memory that is twice the size of its current capacity. The difference here is that the elements are copied not to the start of the new block of memory, but they are copied such that the number of empty slots in the front and back of the array are roughly equal.
Now, the capacity of the map roughly doubles (to be precise, the new capacity is double of the previous capacity plus two) every time it is reallocated. Thus, it begins at 8, then grows to 18, then to 38 and 78 so on. However, the elements are copied such that the number of free slots at the front of the map is roughly equal to the number of free slots at the back of the map. So, in order to determine how many reallocations there will be, we need to figure out how many free slots there are in front of the map every time the map is reallocated and the elements are copied over.
Now, when the map is first created, there are 4 free slots at the front side of the map, though one is taken up immediately and is not available for use by push_front
. In any case, once these 4 slots are filled up, the map is reallocated to size 18. The location where the 4 elements are copied to is calculated thusly:
offset = (this->_M_impl._M_map_size - __new_num_nodes) / 2 + (__add_at_front ? __nodes_to_add : 0);
= (new_map_capacity - (current_num_elements + 1)) / 2 + 1;
So, in this case, it is (18 - 5) / 2 + 1 = 13 / 2 + 1 = 6 + 1 = 7. Thus there are 7 free slots at the front. That’s why it then takes another 7 insertions before the front of the map is filled.
In the next case, the new map size is 38 and the number of elements is 11, so the calculation goes:
free slots at front = (38 - 12) / 2 + 1 = 26 / 2 + 1 = 13 + 1 = 14
So it takes another 14 insertions for the front of the map to be filled.
In the next case, the new map size is 78 and the number of elements is 11 + 14 = 25, so the calculation goes:
free slots at front = (78 - 26) / 2 + 1 = 52 / 2 + 1 = 26 + 1 = 27
So there are now 27 free slots at the front, followed by 25 elements, followed by 26 free slots at the back. Once the 26th element is inserted, it will be perfectly balanced.
With 26 free slots at the front, it would take 26 insertions to fill up the front. So the next time the map is filled at the front would be when it has 26 + 26 = 52 elements. Let’s calculate it again:
free slots at front = (158 - 53) / 2 + 1 = 105 / 2 + 1 = 52 + 1 = 53
With 53 free slots at the front followed by 52 elements, there are 53 elements at the back. Perfectly balanced. The next time the front will be filled will be when there are 53 + 52 = 105 elements in the map.
The reader may want to convince themself that if the map is reallocated when the number of elements in the map is equal to 2/3 of the capacity of the map, then once the map’s capacity is doubled, then the number of elements in the map will be equal to 1/3 of the capacity of the map. Thus, when the elements are copied over such that the amount of free space in the front of the map is equal to the amount of space at the back of the map, we have that the first 1/3 of the map will be empty, the middle 1/3 of the map will be filled with the elements, and the last 1/3 of the map will also be empty. As the elements in the map will start growing from the middle 1/3 to the front, once the front of the map is filled, then once again we have reached the point where the number of elements in the map is equal to 2/3 of the map’s capacity. Thus this cycle is stable and will go on forever.
This also means that a reallocation happens roughly when the number of elements in the map doubles (from 1/3 to 2/3). Thus, the number of elements that are copied over doubles with every reallocation. And so the total cost of all the copies is roughly this:
n + n/2 + n/4 + ... + 4 ~= 2n
Which is O(n). So, after inserting n blocks, the total cost of all of the copies is O(n), therefore the amortized time complexity of a push_front
in the libstdc++ deque implementation is just O(1). This is in contrast to the amortized time complexity of the same operation in the libc++ implementation, which is O(log n).
No benchmarks are attached because I can’t figure out how to write the code in such a way that the contents in memory are modified in exactly the way that I expect (in the optimized version I can’t figure out where the contents of the map are stored. Printing out the contents of the map did not give the results that I expected). The tests that I did run, initially the libstdc++ and the libc++ versions have roughly the same performance, but the performance diverges after a number of insertions somewhere between 6,400 and 12,800. After that the runtime of the libc++ version is roughly 2-3 times that of the libstdc++ version. I don’t think my benchmarking code is correct though, so I’m not going to attach it here. Anyways, the point of this post is to discuss the high level algorithms, not to do micro-benchmarks (the outcomes of which depend on the compiler version anyway).
Appendix A
Here is the source code that we will be using to find out the block size used by the libc++ and libstdc++ implementations of the STL deque:
#include <deque>
#include <stdint.h>
int main()
{
std::deque<int8_t> d8 = {};
std::deque<int16_t> d16 = {};
std::deque<int32_t> d32 = {};
std::deque<int64_t> d64 = {};
int8_t i8 = 5;
int16_t i16 = 5;
int32_t i32 = 5;
int64_t i64 = 5;
d8.push_front(i8);
d8.push_front(i8);
d16.push_front(i16);
d16.push_front(i16);
d32.push_front(i32);
d32.push_front(i32);
d64.push_front(i64);
d64.push_front(i64);
}
First we will look at the block size in the libc++ implementation of the STL deque. So compile it with libc++ and run it under gdb:
clang++ -stdlib=libc++ a.cpp -o a.out -g
gdb x.out
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from x.out...
(gdb) break main
Breakpoint 1 at 0x4011ef: file sizetest.cpp, line 6.
(gdb) r
Starting program: /home/x/src/x.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main () at sizetest.cpp:6
6 std::deque<int8_t> d8 = {};
(gdb) layout src
(gdb) b 15
Breakpoint 2 at 0x401252: file sizetest.cpp, line 15.
(gdb) c
Continuing.
Breakpoint 2, main () at sizetest.cpp:15
┌─sizetest.cpp───────────────────────────────────────────────────────────────────────────────────────┐
│ 5 { │
│B+ 6 std::deque<int8_t> d8 = {}; │
│ 7 std::deque<int16_t> d16 = {}; │
│ 8 std::deque<int32_t> d32 = {}; │
│ 9 std::deque<int64_t> d64 = {}; │
│ 10 int8_t i8 = 5; │
│ 11 int16_t i16 = 5; │
│ 12 int32_t i32 = 5; │
│ 13 int64_t i64 = 5; │
│ 14 │
│B+>15 d8.push_front(i8); │
│ 16 d8.push_front(i8); │
│ 17 d16.push_front(i16); │
│ 18 d16.push_front(i16); │
│ 19 d32.push_front(i32); │
│ 20 d32.push_front(i32); │
│ 21 d64.push_front(i64); │
│ 22 d64.push_front(i64); │
│ 23 } │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: main L15 PC: 0x401252
(gdb) s
std::__1::deque<signed char, std::__1::allocator<signed char> >::push_front (this=0x7fffffffe110,
__v=@0x7fffffffe07f: 5 '\005') at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1940
(gdb) n
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────┐
│ 1931 // __back_spare() >= 1 │
│ 1932 __alloc_traits::construct(__a, _VSTD::addressof(*__base::end()), __v); │
│ 1933 ++__base::size(); │
│ 1934 } │
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ 1941 if (__front_spare() == 0) │
│ >1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
│ 1949 #ifndef _LIBCPP_CXX03_LANG │
│ 1950 template <class _Tp, class _Allocator> │
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<signed char, std::__1::allocat* L1942 PC: 0x40143e
(gdb) s
std::__1::deque<signed char, std::__1::allocator<signed char> >::__add_front_capacity (
this=0x7fffffffe110) at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2425
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────┐
│ 2416 │
│ 2417 } │
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ 2425 allocator_type& __a = __base::__alloc(); │
│ >2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allo│
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<signed char, std::__1::allocat* L2426 PC: 0x404ed0
Breakpoint 2, main () at sizetest.cpp:15
(gdb) print __block_size
$1 = <optimized out>
Observe that if we try to print out the __block_size
gdb says “optimized out”. This is because libc++ was compiled with optimization enabled. It’s irrelevant whether our code was compiled with optimization enabled, so disabling optimization for our own code isn’t going to do anything.
What we can do is to use layout asm
and the ni
(next assembly instruction) command to step through the assembly code line by line. So let’s look at the assembly:
(gdb) layout asm
(gdb) ni
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 0x404ea0 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv> push %rbp │
│ 0x404ea1 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+1> mov %rsp,%rbp │
│ 0x404ea4 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+4> sub $0x130,%rsp │
│ 0x404eab <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+11> mov %rdi,-0x8(%rbp) │
│ 0x404eaf <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+15> mov -0x8(%rbp),%rax │
│ 0x404eb3 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+19> mov %rax,%rcx │
│ 0x404eb6 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+22> mov %rcx,%rdi │
│ 0x404eb9 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+25> mov %rax,-0xb8(%rbp) │
│ 0x404ec0 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+32> call 0x403d10 <_ZNSt3__112__de │
│ 0x404ec5 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+37> mov %rax,-0x10(%rbp) │
│ 0x404ec9 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+41> mov -0xb8(%rbp),%rdi │
│ 0x404ed0 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+48> call 0x4053d0 <_ZNKSt3__15dequ │
│ >0x404ed5 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+53> cmp $0x1000,%rax │
│ 0x404edb <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+59> jb 0x404f2d <_ZNSt3__15deque │
│ 0x404ee1 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+65> mov -0xb8(%rbp),%rax │
│ 0x404ee8 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+72> mov 0x20(%rax),%rcx │
│ 0x404eec <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+76> add $0x1000,%rcx │
│ 0x404ef3 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+83> mov %rcx,0x20(%rax) │
│ 0x404ef7 <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+87> mov %rax,%rdi │
│ 0x404efa <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+90> call 0x405420 <_ZNSt3__114__sp │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Unfortunately the name of the called function has been cut off due to the width of the screen. From now on I will be showing the assembly listing scrolled to the right so that the called function name is visible:
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ nt_capacityEv> push %rbp │
│ nt_capacityEv+1> mov %rsp,%rbp │
│ nt_capacityEv+4> sub $0x130,%rsp │
│ nt_capacityEv+11> mov %rdi,-0x8(%rbp) │
│ nt_capacityEv+15> mov -0x8(%rbp),%rax │
│ nt_capacityEv+19> mov %rax,%rcx │
│ nt_capacityEv+22> mov %rcx,%rdi │
│ nt_capacityEv+25> mov %rax,-0xb8(%rbp) │
│ nt_capacityEv+32> call 0x403d10 <_ZNSt3__112__deque_baseIaNS_9allocatorIaEEE7__allocEv> │
│ nt_capacityEv+37> mov %rax,-0x10(%rbp) │
│ nt_capacityEv+41> mov -0xb8(%rbp),%rdi │
│ nt_capacityEv+48> call 0x4053d0 <_ZNKSt3__15dequeIaNS_9allocatorIaEEE12__back_spareEv> │
│ >nt_capacityEv+53> cmp $0x1000,%rax │
│ nt_capacityEv+59> jb 0x404f2d <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+141> │
│ nt_capacityEv+65> mov -0xb8(%rbp),%rax │
│ nt_capacityEv+72> mov 0x20(%rax),%rcx │
│ nt_capacityEv+76> add $0x1000,%rcx │
│ nt_capacityEv+83> mov %rcx,0x20(%rax) │
│ nt_capacityEv+87> mov %rax,%rdi │
│ nt_capacityEv+90> call 0x405420 <_ZNSt3__114__split_bufferIPaNS_9allocatorIS1_EEE4backEv> │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
As you can see immediately after the call instruction call 0x4053d0 <_ZNKSt3__15dequeIaNS_9allocatorIaEEE12__back_spareEv>
there is a cmp instruction with the immediate value 0x1000, which is 4096 in decimal. That means immediately after the call to __back_spare()
the value in the rax register is compared to 4096. By convention, integer return values are put into rax. But here I’m going to prove to you that the function puts the return value into the rax register:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────┐
│ 1504 _LIBCPP_INLINE_VISIBILITY │
│ 1505 size_type __back_spare() const │
│ 1506 { │
│ >1507 return __capacity() - (__base::__start_ + __base::size()); │
│ 1508 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) disas __back_spare
Dump of assembler code for function _ZNKSt3__15dequeIaNS_9allocatorIaEEE12__back_spareEv:
0x00000000004053d0 <+0>: push %rbp
0x00000000004053d1 <+1>: mov %rsp,%rbp
0x00000000004053d4 <+4>: sub $0x20,%rsp
0x00000000004053d8 <+8>: mov %rdi,-0x8(%rbp)
0x00000000004053dc <+12>: mov -0x8(%rbp),%rax
0x00000000004053e0 <+16>: mov %rax,%rdi
=> 0x00000000004053e3 <+19>: mov %rax,-0x10(%rbp)
0x00000000004053e7 <+23>: call 0x406490 <_ZNKSt3__15dequeIaNS_9allocatorIaEEE10__capacityEv>
0x00000000004053ec <+28>: mov -0x10(%rbp),%rcx
0x00000000004053f0 <+32>: mov 0x20(%rcx),%rdx
0x00000000004053f7 <+39>: mov %rax,-0x18(%rbp)
0x00000000004053fb <+43>: mov %rdx,-0x20(%rbp)
0x00000000004053ff <+47>: call 0x4064f0 <_ZNKSt3__112__deque_baseIaNS_9allocatorIaEEE4sizeEv>
0x0000000000405404 <+52>: mov -0x20(%rbp),%rcx
0x0000000000405408 <+56>: add (%rax),%rcx
0x000000000040540b <+59>: mov -0x18(%rbp),%rax
0x000000000040540f <+63>: sub %rcx,%rax
0x0000000000405412 <+66>: add $0x20,%rsp
0x0000000000405416 <+70>: pop %rbp
0x0000000000405417 <+71>: ret
The first few lines are the prologue of the function (https://en.wikipedia.org/wiki/Function_prologue_and_epilogue). In the disassembly above, the prologue is:
0x00000000004053d0 <+0>: push %rbp
0x00000000004053d1 <+1>: mov %rsp,%rbp
0x00000000004053d4 <+4>: sub $0x20,%rsp
The first instruction pushes the base pointer (stored in the rbp register) onto the stack, so it can be later restored with a pop instruction. We do this because we are about to overwrite the rbp register.
The second instruction copies the value held in the rsp register into the rbp register, overwriting it.
The third instruction subtracts 0x20 from the value held in the rsp register. Conceptually this allocates 0x20 bytes of stack space in order to store the local variables. Subtraction is used because on x86 systems the stack grows down.
The epilogue is:
0x0000000000405412 <+66>: add $0x20,%rsp
0x0000000000405416 <+70>: pop %rbp
0x0000000000405417 <+71>: ret
This basically just undoes everything that the prologue did: it first adds 0x20 to the rsp register, freeing the space that we allocated for the local variables, then it restores the rbp pointer that we pushed onto the stack. Finally, we return to the caller by popping the stored program counter from the stack and jumping to it.
gdb uses AT&T syntax by default (though you can set it to Intel syntax). Here the assembly is in AT&T syntax, so the sub %rcx,%rax
instruction means subtract the value stored in the rcx register from the value stored in the rax register and the store the resulting value in the rax register. The instructions after that are just the epilogue of the function. Hence it can be seen that the results of the computation (the subtraction is the last step - see source code above) are stored in the rax register.
So we have established that the call instruction call 0x4053d0 <_ZNKSt3__15dequeIaNS_9allocatorIaEEE12__back_spareEv>
indeed puts the return value into the rax register. The instruction immediately following that, cmp $0x1000,%rax
, then, compares the return value of the call to __back_spare
with the immediate 0x100. Let’s revisit the source code and place the relevant assembly instructions next to it:
if (__back_spare() >= __base::__block_size)
│ nt_capacityEv+48> call 0x4053d0 <_ZNKSt3__15dequeIaNS_9allocatorIaEEE12__back_spareEv> │
│ >nt_capacityEv+53> cmp $0x1000,%rax │
│ nt_capacityEv+59> jb 0x404f2d <_ZNSt3__15dequeIaNS_9allocatorIaEEE20__add_front_capacityEv+141> │
Taken together, what these lines of assembly do is, first the call instruction puts the result of calling __back_spare()
into the rax register. Then the cmp instruction compares the value in the rax register is compared to 0x1000
. It then sets the ZF and CF flags, which the jb instruction looks at to determine whether or not to jump. It is clear, therefore, that the 0x1000 is in fact the “optimized out” value of the __base::__block_size
variable.
Now that we have established that the call to __back_spare
, cmp, and jb instruction sequence corresponds exactly to the if statement, we shall look at the disassembly of that exact if statement in deques containing differently sized elements.
So far we have looked at the deque containing int8_t values. Now we will move on to the deque containing int16_t values:
┌─sizetest.cpp───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 6 std::deque<int8_t> d8 = {}; │
│ 7 std::deque<int16_t> d16 = {}; │
│ 8 std::deque<int32_t> d32 = {}; │
│ 9 std::deque<int64_t> d64 = {}; │
│ 10 int8_t i8 = 5; │
│ 11 int16_t i16 = 5; │
│ 12 int32_t i32 = 5; │
│ 13 int64_t i64 = 5; │
│ 14 │
│B+ 15 d8.push_front(i8); │
│ 16 d8.push_front(i8); │
│ >17 d16.push_front(i16); │
│ 18 d16.push_front(i16); │
│ 19 d32.push_front(i32); │
│ 20 d32.push_front(i32); │
│ 21 d64.push_front(i64); │
│ 22 d64.push_front(i64); │
│ 23 } │
│ │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: main L17 PC: 0x40127c
(gdb) s
std::__1::deque<short, std::__1::allocator<short> >::push_front (this=0x7fffffffe0e0, __v=@0x7fffffffe07c: 5)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1940
(gdb) n
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────┐
│ 1931 // __back_spare() >= 1 │
│ 1932 __alloc_traits::construct(__a, _VSTD::addressof(*__base::end()), __v); │
│ 1933 ++__base::size(); │
│ 1934 } │
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ 1941 if (__front_spare() == 0) │
│ >1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
│ 1949 #ifndef _LIBCPP_CXX03_LANG │
│ 1950 template <class _Tp, class _Allocator> │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<short, std::__1::allocator<short> >::push_* L1942 PC: 0x4014fe
(gdb) s
std::__1::deque<short, std::__1::allocator<short> >::__add_front_capacity (this=0x7fffffffe0e0)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2425
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────┐
│ 2416 │
│ 2417 } │
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ 2425 allocator_type& __a = __base::__alloc(); │
│ >2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffe│
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<short, std::__1::allocator<short> >::__add* L2426 PC: 0x4073f0
(gdb) layout asm
(gdb) ni
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ apacityEv+4> sub $0x130,%rsp │
│ apacityEv+11> mov %rdi,-0x8(%rbp) │
│ apacityEv+15> mov -0x8(%rbp),%rax │
│ apacityEv+19> mov %rax,%rcx │
│ apacityEv+22> mov %rcx,%rdi │
│ apacityEv+25> mov %rax,-0xb8(%rbp) │
│ apacityEv+32> call 0x4031c0 <_ZNSt3__112__deque_baseIsNS_9allocatorIsEEE7__allocEv> │
│ apacityEv+37> mov %rax,-0x10(%rbp) │
│ apacityEv+41> mov -0xb8(%rbp),%rdi │
│ apacityEv+48> call 0x4078f0 <_ZNKSt3__15dequeIsNS_9allocatorIsEEE12__back_spareEv> │
│ >apacityEv+53> cmp $0x800,%rax │
│ apacityEv+59> jb 0x40744d <_ZNSt3__15dequeIsNS_9allocatorIsEEE20__add_front_capacityEv+141> │
│ apacityEv+65> mov -0xb8(%rbp),%rax │
│ apacityEv+72> mov 0x20(%rax),%rcx │
│ apacityEv+76> add $0x800,%rcx │
│ apacityEv+83> mov %rcx,0x20(%rax) │
│ apacityEv+87> mov %rax,%rdi │
│ apacityEv+90> call 0x407940 <_ZNSt3__114__split_bufferIPsNS_9allocatorIS1_EEE4backEv> │
│ apacityEv+95> mov (%rax),%rax │
│ apacityEv+98> mov %rax,-0x18(%rbp) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<short, std::__1::allocator<short> >::__add* L2426 PC: 0x4073f5
Now, notice the call to __back_spare
, cmp, and jb instruction sequence that we noted down earlier. Notice how this time the cmp instruction has the immediate 0x800 rather than 0x1000. In fact, 0x800 is half of 0x1000: while 0x1000 is equal to 4096 in decimal, 0x800 is equal to 2048 in decimal. What this means is that as we have moved from the deque containing int8_t values to the deque containing int16_t values, the __block_size
has decreased from 4096 to 2048.
I skip the gdb logs here since they are basically the same as what we have already seen and just present to you the assembly listing for the same if statement except for deques containing int32_t values and int64_t values. Here is the assembly listing for the int32_t deque:
(gdb) layout asm
(gdb) ni
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ ont_capacityEv+4> sub $0x130,%rsp │
│ ont_capacityEv+11> mov %rdi,-0x8(%rbp) │
│ ont_capacityEv+15> mov -0x8(%rbp),%rax │
│ ont_capacityEv+19> mov %rax,%rcx │
│ ont_capacityEv+22> mov %rcx,%rdi │
│ ont_capacityEv+25> mov %rax,-0xb8(%rbp) │
│ ont_capacityEv+32> call 0x402670 <_ZNSt3__112__deque_baseIiNS_9allocatorIiEEE7__allocEv> │
│ ont_capacityEv+37> mov %rax,-0x10(%rbp) │
│ ont_capacityEv+41> mov -0xb8(%rbp),%rdi │
│ ont_capacityEv+48> call 0x409c70 <_ZNKSt3__15dequeIiNS_9allocatorIiEEE12__back_spareEv> │
│ >ont_capacityEv+53> cmp $0x400,%rax │
│ ont_capacityEv+59> jb 0x4097cd <_ZNSt3__15dequeIiNS_9allocatorIiEEE20__add_front_capacityEv+141> │
│ ont_capacityEv+65> mov -0xb8(%rbp),%rax │
│ ont_capacityEv+72> mov 0x20(%rax),%rcx │
│ ont_capacityEv+76> add $0x400,%rcx │
│ ont_capacityEv+83> mov %rcx,0x20(%rax) │
│ ont_capacityEv+87> mov %rax,%rdi │
│ ont_capacityEv+90> call 0x409cc0 <_ZNSt3__114__split_bufferIPiNS_9allocatorIS1_EEE4backEv> │
│ ont_capacityEv+95> mov (%rax),%rax │
│ ont_capacityEv+98> mov %rax,-0x18(%rbp) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<int, std::__1::allocator<int> >::__add_fro* L2426 PC: 0x409775
As you can see, rax is now compared to 0x400, which is half of 0x800.
Here is the assembly listing for the int64_t deque:
(gdb) layout asm
(gdb) ni
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ ont_capacityEv+4> sub $0x130,%rsp │
│ ont_capacityEv+11> mov %rdi,-0x8(%rbp) │
│ ont_capacityEv+15> mov -0x8(%rbp),%rax │
│ ont_capacityEv+19> mov %rax,%rcx │
│ ont_capacityEv+22> mov %rcx,%rdi │
│ ont_capacityEv+25> mov %rax,-0xb8(%rbp) │
│ ont_capacityEv+32> call 0x401a60 <_ZNSt3__112__deque_baseIlNS_9allocatorIlEEE7__allocEv> │
│ ont_capacityEv+37> mov %rax,-0x10(%rbp) │
│ ont_capacityEv+41> mov -0xb8(%rbp),%rdi │
│ ont_capacityEv+48> call 0x40bff0 <_ZNKSt3__15dequeIlNS_9allocatorIlEEE12__back_spareEv> │
│ >ont_capacityEv+53> cmp $0x200,%rax │
│ ont_capacityEv+59> jb 0x40bb4d <_ZNSt3__15dequeIlNS_9allocatorIlEEE20__add_front_capacityEv+141> │
│ ont_capacityEv+65> mov -0xb8(%rbp),%rax │
│ ont_capacityEv+72> mov 0x20(%rax),%rcx │
│ ont_capacityEv+76> add $0x200,%rcx │
│ ont_capacityEv+83> mov %rcx,0x20(%rax) │
│ ont_capacityEv+87> mov %rax,%rdi │
│ ont_capacityEv+90> call 0x40c040 <_ZNSt3__114__split_bufferIPlNS_9allocatorIS1_EEE4backEv> │
│ ont_capacityEv+95> mov (%rax),%rax │
│ ont_capacityEv+98> mov %rax,-0x18(%rbp) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_f* L2426 PC: 0x40baf5
And here the value of the immediate is 0x200.
So we have seen that for the int8_t, the block size is 4096, for the int16_t, the block size is 2048, for the int32_t, the block size is 1024, and for the int64_t the block size is 512.
This agrees with the hypothesis that the block size is 4 kilobytes or 4096 bytes. Since a byte is 8 bits, a block can contain 4096 int8_ts, which each occupy only 1 byte. Since the int16_t value occupies 16 bits or 2 bytes, a block can only contain half as many, so only 2048. And so on for the int32_t and int64_t.
Now let’s look at the block size in the libstdc++ implementation of the deque:
g++ sizetest.cpp -o x.out -g
gdb x.out
GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from x.out...
(gdb) break main
Breakpoint 1 at 0x11c1: file sizetest.cpp, line 8.
(gdb) r
Starting program: /home/x/src/x.out
Breakpoint 1, main () at sizetest.cpp:8
8 std::deque<int8_t> d8 = {};
(gdb) layout src
(gdb) b 17
Breakpoint 2 at 0x5555555553ad: file sizetest.cpp, line 17.
(gdb) c
Continuing.
Breakpoint 2, main () at sizetest.cpp:17
┌─sizetest.cpp───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 7 { │
│B+ 8 std::deque<int8_t> d8 = {}; │
│ 9 std::deque<int16_t> d16 = {}; │
│ 10 std::deque<int32_t> d32 = {}; │
│ 11 std::deque<int64_t> d64 = {}; │
│ 12 int8_t i8 = 5; │
│ 13 int16_t i16 = 5; │
│ 14 int32_t i32 = 5; │
│ 15 int64_t i64 = 5; │
│ 16 │
│B+>17 d8.push_front(i8); │
│ 18 d8.push_front(i8); │
│ 19 d16.push_front(i16); │
│ 20 d16.push_front(i16); │
│ 21 d32.push_front(i32); │
│ 22 d32.push_front(i32); │
│ 23 d64.push_front(i64); │
│ 24 d64.push_front(i64); │
│ 25 } │
│ │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 50097 In: main L17 PC: 0x5555555553ad
(gdb) s
std::deque<signed char, std::allocator<signed char> >::push_front (this=0x7fffffffe0e0,
__x=@0x7fffffffdfef: 5 '\005') at /usr/include/c++/10/bits/stl_deque.h:1458 L17 PC: 0x5555555553ad
(gdb) n
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 1449 * │
│ 1450 * This is a typical stack operation. The function creates an │
│ 1451 * element at the front of the %deque and assigns the given │
│ 1452 * data to it. Due to the nature of a %deque this operation │
│ 1453 * can be done in constant time. │
│ 1454 */ │
│ 1455 void │
│ 1456 push_front(const value_type& __x) │
│ 1457 { │
│ 1458 if (this->_M_impl._M_start._M_cur != this->_M_impl._M_start._M_first) │
│ 1459 { │
│ 1460 _Alloc_traits::construct(this->_M_impl, │
│ 1461 this->_M_impl._M_start._M_cur - 1, │
│ 1462 __x); │
│ 1463 --this->_M_impl._M_start._M_cur; │
│ 1464 } │
│ 1465 else │
│ >1466 _M_push_front_aux(__x); │
│ 1467 } │
│ 1468 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 50097 In: std::deque<signed char, std::allocator<signed char> >::push_fr* L1466 PC: 0x555555555b00
(gdb) s
std::deque<signed char, std::allocator<signed char> >::_M_push_front_aux<signed char const&> (
this=0x7fffffffe0e0) at /usr/include/c++/10/bits/deque.tcc:528
(gdb) n
(gdb) n
┌─/usr/include/c++/10/bits/deque.tcc─────────────────────────────────────────────────────────────────────────────┐
│ 523 void │
│ 524 deque<_Tp, _Alloc>:: │
│ 525 _M_push_front_aux(const value_type& __t) │
│ 526 #endif │
│ 527 { │
│ 528 if (size() == max_size()) │
│ 529 __throw_length_error( │
│ 530 __N("cannot create std::deque larger than max_size()")); │
│ 531 │
│ 532 _M_reserve_map_at_front(); │
│ >533 *(this->_M_impl._M_start._M_node - 1) = this->_M_allocate_node(); │
│ 534 __try │
│ 535 { │
│ 536 this->_M_impl._M_start._M_set_node(this->_M_impl._M_start._M_node │
│ 537 - 1); │
│ 538 this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_last - 1; │
│ 539 #if __cplusplus >= 201103L │
│ 540 _Alloc_traits::construct(this->_M_impl, │
│ 541 this->_M_impl._M_start._M_cur, │
│ 542 std::forward<_Args>(__args)...); │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::deque<signed char, std::allocator<signed char> >::_M_push* L533 PC: 0x555555556a2b
(gdb) s
std::_Deque_base<signed char, std::allocator<signed char> >::_M_allocate_node (this=0x7fffffffe0e0)
at /usr/include/c++/10/bits/stl_deque.h:559
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 555 _Ptr │
│ 556 _M_allocate_node() │
│ 557 { │
│ 558 typedef __gnu_cxx::__alloc_traits<_Tp_alloc_type> _Traits; │
│ >559 return _Traits::allocate(_M_impl, __deque_buf_size(sizeof(_Tp))); │
│ 560 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::_Deque_base<signed char, std::allocator<signed char> >::_* L559 PC: 0x555555557842
(gdb) print _M_impl
$1 = {<std::allocator<signed char>> = {<__gnu_cxx::new_allocator<signed char>> = {<No data fields>}, <No data fiel
ds>}, <std::_Deque_base<signed char, std::allocator<signed char> >::_Deque_impl_data> = {
_M_map = 0x555555571eb0, _M_map_size = 8, _M_start = 0 '\000', _M_finish = 0 '\000'}, <No data fields>}
(gdb) print __deque_buf_size(sizeof(_Tp))
$2 = 512
(gdb) s
std::__deque_buf_size (__size=1) at /usr/include/c++/10/bits/stl_deque.h:98
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 90 │
│ 91 #ifndef _GLIBCXX_DEQUE_BUF_SIZE │
│ 92 #define _GLIBCXX_DEQUE_BUF_SIZE 512 │
│ 93 #endif │
│ 94 │
│ 95 _GLIBCXX_CONSTEXPR inline size_t │
│ 96 __deque_buf_size(size_t __size) │
│ 97 { return (__size < _GLIBCXX_DEQUE_BUF_SIZE │
│ >98 ? size_t(_GLIBCXX_DEQUE_BUF_SIZE / __size) : size_t(1)); } │
│ 99 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::__deque_buf_size L98 PC: 0x555555555525
As you can see, the _M_allocate_node
function, which allocates a “node” aka a block, here uses the signed char allocator, and here it allocates enough memory to hold 512 signed chars.
Stepping into the __deque_buf_size
function we can see that it simply divides the defined _GLIBCXX_DEQUE_BUF_SIZE by the parameter passed in. The sizeof operator returns the number of bytes occupied by an object. Here, a signed char is the type, and a signed char occupies one byte, thus sizeof returns 1 in this case. Thus it is the case that sizeof(_tp) is 1 and that _GLIBCXX_DEQUE_BUF_SIZE
is 512.
Next we move on to the deque of int16_t values:
┌─sizetest.cpp───────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 9 std::deque<int16_t> d16 = {}; │
│ 10 std::deque<int32_t> d32 = {}; │
│ 11 std::deque<int64_t> d64 = {}; │
│ 12 int8_t i8 = 5; │
│ 13 int16_t i16 = 5; │
│ 14 int32_t i32 = 5; │
│ 15 int64_t i64 = 5; │
│ 16 │
│B+ 17 d8.push_front(i8); │
│ 18 d8.push_front(i8); │
│ >19 d16.push_front(i16); │
│ 20 d16.push_front(i16); │
│ 21 d32.push_front(i32); │
│ 22 d32.push_front(i32); │
│ 23 d64.push_front(i64); │
│ 24 d64.push_front(i64); │
│ 25 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: main L19 PC: 0x5555555553d9
(gdb) s
std::deque<short, std::allocator<short> >::push_front (this=0x7fffffffe090, __x=@0x7fffffffdfec: 5)
at /usr/include/c++/10/bits/stl_deque.h:1458
(gdb) n
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 1449 * │
│ 1450 * This is a typical stack operation. The function creates an │
│ 1451 * element at the front of the %deque and assigns the given │
│ 1452 * data to it. Due to the nature of a %deque this operation │
│ 1453 * can be done in constant time. │
│ 1454 */ │
│ 1455 void │
│ 1456 push_front(const value_type& __x) │
│ 1457 { │
│ 1458 if (this->_M_impl._M_start._M_cur != this->_M_impl._M_start._M_first) │
│ 1459 { │
│ 1460 _Alloc_traits::construct(this->_M_impl, │
│ 1461 this->_M_impl._M_start._M_cur - 1, │
│ 1462 __x); │
│ 1463 --this->_M_impl._M_start._M_cur; │
│ 1464 } │
│ 1465 else │
│ >1466 _M_push_front_aux(__x); │
│ 1467 } │
│ 1468 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::deque<short, std::allocator<short> >::push_front L1466 PC: 0x555555555b70
(gdb) s
std::deque<short, std::allocator<short> >::_M_push_front_aux<short const&> (this=0x7fffffffe090)
at /usr/include/c++/10/bits/deque.tcc:528
(gdb) n
(gdb) n
┌─/usr/include/c++/10/bits/deque.tcc─────────────────────────────────────────────────────────────────────────────┐
│ 519 void │
│ 520 deque<_Tp, _Alloc>:: │
│ 521 _M_push_front_aux(_Args&&... __args) │
│ 522 #else │
│ 523 void │
│ 524 deque<_Tp, _Alloc>:: │
│ 525 _M_push_front_aux(const value_type& __t) │
│ 526 #endif │
│ 527 { │
│ 528 if (size() == max_size()) │
│ 529 __throw_length_error( │
│ 530 __N("cannot create std::deque larger than max_size()")); │
│ 531 │
│ 532 _M_reserve_map_at_front(); │
│ >533 *(this->_M_impl._M_start._M_node - 1) = this->_M_allocate_node(); │
│ 534 __try │
│ 535 { │
│ 536 this->_M_impl._M_start._M_set_node(this->_M_impl._M_start._M_node │
│ 537 - 1); │
│ 538 this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_last - 1; │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::deque<short, std::allocator<short> >::_M_push_front_aux<s* L533 PC: 0x555555556b33
(gdb) s
std::_Deque_base<short, std::allocator<short> >::_M_allocate_node (this=0x7fffffffe090)
at /usr/include/c++/10/bits/stl_deque.h:559
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 554 │
│ 555 _Ptr │
│ 556 _M_allocate_node() │
│ 557 { │
│ 558 typedef __gnu_cxx::__alloc_traits<_Tp_alloc_type> _Traits; │
│ >559 return _Traits::allocate(_M_impl, __deque_buf_size(sizeof(_Tp))); │
│ 560 } │
│ 561 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::_Deque_base<short, std::allocator<short> >::_M_allocate_n* L559 PC: 0x555555557950
(gdb) print _M_impl
$5 = {<std::allocator<short>> = {<__gnu_cxx::new_allocator<short>> = {<No data fields>}, <No data fields>}, <std::
_Deque_base<short, std::allocator<short> >::_Deque_impl_data> = {_M_map = 0x555555572110, _M_map_size = 8,
_M_start = 0, _M_finish = 0}, <No data fields>}
(gdb) print sizeof(_Tp)
$7 = 2
(gdb) print __deque_buf_size(sizeof(_Tp))
$6 = 256
Here we can see that _M_impl
is the allocator for the short type, which occupies 2 bytes. Here __deque_buf_size
function returns 256. Clearly, it is the result of _GLIBCXX_DEQUE_BUF_SIZE
, which is 512, being divided by sizeof(_Tp)
, which is 2.
Now I will omit the gdb logs for the other two deques, except for the last few lines, since it mostly just repeats what we have already seen so far. Here it is for the int32_t deque:
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 554 │
│ 555 _Ptr │
│ 556 _M_allocate_node() │
│ 557 { │
│ 558 typedef __gnu_cxx::__alloc_traits<_Tp_alloc_type> _Traits; │
│ >559 return _Traits::allocate(_M_impl, __deque_buf_size(sizeof(_Tp))); │
│ 560 } │
│ 561 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::_Deque_base<int, std::allocator<int> >::_M_allocate_node L559 PC: 0x555555557a5c
(gdb) print _M_impl
$9 = {<std::allocator<int>> = {<__gnu_cxx::new_allocator<int>> = {<No data fields>}, <No data fields>}, <std::_Deq
ue_base<int, std::allocator<int> >::_Deque_impl_data> = {_M_map = 0x555555572370, _M_map_size = 8, _M_start = 0,
_M_finish = 0}, <No data fields>}
(gdb) print sizeof(_Tp)
$10 = 4
(gdb) print __deque_buf_size(sizeof(_Tp))
$11 = 128
(gdb)
And for the int64_t deque:
┌─/usr/include/c++/10/bits/stl_deque.h───────────────────────────────────────────────────────────────────────────┐
│ 554 │
│ 555 _Ptr │
│ 556 _M_allocate_node() │
│ 557 { │
│ 558 typedef __gnu_cxx::__alloc_traits<_Tp_alloc_type> _Traits; │
│ >559 return _Traits::allocate(_M_impl, __deque_buf_size(sizeof(_Tp))); │
│ 560 } │
│ 561 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
native process 58209 In: std::_Deque_base<long, std::allocator<long> >::_M_allocate_node L559 PC: 0x555555557b6a
(gdb) print _M_impl
$13 = {<std::allocator<long>> = {<__gnu_cxx::new_allocator<long>> = {<No data fields>}, <No data fields>}, <std::_
Deque_base<long, std::allocator<long> >::_Deque_impl_data> = {_M_map = 0x5555555725d0, _M_map_size = 8,
_M_start = 0, _M_finish = 0}, <No data fields>}
(gdb) print sizeof(_Tp)
$14 = 8
(gdb) print __deque_buf_size(sizeof(_Tp))
$15 = 64
So here we can see that the block size for the libc++ deque is 4096 bytes while the block size for the libstdc++ deque is only 512 bytes - only one eighth the size! Thus while a block in a libc++ deque can hold 4096 values of type int8_t, a block in a libstdc++ deque can only hold 512 values of type int8_t. As for the int64_t, a block in a libc++ deque can hold 512 values, whereas a block in a libstdc++ deque can only hold 64.
Appendix B
Note: If you are not already familiar with the libc++ and libstdc++ implementations of the deque push_front function, then I would advise reading appendix D and E where I explain how it all works.
We will be using this program to look at when blocks are added to the map in libc++ and how they are arranged in the map:
#include <deque>
long mylong = 13370000;
void my_pf(std::deque<long>& d){
d.push_front(mylong);
++mylong;
}
void pushn(std::deque<long>& d, int n){
for (int i = 0; i < n; i++){
my_pf(d);
}
}
int main()
{
std::deque<long> d = {};
// zero blocks
pushn(d, 256);
// one block
my_pf(d);
// two blocks
pushn(d, 511);
// still two blocks
my_pf(d);
// three blocks
pushn(d, 511);
// still three blocks
my_pf(d); // 4th block is added
pushn(d, 511);
my_pf(d); // 5th block is added
pushn(d, 511);
my_pf(d); // 6th block is added
pushn(d, 511);
my_pf(d); // 7th block is added
pushn(d, 511);
my_pf(d); // 8th block is added
pushn(d, 511);
my_pf(d); // 9th block is added
pushn(d, 511);
my_pf(d); // 10th block is added
pushn(d, 511);
my_pf(d); // 11th block is added
pushn(d, 511);
my_pf(d); // 12th block is added
pushn(d, 511);
my_pf(d); // 13th block is added
pushn(d, 511);
my_pf(d); // 14th block is added
pushn(d, 511);
my_pf(d); // 15th block is added
pushn(d, 511);
my_pf(d); // 16th block is added
pushn(d, 511);
my_pf(d); // 17th block is added
}
Here I dump the contents of the map in gdb after every new block is added, starting at the first block:
(gdb) print __map_.capacity()
$3 = 1
(gdb) print *__map_.__first_@1
$4 = {0x4092c0}
(gdb) print __map_.capacity()
$1 = 2
(gdb) print *__map_.__first_@2
$2 = {0x40a2f0, 0x4092c0}
(gdb) print __map_.capacity()
$3 = 4
(gdb) print *__map_.__first_@4
$4 = {0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_.capacity()
$5 = 4
(gdb) print __map_
$1 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@4
$6 = {0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
(gdb) print __map_
$2 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$8 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$11 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d358,
__end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$10 = {0x40d3a0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$3 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$4 = {0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$5 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d390, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$6 = {0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
(gdb) print __map_
$7 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411428, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print __map_.capacity()
$8 = 16
(gdb) print *__map_.__first_@16
$9 = {0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0}
(gdb) print __map_
$10 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f8,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$11 = {0x411470, 0x4103d0, 0x40f3c0, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$12 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f0,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$13 = {0x411470, 0x4103d0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$14 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$15 = {0x411470, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$16 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$17 = {0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$18 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,
__end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$19 = {0x4154b0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,
0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$20 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$21 = {0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,
0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$22 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411460, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$23 = {0x4184e0, 0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0,
0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
Appendix C
Here is the source code I used for the same for libstdc++ (compile with clang++ -stdlib=libstdc++ h.cpp -o a.out -g
):
#include <deque>
long mylong = 13370000;
void my_pf(std::deque<long>& d){
d.push_front(mylong);
++mylong;
}
void pushn(std::deque<long>& d, int n){
for (int i = 0; i < n; i++){
my_pf(d);
}
}
int main()
{
std::deque<long> d = {}; // starts off with 1 block
my_pf(d); // 2nd block is added
pushn(d, 63);
my_pf(d); // 3rd block is added
pushn(d, 63);
my_pf(d); // 4th block is added
pushn(d, 63);
my_pf(d); // 5th block is added
pushn(d, 63);
my_pf(d); // 6th block is added
pushn(d, 63);
my_pf(d); // 7th block is added
pushn(d, 63);
my_pf(d); // 8th block is added
pushn(d, 63);
my_pf(d); // 9th block is added
pushn(d, 63);
my_pf(d); // 10th block is added
pushn(d, 63);
my_pf(d); // 11th block is added
pushn(d, 63);
my_pf(d); // 12th block is added
pushn(d, 63);
my_pf(d); // 13th block is added
pushn(d, 63);
my_pf(d); // 14th block is added
pushn(d, 63);
my_pf(d); // 15th block is added
pushn(d, 63);
my_pf(d); // 16th block is added
pushn(d, 63);
my_pf(d); // 17th block is added
pushn(d, 63);
my_pf(d); // 18th block is added
pushn(d, 63);
my_pf(d); // 19th block is added
pushn(d, 63);
my_pf(d); // 20th block is added
pushn(d, 63);
my_pf(d); // 21th block is added
pushn(d, 63);
my_pf(d); // 22th block is added
pushn(d, 63);
my_pf(d); // 23th block is added
pushn(d, 63);
my_pf(d); // 24th block is added
pushn(d, 63);
my_pf(d); // 25th block is added
}
Here are the gdb logs where I once again dump the contents of the map after every new block is added (note that unlike in the libc++ version, in the libstdc++ version the first block is added to the map when the deque is first created, before any elements have been pushed into the deque):
print this->_M_impl._M_map_size
print *this->_M_impl._M_map@38
(gdb) print this->_M_impl._M_map
$1 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x418eb0
(gdb) print this->_M_impl._M_map_size
$2 = 8
(gdb) print *this->_M_impl._M_map@8
$3 = {0x0, 0x0, 0x0, 0x418f00, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$4 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x418eb0
(gdb) print *this->_M_impl._M_map@8
$5 = {0x0, 0x0, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$6 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x418eb0
(gdb) print *this->_M_impl._M_map@8
$7 = {0x0, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$8 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x418eb0
(gdb) print *this->_M_impl._M_map@8
$9 = {0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map_size
$10 = 8
(gdb) print this->_M_impl._M_map
$12 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print this->_M_impl._M_map_size
$13 = 18
(gdb) print *this->_M_impl._M_map@18
$14 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb)
(gdb) print this->_M_impl._M_map
$15 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0}
(gdb) print this->_M_impl._M_map
$17 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$18 = {0x0, 0x0, 0x0, 0x0, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0}
(gdb) print this->_M_impl._M_map
$19 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$20 = {0x0, 0x0, 0x0, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$21 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$22 = {0x0, 0x0, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$23 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$24 = {0x0, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$25 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x419740
(gdb) print *this->_M_impl._M_map@18
$26 = {0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) print this->_M_impl._M_map
$27 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print this->_M_impl._M_map_size
$28 = 38
(gdb) print *this->_M_impl._M_map@38
$29 = {0x0 <repeats 13 times>, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530,
0x419320, 0x419110, 0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$30 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$31 = {0x0 <repeats 12 times>, 0x41a9a0, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0,
0x419530, 0x419320, 0x419110, 0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$32 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$33 = {0x0 <repeats 11 times>, 0x41abb0, 0x41a9a0, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0,
0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$34 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$35 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790, 0x41a440, 0x41a230,
0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$36 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$37 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790, 0x41a440, 0x41a230,
0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$38 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$39 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790, 0x41a440,
0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$40 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$41 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790, 0x41a440,
0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$42 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$43 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790,
0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$44 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$45 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0, 0x41a790,
0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$46 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$47 = {0x0, 0x0, 0x0, 0x0, 0x41ba20, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0,
0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$48 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$49 = {0x0, 0x0, 0x0, 0x41bc30, 0x41ba20, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0, 0x41a9a0,
0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110, 0x418f00,
0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$50 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$51 = {0x0, 0x0, 0x41be40, 0x41bc30, 0x41ba20, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0,
0x41a9a0, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110,
0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$52 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$53 = {0x0, 0x41c050, 0x41be40, 0x41bc30, 0x41ba20, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0, 0x41abb0,
0x41a9a0, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320, 0x419110,
0x418f00, 0x0 <repeats 13 times>}
(gdb) print this->_M_impl._M_map
$54 = (std::_Deque_base<long, std::allocator<long> >::_Map_pointer) 0x41a650
(gdb) print *this->_M_impl._M_map@38
$55 = {0x41c260, 0x41c050, 0x41be40, 0x41bc30, 0x41ba20, 0x41b810, 0x41b600, 0x41b3f0, 0x41b1e0, 0x41afd0, 0x41adc0,
0x41abb0, 0x41a9a0, 0x41a790, 0x41a440, 0x41a230, 0x41a020, 0x419e10, 0x419c00, 0x4199f0, 0x4197e0, 0x419530, 0x419320,
0x419110, 0x418f00, 0x0 <repeats 13 times>}
Appendix D
Here I explain how the libc++ implementation of the STL deque push_front function works.
First we write a program which creates an empty deque of longs and then pushes some longs into it:
#include <deque>
int main()
{
std::deque<long> d = {};
long mylong = 5;
d.push_front(mylong);
d.push_front(mylong);
d.push_front(mylong);
}
We compile it with libc++ like this (the -g flag gives debugging symbols):
clang++ -stdlib=libc++ a.cpp -o a.out -g
Now let’s run it with gdb, making a breakpoint at the first call to push_front:
$ gdb a.out
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...
(gdb) set trace-commands on
(gdb) set logging on
+set logging on
Copying output to gdb.txt.
Copying debug output to gdb.txt.
(gdb) break main
Breakpoint 1 at 0x4011ec: file a.cpp, line 5.
(gdb) r
Starting program: a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main () at a.cpp:5
5 std::deque<long> d = {};
(gdb) layout src
┌──a.cpp────────────────────────────────────────────────────────────────────┐
│ 1 #include <deque> │
│ 2 │
│ 3 int main() │
│ 4 { │
│B+>5 std::deque<long> d = {}; │
│ 6 long mylong = 5; │
│ 7 d.push_front(mylong); │
│ 8 d.push_front(mylong); │
│ 9 d.push_front(mylong); │
│ 10 } │
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b297 In: main L5 PC: 0x4011ec
(gdb) break 7
Breakpoint 2 at 0x401208: file a.cpp, line 7.
┌──a.cpp────────────────────────────────────────────────────────────────────┐
│ 1 #include <deque> │
│ 2 │
│ 3 int main() │
│ 4 { │
│B+>5 std::deque<long> d = {}; │
│ 6 long mylong = 5; │
│b+ 7 d.push_front(mylong); │
│ 8 d.push_front(mylong); │
│ 9 d.push_front(mylong); │
│ 10 } │
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b297 In: main L5 PC: 0x4011ec
(gdb) c
Continuing.
Breakpoint 2, main () at a.cpp:7
┌──a.cpp────────────────────────────────────────────────────────────────────┐
│ 1 #include <deque> │
│ 2 │
│ 3 int main() │
│ 4 { │
│B+ 5 std::deque<long> d = {}; │
│ 6 long mylong = 5; │
│B+>7 d.push_front(mylong); │
│ 8 d.push_front(mylong); │
│ 9 d.push_front(mylong); │
│ 10 } │
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b297 In: main L7 PC: 0x401208
Now we can step into the call
(gdb) s
std::__1::deque<long, std::__1::allocator<long> >::push_front (
this=0x7fffffffdf30, __v=@0x7fffffffdf28: 5)
at /usr/lib/llvm-10/bin/../include/c++/v1/deque:1938
┌──/usr/lib/llvm-10/bin/../include/c++/v1/deque─────────────────────────────┐
│ 1933 │
│ 1934 template <class _Tp, class _Allocator> │
│ 1935 void │
│ 1936 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1937 { │
│ >1938 allocator_type& __a = __base::__alloc(); │
│ 1939 if (__front_spare() == 0) │
│ 1940 __add_front_capacity(); │
│ 1941 // __front_spare() >= 1 │
│ 1942 __alloc_traits::construct(__a, _VSTD::addressof(*--__bas│
│ 1943 --__base::__start_; │
│ 1944 ++__base::size(); │
│ 1945 } │
└───────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b297 In: std::__1::deque<long,* L1938 PC: 0x401294
Let’s step into this __alloc()
call and see what it does:
(gdb) s
std::__1::__deque_base<long, std::__1::allocator<long> >::__alloc (this=0x7fffffffe110) at /usr/lib/llvm-11/bin/../i
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 1039 _LIBCPP_INLINE_VISIBILITY size_type& size() {return __size_.first();} │
│ 1040 _LIBCPP_INLINE_VISIBILITY │
│ 1041 const size_type& size() const _NOEXCEPT {return __size_.first();} │
│ >1042 _LIBCPP_INLINE_VISIBILITY allocator_type& __alloc() {return __size_.second();} │
│ 1043 _LIBCPP_INLINE_VISIBILITY │
│ 1044 const allocator_type& __alloc() const _NOEXCEPT {return __size_.second();} │
│ 1045 │
│ 1046 _LIBCPP_INLINE_VISIBILITY │
│ 1047 __deque_base() │
│ 1048 _NOEXCEPT_(is_nothrow_default_constructible<allocator_type>::value); │
│ 1049 _LIBCPP_INLINE_VISIBILITY │
│ 1050 explicit __deque_base(const allocator_type& __a); │
│ 1051 public: │
│ 1052 ~__deque_base(); │
│ 1053 │
│ 1054 #ifndef _LIBCPP_CXX03_LANG │
│ 1055 __deque_base(__deque_base&& __c) │
│ 1056 _NOEXCEPT_(is_nothrow_move_constructible<allocator_type>::value); │
│ 1057 __deque_base(__deque_base&& __c, const allocator_type& __a); │
│ 1058 #endif // _LIBCPP_CXX03_LANG │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__deque_base<long, std::__1::allocator<long> >::__* L1042 PC: 0x401650
(gdb) s
std::__1::__compressed_pair<unsigned long, std::__1::allocator<long> >::second (this=0x7fffffffe138)
at /usr/lib/llvm-11/bin/../include/c++/v1/memory:2213
┌─/usr/lib/llvm-11/bin/../include/c++/v1/memory────────────────────────────────────────────────────────────────────┐
│ 2210 │
│ 2211 _LIBCPP_INLINE_VISIBILITY │
│ 2212 typename _Base2::reference second() _NOEXCEPT { │
│ >2213 return static_cast<_Base2&>(*this).__get(); │
│ 2214 } │
│ 2215 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__compressed_pair<unsigned long, std::__1::allocat* L2213 PC: 0x401c70
(gdb) s
std::__1::__compressed_pair_elem<std::__1::allocator<long>, 1, true>::__get (this=0x7fffffffe138)
at /usr/lib/llvm-11/bin/../include/c++/v1/memory:2155
┌─/usr/lib/llvm-11/bin/../include/c++/v1/memory────────────────────────────────────────────────────────────────────┐
│ 2154 │
│ >2155 _LIBCPP_INLINE_VISIBILITY reference __get() _NOEXCEPT { return *this; } │
│ 2156 _LIBCPP_INLINE_VISIBILITY │
│ 2157 const_reference __get() const _NOEXCEPT { return *this; } │
│ 2158 }; │
│ 2159 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__compressed_pair_elem<std::__1::allocator<long>, * L2155 PC: 0x401c8c
(gdb) print *this
$1 = {<std::__1::allocator<long>> = {<No data fields>}, <No data fields>}
(gdb)
Note: It seems that you can’t actually scroll your history in gdb, which means that the gdb commands that you run, as well as the output, are lost when they are scrolled off screen. A workaround is to set logging on but you have to re-enable logging when you enter TUI mode otherwise gdb won’t log them. This is a known gdb bug.
So this __alloc()
call simply calls the allocator for the long type. Basically it just allocates some memory to store a new element of type long.
(gdb) n
std::__1::deque<long, std::__1::allocator<long> >::push_front (this=0x7fffffffe110, __v=@0x7fffffffe108: 5)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1941
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ >1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __front_spare()
$2 = 0
Since this is the first ever call to push_front, and we created an empty deque, there is NO spare capacity in the deque whatsoever, hence naturally __front_spare()
returns 0. But let’s step into it to find out exactly what it does:
(gdb) s
│ 1495 _LIBCPP_INLINE_VISIBILITY │
│ 1496 size_type __front_spare() const │
│ 1497 { │
│ >1498 return __base::__start_; │
│ 1499 }
This function simply returns the __start_
variable. Thus, the __start_
variable stores the amount of spare space in the front of the deque.
(gdb) n
std::__1::deque<long, std::__1::allocator<long> >::push_front (this=0x7fffffffe110, __v=@0x7fffffffe108: 5)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1942
(gdb) s
std::__1::deque<long, std::__1::allocator<long> >::__add_front_capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2425
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ >2425 allocator_type& __a = __base::__alloc(); │
│ 2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
You will notice that this line is exactly identical to the one we saw earlier, which allocated space for a long. So this line also allocates space for an element.
(gdb) n
std::__1::deque<long, std::__1::allocator<long> >::__add_front_capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2426
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ 2425 allocator_type& __a = __base::__alloc(); │
│ >2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __back_spare()
$1 = 0
(gdb) s
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 1504 _LIBCPP_INLINE_VISIBILITY │
│ 1505 size_type __back_spare() const │
│ 1506 { │
│ >1507 return __capacity() - (__base::__start_ + __base::size()); │
│ 1508 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) s
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 1484 _LIBCPP_INLINE_VISIBILITY │
│ 1485 size_type __capacity() const │
│ 1486 { │
│ >1487 return __base::__map_.size() == 0 ? 0 : __base::__map_.size() * __base::__block_size - 1; │
│ 1488 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) finish
Run till exit from #0 std::__1::deque<long, std::__1::allocator<long> >::__capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1487
0x000000000040277c in std::__1::deque<long, std::__1::allocator<long> >::__back_spare (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1507
Value returned is $3 = 0
(gdb) print __block_size
$2 = <optimized out>
So we calculate __back_spare
by getting the difference between the capacity and the sum of __start_
and the size of the deque (i.e the amount of data actually stored in the deque currently). Recall from earlier that the __start
variable stores the amount of spare space at the front of the deque. Of course! If we add together the size of the deque and the amount of spare space at the front of the deque, then the remaining space must be the spare space at the back of the deque.
In Appendix A I explained how to see the actual value of the __block_size
variable by looking at the assembly, so I won’t do that here. It suffices to know that in the libc++ deque, the block size is (at the time of this writing) 4096 bytes and in the libstdc++ deque the block size is 512 bytes.
Anyways, let’s go back to __add_front_capacity
. Since the block size is 512 (since a long is 8 bytes and 4096 divided by 8 is 512) and the return value of __back_spare()
is 0, that means the condition fails which means we drop into the next test:
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ 2425 allocator_type& __a = __base::__alloc(); │
│ 2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ >2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ 2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __map_.size()
$8 = 0
(gdb) print __map_.capacity()
$9 = 0
Since the size is not less than the capacity (they are both 0), we drop to the else block:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────┐
│ 2452 // Else need to allocate 1 buffer, *and* we need to reallocate __map_. │
│ 2453 else │
│ 2454 { │
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ >2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ 2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ 2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
│ 2468 __buf.push_back(*__i); │
│ 2469 _VSTD::swap(__base::__map_.__first_, __buf.__first_); │
│ 2470 _VSTD::swap(__base::__map_.__begin_, __buf.__begin_); │
│ 2471 _VSTD::swap(__base::__map_.__end_, __buf.__end_); │
│ 2472 _VSTD::swap(__base::__map_.__end_cap(), __buf.__end_cap()); │
│ 2473 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2474 __base::__block_size / 2 : │
│ 2475 __base::__start_ + __base::__block_size; │
│ 2476 } │
│ 2477 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
So in this constructor call, we first pass it the max of twice the __map_
capacity and 1. Since the capacity is 0, 1 is used. Next we pass it 0. Finally we pass it the allocator for long pointers.
So what is the constructor being called here?
(gdb) s
std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::__split_buffer (this=0x7fffffffe038, __cap=1,
__start=0, __a=...) at /usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer:316
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer──────────────────────────────────────────────────────────────┐
│ 313 │
│ 314 template <class _Tp, class _Allocator> │
│ 315 __split_buffer<_Tp, _Allocator>::__split_buffer(size_type __cap, size_type __start, __alloc_rr& __a) │
│ >316 : __end_cap_(nullptr, __a) │
│ 317 { │
│ 318 __first_ = __cap != 0 ? __alloc_traits::allocate(__alloc(), __cap) : nullptr; │
│ 319 __begin_ = __end_ = __first_ + __start; │
│ 320 __end_cap() = __first_ + __cap; │
│ 321 } │
│ 322 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __cap
$16 = 1
(gdb) print __start
$17 = 0
(gdb) print __a
$18 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::__alloc_rr &) @0x7fffffffe128: {<No data fields>}
(gdb) fin
Run till exit from #0 std::__1::allocator_traits<std::__1::allocator<long*> >::allocate (__a=..., __n=1)
at /usr/lib/llvm-11/bin/../include/c++/v1/memory:1525
0x0000000000403082 in std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::__split_buffer (
this=0x7fffffffe038, __cap=1, __start=0, __a=...) at /usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer:318
Value returned is $21 = (long **) 0x4092a0
Let’s look at what this constructor does, since it’s quite simple.
So what does this __alloc_traits::allocate(__alloc(), __cap)
do? It simply calls __a.allocate(__n)
where __a
is the allocator returned by __alloc()
and __n
is the __cap
variable. That is, it allocates __cap
long pointers. Here __cap
is just 1, so it just allocates one pointer to long.
The function checks to see if __cap
is 0. If it is, then __first_
is set to nullptr. Otherwise it is set to the pointer returned by the allocator, i.e a pointer-to-pointer-to-long.
The __begin_
AND __end_
variables are then set to __first_
plus __start
. Remember how we just set __first_
to the pointer-to-pointer-to-long? So __first_
really is the start of __map_
, as in, it is the start of the underlying array. Now what is __start
? Well it is zero. The function call explicitly passed 0 as the argument.
Finally, __end_cap()
is set to the value of __first_
+ __cap
, which makes perfect sense. The source code of __end_cap()
is simply this: {return __end_cap_.first();}
. We can simply assume that __end_cap()
always returns a pointer that points to the end of the __map_
array.
Okay, now let’s return to __add_front_capacity
:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 2453 else │
│ 2454 { │
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ 2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ >2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ 2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
│ 2468 __buf.push_back(*__i); │
│ 2469 _VSTD::swap(__base::__map_.__first_, __buf.__first_); │
│ 2470 _VSTD::swap(__base::__map_.__begin_, __buf.__begin_); │
│ 2471 _VSTD::swap(__base::__map_.__end_, __buf.__end_); │
│ 2472 _VSTD::swap(__base::__map_.__end_cap(), __buf.__end_cap()); │
│ 2473 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2474 __base::__block_size / 2 : │
│ 2475 __base::__start_ + __base::__block_size; │
│ 2476 } │
│ 2477 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __a
$27 = (std::__1::deque<long, std::__1::allocator<long> >::allocator_type &) @0x7fffffffe138: {<No data fields>}
(gdb) fin
Run till exit from #0 std::__1::allocator_traits<std::__1::allocator<long> >::allocate (__a=..., __n=512)
at /usr/lib/llvm-11/bin/../include/c++/v1/memory:1525
0x0000000000402465 in std::__1::deque<long, std::__1::allocator<long> >::__add_front_capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2461
Value returned is $1 = (long *) 0x4092c0
(gdb) print __buf
$4 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4092a0, __begin_ = 0x4092a0,
__end_ = 0x4092a0, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x4092a8}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) whatis __buf
type = std::__1::__split_buffer<long*, std::__1::allocator<long*>&>
There are 3 function calls in this one function call which takes two arguments, each argument being computed by a function call. So let’s look at the first argument. It is a call to allocate - what does it do? Well it simply allocates 512 longs. Recall that the STL deque is an array of pointers to arrays, each “subarray” is called a “block” and contains 512 elements. So this allocate call allocates a fresh new empty block. The _Dp is a destructor, which destroys the element. In this case it does nothing. Then it creates the __hold
unique_ptr from the block pointer and the destructor.
I noted down that the address returned by the block allocator here is 0x4092c0.
The call to __hold.get()
simply returns a pointer-to-long, address 0x4092c0. The exact same address that was just returned by the block allocator. So I think we can safely say that the __buf.push_back(__hold.get());
line just pushes the pointer to the newly allocated block to the back of the __buf
which is a __split_buffer
.
After the call to release there is a for loop. We begin iterating from __map_.begin()
, which just returns __begin_
, and we finish at __map_.end()
, which simply returns __end_
.
(gdb) print __map_.begin()
$7 = (long **) 0x0
(gdb) print __map_.end()
$8 = (long **) 0x0
As you can see, both of these are zero to begin with. So we don’t enter the for loop. But one can imagine that the for loop simply copies every element from the old split_buffer to the new one.
Next there are the four calls to swap. These calls change the __first_
, __begin_
, __end_
, and __end_cap()
variables to that of the newly allocated split_buffer, effectively replacing the old split buffer with the new one.
(gdb) print __start_
$10 = 0
Finally, we set the __start_
variable to __base::__map_.size() == 1 ? __base::__block_size / 2 : __base::__start_ + __base::__block_size;
. Now, __map_.size()
is simply __end_ - __begin_
. This gets you the number of elements stored in __map_
, i.e the number of blocks. So if there is exactly 1 block then __start_
is set to half of block size (256), otherwise it is set to __start_
plus block size (512).
Now we return to the push_front
function:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ 1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ >1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
It appears that this line first decrements the pointer returned by begin()
and then puts the new element there. What exactly does begin()
do?
│ 1138 template <class _Tp, class _Allocator> │
│ 1139 typename __deque_base<_Tp, _Allocator>::iterator │
│ 1140 __deque_base<_Tp, _Allocator>::begin() _NOEXCEPT │
│ 1141 { │
│ >1142 __map_pointer __mp = __map_.begin() + __start_ / __block_size; │
│ 1143 return iterator(__mp, __map_.empty() ? 0 : *__mp + __start_ % __block_size); │
│ 1144 }
Now, __map_.begin()
simply returns __begin_
. So the map pointer __mp
is the location of the first element in __map_
plus the __start_ / __block_size
. Here __start_
is 256 which when divided by the block size of 512 is zero.
(gdb) print __map_.begin()
$3 = (long **) 0x4092a0
(gdb) print __mp
$1 = (std::__1::__deque_base<long, std::__1::allocator<long> >::__map_pointer) 0x4092a0
(gdb) print *__mp
$4 = (long *) 0x4092c0
(gdb) print **__mp
$7 = 0
(gdb) print __start_
$2 = 256
Finally, we decrement __start_
and increment the size()
.
So, now we have run push_front for the first time. Let’s run it again.
Since we allocated a new block last time, this time it should not call __add_front_capacity()
. Let’s have a look:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ >1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __start_
$12 = 255
(gdb)
So now there are 255 spare places at the front of the deque. Since we just allocated a block of 512 capacity, and we put the element into the middle, that means there are 255 spare spaces left in the block. So we skip the call to __add_front_capacity
and just put the item into the block. The same happens for the next call where there are 254 spare spaces left at the front of the deque - not very interesting so we’ll skip all that.
Now let’s write some code to put 254 elements into the front of the deque so we can skip all of the uninteresting stuff:
#include <deque>
int main()
{
std::deque<long> d = {};
long mylong = 5;
for (int i = 0; i < 255; i++){
d.push_front(mylong);
}
// at this point there is still 1 spare space left at the front of the deque
d.push_front(mylong);
// now there are no spare spaces left at the front of the deque at all
d.push_front(mylong); // <- we step into the function here
d.push_front(mylong);
d.push_front(mylong);
}
So anyways we skip past 256 calls to push_front to get to where there is NO spare capacity at the front left at all:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ >1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) call __front_spare()
$2 = 0
Okay, so now condition is going to evaluate true, and we are going to see __add_front_capacity()
called for the second time, so let’s have a look:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ 2425 allocator_type& __a = __base::__alloc(); │
│ >2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __back_spare()
$3 = 255
(gdb)
There are only 255 spare spaces at the back so we skip this block.
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ >2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ 2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
│ 2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
│ 2445 __base::__map_.pop_back(); │
│ 2446 __base::__map_.push_front(__pt); │
│ 2447 } │
│ 2448 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2449 __base::__block_size / 2 : │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __map_.size()
$4 = 1
(gdb) print __map_.capacity()
$5 = 1
So that makes sense, since we have just allocated one block so __map_
contains only one block. So there is no spare space left in __map_
so we need to allocate another block. The source code calls a block a buffer, but they are the same thing. Since it allocates a block of __block_size
memory I will continue calling it a block.
Anyways, since the map has no more room for additional blocks, we will need to reallocate it. So we continue to the else block.
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 2450 __base::__start_ + __base::__block_size; │
│ 2451 } │
│ 2452 // Else need to allocate 1 buffer, *and* we need to reallocate __map_. │
│ 2453 else │
│ 2454 { │
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ >2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ 2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ 2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
│ 2468 __buf.push_back(*__i); │
│ 2469 _VSTD::swap(__base::__map_.__first_, __buf.__first_); │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) call __map_.capacity()
$3 = 1
Max of 2 and 1 is obviously 2. From this point on, 2 * __map_.capacity()
will always be bigger than 1, so every time this else branch is taken, a new split buffer will be allocated which has exactly twice the capacity of the previous one. Let’s review the ctor code:
│ 314 template <class _Tp, class _Allocator> │
│ 315 __split_buffer<_Tp, _Allocator>::__split_buffer(size_type __cap, size_type __start, __alloc_rr& __a) │
│ 316 : __end_cap_(nullptr, __a) │
│ 317 { │
│ 318 __first_ = __cap != 0 ? __alloc_traits::allocate(__alloc(), __cap) : nullptr; │
│ 319 __begin_ = __end_ = __first_ + __start; │
│ >320 __end_cap() = __first_ + __cap; │
│ 321 }
As you can see, it allocates a new split buffer with capacity equal to __cap
. It also sets __begin_
and __end_
equal to __first_ + __start
. Now, __first_
is the pointer pointing to the start of the just-allocated split buffer, and __start
is the parameter passed in. In the context of the else block, the second argument is ALWAYS zero, which means __start
is always zero which means both the __begin_
and __end_
are ALWAYS initialized to the value of __first_
, that is, the start of the split buffer.
Then, as previously discussed, it allocates a block and pushes the pointer to that block to the end of the new split buffer. Let’s visit that push_back
function:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer──────────────────────────────────────────────────────────────┐
│ 568 template <class _Tp, class _Allocator> │
│ 569 void │
│ 570 __split_buffer<_Tp, _Allocator>::push_back(value_type&& __x) │
│ 571 { │
│ >572 if (__end_ == __end_cap()) │
│ 573 { │
│ 574 if (__begin_ > __first_) │
│ 575 { │
│ 576 difference_type __d = __begin_ - __first_; │
│ 577 __d = (__d + 1) / 2; │
│ 578 __end_ = _VSTD::move(__begin_, __end_, __begin_ - __d); │
│ 579 __begin_ -= __d; │
│ 580 } │
│ 581 else │
│ 582 { │
│ 583 size_type __c = max<size_type>(2 * static_cast<size_t>(__end_cap() - __first_), 1); │
│ 584 __split_buffer<value_type, __alloc_rr&> __t(__c, __c / 4, __alloc()); │
│ 585 __t.__construct_at_end(move_iterator<pointer>(__begin_), │
│ 586 move_iterator<pointer>(__end_)); │
│ 587 _VSTD::swap(__first_, __t.__first_); │
│ 588 _VSTD::swap(__begin_, __t.__begin_); │
│ 589 _VSTD::swap(__end_, __t.__end_); │
│ 590 _VSTD::swap(__end_cap(), __t.__end_cap()); │
│ 591 } │
│ 592 } │
│ 593 __alloc_traits::construct(__alloc(), _VSTD::__to_address(__end_), │
│ 594 _VSTD::move(__x)); │
│ 595 ++__end_; │
│ 596 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __end_
$2 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __end_cap()
$3 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer &) @0x7fffffffe040: 0x40a2e0
So the push_back function simply puts the element at the location pointed to by the __end_
pointer, and then the __end_
pointer is incremented. We can check this by looking at the value of the __end_
pointer before and after the push_back
call:
...before the push_back call...
(gdb) print __buf.__first_
$1 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__begin_
$2 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__end_
$3 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) call __buf.__end_cap()
$5 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer &) @0x7fffffffe040: 0x40a2e0
...after the push_back call...
(gdb) print __buf.__first_
$6 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__begin_
$7 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__end_
$8 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d8
(gdb) call __buf.__end_cap()
$9 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer &) @0x7fffffffe040: 0x40a2e0
(gdb) print __map_.__first_
$11 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a0
(gdb) print __map_.__begin_
$12 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a0
(gdb) print __map_.__end_
$13 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a8
(gdb) call __map_.__end_cap()
$15 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe120: 0x4092a8
...after the for loop, which runs only one iteration...
(gdb) print __buf.__first_
$16 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__begin_
$17 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2d0
(gdb) print __buf.__end_
$18 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40a2e0
(gdb) call __buf.__end_cap()
$19 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer &) @0x7fffffffe040: 0x40a2e0
(gdb) print __map_.__first_
$21 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a0
(gdb) print __map_.__begin_
$22 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a0
(gdb) print __map_.__end_
$23 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x4092a8
(gdb) call __map_.__end_cap()
$24 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe120: 0x4092a8
...after the swaps...
(gdb) print __map_.__first_
$25 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40a2d0
(gdb) print __map_.__begin_
$26 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40a2d0
(gdb) print __map_.__end_
$27 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40a2e0
(gdb) call __map_.__end_cap()
$28 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe120: 0x40a2e0
Obviously the __map_
pointers don’t change before and after the push_back and the for loop. We can also see that for __buf
the __first_
, __begin_
, and __end_cap()
pointers don’t change either. Only the __end_
pointer changes. This makes sense, since the only thing that the push_back
function changes is the __end_
pointer - and intuitively it makes sense too - pushing an element onto an array is not going to change anything other than the location of the last element of the array.
As you can see, before the call to push_back, the values of __first_
, __begin_
, and __end
in the __buf
split buffer are all identical. The value of __end_cap
however is different - it is greater than the other variables. Actually, it is exactly 16 bytes greater than the other variables, indicating that __buf
has a capacity of 16 bytes, as opposed to __map_
which only has 8 bytes.
After the call to push_back, the values of __first_
and __begin_
are unchanged but the value of __end_
has been incremented by 8 (it’s 8 presumably because a pointer is 8 bytes long). This shows that the __end_
variable points to the end of the contents in the split buffer, rather than the actual end of the array, which is pointed to by the __end_cap_
variable.
This also proves that the split buffer named __buf
was empty prior to the call to push_back
(since __begin_
and __end_
pointed to the same location) and contains only one element after the call. Since the call happens immediately after __buf
was allocated, that means that this will ALWAYS happen. An empty split buffer will be created and then a new block is allocated and then a pointer to that block is pushed onto the split buffer from the left, that is to say, the VERY FIRST space in the split buffer will be occupied by that pointer-to-the-newly-allocated-block.
After the pointer is pushed onto the new split buffer, THEN the original contents of the old split buffer are copied to the new split buffer. That is to say, in this particular situation it looks like this:
__map_ [old1]
__buf [new1, old1]
In the general case it looks like this (because push_back starts at the front):
old split_buffer: [old1, old2, old3, old4]
new split_buffer: [new1, old1, old2, old3, old4, ___, ___, ___]
After the swaps, the pointers of __map_
are swapped with those of the __buf
.
Finally, the __start_
variable is updated with the addition of the __block_size. Since __start_
was 0 to begin with, after this line, __start_
will have a value of 512.
After that these lines are executed:
│ 1943 // __front_spare() >= 1 │
│ >1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
Note that the __base::begin()
function returns the location of where the actual first element in the deque is. In this case, it is in the block pointed to by the second pointer in the map, that is, the pointer residing at the address 0x40a2d8
.
Here is some more context:
(gdb) print (long*)*0x40a2d8
$6 = (long *) 0x4092c0
(gdb) print (long*)*0x40a2d0
$7 = (long *) 0x40a2f0
(gdb) fin
Run till exit from #0 std::__1::addressof<long> (__x=@0x40b2e8: 0)
at /usr/lib/llvm-11/bin/../include/c++/v1/type_traits:607
0x0000000000401358 in std::__1::deque<long, std::__1::allocator<long> >::push_front (this=0x7fffffffe108,
__v=@0x7fffffffe100: 5) at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1944
Value returned is $11 = (long *) 0x40b2e8
Now, there is a lot of work being done by the decrement operator --
here, so let’s examine it:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 339 │
│ 340 _LIBCPP_INLINE_VISIBILITY __deque_iterator& operator--() │
│ 341 { │
│ >342 if (__ptr_ == *__m_iter_) │
│ 343 { │
│ 344 --__m_iter_; │
│ 345 __ptr_ = *__m_iter_ + __block_size; │
│ 346 } │
│ 347 --__ptr_; │
│ 348 return *this; │
│ 349 } │
│ 350 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__deque_iterator<long, long*, long&, long**, long, 5* L342 PC: 0x40276c
(gdb) s
std::__1::__deque_iterator<long, long*, long&, long**, long, 512l>::operator-- (this=0x7fffffffe0a8)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:342
(gdb) print __ptr_
$2 = (std::__1::__deque_iterator<long, long*, long&, long**, long, 512>::pointer) 0x4092c0
(gdb) print *__m_iter_
$3 = (long *) 0x4092c0
(gdb) print __m_iter_
$4 = (std::__1::__deque_iterator<long, long*, long&, long**, long, 512>::__map_iterator) 0x40a2d8
(gdb) print *(__m_iter_ - 1)
$5 = (long *) 0x40a2f0
(gdb) print (long *)(0x40a2f0 + 0x200)
$7 = (long *) 0x40a4f0
The block_size here is 4096. We can check this by looking at the value of ptr before and after the line is executed:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 341 { │
│ 342 if (__ptr_ == *__m_iter_) │
│ 343 { │
│ 344 --__m_iter_; │
│ >345 __ptr_ = *__m_iter_ + __block_size; │
│ 346 } │
│ 347 --__ptr_; │
│ 348 return *this; │
│ 349 } │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__deque_iterator<long, long*, long&, long**, long, 5* L347 PC: 0x4027a6
(gdb) print *__m_iter_
$3 = (long *) 0x40a2f0
(gdb) n
(gdb) print __ptr_
$3 = (std::__1::__deque_iterator<long, long*, long&, long**, long, 512>::pointer) 0x40b2f0
(gdb) n
(gdb) print __ptr_
$4 = (std::__1::__deque_iterator<long, long*, long&, long**, long, 512>::pointer) 0x40b2e8
So here we found the location of the block pointed to by the first pointer in the split buffer __map_
. Then we added 4096 to it. Finally we decremented it by 1. This means that the pointer should now point to the last position in the first block of the deque.
In general, this decrement function decrements the pointer __ptr_
from the first position in a block to the last position of the previous block. That is, intuitively, what we would expect the decrement function to do. Basically:
[___, ___, ... , ___] [___, ___, ... , ___]
^ __ptr_
goes to
[___, ___, ... , ___] [___, ___, ... , ___]
^ __ptr_
Except the arrays are NOT next to each other in memory as the diagram above suggests. They are logically next to each other since they are pointed to by consecutive pointers in the __map_
split buffer but they are allocated separately and therefore do not necessarily reside next to each other in memory.
Anyways, now we know that the new element is placed at the end of the first block, which is the address 0x40b2e8.
Remember this memory address for the next call to push_front
.
Then, __start_
is decremented from 512 to 511. This means the number of spare spaces in the front of the deque has been decremented by that much, which is also the amount of spare spaces in the block at the front of the deque (conceptually).
Now let’s move on to the next call to push_front
. The address where the next element is pushed is 0x40b2e0. Notice how this is just 8 less than the address where the previous variable was pushed. __start_
is also decremented to 510. So this is the “normal” case - most of the time, all that push_front
does is simply fill up the first block from the back to the front. So in that sense it does the opposite of what push_back
does.
Okay, let’s skip ahead by 510 elements so we get to the next situation where the first block is full. To do that, let’s edit our code to make things easier to read.
#include <deque>
long mylong = 13370000;
void my_pf(std::deque<long>& d){
d.push_front(mylong);
++mylong;
}
void pushn(std::deque<long>& d, int n){
for (int i = 0; i < n; i++){
my_pf(d);
}
}
int main()
{
std::deque<long> d = {};
// zero blocks
pushn(d, 256);
// one block
my_pf(d);
// two blocks
pushn(d, 511);
// still two blocks
my_pf(d);
// three blocks
my_pf(d);
}
So now let’s see what happens when there are two blocks and they are both full:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque───────────────────────────────────────────────────────────────────────┐
│ 2418 │
│ 2419 // Create front capacity for one block of elements. │
│ 2420 // Strong guarantee. Either do it or don't touch anything. │
│ 2421 template <class _Tp, class _Allocator> │
│ 2422 void │
│ 2423 deque<_Tp, _Allocator>::__add_front_capacity() │
│ 2424 { │
│ >2425 allocator_type& __a = __base::__alloc(); │
│ 2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __back_spare()
$3 = 255
...
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2425 allocator_type& __a = __base::__alloc(); │
│ 2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ >2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ 2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
│ 2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2434 PC: 0x402174
(gdb) print __map_.size()
$3 = 2
(gdb) print __map_.capacity()
$4 = 2
Since the __map_
is full and there are no empty blocks at the back, we drop to the else block:
As before, we allocate a new split buffer of twice the current one’s capacity. Since the current capacity is 2, that means we allocate a new split buffer of capacity 4. Next we allocate a new empty block and push a pointer to it onto the new empty split buffer.
Here are the memory addresses of the newly allocated split buffer:
(gdb) print __buf.__begin_
$8 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40b300
(gdb) print __buf.__first_
$9 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40b300
(gdb) print __buf.__end_
$10 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer) 0x40b308
(gdb) print __buf.__end_cap()
$11 = (std::__1::__split_buffer<long*, std::__1::allocator<long*>&>::pointer &) @0x7fffffffe050: 0x40b320
As you can see, the new buffer is 0x20 bytes which is 32 bytes long (subtract __end_cap
from __begin_
), which can hold 4 8-byte pointers, as expected.
Next, as before, the contents of the previous split buffer are pushed onto the new split buffer in order, like this:
old: [old1, old2]
new: [new1, ____, ____, ___]
new: [new1, old1, ____, ___]
new: [new1, old1, old2, ___]
And we can check by looking at the contents of __buf
after every call to push_back
:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ 2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ 2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ >2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
│ 2468 __buf.push_back(*__i); │
│ 2469 _VSTD::swap(__base::__map_.__first_, __buf.__first_); │
│ 2470 _VSTD::swap(__base::__map_.__begin_, __buf.__begin_); │
│ 2471 _VSTD::swap(__base::__map_.__end_, __buf.__end_); │
│ 2472 _VSTD::swap(__base::__map_.__end_cap(), __buf.__end_cap()); │
│ 2473 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2474 __base::__block_size / 2 : │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __buf
$1 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b300, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) print *(__buf.__first_)
$2 = (long *) 0x0
(gdb) n
(gdb) print __buf
$4 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b308, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) print *(__buf.__first_)
$5 = (long *) 0x40b330
(gdb) print *(0x40b330)
$6 = 0
(gdb) print __buf.__first_+1
$7 = (long **) 0x40b308
(gdb) print *(__buf.__first_+1)
$8 = (long *) 0x0
As you can see, at this point we have just pushed an empty block onto __buf
, so that __buf
now looks like this:
[new1, ____, ____, ___]
This new block is empty, which is why when we print out the first value from it, we just get zero.
(gdb) n
(gdb) n
(gdb) n
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ 2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ 2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
│ >2468 __buf.push_back(*__i); │
│ 2469 _VSTD::swap(__base::__map_.__first_, __buf.__first_); │
│ 2470 _VSTD::swap(__base::__map_.__begin_, __buf.__begin_); │
│ 2471 _VSTD::swap(__base::__map_.__end_, __buf.__end_); │
│ 2472 _VSTD::swap(__base::__map_.__end_cap(), __buf.__end_cap()); │
│ 2473 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2474 __base::__block_size / 2 : │
│ 2475 __base::__start_ + __base::__block_size; │
│ 2476 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2468 PC: 0x40230c
(gdb) print __buf
$10 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b310, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) print *(__buf.__first_)
$11 = (long *) 0x40b330
(gdb) print *(0x40b330)
$13 = 0
(gdb) print *(__buf.__first_+1)
$14 = (long *) 0x40a2f0
As you can see, the first element in __buf
has not changed, but the second element has - it was previously 0 and now it is 0x40a2f0
. The push_back
call, as expected, does not modify the first element of the split buffer but simply appends an element to the back. At this point, __buf
should look like this:
[new1, old1, ____, ___]
We can check this by inspecting the value held at 0x40a2f0
:
(gdb) print *(0x40a2f0)
$3 = 13370767
(gdb) print *(0x40a2f0+1)
$4 = 52229
(gdb) print *(0x40a2f0+7)
$5 = -872051200
(gdb) print *(0x40a2f0+8)
$6 = 13370766
(gdb) print *(0x40a2f0+16)
$7 = 13370765
(gdb) print *(0x40a2f0+2)
$8 = 204
(gdb) print *(0x40a2f0+4)
$9 = 0
(gdb) print *(0x40a2f0+4096)
$10 = 0
(gdb) print *(0x40a2f0+4096-8)
$11 = 13370256
(gdb) print *(0x40a2f0+4096-16)
$12 = 13370257
As you can see, the blocks fill up from the back to the front. We skip through the block by jumping 8 bytes at a time, since each element in the block is 8 bytes in size (as you can see from the gdb trace).
The last element in the block has the value 13370256 and the first element has the value 13370767. So altogether there are 512 elements in the block as expected, and as the gdb trace shows, the last element has the lowest value, meaning it was added first, and the first element has the highest value, meaning it was added last, which means the block was filled up from the back to the front, like this:
[511, 510, 509, ..., 2, 1, 0]
Continuing on:
(gdb) print *(__buf.__first_)
$13 = (long *) 0x40b330
(gdb) print *(__buf.__first_+1)
$14 = (long *) 0x40a2f0
(gdb) print *(__buf.__first_+2)
$15 = (long *) 0x0
(gdb) n
(gdb) n
(gdb) n
(gdb) n
(gdb) print *(__buf.__first_)
$16 = (long *) 0x40b330
(gdb) print *(__buf.__first_+1)
$17 = (long *) 0x40a2f0
(gdb) print *(__buf.__first_+2)
$18 = (long *) 0x4092c0
(gdb) print *(__buf.__first_+3)
$19 = (long *) 0x0
Here we have stepped through another iteration of the for loop when pushes elements onto the __buf
split buffer. Now we can clearly see that a new element has been added to the back, so that __buf
now looks like this:
[new1, old1, old2, ___]
Let’s have a look at what is inside 0x4092c0
:
(gdb) print *(0x4092c0)
$20 = 13370255
(gdb) print *(0x4092c0+8)
$21 = 13370254
(gdb) print *(0x4092c0+16)
$22 = 13370253
(gdb) print *(0x4092c0+4096)
$23 = 0
(gdb) print *(0x4092c0+4096-8)
$24 = 0
(gdb) print *(0x4092c0+2048)
$30 = 0
(gdb) print *(0x4092c0+2048-8)
$31 = 13370000
(gdb) print *(0x4092c0+2048-16)
$32 = 13370001
Only the addresses from 0x4092c0 to 0x4092c0+2048-8 have been filled, which is 256 values in total: the block contains precisely the values from 13370000 to 13370255, and these elements are stored from position 255 to position 0, like this:
[255, 254, 253, ..., 2 , 1 , 0 , ...]
0 1 2 253 254 255
So this block was filled from position 255, back to front (since we used the push_front
function to fill it).
So far we have only filled two blocks (we have pushed 768 elements into the deque prior to this point. Currently we are pushing the 769th element, so we will be pushing it to the third block) so there are no more blocks left to copy from the old split buffer to the new one.
Next we swap pointers (__first_
, __begin_
, __end_
, and __buf.__end_cap()
) from the newly allocated __buf
to the __map_
which is the old split buffer, as usual.
Finally we change the __base::__start_
pointer like this:
(gdb) print __start_
$34 = 0
(gdb) n
(gdb) print __start_
$35 = 512
(gdb)
As you can see, the __start_
pointer is just an index (here, it is a value from 0 to 512) which points to the last empty position in the current block.
Recall from earlier that at this point, the block at the front of the split buffer has the address 0x40b330
. We fill up the blocks from the back to the front, and this block is currently empty, so we should add 4096 to this address in order to get the address where the newly pushed element will be stored.
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 1139 typename __deque_base<_Tp, _Allocator>::iterator │
│ 1140 __deque_base<_Tp, _Allocator>::begin() _NOEXCEPT │
│ 1141 { │
│ 1142 __map_pointer __mp = __map_.begin() + __start_ / __block_size; │
│ >1143 return iterator(__mp, __map_.empty() ? 0 : *__mp + __start_ % __block_size); │
│ 1144 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__deque_base<long, std::__1::allocator<long> >::begin L1143 PC: 0x4017d0
(gdb) print __map_.begin()
$36 = (long **) 0x40b300
(gdb) print __start_ / __block_size
value has been optimized out
(gdb) print __mp
$37 = (std::__1::__deque_base<long, std::__1::allocator<long> >::__map_pointer) 0x40b308
(gdb) print *__mp
$39 = (long *) 0x40a2f0
As we noted earlier, 0x40a2f0
is the address of the second block in the split buffer. This block is already filled with the values from 13370256 to 13370767 and therefore should not be used. Instead we should be putting the new value in the newly allocated block 0x40b330
which is currently empty. The --
operator here does a job, it gets the address of the block that is before the one that the begin()
function returns.
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 1935 │
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ 1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ >1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ 1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::push_front L1944 PC: 0x401343
std::__1::deque<long, std::__1::allocator<long> >::push_front (this=0x7fffffffe110, __v=@0x408088: 13370768)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1940
(gdb) print *(0x40b330)
$1 = 0
(gdb) print *(0x40b330+4096-8)
$2 = 0
(gdb) print *(0x40b330+4096-16)
$3 = 0
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 1936 template <class _Tp, class _Allocator> │
│ 1937 void │
│ 1938 deque<_Tp, _Allocator>::push_front(const value_type& __v) │
│ 1939 { │
│ 1940 allocator_type& __a = __base::__alloc(); │
│ 1941 if (__front_spare() == 0) │
│ 1942 __add_front_capacity(); │
│ 1943 // __front_spare() >= 1 │
│ 1944 __alloc_traits::construct(__a, _VSTD::addressof(*--__base::begin()), __v); │
│ >1945 --__base::__start_; │
│ 1946 ++__base::size(); │
│ 1947 } │
│ 1948 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::push_front L1945 PC: 0x40138c
(gdb) print __v
$40 = (const std::__1::deque<long, std::__1::allocator<long> >::value_type &) @0x408088: 13370768
(gdb) print *(0x40b330)
$42 = 0
(gdb) print *(0x40b330+4096-8)
$43 = 13370768
(gdb) print *(0x40b330+4096-16)
$44 = 0
As expected, the new element 13370768
is placed at the very end of the newly allocated block 0x40b330
. Then the __start_
pointer is decremented:
(gdb) print __start_
$4 = 512
(gdb) n
(gdb) print __start_
$5 = 511
Since we have just placed an element into the block, the __start_
pointer is moved so that the next element that is inserted via push_front
is placed in front of the current element.
We have reached the end of our program, so let’s add some more lines so that we can see the behavior when more blocks are added:
#include <deque>
long mylong = 13370000;
void my_pf(std::deque<long>& d){
d.push_front(mylong);
++mylong;
}
void pushn(std::deque<long>& d, int n){
for (int i = 0; i < n; i++){
my_pf(d);
}
}
int main()
{
std::deque<long> d = {};
// zero blocks
pushn(d, 256);
// one block
my_pf(d);
// two blocks
pushn(d, 511);
// still two blocks
my_pf(d);
// three blocks
pushn(d, 511);
// still three blocks
my_pf(d); // 4th block is added
pushn(d, 511);
my_pf(d); // 5th block is added
pushn(d, 511);
my_pf(d); // 6th block is added
pushn(d, 511);
my_pf(d); // 7th block is added
pushn(d, 511);
my_pf(d); // 8th block is added
pushn(d, 511);
my_pf(d); // 9th block is added
pushn(d, 511);
my_pf(d); // 10th block is added
pushn(d, 511);
my_pf(d); // 11th block is added
pushn(d, 511);
my_pf(d); // 12th block is added
pushn(d, 511);
my_pf(d); // 13th block is added
pushn(d, 511);
my_pf(d); // 14th block is added
pushn(d, 511);
my_pf(d); // 15th block is added
pushn(d, 511);
my_pf(d); // 16th block is added
pushn(d, 511);
my_pf(d); // 17th block is added
}
So now we know the behavior up to 3 blocks. Let’s skip ahead to where the 4th block is added
┌─e.cpp────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 29 pushn(d, 511); │
│ 30 // still three blocks │
│B+>31 my_pf(d); // 4th block is added │
│ 32 pushn(d, 511); │
│ 33 my_pf(d); // 5th block is added │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: main L31 PC: 0x4012d1
(gdb) s
my_pf (d=...) at e.cpp:7
(gdb) s
std::__1::deque<long, std::__1::allocator<long> >::push_front (this=0x7fffffffe110, __v=@0x408088: 13371280)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:1940
(gdb) n
(gdb) n
(gdb) s
std::__1::deque<long, std::__1::allocator<long> >::__add_front_capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2425
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2425 allocator_type& __a = __base::__alloc(); │
│ 2426 if (__back_spare() >= __base::__block_size) │
│ 2427 { │
│ 2428 __base::__start_ += __base::__block_size; │
│ 2429 pointer __pt = __base::__map_.back(); │
│ 2430 __base::__map_.pop_back(); │
│ 2431 __base::__map_.push_front(__pt); │
│ 2432 } │
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ >2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ 2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
│ 2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2434 PC: 0x402174
(gdb) print __map_.size()
$2 = 3
(gdb) print __map_.capacity()
$3 = 4
Notice that for the first time ever, we are in a situation where the map size is smaller than the map’s capacity (in the context of adding a new block). In the previous two situations, the map size and capacity were both the same:
- When the first block was added, both size and capacity were 0
- When the second block was added, both size and capacity were 1
- When the third block was added, both size and capacity were 2
Going from 0 to 1, then from 1 to 2, the capacity only increased by 1 each time, and each time the newly added capacity was immediately used up by the newly added block.
However, when we added a new block when the map was of size 2, the map capacity was doubled from 2 to 4, which meant that capacity was increased by 2 but only one new block was added (the third block, which was added at the front of the map), so now there is one empty slot at the back of the map. So now when we add the 4th block, there is one empty slot in the map where it can be added, but it’s at the back. So we drop into this else if
block:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ >2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(gdb) print __map_.__front_spare()
$5 = 0
As we know, the map is currently filled at the front. There is only one empty slot, and it’s at the back. So we drop to the else block:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2440 else │
│ 2441 { │
│ >2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
│ 2445 __base::__map_.pop_back(); │
│ 2446 __base::__map_.push_front(__pt); │
│ 2447 } │
│ 2448 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2449 __base::__block_size / 2 : │
│ 2450 __base::__start_ + __base::__block_size; │
│ 2451 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2442 PC: 0x4021f7
In this else block, we first allocate a new block, then we push_back
it into the __map_
, then we get a pointer to the back of the map and we call pop_back
on the __map_
and the we push_front
the pointer onto the map.
Let’s enter the push_back function:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 568 template <class _Tp, class _Allocator> │
│ 569 void │
│ 570 __split_buffer<_Tp, _Allocator>::push_back(value_type&& __x) │
│ 571 { │
│ >572 if (__end_ == __end_cap()) │
│ 573 { │
│ 574 if (__begin_ > __first_) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_b* L572 PC: 0x402c27
(gdb) print __end_
$2 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b318
(gdb) print __end_cap()
$3 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe128: 0x40b320
As we know, there is still one slot left at the back of the split buffer, so the __end_cap()
is 8 bytes greater than __end_
. Thus, when we hit n
, we skip that if block straight to the end of the function:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 592 } │
│ >593 __alloc_traits::construct(__alloc(), _VSTD::__to_address(__end_), │
│ 594 _VSTD::move(__x)); │
│ 595 ++__end_; │
│ 596 } │
│ 597 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_b* L593 PC: 0x402e3a
So it really is a bog standard push_back function that just places the block at the end of the map. We return from the function and now the map looks like this:
(gdb) print __map_
$1 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *(__map_.__first_)
$6 = (long *) 0x40b330
(gdb) print *(__map_.__first_+1)
$7 = (long *) 0x40a2f0
(gdb) print *(__map_.__first_+2)
$8 = (long *) 0x4092c0
(gdb) print *(__map_.__first_+3)
$9 = (long *) 0x40c340
As you can see from the memory addresses, the newly allocated block has the address 0x40c340 and is at the end of the map. This is also the address that is returned from __map_.back()
:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 94 │
│ 95 _LIBCPP_INLINE_VISIBILITY reference front() {return *__begin_;} │
│ 96 _LIBCPP_INLINE_VISIBILITY const_reference front() const {return *__begin_;} │
│ >97 _LIBCPP_INLINE_VISIBILITY reference back() {return *(__end_ - 1);} │
│ 98 _LIBCPP_INLINE_VISIBILITY const_reference back() const {return *(__end_ - 1);} │
│ 99 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::back L97 PC: 0x40266c
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
│ >2445 __base::__map_.pop_back(); │
│ 2446 __base::__map_.push_front(__pt); │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2445 PC: 0x40223d
(gdb) print __pt
$13 = (std::__1::deque<long, std::__1::allocator<long> >::pointer) 0x40c340
And now we move on to the pop_back
function:
(gdb) print __map_
$2 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
│ 2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
│ 2445 __base::__map_.pop_back(); │
│ >2446 __base::__map_.push_front(__pt); │
│ 2447 } │
│ 2448 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2449 __base::__block_size / 2 : │
│ 2450 __base::__start_ + __base::__block_size; │
│ 2451 } │
│ 2452 // Else need to allocate 1 buffer, *and* we need to reallocate __map_. │
│ 2453 else │
│ 2454 { │
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ 2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2446 PC: 0x40224c
(gdb) print __map_
$3 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b318, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
What this pop_back function does is to set the __end_
pointer back to the value it had prior to the call to push_back. This means that the __end_
pointer now points to the pointer to the newly allocated empty block, rather than pointing to a null value.
(gdb) print __begin_
$12 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __first_
$13 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 472 │
│ 473 template <class _Tp, class _Allocator> │
│ 474 void │
│ 475 __split_buffer<_Tp, _Allocator>::push_front(const_reference __x) │
│ 476 { │
│ 477 if (__begin_ == __first_) │
│ 478 { │
│ >479 if (__end_ < __end_cap()) │
│ 480 { │
│ 481 difference_type __d = __end_cap() - __end_; │
│ 482 __d = (__d + 1) / 2; │
│ 483 __begin_ = _VSTD::move_backward(__begin_, __end_, __end_ + __d); │
│ 484 __end_ += __d; │
│ 485 } │
│ 486 else │
│ 487 { │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_f* L479 PC: 0x4026dc
(gdb) print __end_
$16 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b318
(gdb) print __end_cap()
$17 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe128: 0x40b320
(gdb) n
(gdb) print __d
$18 = 1
(gdb) n
(gdb) print __d
$19 = 1
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 472 │
│ 473 template <class _Tp, class _Allocator> │
│ 474 void │
│ 475 __split_buffer<_Tp, _Allocator>::push_front(const_reference __x) │
│ 476 { │
│ 477 if (__begin_ == __first_) │
│ 478 { │
│ 479 if (__end_ < __end_cap()) │
│ 480 { │
│ 481 difference_type __d = __end_cap() - __end_; │
│ 482 __d = (__d + 1) / 2; │
│ >483 __begin_ = _VSTD::move_backward(__begin_, __end_, __end_ + __d); │
│ 484 __end_ += __d; │
│ 485 } │
│ 486 else │
│ 487 { │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_f* L483 PC: 0x402736
Let’s step into move_backward to find out exactly how it works:
(gdb) s
std::__1::move_backward<long**, long**> (__first=0x40b300, __last=0x40b318, __result=0x40b320)
at /usr/lib/llvm-11/bin/../include/c++/v1/algorithm:1933
┌─/usr/lib/llvm-11/bin/../include/c++/v1/algorithm─────────────────────────────────────────────────────────────────────────┐
│ 1926 │
│ 1927 template <class _BidirectionalIterator1, class _BidirectionalIterator2> │
│ 1928 inline _LIBCPP_INLINE_VISIBILITY │
│ 1929 _BidirectionalIterator2 │
│ 1930 move_backward(_BidirectionalIterator1 __first, _BidirectionalIterator1 __last, │
│ 1931 _BidirectionalIterator2 __result) │
│ 1932 { │
│ >1933 return _VSTD::__move_backward(__unwrap_iter(__first), __unwrap_iter(__last), __unwrap_iter(__result)); │
│ 1934 } │
│ 1935 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::move_backward<long**, long**> L1933 PC: 0x4037a4
(gdb) s
std::__1::__move_backward<long*, long*> (__first=0x40b300, __last=0x40b318, __result=0x40b320)
at /usr/lib/llvm-11/bin/../include/c++/v1/algorithm:1918
┌─/usr/lib/llvm-11/bin/../include/c++/v1/algorithm─────────────────────────────────────────────────────────────────────────┐
│ 1907 │
│ 1908 template <class _Tp, class _Up> │
│ 1909 inline _LIBCPP_INLINE_VISIBILITY │
│ 1910 typename enable_if │
│ 1911 < │
│ 1912 is_same<typename remove_const<_Tp>::type, _Up>::value && │
│ 1913 is_trivially_copy_assignable<_Up>::value, │
│ 1914 _Up* │
│ 1915 >::type │
│ 1916 __move_backward(_Tp* __first, _Tp* __last, _Up* __result) │
│ 1917 { │
│ >1918 const size_t __n = static_cast<size_t>(__last - __first); │
│ 1919 if (__n > 0) │
│ 1920 { │
│ 1921 __result -= __n; │
│ 1922 _VSTD::memmove(__result, __first, __n * sizeof(_Up)); │
│ 1923 } │
│ 1924 return __result; │
│ 1925 } │
│ 1926 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: _ZNSt3__115__move_backwardIPlS1_EENS_9enable_ifIXaasr7is_sameINS_12r* L1918 PC: 0x403964
As you can see, the __move_backward
function here ACTUALLY calls memmove, which is a O(n) algorithm. And here I thought it would have used some smart optimization - nope, it actually just shifts the entire array, moving every element in the array. Proof:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 481 difference_type __d = __end_cap() - __end_; │
│ >482 __d = (__d + 1) / 2; │
│ 483 __begin_ = _VSTD::move_backward(__begin_, __end_, __end_ + __d); │
│ 484 __end_ += __d; │
│ 485 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_f* L482 PC: 0x40271e
(gdb) print __begin_
$1 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __first_
$2 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __end_
$3 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b318
(gdb) print __end_cap()
$4 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe128: 0x40b320
(gdb) print *(__begin_)
$1 = (long *) 0x40b330
(gdb) print *(__begin_+1)
$2 = (long *) 0x40a2f0
(gdb) print *(__begin_+2)
$3 = (long *) 0x4092c0
(gdb) print *(__begin_+3)
$4 = (long *) 0x40c340
(gdb) n
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 481 difference_type __d = __end_cap() - __end_; │
│ 482 __d = (__d + 1) / 2; │
│ 483 __begin_ = _VSTD::move_backward(__begin_, __end_, __end_ + __d); │
│ >484 __end_ += __d; │
│ 485 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_f* L484 PC: 0x40275d
(gdb) print __begin_
$5 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b308
(gdb) print __first_
$6 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __end_
$7 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b318
(gdb) print __end_cap()
$8 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe128: 0x40b320
(gdb) print *(__first_)
$9 = (long *) 0x40b330
(gdb) print *(__first_+1)
$10 = (long *) 0x40b330
(gdb) print *(__first_+2)
$11 = (long *) 0x40a2f0
(gdb) print *(__first_+3)
$12 = (long *) 0x4092c0
As you can see, the move_backward
function has indeed moved the contents of the split buffer backwards (by exactly one position in this case, since __d
was 1). Essentially, this has happened:
map before: [0x40b330, 0x40a2f0, 0x4092c0, 0x40c340]
map after: [0x40b330, 0x40b330, 0x40a2f0, 0x4092c0]
As you can see, the newly allocated block 0x40c340
that was pushed onto the back of the map has now fallen off the back. Not to worry though, we are going to push it back onto the front now:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/__split_buffer────────────────────────────────────────────────────────────────────┐
│ 496 } │
│ 497 } │
│ 498 __alloc_traits::construct(__alloc(), _VSTD::__to_address(__begin_-1), __x); │
│ 499 --__begin_; │
│ >500 } │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::__split_buffer<long*, std::__1::allocator<long*> >::push_f* L497 PC: 0x4028c6
(gdb) print __begin_
$13 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __first_
$14 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b300
(gdb) print __end_
$15 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer) 0x40b320
(gdb) print __end_cap()
$16 = (std::__1::__split_buffer<long*, std::__1::allocator<long*> >::pointer &) @0x7fffffffe128: 0x40b320
(gdb) print *(__first_)
$17 = (long *) 0x40c340
(gdb) print *(__first_+1)
$18 = (long *) 0x40b330
(gdb) print *(__first_+2)
$19 = (long *) 0x40a2f0
(gdb) print *(__first_+3)
$20 = (long *) 0x4092c0
...here's an easier way to do it...
(gdb) print *__first_@5
$22 = {0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
The __x
parameter that was passed in was the address of the newly allocated block that fell off the back of the map. So here we added it to the front of the map. We shifted the entire map backwards by one slot just so that we could do this. That is basically what the push_front
function does.
After that, we fix up the __start_
pointer so that it’s 512 instead of 0, pointing to the last element of the first block, which is where we want our current element to be inserted. After that we decrement __start_
again, as usual.
So, now we have 4 blocks and the map is full. The next time we insert a new block, the map will need to be resized. It will double to a capacity of 8 and the new block will be added at the front of the map. But there will now be more space available, so let’s see how it works:
┌─e.cpp────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 32 pushn(d, 511); │
│ >33 my_pf(d); // 5th block is added │
│ 34 pushn(d, 511); │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: main L33 PC: 0x4012f2 (gdb) n
(gdb) n
(gdb) s
std::__1::deque<long, std::__1::allocator<long> >::__add_front_capacity (this=0x7fffffffe110)
at /usr/lib/llvm-11/bin/../include/c++/v1/deque:2425
(gdb) print __map_
$25 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
...there is no spare space anywhere so we drop to the else block...
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2452 // Else need to allocate 1 buffer, *and* we need to reallocate __map_. │
│ 2453 else │
│ 2454 { │
│ 2455 __split_buffer<pointer, typename __base::__pointer_allocator&> │
│ 2456 __buf(max<size_type>(2 * __base::__map_.capacity(), 1), │
│ 2457 0, __base::__map_.__alloc()); │
│ 2458 │
│ 2459 typedef __allocator_destructor<_Allocator> _Dp; │
│ 2460 unique_ptr<pointer, _Dp> __hold( │
│ >2461 __alloc_traits::allocate(__a, __base::__block_size), │
│ 2462 _Dp(__a, __base::__block_size)); │
│ 2463 __buf.push_back(__hold.get()); │
│ 2464 __hold.release(); │
│ 2465 │
│ 2466 for (typename __base::__map_pointer __i = __base::__map_.begin(); │
│ 2467 __i != __base::__map_.end(); ++__i) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2461 PC: 0x402307
(gdb) print __buf
$26 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d350, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
...this newly allocated split buffer has capacity 0x40 or 64 bytes which is enough for 8 longs...
A new block is then allocated and pushed onto __buf
:
(gdb) print *__buf.__first_@5
$28 = {0x0, 0x0, 0x0, 0x0, 0x0}
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2463 __buf.push_back(__hold.get()); │
│ >2464 __hold.release(); │
│ 2465 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2464 PC: 0x402389
(gdb) print *__buf.__first_@5
$29 = {0x40d3a0, 0x0, 0x0, 0x0, 0x0}
(gdb) print __buf
$30 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d358, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
...first element pushed from old map to new map...
(gdb) print __buf
$31 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d360, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) print *__buf.__first_@5
$32 = {0x40d3a0, 0x40c340, 0x0, 0x0, 0x0}
...after all elements pushed...
(gdb) print __buf
$33 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>&, 1, false>> = {
__value_ = @0x7fffffffe128}, <No data fields>}}
(gdb) print *__buf.__first_@8
$35 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
...after swaps...
(gdb) print __map_
$36 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$37 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
So, there you go. As you can see, every time the map is reallocated, it doubles in size, a new block is added at the very front of the map, and then the old blocks are added to the map from the front to the back. The result is that every time the map doubles, the first element of the map is always a pointer to a newly allocated block, followed by pointers to the older blocks.
Now let’s move on to the next function call, which is the adding of the 6th block to the deque. Since we have 3 empty slots at the back, we fall through to the else if
block:
┌─/usr/lib/llvm-11/bin/../include/c++/v1/deque─────────────────────────────────────────────────────────────────────────────┐
│ 2433 // Else if __base::__map_.size() < __base::__map_.capacity() then we need to allocate 1 buffer │
│ 2434 else if (__base::__map_.size() < __base::__map_.capacity()) │
│ 2435 { // we can put the new buffer into the map, but don't shift things around │
│ 2436 // until all buffers are allocated. If we throw, we don't need to fix │
│ 2437 // anything up (any added buffers are undetectible) │
│ 2438 if (__base::__map_.__front_spare() > 0) │
│ 2439 __base::__map_.push_front(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2440 else │
│ 2441 { │
│ >2442 __base::__map_.push_back(__alloc_traits::allocate(__a, __base::__block_size)); │
│ 2443 // Done allocating, reorder capacity │
│ 2444 pointer __pt = __base::__map_.back(); │
│ 2445 __base::__map_.pop_back(); │
│ 2446 __base::__map_.push_front(__pt); │
│ 2447 } │
│ 2448 __base::__start_ = __base::__map_.size() == 1 ? │
│ 2449 __base::__block_size / 2 : │
│ 2450 __base::__start_ + __base::__block_size; │
│ 2451 } │
│ 2452 // Else need to allocate 1 buffer, *and* we need to reallocate __map_. │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
multi-thre Thread 0x7ffff7b557 In: std::__1::deque<long, std::__1::allocator<long> >::__add_front_capac* L2442 PC: 0x4021f7
(gdb) print __map_
$39 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$40 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) n
(gdb) print __map_
$41 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d380, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No