Exploring std::string
(2029 words) Mon, Sep 12, 2016Every C++ developer knows that std::string
represents a sequence of characters
in memory. It manages its own memory, and is very intuitive to use.
Today we’ll explore std::string
as defined by the C++ Standard, and also by
looking at 4 major implementations.
Quick note: in this post I use the notation of compilers to inspect implementations. This is technically incorrect, as a Standard library is not necessarily tied to a specific compiler. So when I say, for example, ‘GCC’ I really mean ‘GCC with its default Standard library’, which is libstdc++. For clang the library is libc++. For Microsoft Visual Studio its the library that ships with it, which is implemented by Dinkumware (I think).
Size
We’ll start with the most basic thing - what’s the sizeof
of a std::string
(not including any dynamic allocation)?
A naive implementation would require 3 fields, each the size of a pointer:
- Pointer to the allocated memory;
- Logical size of string;
- Size of allocated memory (which must be bigger than or equal to logical size).
On a 64-bit machine the above would consume 24 bytes. Let’s examine the output of the following code on different compilers, all using 64-bit architecture:
#include <string>
#include <iostream>
int main() {
std::cout << sizeof(std::string) << std::endl;
}
Visual Studio 14:
40
GCC < 5:
$ g++ -std=c++11 main.cpp && ./a.out
8
GCC >= 5:
$ g++-5 -std=c++11 main.cpp && ./a.out
32
clang:
$ clang++ -std=c++11 -stdlib=libc++ main.cpp && ./a.out
24
Keep these very different numbers in mind, and let’s continue exploring.
Small Object Optimization
One particular optimization found its way to pretty much all implementations:
small objects optimization (aka small buffer optimization). Simply put,
Small Object Optimization means that the std::string
object has a small
buffer for small strings, which saves dynamic allocations.
One might add a buffer on top of existing fields of std::string
. However,
there are some smart tricks to better use existing size.
#include <cstdlib>
#include <iostream>
#include <string>
// replace operator new and delete to log allocations
void* operator new(std::size_t n) {
std::cout << "[Allocating " << n << " bytes]";
return malloc(n);
}
void operator delete(void* p) throw() {
free(p);
}
int main() {
for (size_t i = 0; i < 24; ++i) {
std::cout << i << ": " << std::string(i, '=') << std::endl;
}
}
Visual Studio 14:
0:
1: =
2: ==
3: ===
4: ====
5: =====
6: ======
7: =======
8: ========
9: =========
10: ==========
11: ===========
12: ============
13: =============
14: ==============
15: ===============
[Allocating 32 bytes]16: ================
[Allocating 32 bytes]17: =================
[Allocating 32 bytes]18: ==================
[Allocating 32 bytes]19: ===================
[Allocating 32 bytes]20: ====================
[Allocating 32 bytes]21: =====================
[Allocating 32 bytes]22: ======================
[Allocating 32 bytes]23: =======================
With a size of 40 bytes (5 pointers), it appears as if there are 16 bytes dedicated to hold small strings. As we will see below, this is a somewhat wasteful implementation, and more juice can be squeezed from even smaller structs.
GCC < 5:
0:
1: [Allocating 26 bytes]=
2: [Allocating 27 bytes]==
3: [Allocating 28 bytes]===
4: [Allocating 29 bytes]====
5: [Allocating 30 bytes]=====
6: [Allocating 31 bytes]======
7: [Allocating 32 bytes]=======
8: [Allocating 33 bytes]========
9: [Allocating 34 bytes]=========
10: [Allocating 35 bytes]==========
11: [Allocating 36 bytes]===========
12: [Allocating 37 bytes]============
13: [Allocating 38 bytes]=============
14: [Allocating 39 bytes]==============
15: [Allocating 40 bytes]===============
16: [Allocating 41 bytes]================
17: [Allocating 42 bytes]=================
18: [Allocating 43 bytes]==================
19: [Allocating 44 bytes]===================
20: [Allocating 45 bytes]====================
21: [Allocating 46 bytes]=====================
22: [Allocating 47 bytes]======================
23: [Allocating 48 bytes]=======================
As we will see soon, older libstdc++ implements copy-on-write, and so it makes sense for them to not utilize small objects optimization.
GCC >= 5:
$ g++-5 -std=c++11 main.cpp && ./a.out
0:
1: =
2: ==
3: ===
4: ====
5: =====
6: ======
7: =======
8: ========
9: =========
10: ==========
11: ===========
12: ============
13: =============
14: ==============
15: ===============
[Allocating 17 bytes]16: ================
[Allocating 18 bytes]17: =================
[Allocating 19 bytes]18: ==================
[Allocating 20 bytes]19: ===================
[Allocating 21 bytes]20: ====================
[Allocating 22 bytes]21: =====================
[Allocating 23 bytes]22: ======================
[Allocating 24 bytes]23: =======================
Recent GCC versions use a union
of buffer (16 bytes) and capacity (8 bytes) to
store small strings. Since reserve()
is mandatory (more on this later), the
internal pointer to the beginning of the string either points to this union
or to the dynamically allocated string.
clang:
0:
1: =
2: ==
3: ===
4: ====
5: =====
6: ======
7: =======
8: ========
9: =========
10: ==========
11: ===========
12: ============
13: =============
14: ==============
15: ===============
16: ================
17: =================
18: ==================
19: ===================
20: ====================
21: =====================
22: ======================
23: [Allocating 32 bytes]=======================
clang is by-far the smartest and coolest. While std::string
has the size
of 24 bytes, it allows strings up to 22 bytes(!!) with no allocation. To achieve
this libc++ uses a neat trick: the size of the string is not saved as-is but
rather in a special way: if the string is short (< 23 bytes) then it stores
size() * 2
. This way the least significant bit is always 0. The long form
always bitwise-ors the LSB with 1, which in theory might have meant
unnecessarily larger allocations, but this implementation always rounds
allocations to be of form 16*n - 1
(where n
is an integer). By the way, the
allocated string is actually of form 16*n
, the last character being '\0'
. So
freaking cool :).
Copy on Write
In the past it used to be valid (and some might even say encouraged) to
implement std::string
as a copy-on-write (using a reference-count) object.
The Standard had careful wording to allow that first call to non-const methods
(like operator[]
, begin()
, at()
) invalidate existing iterators.
As multi-threaded computing became more popular, it turned out that
copy-on-write is slower than naive implementations due to necessary locking in
almost every non-const
operation. C++11 changed the wording, making
copy-on-write incompliant.
GCC < 5 remained incompliant many years after the Standard had changed as they didn’t want to introduce an ABI change on such a critical component. Microsoft’s Visual Studio’s stl also dropped copy-on-write from as long as I can remember.
In the following piece of code I create a std::string
with 50
times the
character c
, then change the first character without the std::string
being
aware of it (and of course don’t use this in production code!). I use 50
to
make sure I don’t come across any small-object-optimization.
#include <string>
#include <iostream>
int main() {
std::string a(50, 'c');
std::string b = a;
*const_cast<char*>(a.c_str()) = 'A';
std::cout << "a: " << a << "\nb: " << b << std::endl;
}
Visual Studio 14:
a: Accccccccccccccccccccccccccccccccccccccccccccccccc
b: cccccccccccccccccccccccccccccccccccccccccccccccccc
GCC < 5:
$ g++ -std=c++11 main.cpp && ./a.out
a: Accccccccccccccccccccccccccccccccccccccccccccccccc
b: Accccccccccccccccccccccccccccccccccccccccccccccccc
Aha! Copy-on-write in action.
GCC >= 5:
$ g++-5 -std=c++11 main.cpp && ./a.out
a: Accccccccccccccccccccccccccccccccccccccccccccccccc
b: cccccccccccccccccccccccccccccccccccccccccccccccccc
clang:
$ clang++ -std=c++11 -stdlib=libc++ main.cpp && ./a.out
a: Accccccccccccccccccccccccccccccccccccccccccccccccc
b: cccccccccccccccccccccccccccccccccccccccccccccccccc
Allocate on .reserve()
?
As you may know, std::string
has a .size()
method returning the size of
the string it contains. This is not necessarily the allocated size, which might
be bigger (but can’t be smaller).
One might wonder if there are any guarantees on capacity()
and reserve()
. In
other words: can we implement a Standard compliant std::string
with 2 members
(pointer and size)? The answer to this question is a straight no. To quote the
Standard:
After
reserve()
,capacity()
is greater or equal to the argument of reserve.
So calling reserve()
with a number bigger than size()
forces std::string
to grow. By the way, calling reserve()
with a number <= size()
is similar
to calling shrink_to_fit()
, which is a non-binding request.
.c_str()
and .data()
are the same
Calling .data()
in C++98 would not necessarily return a null-terminated
string. C++11 changes this and now both data()
and c_str()
return a string
that must terminate with a '\0'
.
Another thing C++11 brings is the ability to dereference the pointer returned
from c_str()
/data()
even if empty() == true
. No longer is it undefined to
do for (const char* c = s.data(); *c != 0; ++c) ...
, as now c_str()
/data()
will return a pointer to '\0'
.
Binary data
std::string
can hold binary data. That is, a string with '\0'
in any
position. Note that it’s not entirely trivial to create such a string. For
example, the following will create a string with the content "hello"
by-design, as we’re passing in a C-string:
std::string s = "hello\0world";
Even doing something like the following will fail for the exact same reason:
auto cstr = "hello\0world";
std::string s(cstr, strlen(cstr));
The following, however, will work:
// We could use const char[] or decltype(auto) here, but not auto. Why?
// Because auto will be decayed to const char*, which has the wrong size.
decltype(auto) cstr = "Hello\0World!";
std::string s(cstr, sizeof(cstr));
Also note that printing such strings may also be problematic, as some
implementations stop at '\0'
.
Growth strategy
Like std::vector
, it’s important for std::string
to grow in an efficient
way. Let’s examine the following:
#include <cstdlib>
#include <iostream>
#include <string>
// replace operator new and delete to log allocations
void* operator new(std::size_t n) {
std::cout << "Allocating " << n << " bytes" << std::endl;
return malloc(n);
}
void operator delete(void* p) throw() {
free(p);
}
int main() {
std::string s;
for (size_t i = 0; i != 1000000; ++i) {
s += '.';
}
}
Visual Studio 14:
Allocating 32 bytes
Allocating 48 bytes
Allocating 71 bytes
Allocating 106 bytes
Allocating 158 bytes
Allocating 236 bytes
Allocating 353 bytes
Allocating 529 bytes
Allocating 793 bytes
Allocating 1189 bytes
Allocating 1783 bytes
Allocating 2674 bytes
Allocating 4010 bytes
Allocating 6053 bytes
Allocating 9059 bytes
Allocating 13568 bytes
Allocating 20332 bytes
Allocating 30478 bytes
Allocating 45697 bytes
Allocating 68525 bytes
Allocating 102767 bytes
Allocating 154130 bytes
Allocating 231175 bytes
Allocating 346742 bytes
Allocating 520093 bytes
Allocating 780119 bytes
Allocating 1170158 bytes
Growth strategy is 1.5x.
GCC < 5:
$ g++ -std=c++11 main.cpp && ./a.out
Allocating 26 bytes
Allocating 27 bytes
Allocating 29 bytes
Allocating 33 bytes
Allocating 41 bytes
Allocating 57 bytes
Allocating 89 bytes
Allocating 153 bytes
Allocating 281 bytes
Allocating 537 bytes
Allocating 1049 bytes
Allocating 2073 bytes
Allocating 8160 bytes
Allocating 16352 bytes
Allocating 32736 bytes
Allocating 65504 bytes
Allocating 131040 bytes
Allocating 262112 bytes
Allocating 524256 bytes
Allocating 1048544 bytes
Growth strategy is a bit tricky. On my machine the page size is 4096, and so for strings larger than that it rounds up to fit to a page and grows 2x. For small strings, growth is done by adding multiples of 2 (thanks Marek!).
GCC >= 5:
$ g++-5 -std=c++11 main.cpp && ./a.out
Allocating 31 bytes
Allocating 61 bytes
Allocating 121 bytes
Allocating 241 bytes
Allocating 481 bytes
Allocating 961 bytes
Allocating 1921 bytes
Allocating 3841 bytes
Allocating 7681 bytes
Allocating 15361 bytes
Allocating 30721 bytes
Allocating 61441 bytes
Allocating 122881 bytes
Allocating 245761 bytes
Allocating 491521 bytes
Allocating 983041 bytes
Allocating 1966081 bytes
Growth strategy is 2x.
clang:
$ clang++ -std=c++11 -stdlib=libc++ main.cpp && ./a.out
Allocating 48 bytes
Allocating 96 bytes
Allocating 192 bytes
Allocating 384 bytes
Allocating 768 bytes
Allocating 1536 bytes
Allocating 3072 bytes
Allocating 6144 bytes
Allocating 12288 bytes
Allocating 24576 bytes
Allocating 49152 bytes
Allocating 98304 bytes
Allocating 196608 bytes
Allocating 393216 bytes
Allocating 786432 bytes
Allocating 1572864 bytes
Growth strategy is 2x.
Move semantics
Taking advantage of C++11’s move semantics will benefit every library. Apache’s retired stdcxx implementation is the only one I know of to not support them.
One interesting this to note is that with small objects, where Small Object Optimization exists, a move constructor is implemented as a copy constructor. This of course is purely semantic, as the data structure’s fields need to be copied anyway, but is nevertheless interesting to think about.
Memory
The Standard instructs implementations to use contiguous memory for the entire string, which is convenient when working with some C-APIs.
Summary
C++’s built-in std::string
is a well-designed class with some great
implementations. Today we looked at some important features and hopefully
learned a thing or two. Please let me know if there are more things you’re
interested in, either via a comment below or through the ‘About’ page.