avatar

Catalog
Modern C++ Programming Cookbook Notes 2: Working with Numbers and Strings

Chapter 2 Working with Numbers and Strings

2.1 Converting between numeric and string types

  • Use std::to_string() to convert a numeric (including integral and floating point type) to string.

  • Use std::stoi() to convert a string to an integer type. Other than the string, it accepts another two parameters, which are the address of variable to store the number of characters processed and the number indicating the base (default is 10).

    Note that the 0 (and 0x) prefix in the string is only valid when the base is 0 or 8 (0 or 16).

  • Use std::stod() to convert a string to a double type. It doesn’t accept the number indicating the base explicitly while the string still has several forms like decimal floating point (containing e), binary floating point (containing 0x and p), inf and nan.

  • The functions converting string to numeric types can throw two exceptions potentially, which are std::invalid_argument and std::out_of_range.

2.2 Limits and other properties of numeric types

  • std::numeric_limits, which is a class template, provides some information about numeric types, among which the most common used is ::min() and ::max().
  • Since C++11, all static members of std::numeric_limits are constexpr, which can be used everywhere including as constant expression, so the C-style macro of numeric properties can be deprecated completely.
  • reference

2.3 Generating pseudo-random numbers

  • When talking about random numbers in modern C++, we need to be clear about two concepts: engines and distributions:

    • Engines are used to produce random numbers with a uniform distribution.
    • Distributions are used to convert the output of engine to a specified distribution.
  • So things are clear: choose an engine to produce a random number and use a distribution to convert it to, say, a range we want:

    c++
    1
    2
    3
    4
    5
    6
    std::random_device rd{};
    auto mtgen = std::mt19937{ rd() };
    auto ud = std::uniform_int_distribution<>{ 1, 6 };
    for (auto i = 0; i < 20; ++i) {
    auto number = ud(mtgen);
    }

    First, we use std::random_device engine to produce a random number as seed. Then use it to seed another engine std::mt19937, which will be used by distributions later. And then define a uniform distribution to limit the range to between 1 and 6. Finally invoke the distribution with the chosen engine to produce random numbers in the range we want.

2.5 Creating cooked user-defined literals

  • Since C++11, we can create cooked user-defined literals with operator"":

    c++
    1
    2
    3
    4
    5
    6
    constexpr size_t operator"" _KB(const unsigned long long size) { 
    return static_cast<size_t>(size * 1024);
    }

    auto size{ 4_KB }; // size_t size = 4096;
    auto buffer = std::array<byte, 1_KB>{};
  • There are some points to mention:

    • For integral type, the argument needs to be unsigned long long and for floating-point type, it needs to be long double, i.e. literals should handle the largest possible values.
    • It’s recommended to define the literal operator in a separate namespace and then using it to avoid name collision.
    • It’s also recommended to prefix the user-defined suffix with an underscore (_) to avoid conflict with standard literal suffix introduced in C++14 (such as s, min and so on).

2.6 Creating raw user-defined literals

  • Raw literal operators, as fallbacks of cooked literal operators, accept a string of char as parameter:

    c++
    1
    2
    T operator "" _suffix(const char*); 
    template<char...> T operator "" _suffix();

2.7 Using raw string literals to avoid escaping characters

  • Raw string literals has two forms:

    c++
    1
    2
    R"( literal )"
    R"delimiter( literal )delimiter"

    The principle is what you see is what you get, e.g.:

    c++
    1
    2
    3
    4
    5
    6
    auto sqlselect { 
    R"(SELECT *
    FROM Books
    WHERE Publisher='Paktpub'
    ORDER BY PubDate DESC)"s
    };

    even the \n will be included in the string.

2.8 Creating a library of string helpers

  • One thing worth noting is that return value of remove() algorithm is the first iterator after the new range, so an extra erase() is needed:

    c++
    1
    2
    std::string str = "Text with some   spaces";
    str.erase(std::remove(str.begin(), str.end(), ' '), str.end());

2.9 Verifying the format of a string using regular expressions

  • Use a regular expression to match against a string:

    c++
    1
    2
    3
    auto pattern {R"(^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}$)"s};
    auto rx = std::regex{pattern};
    auto valid = std::regex_match("marius@domain.com"s, rx);
  • When constructing the std::regex, we can specify some extra options. e.g. to ignore letter case:

    c++
    1
    auto rx = std::regex{pattern, std::regex_constants::icase};
  • Actually std::regex_match() has several overloads, among which there is one to return the matched subexpressions:

    c++
    1
    2
    3
    auto rx = std::regex{R"(^([A-Z0-9._%+-]+)@([A-Z0-9.-]+)\.([A-Z]{2,})$)"s};
    auto result = std::smatch{};
    auto success = std::regex_match(email, result, rx);

    Note that three pairs of parentheses in the regular expressions, which indicates the subexpression needed to match. After calling std::regex_match(), the matching results can be queried from the std::smatch:

    c++
    1
    2
    3
    4
    cout << result[0].str() << endl;  // the entire expression
    cout << result[1].str() << endl; // subexpression 1
    cout << result[2].str() << endl; // subexpression 2
    cout << result[3].str() << endl; // subexpression 3

2.10 Parsing the content of a string using regular expressions

  • Just like std::regex_match(), we can use std::regex_search() to parse the content of a string:

    c++
    1
    2
    3
    4
    auto match = std::smatch{}; 
    if (std::regex_search(text, match, rx)) {
    std::cout << match[1] << '=' << match[2] << std::endl;
    }
  • However, std::regex_search() just performs a one-time search, i.e. it won’t iterate over the string to find all substrings that match. To solve this, we could use std::sregex_iterator or std::sregex_token_iterator:

    c++
    1
    2
    3
    4
    5
    auto end = std::sregex_iterator{}; 
    for (auto it = std::sregex_iterator{ std::begin(text), std::end(text), rx };
    it != end; ++it) {
    std::cout << (*it)[1] << '=' << (*it)[2] << std::endl;
    }

2.11 Replacing the content of a string using regular expressions

  • Use std::regex_replace() to replace the content of a string. The parameters of it are as follows:

    • the input string on which the replacement will be performed,
    • a std::basic_regex that is used to match against,
    • the string format that is used to replace,
    • and some flags.
    c++
    1
    2
    3
    auto text{ "bancila, marius"s }; 
    auto rx = std::regex{ R"((\w+),\s*(\w+))"s };
    auto newtext = std::regex_replace(text, rx, "$2 $1"s);
  • The last two parameters are worth mentioning. The string format can use a match identifier to indicate a substring. e.g. $1 means the first subexpression matched, $& means the entire match, $' means the substring after the last match and so on.

    And as the last parameter, the flags can be something like std::regex_constants::format_first_only, which means just replace once.

2.12 Using string_view instead of constant string references

  • C++17 introduces std::string_view, which is a non-owning (doesn’t manage lifetime of the data) constant (cannot modify) reference to a string, to solve the problem of performance cost due to temporary string objects.

    std::string_view provides interfaces which are almost the same with std::string so typically we can almost always replace const std::string & with std::string_view unless a std::string is indeed needed.

  • Essentially std::string_view just holds a pointer to the start position of the character sequence and a length of it.

    It provides remove_prefix() and remove_suffix() methods to resize the range.

  • std::string_view can be constructed from a std::string and vice versa.

Author: Gusabary
Link: http://gusabary.cn/2020/11/05/Modern-C++-Programming-Cookbook-Notes/Modern-C++-Programming-Cookbook-Notes-2/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.

Comment