HDK
|
String-related utilities. More...
Namespaces | |
fmt | |
old | |
Classes | |
class | StringHash |
struct | StringEqual |
C++ functor for comparing two strings for equality of their characters. More... | |
struct | StringIEqual |
struct | StringLess |
C++ functor for comparing the ordering of two strings. More... | |
struct | StringILess |
Enumerations | |
enum | QuoteBehavior { DeleteQuotes, KeepQuotes } |
enum | EditDistMetric { EditDistMetric::Levenshtein } |
String-related utilities.
|
strong |
std::string OIIO_UTIL_API Strutil::base64_encode | ( | string_view | str | ) |
Encode the string in base64. https://en.wikipedia.org/wiki/Base64
std::string OIIO_UTIL_API Strutil::concat | ( | string_view | s, |
string_view | t | ||
) |
Concatenate two strings, returning a std::string, implemented carefully to not perform any redundant copies or allocations. This is semantically equivalent to Strutil::sprintf("%s%s", s, t)
, but is more efficient.
bool OIIO_UTIL_API Strutil::contains | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' contain the string 'b' within it?
bool OIIO_UTIL_API Strutil::contains_any_char | ( | string_view | a, |
string_view | set | ||
) |
Does 'a' contain any of the characters within set
?
OIIO_UTIL_API size_t Strutil::edit_distance | ( | string_view | a, |
string_view | b, | ||
EditDistMetric | metric = EditDistMetric::Levenshtein |
||
) |
Compute an edit distance metric between strings a
and b
, roughly speaking, the number of changes that would be made to transform one string into the other. Identical strings have a distance of 0. The method
selects among possible algorithms, which may have different distance metrics or allow different types of edits. (Currently, the only method supported is Levenshtein; this parameter is for future expansion.)
bool OIIO_UTIL_API Strutil::ends_with | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' end with the string 'b', with a case-sensitive comparison?
std::string OIIO_UTIL_API Strutil::escape_chars | ( | string_view | unescaped | ) |
Take a string that may have embedded newlines, tabs, etc., and turn those characters into escape sequences like \n
, \t
, \v
, \b
, \r
, \f
, \a
, \\
, \"
.
OIIO_UTIL_API std::string Strutil::excise_string_after_head | ( | std::string & | str, |
string_view | head | ||
) |
Look within str
for the pattern: head nonwhitespace_chars whitespace Remove that full pattern from str
and return the nonwhitespace part that followed the head (or return the empty string and leave str
unmodified, if the head was never found).
int Strutil::extract_from_list_string | ( | std::vector< T, Allocator > & | vals, |
string_view | list, | ||
string_view | sep = "," |
||
) |
Given a string containing values separated by a comma (or optionally another separator), extract the individual values, placing them into vals[] which is presumed to already contain defaults. If only a single value was in the list, replace all elements of vals[] with the value. Otherwise, replace them in the same order. A missing value will simply not be replaced. Return the number of values found in the list (including blank or malformed ones). If the vals vector was empty initially, grow it as necessary.
For example, if T=float, suppose initially, vals[] = {0, 1, 2}, then "3.14" results in vals[] = {3.14, 3.14, 3.14} "3.14,,-2.0" results in vals[] = {3.14, 1, -2.0}
This can work for type T = int, float, or any type for that has an explicit constructor from a std::string.
std::vector<T> Strutil::extract_from_list_string | ( | string_view | list, |
size_t | nvals = 0 , |
||
T | val = T() , |
||
string_view | sep = "," |
||
) |
Given a string containing values separated by a comma (or optionally another separator), extract the individual values, returning them as a std::vector<T>. The vector will be initialized with nvals
elements with default value val
. If only a single value was in the list, replace all elements of vals[] with the value. Otherwise, replace them in the same order. A missing value will simply not be replaced and will retain the initialized default value. If the string contains more then nvals
values, they will append to grow the vector.
For example, if T=float, extract_from_list_string ("", 3, 42.0f) –> {42.0, 42.0, 42.0} extract_from_list_string ("3.14", 3, 42.0f) –> {3.14, 3.14, 3.14} extract_from_list_string ("3.14,,-2.0", 3, 42.0f) –> {3.14, 42.0, -2.0} extract_from_list_string ("1,2,3,4", 3, 42.0f) –> {1.0, 2.0, 3.0, 4.0}
This can work for type T = int, float, or any type for that has an explicit constructor from a std::string.
size_t OIIO_UTIL_API Strutil::find | ( | string_view | a, |
string_view | b | ||
) |
Return the position of the first occurrence of b
within a
, or std::npos if not found.
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
bool OIIO_UTIL_API Strutil::get_rest_arguments | ( | const std::string & | str, |
std::string & | base, | ||
std::map< std::string, std::string > & | result | ||
) |
Get a map with RESTful arguments extracted from the given string 'str'. Add it into the 'result' argument (Warning: the 'result' argument may be changed even if 'get_rest_arguments ()' return an error!). Return true on success, false on error. Acceptable forms:
bool OIIO_UTIL_API Strutil::icontains | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' contain the string 'b' within it, using a case-insensitive comparison? Caveat: the case-sensivive contains() is about 20x faster than this case-insensitive function.
bool OIIO_UTIL_API Strutil::iends_with | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' end with the string 'b', with a case-insensitive comparison? For speed, this always uses a static locale that doesn't require a mutex. Caveat: the case-sensivive ends_with() is about 20x faster than this case-insensitive function.
bool OIIO_UTIL_API Strutil::iequals | ( | string_view | a, |
string_view | b | ||
) |
Case-insensitive comparison of strings. For speed, this always uses a static locale that doesn't require a mutex. Caveat: the case-sensivive ==
of string_view's is about 20x faster than this case-insensitive function.
size_t OIIO_UTIL_API Strutil::ifind | ( | string_view | a, |
string_view | b | ||
) |
Return the position of the first occurrence of b
within a
, with a case-insensitive comparison, or std::npos if not found. Caveat: the case-sensivive find() is about 20x faster than this case-insensitive function.
bool OIIO_UTIL_API Strutil::iless | ( | string_view | a, |
string_view | b | ||
) |
Case-insensitive ordered comparison of strings. For speed, this always uses a static locale that doesn't require a mutex.
|
inline |
Does 'a' contain the string 'b' within it? But start looking at the end! This can be a bit faster than contains() if you think that the substring b
will tend to be close to the end of a
. Caveat: the case-sensivive rcontains() is about 20x faster than this case-insensitive function.
size_t OIIO_UTIL_API Strutil::irfind | ( | string_view | a, |
string_view | b | ||
) |
Return the position of the last occurrence of b
within a
, with a case-insensitive comparison, or npos if not found. Caveat: the case-sensivive rfind() is about 20x faster than this case-insensitive function.
|
inline |
Is the character a whitespace character (space, linefeed, tab, carrage return)? Note: this is safer than C isspace(), which has undefined behavior for negative char values. Also note that it differs from C isspace by not detecting form feed or vertical tab, because who cares.
bool OIIO_UTIL_API Strutil::istarts_with | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' start with the string 'b', with a case-insensitive comparison? For speed, this always uses a static locale that doesn't require a mutex. Caveat: the case-sensivive starts_with() is about 20x faster than this case-insensitive function.
std::string Strutil::join | ( | const Sequence & | seq, |
string_view | sep = "" |
||
) |
Join all the strings in 'seq' into one big string, separated by the 'sep' string. The Sequence can be any iterable collection of items that are able to convert to string via stream output. Examples include: std::vector<string_view>, std::vector<std::string>, std::set<ustring>, std::vector<int>, etc.
std::string Strutil::join | ( | const Sequence & | seq, |
string_view | sep, | ||
size_t | len | ||
) |
Join all the strings in 'seq' into one big string, separated by the 'sep' string. The Sequence can be any iterable collection of items that are able to convert to string via stream output. Examples include: std::vector<string_view>, std::vector<std::string>, std::set<ustring>, std::vector<int>, etc. Values will be rendered into the string in a locale-independent manner (i.e., '.' for decimal in floats). If the optional len
is nonzero, exactly that number of elements will be output (truncating or default-value-padding the sequence).
|
inline |
string_view OIIO_UTIL_API Strutil::lstrip | ( | string_view | str, |
string_view | chars = string_view() |
||
) |
Return a reference to the section of str that has all consecutive characters in chars removed from the beginning (left side). If chars is empty, it will be interpreted as " \t\n\r\f\v" (whitespace).
std::string OIIO_UTIL_API Strutil::memformat | ( | long long | bytes, |
int | digits = 1 |
||
) |
Return a string expressing a number of bytes, in human readable form.
|
noexcept |
If str's first character is c (or first non-whitespace char is c, if skip_whitespace is true), return true and additionally modify str to skip over that first character if eat is also true. Otherwise, if str does not begin with character c, return false and don't modify str.
|
noexcept |
If str's first non-whitespace characters form a valid float, return true, place the float's value in val, and additionally modify str to skip over the parsed float if eat is also true. Otherwise, if no float is found at the beginning of str, return false and don't modify val or str.
|
noexcept |
If str's first non-whitespace characters form a valid C-like identifier, return the identifier, and additionally modify str to skip over the parsed identifier if eat is also true. Otherwise, if no identifier is found at the beginning of str, return an empty string_view and don't modify str.
|
noexcept |
If str's first non-whitespace characters form a valid C-like identifier, return the identifier, and additionally modify str to skip over the parsed identifier if eat is also true. Otherwise, if no identifier is found at the beginning of str, return an empty string_view and don't modify str. The 'allowed' parameter may specify a additional characters accepted that would not ordinarily be allowed in C identifiers, for example, parse_identifier (blah, "$:") would allow "identifiers" containing dollar signs and colons as well as the usual alphanumeric and underscore characters.
|
noexcept |
If the C-like identifier at the head of str exactly matches id, return true, and also advance str if eat is true. If it is not a match for id, return false and do not alter str.
|
noexcept |
If str's first non-whitespace characters form a valid integer, return true, place the integer's value in val, and additionally modify str to skip over the parsed integer if eat is also true. Otherwise, if no integer is found at the beginning of str, return false and don't modify val or str.
|
noexcept |
Return the prefix of str up to and including the first newline ('
') character, or all of str if no newline is found within it. If eat
is true, then str
will be modified to trim off this returned prefix (including the newline character).
|
noexcept |
Assuming the string str starts with either '(', '[', or '{', return the head, up to and including the corresponding closing character (')', ']', or '}', respectively), recognizing nesting structures. For example, parse_nested("(a(b)c)d") should return "(a(b)c)", NOT "(a(b)". Return an empty string if str doesn't start with one of those characters, or doesn't contain a correctly matching nested pair. If eat==true, str will be modified to trim off the part of the string that is returned as the match.
|
noexcept |
If str's first non-whitespace characters are the prefix, return true and additionally modify str to skip over that prefix if eat is also true. Otherwise, if str doesn't start with optional whitespace and the prefix, return false and don't modify str.
|
noexcept |
If str's first non-whitespace characters form a valid string (either a single word separated by whitespace or anything inside a double-quoted ("") or single-quoted ('') string, return true, place the string's value (not including surrounding double quotes) in val, and additionally modify str to skip over the parsed string if eat is also true. Otherwise, if no string is found at the beginning of str, return false and don't modify val or str. If keep_quotes is true, the surrounding double quotes (if present) will be kept in val.
|
noexcept |
Return the longest prefix of str
that does not contain any characters found in set
(which defaults to the set of common whitespace characters). If eat
is true, then str
will be modified to trim off this returned prefix (but not the separator character).
|
noexcept |
Modify str to trim all characters up to (but not including) the first occurrence of c, and return true if c was found or false if the whole string was trimmed without ever finding c. But if eat is false, then don't modify str, just return true if any c is found, false if no c is found.
|
inlinenoexcept |
|
inlinenoexcept |
|
noexcept |
Parse from str
: a prefix
, a series of int values separated by the sep
string, and a postfix
, placing the values in the elements of mutable span values
, where the span length indicates the number of values to read. Any of the prefix, separator, or postfix may be empty strings. If eat
is true and the parse was successful, str
will be updated in place to trim everything that was parsed, but if any part of the parse failed, str
will not be altered from its original state.
|
noexcept |
parse_values for int.
|
noexcept |
Return the longest prefix of str
that contain only characters found in set
. If eat
is true, then str
will be modified to trim off this returned prefix.
|
noexcept |
Return the first "word" (set of contiguous alphabetical characters) in str, and additionally modify str to skip over the parsed word if eat is also true. Otherwise, if no word is found at the beginning of str, return an empty string_view and don't modify str.
|
inline |
Strutil::print (fmt, ...) Strutil::print (FILE*, fmt, ...) Strutil::print (ostream& fmt, ...)
Output formatted strings to stdout, a FILE*, or a stream, respectively. All use "Python-like" formatting description (as {fmt} does, and some day, std::format), are type-safe, are thread-safe (the outputs are "atomic", at least versus other calls to Strutil::*printf), and automatically flush their outputs. They are all locale-independent by default (use {:n} for locale-aware formatting).
|
inline |
|
inline |
|
inline |
Strutil::printf (fmt, ...) Strutil::fprintf (FILE*, fmt, ...) Strutil::fprintf (ostream& fmt, ...)
Output formatted strings to stdout, a FILE*, or a stream, respectively. All use printf-like formatting rules, are type-safe, are thread-safe (the outputs are "atomic", at least versus other calls to Strutil::*printf), and automatically flush their outputs. They are all locale-independent (forcing classic "C" locale).
|
inline |
Does 'a' contain the string 'b' within it? But start looking at the end! This can be a bit faster than contains() if you think that the substring b
will tend to be close to the end of a
.
|
noexcept |
Modify str to trim any trailing whitespace (space, tab, linefeed, cr) from the back.
std::string OIIO_UTIL_API Strutil::repeat | ( | string_view | str, |
int | n | ||
) |
Repeat a string formed by concatenating str n times.
std::string OIIO_UTIL_API Strutil::replace | ( | string_view | str, |
string_view | pattern, | ||
string_view | replacement, | ||
bool | global = false |
||
) |
Replace a pattern inside a string and return the result. If global is true, replace all instances of the pattern, otherwise just the first.
size_t OIIO_UTIL_API Strutil::rfind | ( | string_view | a, |
string_view | b | ||
) |
Return the position of the last occurrence of b
within a
, or npos if not found.
string_view OIIO_UTIL_API Strutil::rstrip | ( | string_view | str, |
string_view | chars = string_view() |
||
) |
Return a reference to the section of str that has all consecutive characters in chars removed from the ending (right side). If chars is empty, it will be interpreted as " \t\n\r\f\v" (whitespace).
|
noexcept |
Copy at most size characters (including terminating 0 character) from src into dst[], filling any remaining characters with 0 values. Returns dst. Note that this behavior is identical to strncpy, except that it guarantees that there will be a termining 0 character.
OIIO_UTIL_API bool Strutil::scan_datetime | ( | string_view | str, |
int & | year, | ||
int & | month, | ||
int & | day, | ||
int & | hour, | ||
int & | min, | ||
int & | sec | ||
) |
Scan a string for date and time information. Return true upon success, false if the string did not appear to contain a valid date/time. If, after parsing a valid date/time (including out of range values), str
contains more characters after that, it is not considered a failure.
Valid date/time formats include:
|
inlinenoexcept |
|
inlinenoexcept |
|
noexcept |
Modify str to trim any leading whitespace (space, tab, linefeed, cr) from the front.
void OIIO_UTIL_API Strutil::split | ( | string_view | str, |
std::vector< string_view > & | result, | ||
string_view | sep = string_view() , |
||
int | maxsplit = -1 |
||
) |
Fills the "result" list with the words in the string, using sep as the delimiter string. If maxsplit
is > -1, the string will be split into at most maxsplit
pieces (a negative value will impose no maximum). If sep is "", any whitespace string is a separator. If the source str
is empty, there will be zero pieces.
void OIIO_UTIL_API Strutil::split | ( | string_view | str, |
std::vector< std::string > & | result, | ||
string_view | sep = string_view() , |
||
int | maxsplit = -1 |
||
) |
OIIO_UTIL_API std::vector<std::string> Strutil::splits | ( | string_view | str, |
string_view | sep = "" , |
||
int | maxsplit = -1 |
||
) |
Split the contents of str
using sep
as the delimiter string. If sep
is "", any whitespace string is a separator. If maxsplit > -1
, at most maxsplit
split fragments will be produced (for example, maxsplit=2 will split at only the first separator, yielding at most two fragments). The result is returned as a vector of std::string (for splits()
) or a vector of string_view (for splitsv()
). If the source str
is empty, there will be zero pieces.
OIIO_UTIL_API std::vector<string_view> Strutil::splitsv | ( | string_view | str, |
string_view | sep = "" , |
||
int | maxsplit = -1 |
||
) |
bool OIIO_UTIL_API Strutil::starts_with | ( | string_view | a, |
string_view | b | ||
) |
Does 'a' start with the string 'b', with a case-sensitive comparison?
OIIO_UTIL_API double Strutil::stod | ( | string_view | s, |
size_t * | pos = 0 |
||
) |
OIIO_UTIL_API double Strutil::stod | ( | const std::string & | s, |
size_t * | pos = 0 |
||
) |
OIIO_UTIL_API double Strutil::stod | ( | const char * | s, |
size_t * | pos = 0 |
||
) |
OIIO_UTIL_API float Strutil::stof | ( | string_view | s, |
size_t * | pos = 0 |
||
) |
stof() returns the float conversion of text from several string types. No exceptions or errors – parsing errors just return 0.0. These always use '.' for the decimal mark (versus atof and std::strtof, which are locale-dependent).
OIIO_UTIL_API float Strutil::stof | ( | const std::string & | s, |
size_t * | pos = 0 |
||
) |
OIIO_UTIL_API float Strutil::stof | ( | const char * | s, |
size_t * | pos = 0 |
||
) |
OIIO_UTIL_API int Strutil::stoi | ( | string_view | s, |
size_t * | pos = 0 , |
||
int | base = 10 |
||
) |
OIIO_UTIL_API unsigned int Strutil::stoui | ( | string_view | s, |
size_t * | pos = 0 , |
||
int | base = 10 |
||
) |
|
inline |
|
inline |
Hash a string_view. This is OIIO's default favorite string hasher. Currently, it uses farmhash, is constexpr (for C++14), and works in Cuda. This is rigged, though, so that empty strings hash always hash to 0 (that isn't would a raw farmhash would give you, but it's a useful property, especially for trivial initialization).
|
inline |
|
inline |
|
inline |
OIIO_UTIL_API bool Strutil::string_is_float | ( | string_view | s | ) |
Return true if the string is exactly (other than leading or trailing whitespace) a valid float. This operations in a locale-independent manner, i.e., it assumes '.' as the decimal mark.
OIIO_UTIL_API bool Strutil::string_is_int | ( | string_view | s | ) |
Return true if the string is exactly (other than leading and trailing whitespace) a valid int.
string_view OIIO_UTIL_API Strutil::strip | ( | string_view | str, |
string_view | chars = string_view() |
||
) |
Return a reference to the section of str that has all consecutive characters in chars removed from the beginning and ending. If chars is empty, it will be interpreted as " \t\n\r\f\v" (whitespace).
|
noexcept |
|
noexcept |
strtod/strtof equivalents that are "locale-independent", always using '.' as the decimal separator. This should be preferred for I/O and other situations where you want the same standard formatting regardless of locale.
void OIIO_UTIL_API Strutil::sync_output | ( | FILE * | file, |
string_view | str | ||
) |
Output the string to the file/stream in a synchronized fashion, so that buffers are flushed and internal mutex is used to prevent threads from clobbering each other – output strings coming from concurrent threads may be interleaved, but each string is "atomic" and will never splice each other character-by-character.
void OIIO_UTIL_API Strutil::sync_output | ( | std::ostream & | file, |
string_view | str | ||
) |
std::string OIIO_UTIL_API Strutil::timeintervalformat | ( | double | secs, |
int | digits = 1 |
||
) |
Return a string expressing an elapsed time, in human readable form. e.g. "0:35.2"
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
|
inline |
void OIIO_UTIL_API Strutil::to_upper | ( | std::string & | a | ) |
Convert to upper case in place, faster than std::toupper because we use a static locale that doesn't require a mutex lock.
|
inlinenoexcept |
std::string OIIO_UTIL_API Strutil::unescape_chars | ( | string_view | escaped | ) |
Take a string that has embedded escape sequences (\\
, \"
, \n
, etc.) and collapse them into the 'real' characters.
|
inline |
|
noexcept |
Conversion from wstring UTF-16 to a UTF-8 std::string. This is the standard way to convert from Windows wide character strings used for filenames into the UTF-8 strings OIIO expects for filenames when passed to functions like ImageInput::open().
void OIIO_UTIL_API Strutil::utf8_to_unicode | ( | string_view | str, |
std::vector< uint32_t > & | uvec | ||
) |
Converts utf-8 string to vector of unicode codepoints. This function will not stop on invalid sequences. It will let through some invalid utf-8 sequences like: 0xfdd0-0xfdef, 0x??fffe/0x??ffff. It does not support 5-6 bytes long utf-8 sequences. Will skip trailing character if there are not enough bytes for decoding a codepoint.
N.B. Following should probably return u32string instead of taking vector, but C++11 support is not yet stabilized across compilers. We will eventually add that and deprecate this one, after everybody is caught up to C++11.
|
noexcept |
Conversion of normal char-based strings (presumed to be UTF-8 encoding) to wide char string, wstring.
std::string OIIO_UTIL_API Strutil::vformat | ( | const char * | fmt, |
va_list | ap | ||
) |
Return a std::string formatted like Strutil::format, but passed already as a va_list. This is not guaranteed type-safe and is not extensible like format(). Use with caution!
std::string OIIO_UTIL_API Strutil::vsprintf | ( | const char * | fmt, |
va_list | ap | ||
) |
Return a std::string formatted from printf-like arguments – passed already as a va_list. This is not guaranteed type-safe and is not extensible like format(). Use with caution!
std::string OIIO_UTIL_API Strutil::wordwrap | ( | string_view | src, |
int | columns = 80 , |
||
int | prefix = 0 , |
||
string_view | sep = " " , |
||
string_view | presep = "" |
||
) |
Word-wrap string src
to no more than columns
width, starting with an assumed position of prefix
on the first line and intending by prefix
blanks before all lines other than the first.
Words may be split AT any characters in sep
or immediately AFTER any characters in presep
. After the break, any extra sep
characters will be deleted.
By illustration, wordwrap("0 1 2 3 4 5 6 7 8", 10, 4) should return: "0 1 2\n 3 4 5\n 6 7 8"