|
Thread System 0.3.1
High-performance C++20 thread pool with work stealing and DAG scheduling
|
Provides utilities for string encoding conversion, Base64 encoding/decoding, and substring operations like splitting or replacing. More...
#include <convert_string.h>

Public Member Functions | |
| template<typename FromType , typename ToType > | |
| auto | convert (const FromType &value, const std::string &from_encoding, const std::string &to_encoding) -> std::tuple< std::optional< ToType >, std::optional< std::string > > |
Static Public Member Functions | |
| static auto | to_string (const std::wstring &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Converts a std::wstring to a std::string using the system encoding. | |
| static auto | to_string (std::wstring_view value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Converts a std::wstring_view to a std::string using the system encoding. | |
| static auto | to_wstring (const std::string &value) -> std::tuple< std::optional< std::wstring >, std::optional< std::string > > |
Converts a std::string (system-encoded) to a std::wstring. | |
| static auto | to_wstring (std::string_view value) -> std::tuple< std::optional< std::wstring >, std::optional< std::string > > |
Converts a std::string_view (system-encoded) to a std::wstring. | |
| static auto | get_system_code_page () -> int |
| Retrieves the system code page used for conversions. | |
| static auto | system_to_utf8 (const std::string &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a system-encoded string to UTF-8. | |
| static auto | utf8_to_system (const std::string &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a UTF-8 encoded string to the system encoding. | |
| static auto | split (const std::string &source, const std::string &token) -> std::tuple< std::optional< std::vector< std::string > >, std::optional< std::string > > |
| Splits a string by a given delimiter. | |
| static auto | to_array (const std::string &value) -> std::tuple< std::optional< std::vector< uint8_t > >, std::optional< std::string > > |
| Converts a system-encoded string to a UTF-8 byte array. | |
| static auto | to_string (const std::vector< uint8_t > &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a UTF-8 byte array to a system-encoded string. | |
| static auto | to_base64 (const std::vector< uint8_t > &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Encodes a byte array into a Base64 string. | |
| static auto | from_base64 (const std::string &base64_str) -> std::tuple< std::vector< uint8_t >, std::optional< std::string > > |
| Decodes a Base64 string into a byte array. | |
| static auto | replace (std::string &source, const std::string &token, const std::string &target) -> std::optional< std::string > |
Replaces all occurrences of token in source with target, in place. | |
| static auto | replace2 (const std::string &source, const std::string &token, const std::string &target) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Replaces all occurrences of token in source with target, returning a new string. | |
| static auto | to_string (const std::wstring &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Converts a std::wstring to a std::string using the system encoding. | |
| static auto | to_string (std::wstring_view value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Converts a std::wstring_view to a std::string using the system encoding. | |
| static auto | to_wstring (const std::string &value) -> std::tuple< std::optional< std::wstring >, std::optional< std::string > > |
Converts a std::string (system-encoded) to a std::wstring. | |
| static auto | to_wstring (std::string_view value) -> std::tuple< std::optional< std::wstring >, std::optional< std::string > > |
Converts a std::string_view (system-encoded) to a std::wstring. | |
| static auto | get_system_code_page () -> int |
| Retrieves the system code page used for conversions. | |
| static auto | system_to_utf8 (const std::string &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a system-encoded string to UTF-8. | |
| static auto | utf8_to_system (const std::string &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a UTF-8 encoded string to the system encoding. | |
| static auto | split (const std::string &source, const std::string &token) -> std::tuple< std::optional< std::vector< std::string > >, std::optional< std::string > > |
| Splits a string by a given delimiter. | |
| static auto | to_array (const std::string &value) -> std::tuple< std::optional< std::vector< uint8_t > >, std::optional< std::string > > |
| Converts a system-encoded string to a UTF-8 byte array. | |
| static auto | to_string (const std::vector< uint8_t > &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Converts a UTF-8 byte array to a system-encoded string. | |
| static auto | to_base64 (const std::vector< uint8_t > &value) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
| Encodes a byte array into a Base64 string. | |
| static auto | from_base64 (const std::string &base64_str) -> std::tuple< std::vector< uint8_t >, std::optional< std::string > > |
| Decodes a Base64 string into a byte array. | |
| static auto | replace (std::string &source, const std::string &token, const std::string &target) -> std::optional< std::string > |
Replaces all occurrences of token in source with target, in place. | |
| static auto | replace2 (const std::string &source, const std::string &token, const std::string &target) -> std::tuple< std::optional< std::string >, std::optional< std::string > > |
Replaces all occurrences of token in source with target, returning a new string. | |
Private Types | |
| enum class | endian_types { little , big , unknown , little , big , unknown } |
| Possible endianness values for UTF-16 or UTF-32 data. More... | |
| enum class | encoding_types { utf8 , utf16 , utf32 , utf8 , utf16 , utf32 } |
| Supported encoding types for Unicode conversion. More... | |
| enum class | endian_types { little , big , unknown , little , big , unknown } |
| Possible endianness values for UTF-16 or UTF-32 data. More... | |
| enum class | encoding_types { utf8 , utf16 , utf32 , utf8 , utf16 , utf32 } |
| Supported encoding types for Unicode conversion. More... | |
Static Private Member Functions | |
| static auto | get_code_page_name (int code_page) -> std::string |
| Retrieves a textual name for a code page (e.g., "CP_ACP" or a locale-based name). | |
| static auto | get_encoding_name (encoding_types encoding, endian_types endian=endian_types::little) -> std::string |
| Returns the encoding name string for the given encoding type and endianness. | |
| static auto | get_wchar_encoding (endian_types endian=endian_types::little) -> std::string |
| Derives the wchar_t encoding name based on its size and endianness. | |
| template<typename FromType , typename ToType > | |
| static auto | convert (const FromType &value, const std::string &from_encoding, const std::string &to_encoding) -> std::tuple< std::optional< ToType >, std::optional< std::string > > |
| Converts from one encoding to another using simdutf. | |
| static auto | detect_endian (const std::u16string &value) -> endian_types |
| Detects the endianness of a UTF-16 string by checking for BOM or content patterns. | |
| static auto | detect_endian (const std::u32string &value) -> endian_types |
| Detects the endianness of a UTF-32 string by checking for BOM or content patterns. | |
| static auto | has_utf8_bom (const std::string &value) -> bool |
| Checks if a string has a UTF-8 BOM (Byte Order Mark). | |
| static auto | remove_utf8_bom (const std::string &value) -> std::string |
| Removes a leading UTF-8 BOM from a string, if present. | |
| static auto | add_utf8_bom (const std::string &value) -> std::string |
| Adds a UTF-8 BOM to a string if it doesn't already have one. | |
| static auto | base64_encode (const std::vector< uint8_t > &data) -> std::string |
| Encodes a byte array into a Base64 string. | |
| static auto | base64_decode (const std::string &base64_str) -> std::tuple< std::vector< uint8_t >, std::optional< std::string > > |
| Decodes a Base64 string into a byte array. | |
| static auto | get_code_page_name (int code_page) -> std::string |
| Retrieves a textual name for a code page (e.g., "CP_ACP" or a locale-based name). | |
| static auto | get_encoding_name (encoding_types encoding, endian_types endian=endian_types::little) -> std::string |
| Returns the encoding name string for the given encoding type and endianness. | |
| static auto | get_wchar_encoding (endian_types endian=endian_types::little) -> std::string |
| Derives the wchar_t encoding name based on its size and endianness. | |
| template<typename FromType , typename ToType > | |
| static auto | convert (const FromType &value, const std::string &from_encoding, const std::string &to_encoding) -> std::tuple< std::optional< ToType >, std::optional< std::string > > |
| Converts from one encoding to another using simdutf. | |
| static auto | detect_endian (const std::u16string &value) -> endian_types |
| Detects the endianness of a UTF-16 string by checking for BOM or content patterns. | |
| static auto | detect_endian (const std::u32string &value) -> endian_types |
| Detects the endianness of a UTF-32 string by checking for BOM or content patterns. | |
| static auto | has_utf8_bom (const std::string &value) -> bool |
| Checks if a string has a UTF-8 BOM (Byte Order Mark). | |
| static auto | remove_utf8_bom (const std::string &value) -> std::string |
| Removes a leading UTF-8 BOM from a string, if present. | |
| static auto | add_utf8_bom (const std::string &value) -> std::string |
| Adds a UTF-8 BOM to a string if it doesn't already have one. | |
| static auto | base64_encode (const std::vector< uint8_t > &data) -> std::string |
| Encodes a byte array into a Base64 string. | |
| static auto | base64_decode (const std::string &base64_str) -> std::tuple< std::vector< uint8_t >, std::optional< std::string > > |
| Decodes a Base64 string into a byte array. | |
Provides utilities for string encoding conversion, Base64 encoding/decoding, and substring operations like splitting or replacing.
This class uses simdutf for SIMD-accelerated Unicode transcoding between UTF-8, UTF-16, and UTF-32 on all platforms. It also provides methods to handle Base64 encoding/decoding and some basic string manipulation (split, replace).
std::optional<T>) and an empty error (std::nullopt), or std::nullopt with an error message describing the failure.std::vector<uint8_t>).This class uses simdutf for SIMD-accelerated Unicode transcoding between UTF-8, UTF-16, and UTF-32 on all platforms. It also provides methods to handle Base64 encoding/decoding and some basic string manipulation (split, replace).
std::optional<T>) and an empty error (std::nullopt), or std::nullopt with an error message describing the failure.std::vector<uint8_t>).Definition at line 52 of file convert_string.h.
|
strongprivate |
Supported encoding types for Unicode conversion.
| Enumerator | |
|---|---|
| utf8 | UTF-8 encoding. |
| utf16 | UTF-16 encoding. |
| utf32 | UTF-32 encoding. |
| utf8 | UTF-8 encoding. |
| utf16 | UTF-16 encoding. |
| utf32 | UTF-32 encoding. |
Definition at line 222 of file convert_string.h.
|
strongprivate |
Supported encoding types for Unicode conversion.
| Enumerator | |
|---|---|
| utf8 | UTF-8 encoding. |
| utf16 | UTF-16 encoding. |
| utf32 | UTF-32 encoding. |
| utf8 | UTF-8 encoding. |
| utf16 | UTF-16 encoding. |
| utf32 | UTF-32 encoding. |
Definition at line 216 of file convert_string.h.
|
strongprivate |
Possible endianness values for UTF-16 or UTF-32 data.
| Enumerator | |
|---|---|
| little | Little-endian. |
| big | Big-endian. |
| unknown | Unknown or not applicable. |
| little | Little-endian. |
| big | Big-endian. |
| unknown | Unknown or not applicable. |
Definition at line 214 of file convert_string.h.
|
strongprivate |
Possible endianness values for UTF-16 or UTF-32 data.
| Enumerator | |
|---|---|
| little | Little-endian. |
| big | Big-endian. |
| unknown | Unknown or not applicable. |
| little | Little-endian. |
| big | Big-endian. |
| unknown | Unknown or not applicable. |
Definition at line 208 of file convert_string.h.
|
staticprivate |
Adds a UTF-8 BOM to a string if it doesn't already have one.
| value | The input string (UTF-8 encoded). |
References base64_decode(), and base64_encode().

|
staticprivate |
Adds a UTF-8 BOM to a string if it doesn't already have one.
| value | The input string (UTF-8 encoded). |
|
staticprivate |
Decodes a Base64 string into a byte array.
| base64_str | The Base64 string. |
std::vector<uint8_t>: The decoded bytes (empty on failure).std::optional<std::string>: An error message if decoding fails, otherwise std::nullopt. Definition at line 554 of file convert_string.cpp.
|
staticprivate |
Decodes a Base64 string into a byte array.
| base64_str | The Base64 string. |
std::vector<uint8_t>: The decoded bytes (empty on failure).std::optional<std::string>: An error message if decoding fails, otherwise std::nullopt. Referenced by add_utf8_bom().

|
staticprivate |
Encodes a byte array into a Base64 string.
| data | The raw byte array. |
Definition at line 520 of file convert_string.cpp.
|
staticprivate |
Encodes a byte array into a Base64 string.
| data | The raw byte array. |
Referenced by add_utf8_bom().

|
staticprivate |
Converts from one encoding to another using simdutf.
| FromType | The input string type. |
| ToType | The output string type. |
| value | The input string. |
| from_encoding | Source encoding name (e.g., "UTF-16LE"). |
| to_encoding | Target encoding name (e.g., "UTF-8"). |
std::optional<ToType>: The converted string on success, or std::nullopt on failure.std::optional<std::string>: The error message if conversion fails. | auto utility_module::convert_string::convert | ( | const FromType & | value, |
| const std::string & | from_encoding, | ||
| const std::string & | to_encoding ) -> std::tuple<std::optional<ToType>, std::optional<std::string>> |
Definition at line 36 of file convert_string.cpp.
|
staticprivate |
Converts from one encoding to another using simdutf.
| FromType | The input string type. |
| ToType | The output string type. |
| value | The input string. |
| from_encoding | Source encoding name (e.g., "UTF-16LE"). |
| to_encoding | Target encoding name (e.g., "UTF-8"). |
std::optional<ToType>: The converted string on success, or std::nullopt on failure.std::optional<std::string>: The error message if conversion fails.
|
staticprivate |
Detects the endianness of a UTF-16 string by checking for BOM or content patterns.
| value | The UTF-16 string. |
|
staticprivate |
Detects the endianness of a UTF-16 string by checking for BOM or content patterns.
| value | The UTF-16 string. |
|
staticprivate |
Detects the endianness of a UTF-32 string by checking for BOM or content patterns.
| value | The UTF-32 string. |
|
staticprivate |
Detects the endianness of a UTF-32 string by checking for BOM or content patterns.
| value | The UTF-32 string. |
|
static |
Decodes a Base64 string into a byte array.
| base64_str | The Base64-encoded string. |
std::vector<uint8_t>: The decoded data. May be empty if decoding fails.std::optional<std::string>: An error message on failure, or std::nullopt on success. Definition at line 470 of file convert_string.cpp.
|
static |
Decodes a Base64 string into a byte array.
| base64_str | The Base64-encoded string. |
std::vector<uint8_t>: The decoded data. May be empty if decoding fails.std::optional<std::string>: An error message on failure, or std::nullopt on success.
|
staticprivate |
Retrieves a textual name for a code page (e.g., "CP_ACP" or a locale-based name).
| code_page | The code page integer. |
Definition at line 374 of file convert_string.cpp.
|
staticprivate |
Retrieves a textual name for a code page (e.g., "CP_ACP" or a locale-based name).
| code_page | The code page integer. |
|
staticprivate |
Returns the encoding name string for the given encoding type and endianness.
| encoding | The base encoding (UTF-8, UTF-16, or UTF-32). |
| endian | The byte order (little, big) if applicable. |
Definition at line 248 of file convert_string.cpp.
|
staticprivate |
Returns the encoding name string for the given encoding type and endianness.
| encoding | The base encoding (UTF-8, UTF-16, or UTF-32). |
| endian | The byte order (little, big) if applicable. |
|
static |
Retrieves the system code page used for conversions.
On Unix-like systems, returns UTF-8 (65001) as the default. On Windows, uses GetACP() to detect the system code page.
Definition at line 365 of file convert_string.cpp.
|
static |
Retrieves the system code page used for conversions.
On Unix-like systems, returns UTF-8 (65001) as the default. On Windows, uses GetACP() to detect the system code page.
|
staticprivate |
Derives the wchar_t encoding name based on its size and endianness.
| endian | The byte order. Defaults to little-endian. |
Definition at line 274 of file convert_string.cpp.
|
staticprivate |
Derives the wchar_t encoding name based on its size and endianness.
| endian | The byte order. Defaults to little-endian. |
|
staticprivate |
Checks if a string has a UTF-8 BOM (Byte Order Mark).
| value | The input string (UTF-8 encoded). |
true if the string starts with the UTF-8 BOM (0xEF,0xBB,0xBF), else false.
|
staticprivate |
Checks if a string has a UTF-8 BOM (Byte Order Mark).
| value | The input string (UTF-8 encoded). |
true if the string starts with the UTF-8 BOM (0xEF,0xBB,0xBF), else false.
|
staticprivate |
Removes a leading UTF-8 BOM from a string, if present.
| value | The input string (UTF-8 encoded). |
|
staticprivate |
Removes a leading UTF-8 BOM from a string, if present.
| value | The input string (UTF-8 encoded). |
|
static |
Replaces all occurrences of token in source with target, in place.
| source | The string to modify. |
| token | The substring to find. |
| target | The substring to replace token with. |
std::optional<std::string> containing an error message if replacement fails, or std::nullopt on success.This function modifies source directly. If token is empty, it returns an error.
Definition at line 476 of file convert_string.cpp.
|
static |
Replaces all occurrences of token in source with target, in place.
| source | The string to modify. |
| token | The substring to find. |
| target | The substring to replace token with. |
std::optional<std::string> containing an error message if replacement fails, or std::nullopt on success.This function modifies source directly. If token is empty, it returns an error.
|
static |
Replaces all occurrences of token in source with target, returning a new string.
| source | The source string (unchanged). |
| token | The substring to find. |
| target | The substring to replace token with. |
std::optional<std::string>: The modified string on success.std::optional<std::string>: An error message if replacement fails. Definition at line 490 of file convert_string.cpp.
|
static |
Replaces all occurrences of token in source with target, returning a new string.
| source | The source string (unchanged). |
| token | The substring to find. |
| target | The substring to replace token with. |
std::optional<std::string>: The modified string on success.std::optional<std::string>: An error message if replacement fails.
|
static |
Splits a string by a given delimiter.
| source | The source string to split. |
| token | The delimiter substring. |
std::optional<std::vector<std::string>>: A vector of tokens on success.std::optional<std::string>: An error message if splitting fails.If token is empty, splitting cannot proceed and an error is returned.
Definition at line 409 of file convert_string.cpp.
|
static |
Splits a string by a given delimiter.
| source | The source string to split. |
| token | The delimiter substring. |
std::optional<std::vector<std::string>>: A vector of tokens on success.std::optional<std::string>: An error message if splitting fails.If token is empty, splitting cannot proceed and an error is returned.
|
static |
Converts a system-encoded string to UTF-8.
| value | The input string in system encoding. |
std::optional<std::string>: The UTF-8 encoded string on success.std::optional<std::string>: An error message on failure. Definition at line 385 of file convert_string.cpp.
|
static |
Converts a system-encoded string to UTF-8.
| value | The input string in system encoding. |
std::optional<std::string>: The UTF-8 encoded string on success.std::optional<std::string>: An error message on failure.
|
static |
Converts a system-encoded string to a UTF-8 byte array.
| value | The input string in system encoding. |
std::optional<std::vector<uint8_t>>: The UTF-8 byte array on success.std::optional<std::string>: An error message on failure. Definition at line 433 of file convert_string.cpp.
|
static |
Converts a system-encoded string to a UTF-8 byte array.
| value | The input string in system encoding. |
std::optional<std::vector<uint8_t>>: The UTF-8 byte array on success.std::optional<std::string>: An error message on failure.
|
static |
Encodes a byte array into a Base64 string.
| value | The raw byte array to encode. |
std::optional<std::string>: The Base64-encoded string on success.std::optional<std::string>: An error message if encoding fails.Typically, Base64 encoding should not fail unless the input is extremely large and memory allocation fails.
Definition at line 456 of file convert_string.cpp.
|
static |
Encodes a byte array into a Base64 string.
| value | The raw byte array to encode. |
std::optional<std::string>: The Base64-encoded string on success.std::optional<std::string>: An error message if encoding fails.Typically, Base64 encoding should not fail unless the input is extremely large and memory allocation fails.
|
static |
Converts a UTF-8 byte array to a system-encoded string.
| value | The input byte array. |
std::optional<std::string>: The system-encoded string on success.std::optional<std::string>: An error message on failure. Definition at line 447 of file convert_string.cpp.
|
static |
Converts a UTF-8 byte array to a system-encoded string.
| value | The input byte array. |
std::optional<std::string>: The system-encoded string on success.std::optional<std::string>: An error message on failure.
|
static |
Converts a std::wstring to a std::string using the system encoding.
| value | The wide-string input. |
std::optional<std::string>: The converted narrow string on success, or std::nullopt on failure.std::optional<std::string>: The error message if conversion fails, otherwise std::nullopt.Uses simdutf for SIMD-accelerated UTF-8/UTF-16/UTF-32 transcoding on all platforms.
Definition at line 188 of file convert_string.cpp.
|
static |
Converts a std::wstring to a std::string using the system encoding.
| value | The wide-string input. |
std::optional<std::string>: The converted narrow string on success, or std::nullopt on failure.std::optional<std::string>: The error message if conversion fails, otherwise std::nullopt.Uses simdutf for SIMD-accelerated UTF-8/UTF-16/UTF-32 transcoding on all platforms.
|
static |
Converts a std::wstring_view to a std::string using the system encoding.
| value | The wide-string view. |
to_string(const std::wstring&).This is a convenience overload for handling string views without copying the entire wide string first.
Definition at line 203 of file convert_string.cpp.
|
static |
Converts a std::wstring_view to a std::string using the system encoding.
| value | The wide-string view. |
to_string(const std::wstring&).This is a convenience overload for handling string views without copying the entire wide string first.
|
static |
Converts a std::string (system-encoded) to a std::wstring.
| value | The narrow string in system encoding. |
std::optional<std::wstring>: The converted wide string on success.std::optional<std::string>: An error message if conversion fails. Definition at line 218 of file convert_string.cpp.
|
static |
Converts a std::string (system-encoded) to a std::wstring.
| value | The narrow string in system encoding. |
std::optional<std::wstring>: The converted wide string on success.std::optional<std::string>: An error message if conversion fails. Referenced by std::formatter< kcenon::thread::job_types, wchar_t >::format(), std::formatter< kcenon::thread::thread_conditions, wchar_t >::format(), std::formatter< kcenon::thread::thread_pool, wchar_t >::format(), std::formatter< kcenon::thread::thread_worker, wchar_t >::format(), std::formatter< kcenon::thread::typed_thread_pool_t< job_type >, wchar_t >::format(), and std::formatter< kcenon::thread::typed_thread_worker_t< job_type >, wchar_t >::format().

|
static |
Converts a std::string_view (system-encoded) to a std::wstring.
| value | The narrow string view in system encoding. |
to_wstring(const std::string&). Definition at line 233 of file convert_string.cpp.
|
static |
Converts a std::string_view (system-encoded) to a std::wstring.
| value | The narrow string view in system encoding. |
to_wstring(const std::string&).
|
static |
Converts a UTF-8 encoded string to the system encoding.
| value | The UTF-8 encoded input string. |
system_to_utf8(). Definition at line 397 of file convert_string.cpp.
|
static |
Converts a UTF-8 encoded string to the system encoding.
| value | The UTF-8 encoded input string. |
system_to_utf8().