UTF8
-
class HPS::UTF8
The UTF8 class encapsulates a utf8 encoded array of characters and allows for easy encoding and decoding.
Public Functions
-
UTF8 &Assign(UTF8 &&in_utf8)
Moves the source UTF8 object to this object. This method is functionally equivalent to the overloaded assignment operator.
- Parameters
in_utf8 – The source of the move.
- Returns
A reference to this object.
-
UTF8 &Assign(UTF8 const &in_utf8)
Copies the source UTF8 object to this object. This method is functionally equivalent to the overloaded assignment operator.
- Parameters
in_utf8 – The source of the copy.
- Returns
A reference to this object.
-
inline char At(size_t in_index) const
Retrieves the utf8 encoded character at the specified index. This method may split up individual code points.
- Returns
The utf8 encoded character array.
-
void Clear()
Reset all string data.
-
inline bool Empty() const
Indicates whether this utf8 string is empty.
- Returns
true if the UTF8 string is empty, false otherwise.
-
inline char const *GetBytes() const
Retrieves the raw, utf8 encoded character array.
- Returns
The utf8 encoded character array.
-
size_t GetHash() const
Returns a hash code for the utf8 encoded characters.
- Returns
The size_t hash code.
-
inline size_t GetLength() const
Retrieves the number of bytes in the utf8 encoded string up to but not including the null terminator. This will return 0 if the utf8 object is uninitialized.
- Returns
The number of bytes.
-
inline size_t GetWStrLength() const
Retrieves the number of wide characters in the wchar_t string up to but not including the null terminator. This will return 0 if the utf8 object is uninitialized.
- Returns
The number of wide characters.
-
inline bool IsValid() const
Indicates whether this utf8 string has been initialized.
- Returns
true if the UTF8 string has been initialized, false otherwise.
-
inline operator char const*() const
Allows typecasting to const char * by retrieves the raw, utf8 encoded character array.
- Returns
The utf8 encoded character array.
-
inline bool operator!=(char const *in_utf8) const
This function is used to check a utf8-encoded character string for equivalence to this.
- Parameters
in_utf8 – The object to compare to this.
- Returns
true if the objects are not equivalent, false otherwise.
-
inline bool operator!=(UTF8 const &in_utf8) const
This function is used to check an object for equivalence to this.
- Parameters
in_utf8 – The object to compare to this.
- Returns
true if the objects are not equivalent, false otherwise.
-
UTF8 operator+(char const *in_utf8) const
Creates a new UTF8 object by appending a utf8 encoded string to the end of this object.
- Parameters
in_utf8 – A string, assumed to be utf8 encoded, used as the tail end of the new string.
- Returns
A new UTF8 object representing the concatenation of 2 strings.
-
UTF8 operator+(UTF8 const &in_utf8) const
Creates a new UTF8 object by appending a UTF8 object to the end of this object.
- Parameters
in_utf8 – The tail end of the new string.
- Returns
A new UTF8 object representing the concatenation of 2 strings.
-
UTF8 &operator+=(char const *in_utf8)
Appends a utf8 encoded string to the end of this object.
- Parameters
in_utf8 – A string, assumed to be utf8 encoded, used as the tail end of the new string.
- Returns
A reference to this object.
-
UTF8 &operator+=(UTF8 const &in_utf8)
Appends a UTF8 object to the end of this object.
- Parameters
in_utf8 – The tail end of the new string.
- Returns
A reference to this object.
-
inline UTF8 &operator=(UTF8 &&in_utf8)
The move assignment operator takes control of the underlying data from the source utf8 string.
- Parameters
the – source of the move.
-
inline UTF8 &operator=(UTF8 const &in_utf8)
Copies the source UTF8 object to this object.
- Parameters
in_utf8 – The source of the copy.
- Returns
A reference to this object.
-
bool operator==(char const *in_utf8) const
This function is used to check a utf8-encoded character string for equivalence to this.
- Parameters
in_utf8 – The object to compare to this.
- Returns
true if the objects are equivalent, false otherwise.
-
bool operator==(UTF8 const &in_utf8) const
This function is used to check an object for equivalence to this.
- Parameters
in_utf8 – The object to compare to this.
- Returns
true if the objects are equivalent, false otherwise.
-
inline void Reset()
Resets this object to its initial, uninitialized state.
-
size_t ToWStr(wchar_t *out_wide_string) const
Decode a utf8 encoded string into a wide character buffer
- Parameters
out_wide_string –
- Returns
the number of wide characters (code points) in the wide string.
-
size_t ToWStr(WCharArray &out_wide_string) const
Decode a utf8 encoded string into a wide character buffer
- Returns
The number of wide characters (code points) in the wide string.
-
UTF8(char const *in_string, char const *in_locale = 0)
This constructor can be used to encode a string from any known locale to utf8. Be careful not to re-encode a string that’s already utf8 encoded.
- Parameters
in_string – The string to be encoded.
in_locale – A string identifying the source locale of in_string. If none is specified, the default locale on the local machine will be used. If in_string is already utf8 encoded, specify the locale as “utf8” to prevent re-encoding.
-
UTF8(UTF8 &&in_that)
The move constructor takes control of the underlying data from the source utf8 string.
- Parameters
the – source of the move.
-
UTF8(UTF8 const &in_that)
The copy constructor copies the source utf8 string.
- Parameters
in_that – the source to be copied.
-
UTF8(wchar_t const *in_string)
This constructor can be used to encode a wide character string to utf8.
- Parameters
in_string – The string to be encoded.
Friends
-
inline friend bool operator!=(char const *in_left, UTF8 const &in_right)
This function is used to check a utf8-encoded character string for equivalence to a UTF8 object.
- Parameters
in_left – A utf8-encoded character string.
in_right – A UTF8 object.
- Returns
true if the objects are not equivalent, false otherwise.
-
inline friend bool operator!=(wchar_t const *in_left, UTF8 const &in_right)
This function is used to check a wide character string for equivalence to a UTF8 object.
- Parameters
in_left – A wide character string.
in_right – A UTF8 object.
- Returns
true if the objects are not equivalent, false otherwise.
-
inline friend UTF8 operator+(char const *in_left, UTF8 const &in_right)
Creates a new UTF8 object by appending a UTF8 object to the end of a utf8-encoded character string.
-
inline friend UTF8 operator+(wchar_t const *in_left, UTF8 const &in_right)
Creates a new UTF8 object by appending a UTF8 object to the end of a wide character string.
-
inline friend bool operator==(char const *in_left, UTF8 const &in_right)
This function is used to check a utf8-encoded character string for equivalence to a UTF8 object.
- Parameters
in_left – A utf8-encoded character string.
in_right – A UTF8 object.
- Returns
true if the objects are equivalent, false otherwise.
-
UTF8 &Assign(UTF8 &&in_utf8)