HDK
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
TfUtf8CodePointView Class Referencefinal

#include <unicodeUtils.h>

Public Types

using const_iterator = TfUtf8CodePointIterator
 

Public Member Functions

 TfUtf8CodePointView ()=default
 
 TfUtf8CodePointView (const std::string_view &view)
 
const_iterator begin () const
 
TfUtf8CodePointIterator::PastTheEndSentinel end () const
 
const_iterator cbegin () const
 
TfUtf8CodePointIterator::PastTheEndSentinel cend () const
 
bool empty () const
 Returns true if the underlying view is empty. More...
 
const_iterator EndAsIterator () const
 

Detailed Description

Wrapper for a UTF-8 encoded std::string_view that can be iterated over as code points instead of bytes.

Because of the variable length encoding, the TfUtf8CodePointView iterator is a ForwardIterator and is read only.

std::string value{"∫dx"};
for (const auto codePoint : TfUtf8CodePointView{value}) {
if (codePoint == TfUtf8InvalidCodePoint) {
TF_WARN("String cannot be decoded.");
break;
}
}

The TfUtf8CodePointView's sentinel end() is compatible with range based for loops and the forthcoming STL ranges library; it avoids triplicating the storage for the end iterator. EndAsIterator() can be used for algorithms that require the begin and end iterators to be of the same type but necessarily stores redundant copies of the endpoint.

if (std::any_of(std::cbegin(codePointView), codePointView.EndAsIterator(),
[](const auto c) { return c == TfUtf8InvalidCodePoint; }))
{
TF_WARN("String cannot be decoded");
}

Definition at line 338 of file unicodeUtils.h.

Member Typedef Documentation

Constructor & Destructor Documentation

TfUtf8CodePointView::TfUtf8CodePointView ( )
default
TfUtf8CodePointView::TfUtf8CodePointView ( const std::string_view view)
inlineexplicit

Definition at line 343 of file unicodeUtils.h.

Member Function Documentation

const_iterator TfUtf8CodePointView::begin ( void  ) const
inline

Definition at line 345 of file unicodeUtils.h.

const_iterator TfUtf8CodePointView::cbegin ( ) const
inline

Definition at line 357 of file unicodeUtils.h.

TfUtf8CodePointIterator::PastTheEndSentinel TfUtf8CodePointView::cend ( ) const
inline

The sentinel will compare as equal to any iterator at the end of the underlying string_view

Definition at line 364 of file unicodeUtils.h.

bool TfUtf8CodePointView::empty ( void  ) const
inline

Returns true if the underlying view is empty.

Definition at line 370 of file unicodeUtils.h.

TfUtf8CodePointIterator::PastTheEndSentinel TfUtf8CodePointView::end ( void  ) const
inline

The sentinel will compare as equal to any iterator at the end of the underlying string_view

Definition at line 352 of file unicodeUtils.h.

const_iterator TfUtf8CodePointView::EndAsIterator ( ) const
inline

Returns an iterator of the same type as begin that identifies the end of the string.

As the end iterator is stored three times, this is slightly heavier than using the PastTheEndSentinel and should be avoided in performance critical code paths. It is provided for convenience when an algorithm restricts the iterators to have the same type.

As C++20 ranges exposes more sentinel friendly algorithms, this can likely be deprecated in the future.

Definition at line 385 of file unicodeUtils.h.


The documentation for this class was generated from the following file: