PACS System 0.1.0
PACS DICOM system library
Loading...
Searching...
No Matches
character_set.h File Reference

DICOM Character Set registry, ISO 2022 parser, and string decoder. More...

#include <cstdint>
#include <optional>
#include <string>
#include <string_view>
#include <vector>
Include dependency graph for character_set.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  kcenon::pacs::encoding::character_set_info
 Information about a DICOM character set. More...
 
struct  kcenon::pacs::encoding::specific_character_set
 Parsed representation of a multi-valued Specific Character Set. More...
 
struct  kcenon::pacs::encoding::text_segment
 A text segment with its associated character set. More...
 

Namespaces

namespace  kcenon
 
namespace  kcenon::pacs
 
namespace  kcenon::pacs::encoding
 

Functions

Character Set Registry
const character_set_infokcenon::pacs::encoding::find_character_set (std::string_view defined_term) noexcept
 Look up character set info by DICOM Defined Term.
 
const character_set_infokcenon::pacs::encoding::find_character_set_by_ir (int iso_ir_number) noexcept
 Look up character set info by ISO-IR number.
 
const character_set_infokcenon::pacs::encoding::default_character_set () noexcept
 Get the default character set (ISO-IR 6, ASCII).
 
std::vector< const character_set_info * > kcenon::pacs::encoding::all_character_sets () noexcept
 Get all registered character sets.
 
Specific Character Set Parsing
specific_character_set kcenon::pacs::encoding::parse_specific_character_set (std::string_view value)
 Parse a Specific Character Set (0008,0005) value.
 
ISO 2022 Escape Sequence Detection
std::vector< text_segmentkcenon::pacs::encoding::split_by_escape_sequences (std::string_view text, const specific_character_set &scs)
 Split a string into segments by ISO 2022 escape sequences.
 
String Decoding
std::string kcenon::pacs::encoding::decode_to_utf8 (std::string_view text, const specific_character_set &scs)
 Decode a DICOM string to UTF-8 using the given character set.
 
std::string kcenon::pacs::encoding::convert_to_utf8 (std::string_view text, const character_set_info &charset)
 Decode a single segment from a specific encoding to UTF-8.
 
std::string kcenon::pacs::encoding::decode_person_name (std::string_view pn_value, const specific_character_set &scs)
 Decode a Person Name (PN) value to UTF-8.
 
std::string kcenon::pacs::encoding::encode_from_utf8 (std::string_view utf8_text, const specific_character_set &scs)
 Encode a UTF-8 string to the target character set encoding.
 
std::string kcenon::pacs::encoding::convert_from_utf8 (std::string_view utf8_text, const character_set_info &charset)
 Encode a single UTF-8 segment to a specific character set.
 

Detailed Description

DICOM Character Set registry, ISO 2022 parser, and string decoder.

Provides support for decoding DICOM strings encoded with international character sets using ISO 2022 escape sequence-based code extensions as specified in DICOM PS3.5 Section 6.1 and PS3.3 Annex C.12.1.1.2.

Supported character sets:

  • ISO-IR 6 (ASCII, default repertoire)
  • ISO-IR 100 (Latin-1, Western European)
  • ISO-IR 101 (Latin-2, Central European)
  • ISO-IR 126 (Greek)
  • ISO-IR 127 (Arabic)
  • ISO-IR 138 (Hebrew)
  • ISO-IR 144 (Cyrillic)
  • ISO-IR 166 (Thai, TIS 620-2533)
  • ISO-IR 192 (UTF-8, Unicode)
  • ISO-IR 149 (Korean, KS X 1001 / EUC-KR)
  • ISO-IR 87 (Japanese Kanji, JIS X 0208)
  • ISO-IR 13 (Japanese Katakana, JIS X 0201)
  • ISO-IR 58 (Chinese, GB2312)
  • GB18030 (Chinese, full character set)
See also
DICOM PS3.5 Section 6.1 - Support of Character Repertoires
DICOM PS3.3 Section C.12.1.1.2 - Specific Character Set
Author
kcenon
Since
1.0.0

Definition in file character_set.h.