Character encoding converter
[UCS-2 Character/String API]

Character encoding converter is to convert a string in one character encoding to another. More...

Functions

UCS2API int ucs2setenc (const char *encoding)
 Set the encoding for multi-byte characters (for iconv/libiconv).
UCS2API const char * ucs2getenc ()
 Get the encoding for multi-byte characters (for iconv/libiconv).
UCS2API void ucs2setcp (int cp)
 Set the code page for multi-byte characters (for Win32).
UCS2API int ucs2getcp ()
 Get the code page for multi-byte characters (for Win32).
UCS2API size_t ucs2tombs (char *mbstr, size_t mbs_size, const ucs2char_t *ucs2str, size_t ucs_size)
 Convert a UCS-2 string to multi-byte characters.
UCS2API size_t mbstoucs2 (ucs2char_t *ucs2str, size_t ucs_size, const char *mbstr, size_t mbs_size)
 Convert multi-byte characters to a UCS-2 string.
UCS2API size_t mbstoucs2_charset (ucs2char_t *ucs2str, size_t ucs_size, const char *mbstr, size_t mbs_size, const char *charset)
 Convert multi-byte characters in a specific encoding to a UCS-2 string.
UCS2API size_t ucs2toutf8 (char *mbstr, size_t mbs_size, const ucs2char_t *ucs2str, size_t ucs_size)
 Convert a UCS-2 string to a UTF-8 string.
UCS2API size_t utf8toucs2 (ucs2char_t *ucs2str, size_t ucs_size, const char *mbstr, size_t mbs_size)
 Convert a UTF-8 string to a UCS-2 string.
UCS2API char * ucs2dupmbs (const ucs2char_t *ucs2str)
 Convert and duplicate a UCS-2 string to multi-byte characters.
UCS2API ucs2char_tmbsdupucs2 (const char *mbstr)
 Convert and duplicate multi-byte characters to a UCS-2 string.
UCS2API ucs2char_tmbsdupucs2_charset (const char *mbstr, const char *charset)
 Convert and duplicate multi-byte characters in a specific encoding to a UCS-2 string.
UCS2API char * ucs2duputf8 (const ucs2char_t *ucs2str)
 Convert and duplicate a UCS-2 string to a UTF-8 string.
UCS2API ucs2char_tutf8dupucs2 (const char *utf8str)
 Convert and duplicate a UTF-8 string to a UCS-2 string.

Detailed Description

Character encoding converter is to convert a string in one character encoding to another.

The API subset supports mutual conversions between: UCS-2 and multi-byte character (i.e., char); UCS-2 and UTF-8. Character encoding conversion is performed by MultiByteToWideChar() and WideCharToMultiByte() function in Win32 API (for Windows environments) iconv() function in libc or libiconv (for POSIX environments).


Function Documentation

UCS2API ucs2char_t* mbsdupucs2 ( const char *  mbstr  ) 

Convert and duplicate multi-byte characters to a UCS-2 string.

Parameters:
mbstr The pointer to multi-byte characters.
Return values:
char* The pointer to the duplicated string. Call ucs2free() to free the memory block allocated by this function.

UCS2API ucs2char_t* mbsdupucs2_charset ( const char *  mbstr,
const char *  charset 
)

Convert and duplicate multi-byte characters in a specific encoding to a UCS-2 string.

Parameters:
mbstr The pointer to multi-byte characters.
charset The pointer to the string specifying the encoding of the multi-byte characters.
Return values:
char* The pointer to the duplicated string. Call ucs2free() to free the memory block allocated by this function.

UCS2API size_t mbstoucs2 ( ucs2char_t ucs2str,
size_t  ucs_size,
const char *  mbstr,
size_t  mbs_size 
)

Convert multi-byte characters to a UCS-2 string.

Parameters:
ucs2str The pointer to the buffer that receives UCS-2 string converted from the multi-byte characters. If ucs_size is zero, this argument is not be used.
ucs_size The size, in number of UCS-2 characters, of the buffer pointed to by the ucs2str argument. If this value is zero, the function returns the number of UCS-2 characters required for the buffer.
mbstr The pointer to the multi-byte characters to be converted.
mbs_size The size, in bytes, of the multi-byte characters, mbstr.
Return values:
size_t The number of UCS-2 characters written to ucs2str buffer if the conversion is successful. If ucs_size is zero, the return value is the required size, in number of UCS-2 characters, for a buffer to receive the converted string. This function returns zero if an error occurred.

UCS2API size_t mbstoucs2_charset ( ucs2char_t ucs2str,
size_t  ucs_size,
const char *  mbstr,
size_t  mbs_size,
const char *  charset 
)

Convert multi-byte characters in a specific encoding to a UCS-2 string.

Parameters:
ucs2str The pointer to the buffer that receives UCS-2 string converted from the multi-byte characters. If ucs_size is zero, this argument is not be used.
ucs_size The size, in number of UCS-2 characters, of the buffer pointed to by the ucs2str argument. If this value is zero, the function returns the number of UCS-2 characters required for the buffer.
mbstr The pointer to the multi-byte characters to be converted.
mbs_size The size, in bytes, of the multi-byte characters, mbstr.
charset The pointer to the string specifying the encoding of the multi-byte characters.
Return values:
size_t The number of UCS-2 characters written to ucs2str buffer if the conversion is successful. If ucs_size is zero, the return value is the required size, in number of UCS-2 characters, for a buffer to receive the converted string. This function returns zero if an error occurred.
Note:
charset is ignored on Win32 environments.

UCS2API char* ucs2dupmbs ( const ucs2char_t ucs2str  ) 

Convert and duplicate a UCS-2 string to multi-byte characters.

Parameters:
ucs2str The pointer to a UCS-2 string.
Return values:
char* The pointer to the duplicated string. Call ucs2free() to free the memory block allocated by this function.

UCS2API char* ucs2duputf8 ( const ucs2char_t ucs2str  ) 

Convert and duplicate a UCS-2 string to a UTF-8 string.

Parameters:
ucs2str The pointer to a UCS-2 string.
Return values:
char* The pointer to the duplicated string. Call ucs2free() to free the memory block allocated by this function.

UCS2API int ucs2getcp (  ) 

Get the code page for multi-byte characters (for Win32).

This function returns the default code page for multi-byte characters.

Note:
This function is effective only on Win32 environments.

UCS2API const char* ucs2getenc (  ) 

Get the encoding for multi-byte characters (for iconv/libiconv).

This function returns the default encoding for multi-byte characters used in the UCS-2 API.

Return values:
const char* The pointer to the string of the character encoding.
Note:
This function is effective only on environments with iconv (libiconv).

UCS2API void ucs2setcp ( int  cp  ) 

Set the code page for multi-byte characters (for Win32).

This function changes the default encoding for multi-byte characters to the code page specified by the cp argument.

Parameters:
cp The code page.
Note:
This function is effective only on Win32 environments.

UCS2API int ucs2setenc ( const char *  encoding  ) 

Set the encoding for multi-byte characters (for iconv/libiconv).

This function changes the default encoding for multi-byte characters to the character encoding specified by the encoding argument.

Parameters:
encoding The pointer to the string specifying the character encoding.
Note:
This function is effective only on environments with iconv (libiconv).

UCS2API size_t ucs2tombs ( char *  mbstr,
size_t  mbs_size,
const ucs2char_t ucs2str,
size_t  ucs_size 
)

Convert a UCS-2 string to multi-byte characters.

Parameters:
mbstr The pointer to the buffer that receives multi-byte characters converted from the UCS-2 string. If mbs_size is zero, this argument is not be used.
mbs_size The size, in bytes, of the buffer pointed to by the mbstr argument. If this value is zero, the function returns the number of bytes required for the buffer.
ucs2str The pointer to the UCS-2 string to be converted.
ucs_size The size, in number of UCS-2 characters, of the UCS-2 string, ucs2str.
Return values:
size_t The number of bytes written to mbstr buffer if the conversion is successful. If mbs_size is zero, the return value is the required size, in bytes, for a buffer to receive the converted string. This function returns zero if an error occurred.

UCS2API size_t ucs2toutf8 ( char *  mbstr,
size_t  mbs_size,
const ucs2char_t ucs2str,
size_t  ucs_size 
)

Convert a UCS-2 string to a UTF-8 string.

Parameters:
mbstr The pointer to the buffer that receives UTF-8 string converted from the UCS-2 string. If mbs_size is zero, this argument is not be used.
mbs_size The size, in bytes, of the buffer pointed to by the mbstr argument. If this value is zero, the function returns the number of bytes required for the buffer.
ucs2str The pointer to the UCS-2 string to be converted.
ucs_size The size, in number of UCS-2 characters, of the UCS-2 string, ucs2str.
Return values:
size_t The number of bytes written to mbstr buffer if the conversion is successful. If mbs_size is zero, the return value is the required size, in bytes, for a buffer to receive the converted string. This function returns zero if an error occurred.

UCS2API ucs2char_t* utf8dupucs2 ( const char *  utf8str  ) 

Convert and duplicate a UTF-8 string to a UCS-2 string.

Parameters:
utf8str The pointer to a UTF-8 string.
Return values:
char* The pointer to the duplicated string. Call ucs2free() to free the memory block allocated by this function.

UCS2API size_t utf8toucs2 ( ucs2char_t ucs2str,
size_t  ucs_size,
const char *  mbstr,
size_t  mbs_size 
)

Convert a UTF-8 string to a UCS-2 string.

Parameters:
ucs2str The pointer to the buffer that receives UCS-2 string converted from the UTF-8 string. If ucs_size is zero, this argument is not be used.
ucs_size The size, in number of UCS-2 characters, of the buffer pointed to by the ucs2str argument. If this value is zero, the function returns the number of UCS-2 characters required for the buffer.
mbstr The pointer to the UTF-8 string to be converted.
mbs_size The size, in bytes, of the UTF-8 string, mbstr.
Return values:
size_t The number of UCS-2 characters written to ucs2str buffer if the conversion is successful. If ucs_size is zero, the return value is the required size, in number of UCS-2 characters, for a buffer to receive the converted string. This function returns zero if an error occurred.