C / C ++ I18N mbstowcs question

I am working on internationalizing input for a C / C ++ application. I am currently running into a problem converting from a multibyte string to a wide character string.

The code needs to be cross platform compatible, so I use mbstowcs and wcstombs as much as possible.

I am currently working on a WIN32 machine and I have set the locale to non-English (Japanese).

When I try to convert a multibyte character string, I seem to be having conversion problems.

Here's some sample code:

int main(int argc, char** argv)
{
    wchar_t *wcsVal = NULL;
    char *mbsVal = NULL;

     /* Get the current code page, in my case 932, runs only on windows */
     TCHAR szCodePage[10]; 
     int cch= GetLocaleInfo( 
             GetSystemDefaultLCID(), 
             LOCALE_IDEFAULTANSICODEPAGE,  
             szCodePage,  
             sizeof(szCodePage)); 

     /* verify locale is set */
     if (setlocale(LC_CTYPE, "") == 0)
     {
        fprintf(stderr, "Failed to set locale\n");
        return 1;
     }

    mbsVal = argv[1];
         /* validate multibyte string and convert to wide character */
    int size = mbstowcs(NULL, mbsVal, 0);
    if (size == -1)
    {
        printf("Invalid multibyte\n");
        return 1;
    }
    wcsVal = (wchar_t*) malloc(sizeof(wchar_t) * (size + 1));
    if (wcsVal == NULL)
    {
        printf("memory issue \n");
        return 1;
    }

    mbstowcs(wcsVal, szVal, size + 1);
    wprintf(L"%ls \n", wcsVal);         
    return 0;
}

      

At the end of execution, the wide character string contains no converted data. I believe there is a problem with the codepage settings because when I use MultiByteToWideChar and have the current codepage sent to

EX: MultiByteToWideChar (CP_ACP, 0, mbsVal, -1, wcsVal, size + 1); instead of calling mbstowcs, the conversion succeeds.

My question is, how do I use the generic mbstowcs call instead of the MuliByteToWideChar call?

+2


a source to share


2 answers


What do you get if you print the string returned setlocale()

? This will indicate what language was installed, which may not be what you expect.



MSDN specifies that on Windows, the default ""

language selected for the default is "the user's default ANSI code page derived from the operating system." Perhaps this is a different beast to the current ANSI code page?

+1


a source


Calling mbstowcs is never as good an idea as MultiByteToWideChar on Windows. Don't worry about it, just stick to the Win32 API.



+1


a source







All Articles