Invalid read file in UNICODE (fread) in C ++

Question

Invalid read file in UNICODE (fread) in C ++

I am trying to load into a string the content of a file saved on dics. The file is .CS code generated in VisualStudio, so I am assuming it is saved in UTF-8 encoding. I'm doing it:

FILE *fConnect = _wfopen(connectFilePath, _T("r,ccs=UTF-8"));
    if (!fConnect)
        return;
    fseek(fConnect, 0, SEEK_END);
    lSize = ftell(fConnect);
    rewind(fConnect);

    LPTSTR lpContent = (LPTSTR)malloc(sizeof(TCHAR) * lSize + 1);
    fread(lpContent, sizeof(TCHAR), lSize, fConnect);

But the result is so strange - the first part (half of the line is the contents of the .CS file), then strange characters like 췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 췍췍 appear. So I think I read the content wrong. But how to do it right? Thank you so much and I want to hear!

+2

c ++ malloc unicode fread

mimic May 17 '10 at 20:59

a source to share

2 answers

Does the string contain the entire contents of the cs file followed by additional funny characters? It is probably just null terminated incorrectly, as it fread

will not automatically do this. You need to set the character following the string content to zero:

lpContent[lSize] = 0;

+1

stephan May 17 '10 at 21:10

a source to share

Remy Lebeau · Accepted Answer · 2010-05-17T21:11:10+0000

ftell (), fseek () and fread () only work on bytes, not characters. In a Unicode environment, TCHAR is at least 2 bytes, so you allocate and read twice as much memory as you should be.

I've never seen fopen () or _wfopen () support the "ccs" attribute. You should use "rb" as read mode, read the raw bytes into memory and then decode them once you have them all available, for example:

FILE *fConnect = _wfopen(connectFilePath, _T("rb")); 
if (!fConnect) 
  return; 
fseek(fConnect, 0, SEEK_END); 
lSize = ftell(fConnect); 
rewind(fConnect); 

LPBYTE lpContent = (LPBYTE) malloc(lSize); 
fread(lpContent, 1, lSize, fConnect);
fclose(lpContent);

.. decode lpContent as needed ...
free(lpContent);

Invalid read file in UNICODE (fread) in C ++

More articles: