Replace a letter wchar_t by another in a file. (Unicode)
I'd like to know how exactly to replace one letter by another in a file containing Unicode characters (wide characters (wchar_t)). I try with the following code, but it does not work, I get every time an endless loop:
Code:
do
{
letter = fgetwc (file);
//printf("%d\n", debug++);
if( letter == 'a')
{
//puts("OK" );
fseek(file, - sizeof(wchar_t), SEEK_CUR);
fputwc('b', file);
}
}
while(letter != WEOF);
Re: Replace a letter wchar_t by another in a file. (Unicode)
if( letter == 'a')
I doubt there. wchar_t is a letter, surprised that it might be equal to a char.
Re: Replace a letter wchar_t by another in a file. (Unicode)
Yeah actually sizeof (wchar_t) is dependent on the system. On Mac OS X, it is 4, on Windows it is 2.
Windows uses UTF-16, which comes in two versions: Big endian and little endian. However, if the file is created and played on the same machine, you should be fine.
In short, should be careful what you read. Put a printf to see whether it is the ascii code of 'a' that gets you a value and not distorted by the endianness.
That's why I prefer UTF-8, this need not take the lead with such details.
Re: Replace a letter wchar_t by another in a file. (Unicode)
@ void
We left for a technical subjects less well understood by the median programmer...
1. wchar_t is not necessarily Unicode.
2. As you like, as a result of narrow or wide, a file to multibyte encoding depends on the local, thus two different characters may occupy a different place in the file, the only possible solution is to copy in making the transformation you want, then delete the original file (or rename), then rename the file that you copy.
3. Where it is planned to have local non-use of encoding variables, I do not know the local I know who use, use wchar_t for Unicode UTF-8 as encoding.
Quote:
fseek(file, - sizeof(wchar_t), SEEK_CUR);
1. In a text file, it can not be possible so as to travel to positions previously returned.
2. sizeof (wchar_t) is a constant of compilation, even if you have a local encoding uses a fixed size, there priory no connection between this size and sizeof (wchar_t).