Mis-coded Chinese filenames

I frequently run in to Windows filenames that are wrongly being displayed as if they were in CP437/OEM when the actual encoding is GB2312.

I’d like to use 010 Editor to make Windows treat such a filename as GB2312.

I’ve used 010 Editor’s ConvertString function before. Could an 010 script be written to use that for my purpose, and then be able to run the script from the Explorer right-click menu or using a Windows command line?

You should be able to do this in 010 Editor and there are a few different ways. You could make a simple script something like:

string filename = GetFileName();
string newFilename = ConvertString( filename, CHARSET_OEM, CHARSET_CHINESE_S );
FileSave( newFilename );
FileClose();

and then run 010 Editor from the command line with:

010editor FileToConvert.dat -script:ConvertFilename.1sc

You could also use the FindFiles function in your script to loop through and process all files in a certain directory. You may have to do some other conversions related to UTF-8 in the script but it’s hard to say without seeing a sample of what needs to be converted. Let us know if you can’t get it to work.

Graeme
SweetScape Software

1 Like

Why do I use these codes and then I get file names that are still garbled?

Right now Printf is expecting strings in UTF-8 format, so you would have to convert the string to UTF-8 before printing it. Let us know if you need some code to show how to do this. Cheers!

Graeme
SweetScape Software

I used the following code to output a string without garbled characters in the Output window, but the string displayed in the Value column of the template result panel is still garbled

local char filename[] = GetFileName();

Printf("测试一下 ~ %s\n",filename);

Can you provide me with some code examples to ensure that the string displayed in the Value column is not garbled? Thank you.

Handling strings with different character encodings is always tricky. The ‘Value’ column of the Template Results displays strings using the character set by ‘View > Character Set’ on the main menu. You can force this to UTF-8 by selecting ‘View > Character Set > UTF-8’. If you want the filename to display correctly even though you have another character set chosen for the file, another option is to switch to using wide strings like this:

local wstring filename = GetFileNameW();

Graeme
SweetScape Software