View Ticket
Not logged in
Ticket UUID: d22946aa0c8052c4dc6edd4b296cb2cb3bc80d7d
Title: Adding files with non-English file names is not cross-platform
Status: Open Type: Code_Defect
Severity: Important Priority:
Subsystem: Resolution: Open
Last Modified: 2012-12-03 21:06:31
Version Found In: [3873aa7ecb]
Description & Comments:
Reproducing the problem, step by step:
  1. Create an empty repository on Linux.
  2. Add a file with non-English file name, say "Текст-01.txt" (the word "text" in Russian with numeric suffix), commit the changes.
  3. Clone repository to Windows machine. In the process of opening the cloned repository Fossil will create file with name "Текст-01.txt", which is definitely is not the file name user expects.
  4. Again, add a file with non-English file name (for the above example it would be "Текст-02.txt", commit the changes and sync with the parent repository.
  5. Update the working copy on Linux. In the result there will be a new file with file name "�����-02.txt", not the original Windows file name. Besides that, the manifest file contains file names in mixed encoding and it is a considerable problem when you try to parse it with console-based tools.

From examining the sources I see that Fossil doesn't differentiate file name encoding on different platforms and assumes it to be always UTF-8. In the contrast, on Windows console applications always use national code pages for file names. In the case of Russian localization of Windows the encoding of file names will be Windows-1251.

In src/db.c there are cases of using sqlite3_win32_mbcs_to_utf8() function when reading the repository database, so it opens even if the file name or path contain non-English characters, but no additional conversions (MBCS to UTF-8, UTF8 to MBCS) are made in the process of working with repository files.