Page 1 of 1
Unable to upload file with german characters in filename
Posted: 2020-08-03 13:14
by test
Hi
First of all, thanks for your prompt help again & again
I have customised FileZilla v3.48.1.
It runs successfully for English characters
However, I am unable to upload file with name in other languages, for example
german.
I have a file
Hörbücher.txt
When I try to upload it to
Storj connection customised by me, it displays error:
Code: Select all
Command: list 3a1
Command: put 3a1 "E:\Hörbücher.txt" "Hörbücher.txt"
Error: Unknown eventType 54
Error: File transfer failed
I am confused about thinking that if
Hörbücher is getting printed in the
status bar, then why it shows issue at the time of upload
However, the same file gets successfully uploaded to any
FTP/SFTP (I have not edited that part, I have edited only
Storj part)
It seems some character
encoding issue related to
UTF-8. I am totally new to that stuff.
So, how do I find what is the default FileZilla character encoding & how may I update that? Further, which encoding should I use to handle any language in this world?
Thanks
Re: Unable to upload file with german characters in filename
Posted: 2020-08-03 16:51
by botg
You need to convert from the communication encoding between the stub and FileZilla (which is UTF-8) to the system's native encoding.
Re: Unable to upload file with german characters in filename
Posted: 2020-08-03 17:15
by botg
Remember that Windows, and Windows only, uses UTF-16 internally.
Re: Unable to upload file with german characters in filename
Posted: 2020-08-04 07:48
by test
I just tested it on Mac & Ubuntu.
No such issues were found on them while upload/download of such files.
However, in Ubuntu, I need to do the following before running filezilla:
export LC_ALL="en_US.UTF-8"
(because sometimes, if the most recent local view directory opened in filezilla has any filename with such characters, then a warning pop-up opens with the message
Code: Select all
A local filename could not be decoded.
Please make sure the LC_CTYPE (or LC_ALL) environment variable is set correctly.
Unless you fix this problem, files might be missing in the file listings.
No further warning will be displayed this session.
)
Re: Unable to upload file with german characters in filename
Posted: 2020-08-04 08:58
by test
So, the issue is only in Windows.
However, I wonder that if Windows has UTF-16 internally & filezilla has UTF-8 internally, then should not the windows handle that stuff since UTF-16 is a superset of UTF-8 ?
Re: Unable to upload file with german characters in filename
Posted: 2020-08-04 09:01
by botg
UTF-16 is not a superset of UTF-8.
Re: Unable to upload file with german characters in filename
Posted: 2020-08-08 22:58
by test
botg wrote: ↑2020-08-03 16:51
You need to convert from the communication encoding between the stub and FileZilla (which is UTF-8) to the system's native encoding.
I explored quite a lot about UTF-8 & UTF-16 from various resources & I tried editing /filezilla/src/engine/storj/file_transfer.cpp
My earlier code, which works with every filename (ASCII/Non-ASCII) in Mac/Ubuntu & all ASCII filenames in WIndows, is:
Code: Select all
int CStorjFileTransferOpData::Send()
{
......
case filetransfer_transfer:
.......
if (download_) {
........
}
else {
std::wstring path = remotePath_.GetPath();
auto pos = path.find('/', 1);
if (pos == std::string::npos) {
path.clear();
}
else {
path = path.substr(pos + 1) + L"/";
}
std::wstring localpath = localFile_; //LINE-A
std::wstring remotepath = path + remoteFile_; //LINE-B
return controlSocket_.SendCommand(L"put " + bucket_ + L" " + controlSocket_.QuoteFilename(localpath) + L" " + controlSocket_.QuoteFilename(remotepath));
}
....
}
Re: Unable to upload file with german characters in filename
Posted: 2020-08-08 23:01
by test
Have tried a lot of options, but none of them works.
The details of the approaches (after replacing line A & B) & the result while trying to upload german language character filename on Windows Filezilla are:
1) Approach:
Code: Select all
std::wstring localpath = fz::to_native(localFile)_; //LINE-A
std::wstring remotepath = path + remoteFile_; //LINE-B
Result:
Code: Select all
Status: Starting upload of E:\Hörbücher.txt
Status: Retrieving directory listing of "/0000"...
Command: list 0000
Command: put 0000 "E:\Hörbücher.txt" "Hörbücher.txt"
Error: Unknown eventType 54
Error: File transfer failed
2) Approach:
Code: Select all
std::wstring localpath = fz::to_native(localFile)_; //LINE-A
std::wstring remotepath = fz::to_utf8(path + remoteFile_); //LINE-B
Result:
3) Approach:
Code: Select all
std::wstring localpath = fz::to_wstring_from_utf8(fz::to_string(localFile_));
std::wstring remotepath = fz::to_wstring_from_utf8(fz::to_string(path + remoteFile_));
Result:
Code: Select all
Status: Starting upload of E:\Office Work\Hörbücher.txt
Status: Retrieving directory listing of "/0000"...
Command: list 0000
Command: put 0000 "" ""
Error: Bad arguments
Error: File transfer failed
4) Approach:
Code: Select all
std::wstring localpath = fz::to_native(fz::to_string(localFile_));
std::wstring remotepath = fz::to_native(fz::to_string(path + remoteFile_));
Result:
Same error as in 1)
5) Approach:
Code: Select all
std::wstring localpath = fz::to_wstring(fz::to_utf8(fz::to_native(localFile_)));
std::wstring remotepath = fz::to_wstring(fz::to_utf8(fz::to_native(path + remoteFile_)));
Result:
Code: Select all
Status: Starting upload of E:\Hörbücher.txt
Status: Retrieving directory listing of "/0000"...
Command: list 0000
Command: put 0000 "E:\Hörbücher.txt" "Hörbücher.txt"
Error: Unknown eventType 54
Error: File transfer failed
Re: Unable to upload file with german characters in filename
Posted: 2020-08-08 23:11
by test
I also tried #include<codecvt> and all that stuff.
However, the FileZilla uses C++17, which does not support the deprecated codecvt
Also, I tried converting widechar (std::wstring) to multibyte (std::string)
However, it seems that both of these approaches would be anyway useless in my case because of the constraints that:
a) on one hand, filezilla supports UTF-8 only
whereas
b) controlSocket_.SendCommand(), controlSocket_.QuoteFilename() functions as well as localFile_ & remoteFile_ variables allow only wstring.
Now, the issue is probably complicated since native-string wstring's wchar_t on Windows takes minimum 2 bytes, but takes minimum 1 byte on Mac/Unix
However, I wonder how the same file gets uploaded to an SFTP server
I can't figure out this. I am really confused.
Re: Unable to upload file with german characters in filename
Posted: 2020-08-10 09:33
by botg
The return type of fz::to_native is fz::native_string, which is a variable typedef depending on target operating system.
Re: Unable to upload file with german characters in filename
Posted: 2020-08-10 10:27
by test
botg wrote: ↑2020-08-10 09:33
The return type of fz::to_native is fz::native_string, which is a variable typedef depending on target operating system.
Yes, I understood that
--------------------------------------------------------------
Following the code in /src/engine/sftp/filetransfer.cpp, I had also tried removing "L" specifier from function & made it as the pseudocode:
control_socket.sendCommand("put" + " " +<variable obtained after converting std::wstring bucket to std::string> + " " + " " + <variable obtained after converting std::wstring localFile_ to std::string> + " " + <variable obtained after converting std::wstring remote_ to std::string>);
However, that doesn't even compile since the control_socket.sendCommand() function takes std::wstring arguments only.
-------------------------------------------------------------
However, I would like to know what am I doing wrong.
Is there any check required in my approach?
Or, could you point out to any specific code part in /src/engine/sftp or /src/engine/ftp which might be used for some reference.
Thanks again & againa
Re: Unable to upload file with german characters in filename
Posted: 2020-08-10 11:17
by botg
Don't modify anything in src/engine/storj, when issuing commands, they are converted from wstring to UTF-8, which is the encoding of the inter-process communication.
In the stub, convert from UTF-8 to the system's native encoding.