Yes, 2.2.23a works perfectly! Thanks so much.botg wrote:Puritan, I've just released 2.2.23a which fixes a problem with UTF-8 detection. Please try this version. If it still won't work, would it be possible to provide some logs or even better, provide a temporary account on your server so I can test myself?
Server not writing unicode names correctly (b/c of utf8?)
Moderator: Project members
The problem is although the Germany clients can't see what the ASCII means, they can still enter the directory when using the correct ASCII name (Instead, if they use some Chinese viewing tools, they can see the correct name).botg wrote: Which ANSI? If a client from Germany connects to a Chinese server for example it tries to interpret the chinese characters as German. FTP has only been specified for 7 bit US ASCII.
Nonstandardized encoding guessing versus specified and universal UTF-8.
However under the current implementation, even if Chinese clients connects to Chinese server, they can't see the names correctly, and can't enter the directories, this is the problem.
Moreover, this breaks support for all download accelarators such as FlashGet, because none of these clients use UTF-8 to handle FTP requests (Filename contains ? characters).
If a client does not support UTF-8, it should still possible to browse the server. Unless of course the client has some more bugs like trying to interpret the characters > 127.
If these so called "download accelerators" replace high ascii characters with question marks, they are incredibly broken. Feel free to mail them tRFC959 and RFC2640.Moreover, this breaks support for all download accelarators such as FlashGet, because none of these clients use UTF-8 to handle FTP requests (Filename contains ? characters).
No, you don't understand. The problem is when client does not support UTF-8, FileZillaServer still sends UTF-8 filenames to the clients. As a result, when clients interprets the filenames, it will have certain ASCII characters are <127 as first byte (In Doble-byte OS, messing up first byte will destroy two bytes and result a ?). This may not be bugs of clients, but most likely limitation of Double-byte OS (or the DBCS architecture).botg wrote:If a client does not support UTF-8, it should still possible to browse the server. Unless of course the client has some more bugs like trying to interpret the characters > 127.
The RFC 2640 has a misleading paragraph:
The code set is ONLY identical for 7-byte ASCII and UTF-8. For Double-byte character the code point is completely different, so directly sending UTF-8 non-English filenames directly to older client DOES break the compatibility.
Code: Select all
The character set used to store files SHALL remain a local decision
and MAY depend on the capability of local operating systems. Prior to
the exchange of pathnames they SHOULD be converted into a ISO/IEC
10646 format and UTF-8 encoded. This approach, while allowing
international exchange of pathnames, will still allow backward
compatibility with older systems because [b]the code set positions for
ASCII characters are identical to the one byte sequence in UTF-8[/b].
If server sends UTF-8 names to old clients, it is impossible for clients to send the correct names back to server in DBCS OS because the string is already damaged.- Servers which support this specification, when presented a pathname
from an old client (one which does not support this specification),
can nearly always tell whether the pathname is in UTF-8 (see B.1)
or in some other code set. In order to support these older clients,
servers may wish to default to a non UTF-8 code set. However, how a
server supports non UTF-8 is outside the scope of this
specification.
RFC 959 for FTP:
starkwong, the part of RFC2640 refers to compatibility to the old specified charset which was US-Ascii.
RFC 854 for Telnet:control connection
The communication path between the USER-PI and SERVER-PI for
the exchange of commands and replies. This connection follows
the Telnet Protocol.
1. When a TELNET connection is first established, each end is
assumed to originate and terminate at a "Network Virtual Terminal",
or NVT.
So UTF-8 support does not break any compatibility as servers using other non-ascii charsets are not covered in the specification.THE NETWORK VIRTUAL TERMINAL
[...] The code
set is seven-bit USASCII in an eight-bit field, except as modified
herein. Any code conversion and timing considerations are local
problems and do not affect the NVT.
starkwong, the part of RFC2640 refers to compatibility to the old specified charset which was US-Ascii.
Yes, so since RFC is not covered about non US-Ascii, you just don't want to deal with the compatibility of non US-Ascii, right?botg wrote:
starkwong, the part of RFC2640 refers to compatibility to the old specified charset which was US-Ascii.
You don't understand. The situation is compatibility of FileZIllaServer in non UTF-8 clients IS ALREADY BROKEN in DBCS version of Windows.
OK, I give up. You win. I just look for other FTP server software.
I could boldly claim all servers not using the German ansi character set are broken because I can't see filenames on for example Chinese servers. There has never been compatibility for non-US-ascii character sets, it was pure luck it did work in the case server and client were using the same local character set.
How should a client know a server is using DBCS (and vice versa)? Only thing a client knows (and according to the specs can rely on it) is that a server is supposed to use 7-bit US-Ascii. Unless the UTF-8 specification is used, everything other character charset is unspecified nonstandard behaviour and broken by definition.
How should a client know a server is using DBCS (and vice versa)? Only thing a client knows (and according to the specs can rely on it) is that a server is supposed to use 7-bit US-Ascii. Unless the UTF-8 specification is used, everything other character charset is unspecified nonstandard behaviour and broken by definition.
Yes your right about the RFC thingy, and i understand your reasons.
But what you need to understand, too, is, that appearantly the Local Encoding method worked very well for most people, including me.
I think, what people want is not 100% compliance to the RFC, they want it to work.
Only very few clients I found on the net support UTF8, having a pure UTF8 server is not a good idea in this case.
I can understand the other people, too. You can't tell everyone: "Hey your client is broken, update it or get another one!" What they'll do is leaving your server and never return...
I don't want to offend anyone here, but all I get since UTF8 is problems. And I guess I'm not the only one...
boco
But what you need to understand, too, is, that appearantly the Local Encoding method worked very well for most people, including me.
I think, what people want is not 100% compliance to the RFC, they want it to work.
Only very few clients I found on the net support UTF8, having a pure UTF8 server is not a good idea in this case.
I can understand the other people, too. You can't tell everyone: "Hey your client is broken, update it or get another one!" What they'll do is leaving your server and never return...
I don't want to offend anyone here, but all I get since UTF8 is problems. And I guess I'm not the only one...
boco
I just upgraded from 0.9.2 to 0.9.17b today and got the same issue.
I totally agree to starkwong that you should only enable UTF8 after the Client says she wants it, or have an option on the server to default to traditional encoding scheme.
I have been using it for more than a year and I love it. However, I can say Filezilla Server is totally useless in a DBCS society unless we have the above option.
Can you clearly let us know whether there will be such as option in the future release?
Thank you.
I totally agree to starkwong that you should only enable UTF8 after the Client says she wants it, or have an option on the server to default to traditional encoding scheme.
I have been using it for more than a year and I love it. However, I can say Filezilla Server is totally useless in a DBCS society unless we have the above option.
Can you clearly let us know whether there will be such as option in the future release?
Thank you.
--
Derek
Derek
Should be an option with three choices like:
Don't know if it's possible though...
boco
Code: Select all
[ ] Pure Unicode mode <-- currently, the server works in this mode
[ ] Use Local Encoding until OPTS UTF8 ON is sent <-- this would be my choice
[ ] Use Local Encoding only <-- 0.9.14a and below
boco
Hi,
i'm having troubles with ftpzilla server 0.9.18 because of utf8 i think
i realized i can, with an old filezilla client, issue the OPTS UTF8 OFF and then the server sends right nfilenames, BUT i have to deal with browsers, too.
when i use an ftplink having accented (italian) chars with mozilla firefox 1.5.0.4, i still get an error about a directory not found: it shows my filename WITHOUT the accented chars, and of course THAT filename doesn't exist...
i think we need a server option too (maybe an xml one) or a default for OFF
until then, i have to downgrade my filezilla server or give up with it.
which is the last version that doesn't comply so strictly with rfcs?
i'm having troubles with ftpzilla server 0.9.18 because of utf8 i think
i realized i can, with an old filezilla client, issue the OPTS UTF8 OFF and then the server sends right nfilenames, BUT i have to deal with browsers, too.
when i use an ftplink having accented (italian) chars with mozilla firefox 1.5.0.4, i still get an error about a directory not found: it shows my filename WITHOUT the accented chars, and of course THAT filename doesn't exist...
i think we need a server option too (maybe an xml one) or a default for OFF
until then, i have to downgrade my filezilla server or give up with it.
which is the last version that doesn't comply so strictly with rfcs?