
July 17th, 2000, 04:54 AM
|
|
lowest of the low
|
|
Join Date: Jul 2000
Location: Japan
Posts: 12
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Anyone know whether we can know anything about the character encoding for data sent from forms by POST or GET?
Japanese has several encodings for the JIS character set, only one of which is 7-bit safe. We need to convert to the 7-bit safe JIS encoding (ISO-2022jp) because we need to cover some weird situations that shouldn't occur but do, where the 8th bit gets stripped. The standard approach tries to automatically sense the encoding used in the common case (either EUC or shift-JIS), but that leaves some open ends (and the possibility of a flubbed conversion).
We are using the (accursed) shift-JIS in our HTML because Windows and Mac use it at the system level and GoLive produces it without effort. So far, we have been safe in assuming shift-JIS from the form, but I would like to refer to some authoritative discussion about which character encoding will be sent. In other words, whether it is guaranteed to be the encoding for the page, or whether there may be some browser setting that causes the form data to be sent in one of the other encodings.
(I am not talking about UUENCODE here, that is an entirely separate issue.)
|