if I have utf-8 encoded data, is it safe to send them in a HTTP body? The thing is that utf-8 data could include control characters including the null character (binary zero), which are not allowed by http RFC of course. So what to do with such data? Encode them with base64?
On the other side the data, which I have in utf-8 is XML and XML specification forbids use of special characters (http://www.w3.org/TR/2006/REC-xml-20060816/#charsets)…
So I guess that the utf-8 is not safe, but XML in utf-8 is safe and can be directly embedded in the http body, e.g. in the MIME multipart body without need to do something like quoted-printable encoding.
BR
STeN
HTTP allows the sending of ARBITRARY data. So yes; UTF-8 is safe for HTTP, but on the gripping hand; 0x00 isn’t really “safe” anywhere. Both HTTP request and response bodies have methods for dealing with arbitrary data, as does MIME (which usually encapsulates HTTP POST bodies), namely a Length:-header.
There is no control character that can cause a compliant HTTP implementation to assume that the body is done if it hasn’t reached Length:.