I’m sending over a gzipped string from C# (using SharpZipLib) to PHP where I decompress with readgzfile. This works, however each character in the string is followed by two strange characters (using vim in the console those are displayed as ^@). I also tried with gzopen/gzread but with the same results.
When I clean the non-ASCII characters from the string with $clean= preg_replace('/[^(\x20-\x7F)]*/','', $string); the $clean string is identical to the one in C#.
While this works, I would like to know what is happening and why so I can make sure this will always work or come up with a better solution.
Given that the string is created on Windows, it’s likely that some multibyte encoding is being used.
You can verify this yourself by using
bin2hex($string)and check the hexadecimal representation instead of relying on vim.If either
UTF-16orUCS2are being used, you can convert them like so: