By readable UTF, i mean anything that is a valid UTF-8, not (of course) that the user must have a font to read that string.
example of readable strings:
$readable_str0 = "Mary had a little lamb.";
$readable_str1 = "Příšerně žluťoučký kůň úpěl ďábelské ódy.";
$readable_str4 = "صِف خَلقَ خَودِ كَمِثلِ الشَمسِ إِذ بَزَغَت يَحظى الضَجيعُ بِها نَجلاءَ مِعطارِ";
$readable_str5 = "ཨ་ཡིག་དཀར་མཛེས་ལས་འཁྲུངས་ཤེས་བློའི་གཏེར༎"; //(Dzongkha)
$readable_str7 = "とりなくこゑす ゆめさませ みよあけわたる";
$readable_str8 = "TWFyeSBoYWQgYSBsaXR0bGUgbGFtYi4=";
not readable strings:
$not_readable_str0 = "�M,�T�HLQHT��,)�IU�I�M�";
$not_readable_str1 = "9��Příšerně žluťoučký kůň úpěl ďábelské ódy."
// this has some odd characters at the beginning so should count as unreadable
// it was result of gzdeflate of readable str 1
$not_readable_str4 = "ŹĎ5ůĹńŁV»×~1xâţöÚkkąő«¶’ŢáJ";
//some random selection from gif file
Kind of dirty hack that most likely will fail in some cases:
and compare lenghts of $str and $str2.