I would like to make my debug handler (installed with qInstallMsgHandler) handles UTF-8, however it seems it can only be defined as void myMessageOutput(QtMsgType type, const char *msg) and const char* doesn’t handle UTF-8 (once displayed, it’s just random characters).
Is there some way to define this function as void myMessageOutput(QtMsgType type, QString msg), or maybe some other way to make it work?
This is my current code:
void myMessageOutput(QtMsgType type, const char *msg) {
QString message = "";
QString test = QString::fromUtf8(msg);
// If I break into the debugger here. both "test" and "msg" contain a question mark.
switch (type) {
case QtDebugMsg:
message = QString("[Debug] %1").arg(msg);
break;
case QtWarningMsg:
message = QString("[Warning] %1").arg(msg);
break;
case QtCriticalMsg:
message = QString("[Critical] %1").arg(msg);
break;
case QtFatalMsg:
message = QString("[Fatal] %1").arg(msg);
abort();
}
Application::instance()->debugDialog()->displayMessage(message);
}
Application::Application(int argc, char *argv[]) : QApplication(argc, argv) {
debugDialog_ = new DebugDialog();
debugDialog_->show();
qInstallMsgHandler(myMessageOutput);
qDebug() << QString::fromUtf8("我");
}
If you step through the code in the debugger you will find out that QDebug and qt_message first construct a QString from the const char* and then use toLocal8Bit on this string.
The only way I can think of to circumvent this: Use your own coding (something like “[E68891]”) or some other coding like uu-encode or base64-encoding that uses only ASCII characters and decode the string in your message handler.
You should also consider to use the version qDebug(“%s”, “string”) to avoid quotes and additional whitespace (see this question).
Edit: the toLocal8Bit happens in the destructor of QDebug that is call at the end of a qDebug statement (qdebug.h line 85). At least on the Windows platform this calls toLatin1 thus misinterpreting the string. You can prevent this by calling the following lines at the start of your program:
On some platforms UTF-8 seems to be the default text codec.