Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7875191
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T02:56:30+00:00 2026-06-03T02:56:30+00:00

I realize this has been asked before, on this very forum no less, but

  • 0

I realize this has been asked before, on this very forum no less, but the proposed solution was not reliable for me.

I have been working on this for a week or more by now, and I stayed up ’till 3am yesterday working on it… But I digress, let me get to the issue at hand:

For those unaware, mirc uses ascii control codes to control character color, underline, weight, and italics. The ascii code for the color is 3, bold 2, underline 1F, italic 1D, and reverse(white text on black background), 16.

As an example of the form this data is going to come in, we have(in regex because those characters will not print):

\x034this text is red\x033this text is green\x03 \x02bold text\x02
\x034,3this text is red with a green background\x03

Et-cetera.

Below are the two functions I have attempted to modify for my own use, but have returned unreliable results. Before I get into that code, to be specific on ‘unreliable’, sometimes the code would parse, other times there would still be control codes left in the text, and I can’t figure out why. Anyway;

function mirc2html($x) {
    $c = array("FFF","000","00007F","009000","FF0000","7F0000","9F009F","FF7F00","FFFF00","00F800","00908F","00FFFF","0000FF","FF00FF","7F7F7F","CFD0CF");
    $x = preg_replace("/\x02(.*?)((?=\x02)\x02|$)/", "<b>$1</b>", $x);
    $x = preg_replace("/\x1F(.*?)((?=\x1F)\x1F|$)/", "<u>$1</u>", $x);
    $x = preg_replace("/\x1D(.*?)((?=\x1D)\x1D|$)/", "<i>$1</i>", $x);
    $x = preg_replace("/\x03(\d\d?),(\d\d?)(.*?)(?(?=\x03)|$)/e", "'</span><span style=\"color: #'.\$c[$1].'; background-color: #'.\$c[$2].';\">$3</span>'", $x);
    $x = preg_replace("/\x03(\d\d?)(.*?)(?(?=\x03)|$)/e", "'</span><span style=\"color: #'.\$c[$1].';\">$2</span>'", $x);
    //$x = preg_replace("/(\x0F|\x03)(.*?)/", "<span style=\"color: #000; background-color: #FFF;\">$2</span>", $x);
    //$x = preg_replace("/\x16(.*?)/", "<span style=\"color: #FFF; background-color: #000;\">$1</span>", $x);
    //$x = preg_replace("/\<\/span\>/","",$x,1);
    //$x = preg_replace("/(\<\/span\>){2}/","</span>",$x);
    return $x;
}

function color_rep($matches) {
    $matches[2] = ltrim($matches[2], "0");
    $bindings = array(0=>'white',1=>'black',2=>'blue',3=>'green',4=>'red',5=>'brown',6=>'purple',7=>'orange',8=>'yellow',9=>'lightgreen',10=>'#00908F',
        11=>'lightblue',12=>'blue',13=>'pink',14=>'grey',15=>'lightgrey');
    $preg = preg_match_all('/(\d\d?),(\d\d?)/',$matches[2], $col_arr);
    //print_r($col_arr);
    $fg = isset($bindings[$matches[2]]) ? $bindings[$matches[2]] : 'transparent';
    if ($preg == 1) {
        $fg = $bindings[$col_arr[1][0]];
        $bg = $bindings[$col_arr[2][0]];
    }
    else {
        $bg = 'transparent';
    }


    return '<span style="color: '.$fg.'; background: '.$bg.';">'.$matches[3].'</span>';
}

And, in case it is relevant, where the code is called:

$logln = preg_replace_callback("/(\x03)(\d\d?,\d\d?|\d\d?)(\s?.*?)(?(?=\x03)|$)/","color_rep",$logln);

Sources: First, Second

I’ve of course also attempted to look at the methods done by various php/ajax based irc clients, and there hasn’t been any success there. As to doing this mirc-side, I’ve looked there as well, and although the results have been more reliable than php, the data sent to the server increases exponentially to the point that the socket times out on upload, so it isn’t a viable option.

As always, any help in this matter would be appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T02:56:31+00:00Added an answer on June 3, 2026 at 2:56 am

    You should divide the problem, for example with a tokenizer. A tokenizer will scan the input string and turn the special parts into named tokens, so the rest of your script can identify them. Usage example:

    $mirc = "\x034this text is red\x033this text is green\x03 \x02bold text\x02
    \x034,3this text is red with a green background\x03";
    
    $tokenizer = new Tokenizer($mirc);
    
    while(list($token, $data) = $tokenizer->getNext())
    {
        switch($token)
        {
            case 'color-fgbg':
                printf('<%s:%d,%d>', $token, $data[1], $data[2]);
                break;
    
            case 'color-fg':
                printf('<%s:%d>', $token, $data[1]);
                break;
    
            case 'color-reset':
            case 'style-bold';
                printf('<%s>', $token);
                break;
    
            case 'catch-all':
                echo $data[0];
                break;
    
            default:
                throw new Exception(sprintf('Unknown token <%s>.', $token));
        }
    }
    

    This does not much yet, but identify the interesting parts and their (sub-) values as the output demonstrates:

    <color-fg:4>this text is red<color-fg:3>this text is green<color-reset> <style-bold>bold text<style-bold>
    <color-fgbg:4,3>this text is red with a green background<color-reset>
    

    It should be relatively easy for you to modify the loop above and handle the states like opening/closing color and font-variant tags like bold.

    The tokenizer itself defines a set of tokens of which is tries to find them one after the other at a certain offset (starting at the beginning of the string). The tokens are defined by regular expressions:

    /**
     * regular expression based tokenizer,
     * first token wins.
     */
    class Tokenizer
    {
        private $subject;
        private $offset = 0;
        private $tokens = array(
            'color-fgbg'  => '\x03(\d{1,2}),(\d{1,2})',
            'color-fg'    => '\x03(\d{1,2})',
            'color-reset' => '\x03',
            'style-bold'  => '\x02',
            'catch-all' => '.|\n',
        );
        public function __construct($subject)
        {
            $this->subject = (string) $subject;
        }
        ...
    

    As this private array shows, simple regular expressions and they get a name with their key. That’s the name used in the switch statement above.

    The next() function will look for a token at the current offset, and if found, will advance the offset and return the token incl. all subgroup matches. As offsets are involved, the more detailed $matches array is simplified (offsets removed) as the main routine normally does not need to know about offsets.

    The principle is easy here: The first pattern wins. So you need to place the pattern that matches most (in sense of string length) on top to have this working. In your case, the largest one is the token for the foreground and background color, <color-fgbg>.

    In case not token can be found, NULL is returned, so here the next() function:

    ...
    /**
     * @return array|null
     */
    public function getNext()
    {
        if ($this->offset >= strlen($this->subject))
            return NULL;
    
        foreach($this->tokens as $name => $token)
        {
            if (FALSE === $r = preg_match("~$token~", $this->subject, $matches, PREG_OFFSET_CAPTURE, $this->offset))
                throw new RuntimeException('Pattern for token %s failed (regex error).', $name);
            if ($r === 0)
                continue;
            if (!isset($matches[0])) {
                var_dump(substr($this->subject, $this->offset));
                $c = 1;
            }
            if ($matches[0][1] !== $this->offset)
                continue;
            $data = array();
            foreach($matches as $match)
            {
                list($data[]) = $match;
            }
    
            $this->offset += strlen($data[0]);
            return array($name, $data);
        }
        return NULL;
    }
    ...
    

    So the tokenization of the string is now encapsulated into the Tokenizer class and the parsing of the token is something you can do your own inside some other part of your application. That should make it more easy for you to change the way of styling (HTML output, CSS based HTML output or something differnt like bbcode or markdown) but also the support of new codes in the future. Also in case something is missing you can more easily fix things because it’s either a non-recognized code or something missing with the transformation.

    The full example as gist: Tokenizer Example of Mirc Color and Style (bold) Codes.

    Related resources:

    • Very rudimentary, regex based tokenizer routine example
    • http://www.mirc.com/colors.html
    • http://en.wikipedia.org/wiki/Control_key
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I realize this question has probably been asked numerous times, but I have not
I realize this question is very likely to have been asked before, but I've
I realize that this question has been asked 100times but none that I have
I realize this has been asked before, but I wasn't able to find a
I realize this question has been asked before, but I can't get it to
I realize that this question has been asked before - but none of the
I realize that this type of question has been asked several times before but
I realize something like this has been asked, but this may be a little
While I realize that this question has been asked once or twice ago but
Sorry if this has been asked before, I did check but couldn't find anything...

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.