Im parsing a SCPI string, which looks something like: HEADER:HEADER:HEADER:CMD NUMBER MULTIPLIER UNIT; The

Question

0

Editorial Team

Asked: June 8, 20262026-06-08T08:07:01+00:00 2026-06-08T08:07:01+00:00

Im parsing a SCPI string, which looks something like: HEADER:HEADER:HEADER:CMD NUMBER MULTIPLIER UNIT; The

0

Im parsing a SCPI string, which looks something like:

HEADER:HEADER:HEADER:CMD NUMBER MULTIPLIER UNIT;

The spaces between the tokens NUMBER, MULTIPLIER and UNIT are not necessarily there, nor are the tokens of a fixed length. I have been able to parse (from L to R) as far as the end of NUMBER. However the MULTIPLIER and UNIT tokens are each optional and can have characters that are the same.

e.g. suffix could be ‘P’ (where P could mean pico [mult] or poise [unit])

or ‘MA’ (could be mega [mult] or milli-Amp [mult-unit])

Does anyone have any experience parsing such syntax’s, or indeed anyone else, have any ideas on how to parse these into their correct tokens.

EDIT: For the pedant, I guess this is more lexical analysis than parsing.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T08:07:02+00:00

Perhaps in your simple example, doing it with a couple of nested ifs would be easier than trying a more powerful method, but if you don’t want to do that manually or if the actual problem is a bit bigger, you can try matching your input with regular expression (standard lexer stuff).

On a POSIX system, you can use regexec.

Edit: How to do it with if (and select):

I assume your input is in text and you have already read up to the end of NUMBER, so your index i, shows that!

// helper function: find next non-whitespace character
char get_prev(char *text, int *end, int i)
{
    for (; *end > i; --*end)
        if (text[*end] != ' ' && text[*end] != '\t'
            && text[*end] != '\n' && text[*end] != '\r')
        // or `if (text[*end] > ' ')` if ASCII
            break;
    return text[(*end)--];
}

... your function...
    // read up to i
    int end = strlen(text);
    int power_of_10 = 0;  // for MULT
    enum unit unit = UNKNOWN; // for UNIT
    switch (get_prev(text, &end, i))
    {
        case 'P':
            unit = POISE;
            break;
        case 'A':
            unit = AMP;
            break;
        ...
        default: // unforeseen character
        case '\0':
            // neither UNIT nor MULT exist
            break;
    }
    if (unit != UNKNOWN)
        switch (get_prev(text, &end, i))
        {
            case 'M':
                power_of_ten = -3;  // milli
                break;
            case 'A':
                switch (get_prev(text, &end, i))
                {
                    case 'M':
                        power_of_ten = 6;  // mega
                        break;
                    ...
                }
                break;
            ...
            default: // unforeseen character
            case '\0':
                // MULT doesn't exist
                break;
        }

Note, in this case, I assumed UNIT is mandatory. I’m not sure how you can distinguish between mega and milliamp in 10MA if both MULT and UNIT are optional. However you can add more cases to the first switch, that correspond to values of MULT and change power_of_10 there to. For example, if in the first switch you see k, you can understand that UNIT doesn’t exist and power_of_10 is 3.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Im parsing a SCPI string, which looks something like: HEADER:HEADER:HEADER:CMD NUMBER MULTIPLIER UNIT; The

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply