Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8891183
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T22:40:32+00:00 2026-06-14T22:40:32+00:00

I took the example below partially from SO and changed it to my needs.

  • 0

I took the example below partially from SO and changed it to my needs. It almost fits, but what I want to do is that always the first string in the commaSep expr is parsed as identifier whilst all subsequent strings should be strings only.

Currently they are all parsed as Identifiers.

*Parser> parse expr "" "rd (isFib, test2, 100.1, ?BOOL)"
Right (FuncCall "rd" [Identifier "isFib",Identifier "test2",Number 100.1,Query "?BOOL"])

I have tried a number of solutions that in the end all would break down to parsing the whole input without using commaSep. Means I would have to ignore the structure and do something like

expr_parse = do
    name <- resvd_cmd
    char '('
    skipMany space
    worker <- ident
    char ','
    skipMany1 space
    args <- commaSep expr --not fully worked this out yet
    query <- theQuery
    skipMany space
    char ')'
    return (name, worker, args, query)

that looks less optimal and very clunky to me. Is there any way to refactor expr in the code below, achive what I need and keep it simple?

module Parser where

import Control.Monad (liftM)
import Text.Parsec
import Text.Parsec.String (Parser)
import Lexer
import AST

expr = ident <|>  astring <|> number <|> theQuery <|> callOrIdent

astring = liftM String stringLiteral <?> "String"

number = liftM Number float <?> "Number"

ident = liftM Identifier identifier <?> "WorkerName"

questionm :: Parser Char
questionm = oneOf "?"

theQuery :: Parser AST
theQuery = do first <- questionm
              rest <- many1 letter
              let query = first:rest
              return ( Query query )

resvd_cmd = do { reserved "rd"; return ("rd") }
            <|> do { reserved "eval"; return ("eval") }
            <|> do { reserved "read"; return ("read") }
            <|> do { reserved "in"; return ("in") }
            <|> do { reserved "out"; return ("out") }
            <?> "LINDA-like Tuple"

callOrIdent = do
    name <- resvd_cmd
    liftM (FuncCall name)(parens $ commaSep expr) <|> return (Identifier name)

AST.hs

{-# LANGUAGE DeriveDataTypeable #-}

module AST where

import Data.Typeable

data AST
    = Number Double
    | Identifier String
    | String String
    | FuncCall String [AST]
    | Query String
    deriving (Show, Eq, Typeable)

Lexer.hs

module Lexer (
            identifier, reserved, operator, reservedOp, charLiteral, stringLiteral,
            natural, integer, float, naturalOrFloat, decimal, hexadecimal, octal,
            symbol, lexeme, whiteSpace, parens, braces, angles, brackets, semi,
            comma, colon, dot, semiSep, semiSep1, commaSep, commaSep1
    )where

import Text.Parsec
import qualified Text.Parsec.Token as P
import Text.Parsec.Language (haskellStyle)

lexer = P.makeTokenParser ( haskellStyle
                            {P.reservedNames = ["rd", "in", "out", "eval", "take"]}
                         )


identifier = P.identifier lexer
reserved = P.reserved lexer
operator = P.operator lexer
reservedOp = P.reservedOp lexer
charLiteral = P.charLiteral lexer
stringLiteral = P.stringLiteral lexer
natural = P.natural lexer
integer = P.integer lexer
float = P.float lexer
naturalOrFloat = P.naturalOrFloat lexer
decimal = P.decimal lexer
hexadecimal = P.hexadecimal lexer
octal = P.octal lexer
symbol = P.symbol lexer
lexeme = P.lexeme lexer
whiteSpace = P.whiteSpace lexer
parens = P.parens lexer
braces = P.braces lexer
angles = P.angles lexer
brackets = P.brackets lexer
semi = P.semi lexer
comma = P.comma lexer
colon = P.colon lexer
dot = P.dot lexer
semiSep = P.semiSep lexer
semiSep1 = P.semiSep1 lexer
commaSep = P.commaSep lexer
commaSep1 = P.commaSep1 lexer
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T22:40:33+00:00Added an answer on June 14, 2026 at 10:40 pm

    First, I’d like to introduce you to the function lexeme which alters a parser to eat trailing whitespace. You’re encouraged to use it rather than explicitly eating the whitespace. The difficulty is with commaSep because it eats the , and then fails. It would be nice to write a less optimistic commaSep, but let’s solve your problem directly.

    Let’s apply lexeme to comma

    acomma = lexeme comma
    

    One of the problems with your code was you were expecting it to see test2 as String "test2" but the astring parser expects its strings to begin and end with ". Let’s make a parser for bald strings, but make sure they don’t start with ? and don’t contain spaces or commas:

    baldString = lexeme $ do
       x <- noneOf "? ,)"
       xs <- many (noneOf " ,)")   -- problematic - see comment below
       return . String $ x:xs
    

    The breakthrough came when I realised that because there has to be a query at the end, there was always a comma after a baldString:

    baldStringComma = do 
            s <- baldString
            acomma
            return s
    

    Now let’s make a parser for one or more queries at the end of the tuple:

    queries = commaSep1 (lexeme theQuery)
    

    And now we can take the identifier, the baldStrings and the queries

    therest = do
       name <- lexeme ident 
       acomma
       args <- many baldStringComma
       qs <- queries
       return (name,args,qs)
    

    finally giving

    tuple = do
        name <- lexeme resvd_cmd
        stuff <- parens therest
        return (name,stuff)
    

    So you get

    *Parser> parseTest tuple "rd (isFib, test2, 100.1, ?BOOL)"
    ("rd",(Identifier "isFib",[String "test2",String "100.1"],[Query "?BOOL"]))
    

    But if you want to lump the strings with the queries, you can return (name,args++qs) at the end of therest.

    Applicative is Less Ugly

    I found it frustrating to be tied to the Monad interface, when there are lovely things like <$>, <*> etc, so first

    import Control.Applicative hiding (many, (<|>))
    

    Then

    baldString = lexeme . fmap String $
       (:) <$> noneOf "? ,)"   
           <*> many (noneOf " ,)")   -- problematic - see comment below
    

    Here <$> is an infix version of fmap, so (:) will be applied to the output of noneOf "? ,", giving a parser that returns something like ('c':). This can then be applied to the output of many (noneOf " ,") using <*> to give the string we want.

    baldStringComma = baldString <* acomma
    

    This one’s nice because we got the <*> operator to ignore the output of acomma and just return the output of baldString, using <*. If we wanted it the other way round, we could do *>, but you may as well use >> for that, which already ignores the output of the first parser.

    therest = (,,) <$> 
       lexeme ident <* acomma
       <*> many baldStringComma
       <*> queries
    

    and

    tuple = (,) <$> lexeme resvd_cmd 
                <*> parens therest
    

    But wouldn’t it be nicer if we did

    data Tuple = Tuple {cmd :: String, 
                        id :: AST,
                        argumentList :: [AST],
                        queryList :: [AST]} deriving Show
    

    so we could do

    niceTuple = Tuple <$> lexeme resvd_cmd <* lexeme (char '(')
                      <*> lexeme ident <* acomma
                      <*> many baldStringComma
                      <*> queries <* lexeme (char ')')
    

    which gives (with a little manual pretty-printing to get it into the width)

    *Parser> parseTest niceTuple "rd (isFib, test2, 100.1, ?BOOL)"
    Tuple {cmd = "rd", 
           id = Identifier "isFib", 
           argumentList = [String "test2",String "100.1"], 
           queryList = [Query "?BOOL"]}
    

    I also think your current AST is more of an abstract syntax store than an abstract syntax tree, and that you might get more milage from designing your own Tuple type and use that. Use

    newtype Command = Cmd String  deriving Show
    

    and suchlike to ensure type safety, then roll them together into your Tuple type with a parser to generate them.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am new to this html5. Today, i took an example from the below
I took the example code from the Kendo UI demos at http://demos.kendoui.com/web/grid/remote-data.html , binding
So, I took some code from this Microsoft provided Example which allows me to
I took the code from this example. http://msdn.microsoft.com/en-us/library/2tw134k3.aspx What I am wondering (and I've
I need to make a small openMP project. I took the example from the
I took the minimal PDF example in the PDF specification from PDF Specification ,
EDIT -- took the code from below and made it so it can handle
I trying to learn tdd using RSpec. I took this example from a cheat
I'll ask this with a Scala example, but it may well be that this
I was following the example given in the OpenCV for video displaying, just took

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.