Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8586261
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T22:16:07+00:00 2026-06-11T22:16:07+00:00

I need to parse a sentence. Now I have an implemented Earley parser and

  • 0

I need to parse a sentence. Now I have an implemented Earley parser and a grammar for it. And everything works just fine when a sentence has no misspellings. But the problem is a lot of sentences I have to deal with are highly noisy. I wonder if there’s an algorithm which combines parsing with errors correction? Possible errors are:

  • typos ‘cheker’ instead of ‘checker’
  • typos like ‘spellchecker’ instead of ‘spell checker’
  • contractions like ‘Ear par’ instead ‘Earley parser’

If you know an article which can answer my question I would appriciate a link to it.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T22:16:09+00:00Added an answer on June 11, 2026 at 10:16 pm

    I assume you are using a tagger (or lexer) stage that is applied before the Earley parser, i.e. an algorithm that splits the input string into tokens and looks each token up in a dictionary to determine its part-of-speech (POS) tag(s):

    John     --> PN
    loves    --> V
    a        --> DT
    woman    --> NN
    named    --> JJ,VPP
    Mary     --> PN
    

    It should be possible to build some kind of approximate string lookup (aka fuzzy string lookup) into that stage, so when it is presented with a misspelled token, such as ‘lobes’ instead of ‘loves’, it will not only identify the tags found by exact string matching (‘lobes’ as a noun plural of ‘lobe’), but also tokens that are similar in shape (‘loves’ as third-person singular of verb ‘love’).

    This will imply that you generally get a larger number of candidate tags for each token, and therefore a larger number of possible parse results during parsing. Whether or not this will produce the desired result depends on how comprehensive the grammar is, and how good the parser is at identifying the correct analysis when presented with many possible parse trees. A probabilistic parser may be better for this, as it assigns every candidate parse tree a probability (or confidence score), which may be used to select the most likely (or best) analysis.

    If this is the solution you’d like to try, there are several possible implementation strategies. Firstly, if the tokenization and tagging is performed as a simple dictionary lookup (i.e. in the style of a lexer), you may simply use a data structure for the dictionary that enables approximate string matching. General methods for approximate string comparison are described in Approximate string matching algorithms, while methods for approximate string lookup in larger dictionaries are discussed in Quickly compare a string against a Collection in Java.

    If, however, you use an actual tagger, as opposed to a lexer, i.e. something that performs POS disambiguation in addition to mere dictionary lookup, you will have to build the approximate dictionary lookup into that tagger. There must be a dictionary lookup function, which is used to generate candidate tags before disambiguation is applied, somewhere in the tagger. That dictionary lookup will have to be replaced with one that enables approximate string lookup.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need a regular expression to parse words from a sentence or a paragraph.
I have some HTML I need to parse. Basically I'm walking through the dom
I am trying to display jSON data in jQuery autocomplete, and everything works fine
I need a sentence parser. Where parser splits complete sentence based on white character.
I have a rather big number of source files that I need parse and
gcc 4.6.1 c89 I have a string that I need parse. The string is
I have hash: {'login': u'myemail (myemail@gmail.com)'} I need parse only email myemail@gmail.com What regexp
I need parse a select value in html file. I have this html file:
I need to parse a website which has a lot of nested <div> s
I need parse through a file and do some processing into it. The file

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.