Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6244391
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T12:19:11+00:00 2026-05-24T12:19:11+00:00

I have a string that has several date values in it, and I want

  • 0

I have a string that has several date values in it, and I want to parse them all out. The string is natural language, so the best thing I’ve found so far is dateutil.

Unfortunately, if a string has multiple date values in it, dateutil throws an error:

>>> s = "I like peas on 2011-04-23, and I also like them on easter and my birthday, the 29th of July, 1928"
>>> parse(s, fuzzy=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/pymodules/python2.7/dateutil/parser.py", line 697, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/usr/lib/pymodules/python2.7/dateutil/parser.py", line 303, in parse
    raise ValueError, "unknown string format"
ValueError: unknown string format

Any thoughts on how to parse all dates from a long string? Ideally, a list would be created, but I can handle that myself if I need to.

I’m using Python, but at this point, other languages are probably OK, if they get the job done.

PS – I guess I could recursively split the input file in the middle and try, try again until it works, but it’s a hell of a hack.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T12:19:11+00:00Added an answer on May 24, 2026 at 12:19 pm

    Looking at it, the least hacky way would be to modify dateutil parser to have a fuzzy-multiple option.

    parser._parse takes your string, tokenizes it with _timelex and then compares the tokens with data defined in parserinfo.

    Here, if a token doesn’t match anything in parserinfo, the parse will fail unless fuzzy is True.

    What I suggest you allow non-matches while you don’t have any processed time tokens, then when you hit a non-match, process the parsed data at that point and start looking for time tokens again.

    Shouldn’t take too much effort.


    Update

    While you’re waiting for your patch to get rolled in…

    This is a little hacky, uses non-public functions in the library, but doesn’t require modifying the library and is not trial-and-error. You might have false positives if you have any lone tokens that can be turned into floats. You might need to filter the results some more.

    from dateutil.parser import _timelex, parser
    
    a = "I like peas on 2011-04-23, and I also like them on easter and my birthday, the 29th of July, 1928"
    
    p = parser()
    info = p.info
    
    def timetoken(token):
      try:
        float(token)
        return True
      except ValueError:
        pass
      return any(f(token) for f in (info.jump,info.weekday,info.month,info.hms,info.ampm,info.pertain,info.utczone,info.tzoffset))
    
    def timesplit(input_string):
      batch = []
      for token in _timelex(input_string):
        if timetoken(token):
          if info.jump(token):
            continue
          batch.append(token)
        else:
          if batch:
            yield " ".join(batch)
            batch = []
      if batch:
        yield " ".join(batch)
    
    for item in timesplit(a):
      print "Found:", item
      print "Parsed:", p.parse(item)
    

    Yields:

    Found: 2011 04 23
    Parsed: 2011-04-23 00:00:00
    Found: 29 July 1928
    Parsed: 1928-07-29 00:00:00

    Update for Dieter

    Dateutil 2.1 appears to be written for compatibility with python3 and uses a “compatability” library called six. Something isn’t right with it and it’s not treating str objects as text.

    This solution works with dateutil 2.1 if you pass strings as unicode or as file-like objects:

    from cStringIO import StringIO
    for item in timesplit(StringIO(a)):
      print "Found:", item
      print "Parsed:", p.parse(StringIO(item))
    

    If you want to set option on the parserinfo, instantiate a parserinfo and pass it to the parser object. E.g:

    from dateutil.parser import _timelex, parser, parserinfo
    info = parserinfo(dayfirst=True)
    p = parser(info)
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a Struts (1.3x) ActionForm that has several String and boolean properties/fields, but
I have a model that has several properties. The properties can be primitive (String)
I have several strings that I need to parse. The string is supposed to
I have a string that has some Environment.Newline in it. I'd like to strip
I have a string I need to feed to a com object that has
We have a file that has a 64 bit integer as a string in
Hey. I have an object that has a string property called BackgroundColor. This string
I have a Dictionary<string,int> that has the potential to contain upwards of 10+ million
In my asp.net-mvc website I have a field that usually has a string (from
I have an ArrayList<String> that I'd like to return a copy of. ArrayList has

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.