Given an arbitrary string, for example ("I'm going to play croquet next Friday" or "Gadzooks, is it 17th June already?"), how would you go about extracting the dates from there?
If this is looking like a good candidate for the too-hard basket, perhaps you could suggest an alternative. I want to be able to parse Twitter messages for dates. The tweets I’d be looking at would be ones which users are directing at this service, so they could be coached into using an easier format, however I’d like it to be as transparent as possible. Is there a good middle ground you could think of?
If you have the horsepower, you could try the following algorithm. I’m showing an example, and leaving the tedious work up to you 🙂
And we can assume that
strtotime("17th June")is more accurate thanstrtotime("17th")simply because it contains more words… i.e. “next Friday” will always be more accurate than “Friday”.