I have a PHP application that is supposed to parse uploaded text files that has a format similar to this:
| | | |
| -----------------------------------------------------|
| Sample | Data | |
| -----------------------------------------------------|
| Sample | Data | |
| -----------------------------------------------------|
| Sample | Data | |
| -----------------------------------------------------|
| Accepts | |
| --------------------------------------------------------|
| All | Yes |
| --------------------------------------------------------|
| More | Yes |
| --------------------------------------------------------|
| | | Years | | |
| ---------------------------------------------------------------|
| 1998 | 1999 | 2000 | 2001 | 2002 |
| ---------------------------------------------------------------|
| 2003 | 2004 | 2005 | 2006 | 2007 |
| ---------------------------------------------------------------|
| 2008 | 2009 | 2010 | 2011 | 2012 |
| ---------------------------------------------------------------|
What I need to do is basically isolate each “block” by itself in the same order, so I can loop them one-by-one. A “solution” could be doing
preg_split("/\n{4,}/", $text);
However that would produce unwated results if the person submitting the text decides that the unnecessary newlines doesn’t belong and removes them. I tried playing around with preg_match_all(), but it has been years since I did any real regex, so I couldn’t come up with a usable solution.
The first line of a “block” always contains | and spaces, but fields may contain text. The last line of a “block” is always a pipe followed by a space, dashes to fill the row, ending with a |.
If this is how the content of the text file looks like I would write something like
I’m not sure if this is the most elegant or even reliable way, though, since it’s hard to guess what exactly the content might look like.