Ok, so here’s the thing. I have note in an old sql server text format. It puts all notes for a record in one big blob of data. I need to take that blob of text and parse it out to create one row for each note entry with separate columns for timestamp, user, and note text. The only way do do this that I can think of is to use regex to locate the unix timestamp for each note and parse on that. I know that there is the split function for parsing on delimiters, but that removes the delimiter. I need to parse on \d{10} but also retain the 10 digit number. Here is some sample data.
create table test_table
(
job_number number,
notes varchar2(4000)
)
insert into test_table values
(12345, '1234567890 username notes text notes text notes text notes text 5468204562 username notes text notes text notes text notes text 1025478510 username notes text notes text notes text notes text')
(12346, '2345678901 username notes text notes text notes text notes text 1523024512 username notes text notes text notes text notes text 1578451236 username notes text notes text notes text notes text')
(12347, '2345678902 username notes text notes text notes text notes text 2365201214 username notes text notes text notes text notes text 1202154215 username notes text notes text notes text notes text')
I would like to see one record for each note to look like this.
JOB_NUMBER DTTM USER NOTES_TEXT
---------- ---------- ---- ----------
12345 1234567890 USERNAME notes text notes text notes text notes text
12345 5468204562 USERNAME notes text notes text notes text notes text
12345 1025478510 USERNAME notes text notes text notes text notes text
12346 2345678901 USERNAME notes text notes text notes text notes text
12346 1523024512 USERNAME notes text notes text notes text notes text
12346 1578451236 USERNAME notes text notes text notes text notes text
12347 2345678902 USERNAME notes text notes text notes text notes text
12347 2365201214 USERNAME notes text notes text notes text notes text
12347 1202154215 USERNAME notes text notes text notes text notes text
Thank you for any help you can provide
Text::ParseWordscan handle the quoted strings and split on comma. You can skip ahead in the input by using the flip-flop operator1 .. /values/. This particular skip method may need to be revised.Then it is just a matter of parsing the strings, which can be done by splitting using a lookahead assertion and then capturing the various entries in each substring. The regex in the split:
has a negative lookbehind assertion to avoid matching at the start of the string
^, and a lookahead assertion to match 10 numbers. This will effectively split at the numbers and keep them.The
DATAfile handle is used for demonstration, simply replace<DATA>with<>to use with argument file name.Output: