I have a feed that is populating a single text field in a table with statistics.
I need to pull this data into multiple fields in another table
but the strange format makes importing automatically difficult.
The file format is flat text but an example is below:
08:34:52 Checksum=180957248,TicketType=6,InitialUserType=G,InitialUserID=520,CommunicationType=Incoming,Date=26-03-2012,Time=08:35:00,Service=ST,Duration=00:00:14,Cost=0.12
Effectively it’s made up of:
[timestamp] [Field1 name]=[Field1 value],[Field2 name]=[Field2 value],[Field4 name]=[Field4 value]...[CR]
All fields are always in the same order but not always present.
Total columns could be anywhere from 5 to 30.
I’ve tried the below function to translate it which seems to work mostly but seems to randomly skip fields:
Parsing the data:
(SELECT [Data].[dbo].[GetFromTextString] ( 'Checksum=' ,',' ,RAWTEXT)) AS RowCheckSum,
(SELECT [Data].[dbo].[GetFromTextString] ( 'TicketType=' ,',' ,RAWTEXT)) AS TicketType,
And the Function:
CREATE FUNCTION [dbo].[GetFromTextString]
-- Input start and end and return value.
(@uniqueprefix VARCHAR(100),
@commonsuffix VARCHAR(100),
@datastring VARCHAR(MAX) )
RETURNS VARCHAR(MAX) -- Picked Value.
AS
BEGIN
DECLARE @ADJLEN INT = LEN(@uniqueprefix)
SET @datastring = @datastring + @commonsuffix
RETURN (
CASE WHEN (CHARINDEX(@uniqueprefix,@datastring) > 0)
AND (CHARINDEX(@uniqueprefix + @commonsuffix,@datastring) = 0)
THEN SUBSTRING(@datastring, PATINDEX('%' + @uniqueprefix + '%',@datastring)+@ADJLEN, CHARINDEX(@commonsuffix,@datastring,PATINDEX('%' + @uniqueprefix + '%',@datastring))- PATINDEX('%' + @uniqueprefix + '%',@datastring)-@ADJLEN) ELSE NULL END
)
END
Could anyone suggest a better/cleaner way to strip out the data or could someone work out why this formula skips rows?
Any help really appreciated.
NOTE – THE FIRST SOLUTION IS RUBBISH. I HAVE LEFT IN IT FOR HISTORICAL REASONS, BUT A BETTER SOLUTION IS CONTAINED BELOW
I am not even sure if this will be faster than your current method, but it is the way I would approach the issue (If i was forced into an SQL only solution). The first thing that is required is a table valued function that will perform a split function:
For my example I am working from a temp table @Worklist, you may need to adapt yours accordingly, or you could just insert the relevant data into @Worklist where I have used dummy data:
The main bit of the query is done here. It is quite long, so I have tried to comment it as well as possible. If further clarification is required I can add more comments.
EDIT (4 YEARS LATER)
A recent upvote brought this to my attention and the I hate myself a little bit for ever posting this answer in its current form.
A much better split function would be:
Then there is no need for looping at all, you just have a proper set based solution by calling the split function twice to get your EAV style data set: