I have a table with a column called ExcelLinks that contain records like this:

Question

0

Asked: June 7, 20262026-06-07T09:21:42+00:00 2026-06-07T09:21:42+00:00

I have a table with a column called ExcelLinks that contain records like this:

0

I have a table with a column called ExcelLinks that contain records like this:

=INDEX(‘\\san1\engData[BT_500.0_Structural_Position.xls]Concrete’!$B$4:$IK$83,MATCH($K$9,’\\san1\engData[BT_500.0_Structural_Position.xls]Concrete’!$A$4:$A$83,0),MATCH(C212,’\\san1\engData[BT_500.0_Structural_Position.xls]Concrete’!$B$3:$IK$3,0))/1000000

=INDEX(‘\\san1\engData[GK_600.0_Pumps.xls]Pumps’!$B$4:$BD$39,MATCH($K$9,’\\san1\engData[TT_640.0_Generator.xls]Generator’!$A$4:$A$39,0),MATCH(C214,’\\san1\engData[GK_600.0_Pumps.xls]Pumps’!$B$3:$BD$3,0))/1000000

=INDEX(‘\\san1\engData[TT_640.0_Generator.xls]Generator’!$B$4:$HU$83,MATCH($K$9,’\\san1\engData[GK_600.0_Pumps.xls]Pumps’!$A$4:$A$83,0),MATCH(C218,’\\san1\engData[TT_640.0_Generator.xls]Generator’!$B$3:$HU$3,0))/1000000

The ideal output would be:

_______________________________________
| Row  |  LinkCount |  UniqueLinkCount |
| 1    |     3      |        1         |
| 2    |     3      |        2         |
| 3    |     3      |        2         |

I want to query this data and see the number of files and unique files used per record.

I did a search online and couldn’t find anything that does this.

I’m thinking I’ll make a cursor and for each record I’ll detect chars starting with \\ and ending with '!$ and count the number of files.

The hard bit is the ExcelLinks with the =INDEX and MATCH functions that use multiple interlinks (that could be different files).

There’s over 12 million records in this table so I am concerned about the performance using a cursor.

There are some better ways to do this with Oracle using RegEx’s. I know that SQL Server doesn’t have RegEx and am willing to write/use a CLR stored proc if that’s the easiest option.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T09:21:45+00:00

First, grab this string splitting CLR function from Adam Machanic. Compile the code into a DLL (using csc if you don’t have Visual Studio), copy the DLL to your server, and then register the DLL as follows (you’ll have to replace some variable parts here, such as the file path, what you want to call the assembly, etc.):

CREATE ASSEMBLY CLRStuff 
  FROM 'C:\DLLs\CLRStuff.dll'  
  WITH PERMISSION_SET = SAFE;
GO

CREATE FUNCTION dbo.SplitStrings
(
   @List      NVARCHAR(MAX),
   @Delimiter NVARCHAR(255)
)
RETURNS TABLE ( Item NVARCHAR(4000) )
  EXTERNAL NAME CLRStuff.UserDefinedFunctions.SplitString_Multi;
GO

With that in place, the query itself is quite easy. Let’s create a simple table variable holding a few rows (I shortened the paths for brevity):

DECLARE @x TABLE(i INT, ExcelLink VARCHAR(MAX));

INSERT @x

    -- 3 files, 1 unique: 
    SELECT 1,'=INDEX(''\\san1\a.xls''!$B$4:$IK$83,MATCH($K$9,''\\san1\a.xls'
    + '''!$A$4:$A$83,0),MATCH(C212,''\\san1\a.xls''!$B$3:$IK$3,0))/1000000'

UNION ALL 

    -- 3 files, 3 unique:
    SELECT 2,'=INDEX(''\\san1\a.xls''!$B$4:$BD$39,MATCH($K$9,''\\san1\b.xls'
    + '''!$A$4:$A$39,0),MATCH(C214,''\\san1\c.xls''!$B$3:$BD$3,0))/1000000'

UNION ALL 

    -- 3 files, 2 unique:
    SELECT 3,'=INDEX(''\\san1\b.xls''!$B$4:$HU$83,MATCH($K$9,''\\san1\c.xls'
    + '''!$A$4:$A$83,0),MATCH(C218,''\\san1\c.xls''!$B$3:$HU$3,0))/1000000'

UNION ALL 

    -- 1 file, 1 unique:
    SELECT 4,'=INDEX(''\\san1\foo.xls''!$B$4:$HU$83,0)';

-- the above was just inserts; the remainder is all of the query:

;WITH x(i,part) AS 
(
  SELECT x.i, SUBSTRING(t.Item, CHARINDEX('''\\', t.Item), 2048) 
    FROM @x AS x CROSS APPLY dbo.SplitStrings(x.ExcelLink, '!$') AS t
)
SELECT i, [file_count] = COUNT(part), [unique_files] = COUNT(DISTINCT part)
  FROM x WHERE part LIKE '''\\%'
  GROUP BY i ORDER BY i;

Results:

i   file_count  unique_files
--  ----------  ------------
1   3           1
2   3           3
3   3           2
4   1           1

This relies on \\ not appearing naturally in the data other than as the beginning of a file path, and that all file paths reside on a network share.

This is probably not the most efficient you can get – I’m sure some RegEx wizard can improve this using that approach instead of splitting (here is a good article to get you started), but that’s not my forte. A large portion of the cost is going to be the I/O required to scan the entire table, rather than the counting or the replacing.

If you can’t use CLR, you can substitute that function for any number of non-CLR versions (here is an example that would be a functionally suitable replacement), but keep in mind other approaches will likely suffer from less optimal performance.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a table with a column called ExcelLinks that contain records like this:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply