Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6135213
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T17:27:07+00:00 2026-05-23T17:27:07+00:00

I have a column on an SQL Server 2005 table called BIO – the

  • 0

I have a column on an SQL Server 2005 table called BIO – the data in the BIO column is formatted like this:

<HTML><HEAD><TITLE></TITLE></HEAD><BODY><STRONG><A name=SN>AARTS</A>, <A name=GN>Michelle Marie</A>, </STRONG><A name=HO>B.Sc.</A>, <A name=HO>M.Sc.</A>, <A name=HO>Ph.D.</A>; <A name=OC>scientist, professor</A>; b. <A name=BC>St. Marys</A>, Ont. <A name=BY>1970</A>; <A name=PA>d. Wm. and H. Aarts</A>; <A name=ED>e. Univ. of Western Ont. B.Sc.(Hons.) 1994, M.Sc. 1997</A>; <A name=ED>McGill Univ. Ph.D. 2002</A>; <A name=MA>m. L. MacManus</A>; two children; <A name=PO>CANADA RESEARCH CHAIR IN SIGNAL TRANSDUCTION IN ISCHEMIA</A> and <A name=PO>ASST. PROF., DEPT. OF BIOL. SCI., UNIV. OF TORONTO SCARBOROUGH 2006&ndash;&nbsp;&nbsp;</A>; Postdoctoral Fellow, Toronto Western Hosp. 2000&ndash;06; Expert Cons., Auris Med. SAS, Montpellier, France; mem., Centre for the Neurobiol. of Stress; named INMHA Brainstar of the Year 2003; Bd. of Dirs. &amp; Fundraising Chair, N'Sheemaehn Childcare; mem., Soc. for Neurosci.; Cdn. Physiol. Soc.; Cdn. Assn. for Neurosci.; <A name=WK>co-author: 'Therapeutic Tools in Brain Damage' in <EM>Proteomics and Protein Interactions: Biology, Chemistry, Bioinformatics and Drug Design </EM>2005; 18 pub. journal articles</A>; Office: <A name=OF1_L1>1265 Military Trail</A>, <A name=OF1_CT>Scarborough</A>, <A name=OF1_PR>Ont.</A> <A name=OF1_PC>M1C 1A4</A>. </BODY></HTML>

I need to extract values from each of the anchor tags ie:

<A name=SN>AARTS</A> 

I would need to have AARTS in a column called SN in the result set

This is what I have so far…

SELECT  CONTACT_ID
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=SN>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=SN>', [BIO])) - CHARINDEX('<A name=SN>', [BIO])-11))) AS 'SN'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=GN>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=GN>', [BIO])) - CHARINDEX('<A name=GN>', [BIO])-11))) AS 'GN'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=HO>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=HO>', [BIO])) - CHARINDEX('<A name=HO>', [BIO])-11))) AS 'HO'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=OC>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=OC>', [BIO])) - CHARINDEX('<A name=OC>', [BIO])-11))) AS 'OC'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=PO>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=PO>', [BIO])) - CHARINDEX('<A name=PO>', [BIO])-11))) AS 'PO'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=BD>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=BD>', [BIO])) - CHARINDEX('<A name=BD>', [BIO])-11))) AS 'BD'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=PA>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=PA>', [BIO])) - CHARINDEX('<A name=PA>', [BIO])-11))) AS 'PA'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=BY>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=BY>', [BIO])) - CHARINDEX('<A name=BY>', [BIO])-11))) AS 'BY'
    ,dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=ED>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=ED>', [BIO])) - CHARINDEX('<A name=ED>', [BIO])-11))) AS 'ED'
FROM [cww].[dbo].[Contacts]
ORDER BY CONTACT_ID

The results I get from that look like this:

CONTACT_ID  SN  GN  HO  OC  PO  DB  PA  BY  ED
3   AARON   Raymond Leonard B.Sc.   business coach, professional speaker, real estate entrepreneur  D>AARON
5   AATAMI  Pita    C.Q.    business executive; Kuujjuaq
7   ABBOTT  Anthony C.  P.C.    lawyer  Montreal
8   ABBOTT  Elizabeth   M.A.    historian   Ottawa
9   ABBOTT  (Caroline) Louise   D>ABBOTT    writer, photographer, filmmaker Montreal

I can keep going and manually add all of the substrings for each differently named anchor but the problem with this is that I do not know all of ‘names’ that are used in the anchors and there are 22000+ records in this table that I would have to look through to make sure I catch them all. As well, not all BIOs have all the anchors so if you look at the result for ‘ABBOTT (Caroline) Louise’ she doesn’t have an ‘HO’ anchor so it returns incorrect data ‘D>ABBOTT’ and I haven’t seen this yet with the limited results I’m bringing up but some records have multiple anchors such as 2 ‘HO’s which I imagine will cause problems..

One last problem is that not all anchor names are 2 letters so the 11 I’m using in the charindex would be wrong for those ones..

Is there a better way to do this? Any help would be appreciated.

UPDATE – I’ve added CASE statements to remove incorrect data when the anchor name doesn’t exist for the current record.

SELECT  CONTACT_ID
    ,'SN' = 
        CASE
            WHEN CHARINDEX('<A name=SN>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=SN>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=SN>', [BIO])) - CHARINDEX('<A name=SN>', [BIO])-11)))
        END     
    ,'GN' = 
        CASE
            WHEN CHARINDEX('<A name=GN>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=GN>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=GN>', [BIO])) - CHARINDEX('<A name=GN>', [BIO])-11)))
        END
    ,'HO' = 
        CASE
            WHEN CHARINDEX('<A name=HO>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=HO>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=HO>', [BIO])) - CHARINDEX('<A name=HO>', [BIO])-11)))
        END
    ,'OC' = 
        CASE
            WHEN CHARINDEX('<A name=OC>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=OC>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=OC>', [BIO])) - CHARINDEX('<A name=OC>', [BIO])-11)))
        END
    ,'PO' = 
        CASE
            WHEN CHARINDEX('<A name=PO>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=PO>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=PO>', [BIO])) - CHARINDEX('<A name=PO>', [BIO])-11)))
        END
    ,'BD' = 
        CASE
            WHEN CHARINDEX('<A name=BD>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=BD>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=BD>', [BIO])) - CHARINDEX('<A name=BD>', [BIO])-11)))
        END
    ,'PA' = 
        CASE
            WHEN CHARINDEX('<A name=PA>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=PA>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=PA>', [BIO])) - CHARINDEX('<A name=PA>', [BIO])-11)))
        END
    ,'BY' = 
        CASE
            WHEN CHARINDEX('<A name=BY>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=BY>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=BY>', [BIO])) - CHARINDEX('<A name=BY>', [BIO])-11)))
        END
    ,'ED' = 
        CASE
            WHEN CHARINDEX('<A name=ED>', [BIO]) = 0 THEN NULL
            ELSE dbo.udf_StripHTML(SUBSTRING([BIO], (CHARINDEX('<A name=ED>', [BIO]) + 11), (CHARINDEX('</A>', [BIO], CHARINDEX('<A name=ED>', [BIO])) - CHARINDEX('<A name=ED>', [BIO])-11)))
        END
--INTO [cww].[dbo].[BioDetails]
FROM [cww].[dbo].[Contacts]
ORDER BY CONTACT_ID
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T17:27:08+00:00Added an answer on May 23, 2026 at 5:27 pm

    I don’t know how you could do this purely in T-SQL.

    If you can retrieve the CONTACT_ID and BIO columns into an application, you could iterate over the result set, parse the BIO data as XML, then use XPath to get the name attribute value and the anchor body, building a map of the data to be inserted into your new table. Since you don’t know all the different names that could exist, you’ll probably need to recreate the table each time it’s run, so store names found in a Set and after iterating over all the rows use the Set to generate your create table statement.

    The DB code is pure fantasy, but here’s a snippet showing how you could do it using the XOM XML library for Java. I’m not positive this would work since your attribute values aren’t quoted, but you might be able to find a parser that isn’t too picky, and I’m sure you could do something similar in .NET.

    ResultSet results = db.query("select CONTACT_ID, BIO from [cww].[dbo].[Contacts]");
    
    Set<String> newTableColumns = new Set<String>();
    newTableColumns.put("CONTACT_ID");
    
    List<Map<String,String> > dataToInsert = new ArrayList<Map<String,String> >();
    Builder parser = new Builder();
    
    for (ResultRow resultRow : results) { // iterate over the result set
    
        Map<String,String> rowDataToInsert = new HashMap<String,String>();
        rowData.put("CONTACT_ID", resultRow.get("CONTACT_ID"));
    
        // parse the BIO data as an XML document
        Document doc = parser.build(resultRow.get("BIO"), "");
    
        // query the document using XPath
        Nodes namedAnchors = doc.query("//a[@name]");
    
        for (int nItr = 0; nItr < namedAnchors.size(); nItr++) {
    
            Element anchor = (Element) namedAnchors.get(nItr);
            String name = anchor.getAttributeValue("name");
            String anchorBody = anchor.getValue();
    
            newTableColumns.put(name);
            rowDataToInsert.put(name, anchorBody);
    
        }
    
        // we've stored all the anchor data from this row, so put it away
        dataToInsert.add(rowDataToInsert);
    }
    
    // create your table
    db.createTable("NEW_TABLE_NAME", newTableColumns);
    
    // insert into your new table
    db.batchInsert("NEW_TABLE_NAME", dataToInsert);
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a nullable DateTime column in my SQL Server 2005 table called DateTimeDeleted.
I am using SQL server 2005. I have a table like this - ID
I am using SQL Server 2005. I have a table with a text column
MS SQL Server 2000 I have a column in Table A called Name. I
I have a sql server 2005 table called ZipCode, which has all the US
Current Implementation Sql Server 2005 Database with a table called messages with a column
I have a column in the database (SQL Server 2005) that has data with
I have some XML data stored in a varchar(max) column on SQL Server 2005.
I’m using SQL Server 2005 and I have a table contains a column of
I have a table in SQL Server 2005 which has three columns: id (int),

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.