Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8046161
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T05:43:17+00:00 2026-06-05T05:43:17+00:00

Intro I’ve been given a messy excel dump straight into a table. Now I

  • 0

Intro

I’ve been given a messy excel dump straight into a table. Now I need to turn that mess into something useful.
The dump has duplicates and inconsistencies… good times!

I’ve been striking out on every approach so far 🙁 – Hope you can help me out.

Given this example data set:

ExcelDump
+----+------+------+------+
| ID | Col1 | Col2 | Col3 |
+----+------+------+------+
|  1 |      |      | C    |
|  1 |      | B    | C    |
|  1 | A    | B    | D    |
|  1 | E    | B    | C    |
|  2 | A    | B    | C    |
|  2 | A    | B    | C    |
|  3 | A    | B    | C    |
|  3 | A    | B    | F    |
|  4 | A    | B    | C    |
|  4 | G    | B    | C    |
+----+------+------+------+

One possible result could be:

OutputTable
+----+------+------+------+
| ID | Col1 | Col2 | Col3 |
+----+------+------+------+
|  1 | A    | B    | C    |
|  2 | A    | B    | C    |
|  3 | A    | B    | C    |
|  4 | A    | B    | C    |
+----+------+------+------+

Nice and neat.
Unique ID key and data merged together in a way that makes sense.

How to choose which data is correct?

You’ve probably noticed that another possible result could be:

+----+------+------+------+
| ID | Col1 | Col2 | Col3 |
+----+------+------+------+
|  1 | E    | B    | C    |
|  2 | A    | B    | C    |
|  3 | A    | B    | F    |
|  4 | G    | B    | C    |
+----+------+------+------+

This is where it gets complicated. I want to be able to choose the set that makes the most sense based on some conditions I can manipulate.

For instance I want to setup a condition that says: “Choose the most (non-null) common value, if no most common found take the first value found that is not null.”
This condition should be applied to the selection of grouped by IDs.
The result of that condition would be:

+----+------+------+------+
| ID | Col1 | Col2 | Col3 |
+----+------+------+------+
|  1 | A    | B    | C    |
|  2 | A    | B    | C    |
|  3 | A    | B    | C    |
|  4 | A    | B    | C    |
+----+------+------+------+

If I later find out that that assumption was wrong and it instead should be: “Choose the most (non-null) common value, if no most common found take the last value found that is not null.”

+----+------+------+------+
| ID | Col1 | Col2 | Col3 |
+----+------+------+------+
|  1 | E    | B    | C    |
|  2 | A    | B    | C    |
|  3 | A    | B    | F    |
|  4 | G    | B    | C    |
+----+------+------+------+

So basically I want to select values based a set of conditions on each group of IDs.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T05:43:18+00:00Added an answer on June 5, 2026 at 5:43 am

    I’ve modified my solution to take into account the extra information added in the question. The below query will get you the second sort priority you specified. In order to get the first one, you’d change the “max” in the outer apply to “min” and change the “sortOrder desc” to “sortOrder asc”. Keep in mind if you have multiple ties for most frequent, say A,A,B,B,C and A came first, it would go with B in the below code because that was the highest count and came after the 2 A’s.

    -- setup test table
    create table ExcelDump(
        id int
    ,   Col1 char(1)
    ,   Col2 char(1)
    ,   Col3 char(1)
    )
    
    insert into ExcelDump values(1,null,null,'C')
    insert into ExcelDump values(1,null,'B','C')
    insert into ExcelDump values(1,'A','B','D')
    insert into ExcelDump values(1,'E','B','C')
    insert into ExcelDump values(2,'A','B','C')
    insert into ExcelDump values(2,'A','B','C')
    insert into ExcelDump values(3,'A','B','C')
    insert into ExcelDump values(3,'A','B','F')
    insert into ExcelDump values(4,'A','B','C')
    insert into ExcelDump values(4,'G','B','C')
    
    -- create temp tables to make it easier to debug
    select distinct
        id
    into #distinct
    from ExcelDump
    
    -- number order isn't guaranteed but should be sorting them as first come first serve from the original table if no indexes exist
    select
        row_number() over(order by (select 1)) as numberOrder
    ,   ID
    ,   Col1
    ,   Col2
    ,   Col3
    into #sorted
    from ExcelDump
    
    -- actual query
    select
        ui.Id
    ,   col1.Col1
    ,   col2.Col2
    ,   col3.Col3
    from #distinct ui
      outer apply (
            select top 1
                ed.Col1
            ,   count(*) as cnt
            ,   max(ed.numberOrder) as sortOrder
            from #sorted ed
            where ed.id = ui.id
            and ed.Col1 is not null -- ignore nulls
            group by ed.Col1
            order by cnt desc, sortOrder desc -- get most common value, then get last one found if there are multiple
        ) col1
      outer apply (
            select top 1
                ed.Col2
            ,   count(*) as cnt
            ,   max(ed.numberOrder) as sortOrder
            from #sorted ed
            where ed.id = ui.id
            and ed.Col2 is not null -- ignore nulls
            group by ed.Col2
            order by cnt desc, sortOrder desc -- get most common value, then get last one found if there are multiple
        ) col2
      outer apply (
            select top 1
                ed.Col3
            ,   count(*) as cnt
            ,   max(ed.numberOrder) as sortOrder
            from #sorted ed
            where ed.id = ui.id
            and ed.Col3 is not null -- ignore nulls
            group by ed.Col3
            order by cnt desc, sortOrder desc -- get most common value, then get last one found if there are multiple
        ) col3
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Intro: I have an database with table which contains column with hours, for example
Intro: Web application, ASP.NET MVC 3, a controller action that accepts an instance of
Brief intro about my requirement. I have an empty JSF dataTable. Now, when I
link Im having trouble converting the html entites into html characters, (&# 8217;) i
Intro I have an object @organization that has_many :quick_facts Basically, I want to produce
this is what i have right now Drawing an RSS feed into the php,
Intro: I'm writing web interface with SQLAlchemy reflection that supports multiple databases. It turns
INTRO I have a TCP/HTTP server that supports plugins in form of Shared Libraries
I have a French site that I want to parse, but am running into
Intro I have a complex query I need to write (for an Oracle DB).

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.