Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5974519
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T20:57:16+00:00 2026-05-22T20:57:16+00:00

I’m looking to compare two big sets of csv files and/or a csv file

  • 0

I’m looking to compare two big sets of csv files and/or a csv file and a .txt file. I “think” the .txt file may need to be converted to a csv file just for simplicity sake but that may or may not be needed. I either want to use excel, c++, or python. I need to compare one “accepted” value list to a list that is measured and find the difference between them if there is one. Excel may be the easiest way to do this but python or c++ may work just as well. This is not homework so don’t worry about that sort of thing. Code advice and/or templates is greatly appreciated. or links to websites

EDIT 1

I’ve read about Python’s difflib or differ class but unfamiliar how to use it and may be more than I want.

EDIT 2

The Files both will have a series of columns(not with lines drawn between them or anything) and below those “named” columns there will be numbers. I need to compare the number in column 1 spot one in file one to column 1 spot one of file 2 and if there is a difference show the difference in another csv file

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T20:57:17+00:00Added an answer on May 22, 2026 at 8:57 pm

    You can use ADO (ODBC/JET/OLEDB Text Driver) to treat ‘decent’ .txt/.csv/.tab/.flr files as tables in a SQL Database from every COM-enabled language. Then the comparisons could be done using the power of SQL (DISTINCT, GROUP, (LEFT) JOINS, …).

    Added with regard to your comment:

    It’s your problem and I don’t want to push you where you don’t want to go. But SQL is a good (the best?) tool, if you need to compare tabular data. As evidence the output of a script that spots the differences in two .txt files:

    ======= The .txt files to play with
    ------- file1.txt
    "AC";"AM"
    40000;-19083,00
    40100;20000,00
    40200;350004,00
    40300;3498,99
    
    ------- file2.txt
    "AC";"AM"
    40000;-19083,00
    40300;3498,99
    40105;-234567,00
    40200;350,00
    
    ======= Some diagnostic SQL
    ------- <NULL> indicates: In F1 but not in F2 (LEFT JOIN)
    SELECT T1.AC, T1.AM, T2.AM FROM [file1.txt] AS T1 LEFT JOIN [file2.txt] AS T2 ON (T1.AC =
    T2.AC)
    ------- Result
    AC      File1   File2
    40000   -19083  -19083
    40100   20000   <NULL>
    40200   350004  350
    40300   3498,99 3498,99
    
    ------- <NULL> indicates: Not in the other file (LEFT JOIN, UNION)
    SELECT T1.AC, T1.AM, T2.AM FROM [file1.txt] AS T1 LEFT JOIN [file2.txt] AS T2 ON (T1.AC =
    T2.AC) UNION SELECT T2.AC, T1.AM, T2.AM FROM [file2.txt] AS T2 LEFT JOIN [file1.txt] AS T1
     ON (T1.AC = T2.AC)
    ------- Result
    AC      File1   File2
    40000   -19083  -19083
    40100   20000   <NULL>
    40105   <NULL>  -234567
    40200   350004  350
    40300   3498,99 3498,99
    
    ------- the problems: missing, different values
    SELECT T1.AC, T1.AM, T2.AM FROM [file1.txt] AS T1 LEFT JOIN [file2.txt] AS T2 ON (T1.AC =
    T2.AC) WHERE T2.AM IS NULL OR T1.AM <> T2.AM UNION SELECT T2.AC, T1.AM, T2.AM FROM [file2.
    txt] AS T2 LEFT JOIN [file1.txt] AS T1 ON (T1.AC = T2.AC) WHERE T1.AM IS NULL OR T1.AM <>
    T2.AM
    ------- Result
    AC      File1   File2
    40100   20000   <NULL>
    40105   <NULL>  -234567
    40200   350004  350
    

    Further additions:

    This article deals with ADO and text files; look for a file adoNNN.chm
    (NNN=Version number, e.g. 210) on your computer; this is a good book about
    ADO.

    You can use Access or OpenOffice Base to experiment with SQL statements
    applied to a linked/referenced (not imported!) text database.

    A script/program will be easy after you mastered the initial hurdle: connecting
    to the the database, i.e. to a folder containing the files and a schema.ini
    file to define the structure of the files=tables.

    The output above was generated by:

      Const adClipString = 2
    
      Dim oFS  : Set oFS = CreateObject( "Scripting.FileSystemObject" )
      Dim sDir : sDir    = oFS.GetAbsolutePathName( ".\txt" )
      Dim oDB  : Set oDb = CreateObject( "ADODB.Connection" )
      oDB.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & sDir & ";Extended Properties=""text"""
      Dim sSQL
      Dim sFiNa
      WScript.Echo "=======", "The .txt files to play with"
      For Each sFiNa In Array( "file1.txt", "file2.txt"  )
          WScript.Echo "-------", sFiNa
          WScript.Echo oFS.OpenTextFile( "txt\" & sFiNa ).ReadAll()
      Next
    
      WScript.Echo "=======", "Some diagnostic SQL"
      Dim aSQL
      For Each aSQL In Array( _
           Array(   "<NULL> indicates: In F1 but not in F2 (LEFT JOIN)" _
                  , Join( Array( _
                         "SELECT T1.AC, T1.AM, T2.AM FROM" _
                       , "[file1.txt] AS T1" _
                       , "LEFT JOIN [file2.txt] AS T2 ON (T1.AC = T2.AC)" _
                    ), " " ) ) _
         , Array(   "<NULL> indicates: Not in the other file (LEFT JOIN, UNION)" _
                  , Join( Array( _
                         "SELECT T1.AC, T1.AM, T2.AM FROM" _
                       , "[file1.txt] AS T1" _
                       , "LEFT JOIN [file2.txt] AS T2 ON (T1.AC = T2.AC)" _
                       , "UNION" _
                       , "SELECT T2.AC, T1.AM, T2.AM FROM" _
                       , "[file2.txt] AS T2" _
                       , "LEFT JOIN [file1.txt] AS T1 ON (T1.AC = T2.AC)" _
                    ), " " ) ) _
         , Array(   "the problems: missing, different value" _
                  , Join( Array( _
                         "SELECT T1.AC, T1.AM, T2.AM FROM" _
                       , "[file1.txt] AS T1" _
                       , "LEFT JOIN [file2.txt] AS T2 ON (T1.AC = T2.AC)" _
                       , "WHERE T2.AM IS NULL OR T1.AM <> T2.AM" _
                       , "UNION" _
                       , "SELECT T2.AC, T1.AM, T2.AM FROM" _
                       , "[file2.txt] AS T2" _
                       , "LEFT JOIN [file1.txt] AS T1 ON (T1.AC = T2.AC)" _
                       , "WHERE T1.AM IS NULL OR T1.AM <> T2.AM" _
                    ), " " ) ) _
         )
         sSQL = aSQL( 1 )
         WScript.Echo "-------", aSQL( 0 )
         WScript.Echo sSQL
         Dim oRS : Set oRS = oDB.Execute( sSQL )
         WScript.Echo "------- Result"
         WScript.Echo Join( Array( "AC", "File1", "File2" ), vbTab )
         WScript.Echo oRS.GetString( adClipString, , vbTab, vbCrLf, "<NULL>" )
      Next
      oDB.Close
    

    If you delete/ignore the fat (create SQL statements, diagnostics output), it boils
    down to 6 lines

      Dim oDB  : Set oDb = CreateObject( "ADODB.Connection" )
      oDB.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & sDir & ";Extended Properties=""text"""
      sSQL = "..."
      Dim oRS : Set oRS = oDB.Execute( sSQL )
      WScript.Echo oRS.GetString( adClipString, , vbTab, vbCrLf, "<NULL>" )
      oDB.Close
    

    which can be ‘ported’ easily to every COM-enabled language, because the ADO
    objects do all the heavy lifting. The .GetString method comes handy, when you
    want to save a resultset: just twiddle the separator/delimiter/Null arguments
    and dump it to file

      oFS.CreateTextFile( ... ).WriteLine oRS.GetString( _
        adClipString, , ",", vbCrLf, ""
      )
    

    (don’t forget to add a definition for that table to your schema.ini). Of
    course you also can use a “SELECT/INSERT INTO”, but such statements may not
    be easy to get right/passed the ADO Text Driver’s parser.

    Addition wrt Computations:

    Start with a 5 x 2 master/approved file containing:

    Num0    Num1    Num2    Num3    Num4
    7,6     6,1     3,8     0,9     8,9
    0,9     9,4     4,7     8,8     9,9
    

    transform it to expected.txt

    Num0    Num1    Num2    Num3    Num4    Spot
    7,6     6,1     3,8     0,9     8,9     1
    0,9     9,4     4,7     8,8     9,9     2
    

    by appending the Spot column so it conforms to

    [expected.txt]
    ColNameHeader=True
    CharacterSet=1252
    Format=Delimited(;)
    Col1=Num0 Float
    Col2=Num1 Float
    Col3=Num2 Float
    Col4=Num3 Float
    Col5=Num4 Float
    Col6=Spot Integer
    

    in your schema.ini file. Similarly, transform a measure file like:

    Num0    Num1    Num2    Num3    Num4
    7,1     1,1     3,8     0,9     8,9
    0,9     9,4     4,7     8,8     9,9
    

    to measured.txt

    Num0    Num1    Num2    Num3    Num4    Spot
    7,1     1,1     3,8     0,9     8,9     1
    0,9     9,4     4,7     8,8     9,9     2
    

    Apply

      sSQL = Join( Array( _
             "SELECT E.Num0 - M.Num0 AS Dif0" _
          ,       ", E.Num1 - M.Num1 AS Dif1" _
          ,       ", E.Num2 - M.Num2 AS Dif2" _
          ,       ", E.Num3 - M.Num3 AS Dif3" _
          ,       ", E.Num4 - M.Num4 AS Dif4" _
          ,       ", E.Spot          AS Spot" _
          ,  "FROM [expected.txt] AS E" _
          ,  "INNER JOIN [measured.txt] AS M" _
          ,  "ON E.Spot = M.Spot" _
      ), " " )
    

    Write the resultset to differences.txt

    aFNames = Array( “Num0”, … “Spot” )
    oFS.CreateTextFile( sFSpec ).Write _
    Join( aFNames, sFSep ) & sRSep & oRS.GetString( adClipString, , sFSep, sRSep, “” )

    and you get:

    Num0    Num1    Num2    Num3    Num4    Spot
    0,5     5       0       0       0       1
    0       0       0       0       0       2
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have just tried to save a simple *.rtf file with some websites and
In my XML file chapters tag has more chapter tag.i need to display chapters
I have thousands of HTML files to process using Groovy/Java and I need to
I have a jquery bug and I've been looking for hours now, I can't
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I want use html5's new tag to play a wav file (currently only supported
I am trying to render a haml file in a javascript response like so:
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
We are using XSLT to translate a RIXML file to XML. Our RIXML contains

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.