Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7982029
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T10:33:18+00:00 2026-06-04T10:33:18+00:00

I have a Python Script that generate a CSV (data parsed from a website).

  • 0

I have a Python Script that generate a CSV (data parsed from a website).
Here is an exemple of the CSV file:

File1.csv

China;Beijing;Auralog Software Development (Deijing) Co. Ltd.;;;
United Kingdom;Oxford;Azad University (Ir) In Oxford Ltd;;;
Italy;Bari;Bari, The British School;;Yes;
China;Beijing;Beijing Foreign Enterprise Service Group Co Ltd;;;
China;Beijing;Beijing Ying Biao Human Resources Development Limited;;Yes;
China;Beijing;BeiwaiOnline BFSU;;;
Italy;Curno;Bergamo, Anderson House;;Yes;

File2.csv

China;Beijing;Auralog Software Development (Deijing) Co. Ltd.;;;
United Kingdom;Oxford;Azad University (Ir) In Oxford Ltd;;;
Italy;Bari;Bari, The British School;;Yes;
China;Beijing;Beijing Foreign Enterprise Service Group Co Ltd;;;
China;Beijing;Beijing Ying Biao Human Resources Development Limited;;Yes;
This;Is;A;New;Line;;
Italy;Curno;Bergamo, Anderson House;;Yes;

As you can see,

China;Beijing;BeiwaiOnline BFSU;;; ==> This line from File1.csv is not more present in File2.csv and This;Is;A;New;Line;; ==> This line from File2.csv is new (is not present in File1.csv).

I am looking for a way to compare this two CSV files (one important thing to know is that the order of the lines doesn’t count … they cant be anywhere).

What I’d like to have is a script which can tell me:
– One new line : This;Is;A;New;Line;;
– One removed line : China;Beijing;BeiwaiOnline BFSU;;;
And so on … !

I’ve tried but without any success:

#!/usr/bin/python
# -*- coding: utf-8 -*-

import csv

f1 = file('now.csv', 'r')
f2 = file('past.csv', 'r')

c1 = csv.reader(f1)
c2 = csv.reader(f2)

now = [row for row in c2]
past = [row for row in c1]

for row in now:
    #print row
    lol = past.index(row)
    print lol

f1.close()
f2.close()

_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

Any idea of the best way to proceed ? Thank you so much in advance 😉

EDIT:

import csv

f1 = file('now.csv', 'r')
f2 = file('past.csv', 'r')

c1 = csv.reader(f1)
c2 = csv.reader(f2)

s1 = set(c1)
s2 = set(c2)

lol = s1 - s2
print type(lol)
print lol

This seems to be a good idea but :

Traceback (most recent call last):
  File "compare.py", line 20, in <module>
    s1 = set(c1)
TypeError: unhashable type: 'list'

EDIT 2 (Please don’t care about what is above):
*with your help, here is the script I’m writing :*

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
import csv


### COMPARISON THING ###
x=0

fichiers = os.listdir('/me/CSV')
for fichier in fichiers:
    if '.csv' in fichier:
        print('%s -----> %s' % (x,fichier))
        x=x+1

choice = raw_input("Which file do you want to compare with the new output ? ->>>")
past_file = fichiers[int(choice)]
print 'We gonna compare %s to our output' % past_file

s_now = frozenset(tuple(row) for row in csv.reader(open('/me/CSV/now.csv', 'r'), delimiter=';')) ## OUR OUTPUT
s_past = frozenset(tuple(row) for row in csv.reader(open('/me/CSV/'+past_file, 'r'), delimiter=';')) ## CHOOSEN ONE

added = [";".join(row) for row in s_now - s_past]  # in "now" but not in "past"
removed = [";".join(row) for row in s_past - s_now]  # in "past" but not in "now"

c = csv.writer(open("CHANGELOG.csv", "a"),delimiter=";" )
line = ['AD']
for item_added in added:
    line.append(item_added)
    c.writerow(['AD',item_added])

line = ['RM']
for item_removed in removed:
    line.append(item_removed)
    c.writerow(line)

Two kind of errors:

  File "programcompare.py", line 21, in <genexpr>
    s_past = frozenset(tuple(row) for row in csv.reader(open('/me/CSV/'+past_file, 'r'), delimiter=';')) ## CHOOSEN ONE
_csv.Error: line contains NULL byte

or

  File "programcompare.py", line 21, in <genexpr>
    s_past = frozenset(tuple(row) for row in csv.reader(open('/me/CSV/'+past_file, 'r'), delimiter=';')) ## CHOOSEN ONE
_csv.Error: newline inside string

It was working few minutes ago but I’ve changed the CSV files to test with different datas and here I am 🙂

Sorry, last question !

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T10:33:20+00:00Added an answer on June 4, 2026 at 10:33 am

    If your data is not prohibitively large, loading them into a set (or frozenset) will be an easy approach:

    s_now = frozenset(tuple(row) for row in csv.reader(open('now.csv', 'r'), delimiter=';'))
    s_past = frozenset(tuple(row) for row in csv.reader(open('past.csv', 'r'), delimiter=';'))
    

    To get the list of entries that were added:

    added = [";".join(row) for row in s_now - s_past]  # in "now" but not in "past"
    # Or, simply "added = list(s_now - s_past)" to keep them as tuples.
    

    similarly, list of entries that were removed:

    removed = [";".join(row) for row in s_past - s_now]  # in "past" but not in "now"
    

    To address your updated question on why you’re seeing TypeError: unhashable type: 'list', the csv returns each entry as a list when iterated. lists are not hashable and therefore cannot be inserted into a set.

    To address this, you’ll need to convert the list entries into tuples before adding the to the set. See previous section in my answer for an example of how this can be done.


    To address the additional errors you’re seeing, they are both due to the content of your CSV files.

    _csv.Error: newline inside string

    It looks like you have quote characters (") somewhere in data which confuses the parser. I’m not familiar enough with the CSV module to tell you exactly what has gone wrong, not without having a peek at your data anyway.

    I did however manage to reproduce the error as such:

    >>> [e for e in csv.reader(['hello;wo;"rld'], delimiter=";")]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _csv.Error: newline inside string
    

    In this case, it can fixed by instructing the reader not to do any special processing with quotes (see csv.QUOTE_NONE). (Do note that this will disable the handling of quoted data whereby delimiters can appear within a quoted string without the string being split into separate entries.)

    >>> [e for e in csv.reader(['hello;wo;"rld'], delimiter=";", quoting=csv.QUOTE_NONE)]
    [['hello', 'wo', '"rld']]
    

    _csv.Error: line contains NULL byte

    I’m guessing this might be down to the encoding of your CSV files. See the following questions:

    • Python CSV error: line contains NULL byte
    • "Line contains NULL byte" in CSV reader (Python)
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a Python script that reads through a text csv file and creates
I have a Python script that for-loops through some objects and fetches data from
I have a python script that retrieves information from a web service and then
I have a python script that calls a USB-based data-acquisition C# dotnet executable. The
I have a python script that calls a USB-based data-acquisition C# dotnet executable. The
I have python script that converts data.xml to html using stylesheet.xsl. And i have
I have a python script that generates a bunch of data in a while
I have a shell script that I want to call from a python program,
I have a Python script that uses built-in modules but also imports a number
I have a python script that uses threads and makes lots of HTTP requests.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.