Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9053721
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T13:29:54+00:00 2026-06-16T13:29:54+00:00

Is there a simple way to find and remove duplicate rows from a CSV

  • 0

Is there a simple way to find and remove duplicate rows from a CSV file?

Sample test.csv file:

row1 test tyy......
row2 tesg ghh
row2 tesg ghh
row2 tesg ghh
....
row3 tesg ghh
row3 tesg ghh
...
row4 tesg ghh

Expected results:

row1 test tyy......
row2 tesg ghh
....
row3 tesg ghh
...
row4 tesg ghh

Where can I start to accomplish this with PHP?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T13:29:56+00:00Added an answer on June 16, 2026 at 1:29 pm

    A straight-to-the point method would be to read the file in line-by-line and keep track of each row you’ve previously seen. If the current row has already been seen, skip it.

    Something like the following (untested) code may work:

    <?php
    // array to hold all "seen" lines
    $lines = array();
    
    // open the csv file
    if (($handle = fopen("test.csv", "r")) !== false) {
        // read each line into an array
        while (($data = fgetcsv($handle, 8192, ",")) !== false) {
            // build a "line" from the parsed data
            $line = join(",", $data);
    
            // if the line has been seen, skip it
            if (isset($lines[$line])) continue;
    
            // save the line
            $lines[$line] = true;
        }
        fclose($handle);
    }
    
    // build the new content-data
    $contents = '';
    foreach ($lines as $line => $bool) $contents .= $line . "\r\n";
    
    // save it to a new file
    file_put_contents("test_unique.csv", $contents);
    ?>
    

    This code uses fgetcsv() and uses a space comma as your column-delimiter (based on the sample-data in your question comment).

    Storing every line that has been seen, as above, will assure to remove all duplicate lines in the file regardless of whether-or-not they’re directly following one another or not. If they’re always going to be back-to-back, a more simple method (and more memory conscious) would be to store only the last-seen line and then compare against the current one.

    UPDATE (duplicate lines via the SKU-column, not full-line)
    Based on sample data provided in a comment, the “duplicate lines” aren’t actually equal (though they are similar, they differ by a good number of columns). The similarity between them can be linked to a single column, the sku.

    The following is an expanded version of the above code. This block will parse the first line (column-list) of the CSV file to determine which column contains the sku code. From there, it will keep a unique list of SKU codes seen and if the current line has a “new” code, it will write that line to the new “unique” file using fputcsv():

    <?php
    // array to hold all unique lines
    $lines = array();
    
    // array to hold all unique SKU codes
    $skus = array();
    
    // index of the `sku` column
    $skuIndex = -1;
    
    // open the "save-file"
    if (($saveHandle = fopen("test_unique.csv", "w")) !== false) {
        // open the csv file
        if (($readHandle = fopen("test.csv", "r")) !== false) {
            // read each line into an array
            while (($data = fgetcsv($readHandle, 8192, ",")) !== false) {
                if ($skuIndex == -1) {
                    // we need to determine what column the "sku" is; this will identify
                    // the "unique" rows
                    foreach ($data as $index => $column) {
                        if ($column == 'sku') {
                            $skuIndex = $index;
                            break;
                        }
                    }
                    if ($skuIndex == -1) {
                        echo "Couldn't determine the SKU-column.";
                        die();
                    }
                    // write this line to the file
                    fputcsv($saveHandle, $data);
                }
    
                // if the sku has been seen, skip it
                if (isset($skus[$data[$skuIndex]])) continue;
                $skus[$data[$skuIndex]] = true;
    
                // write this line to the file
                fputcsv($saveHandle, $data);
            }
            fclose($readHandle);
        }
        fclose($saveHandle);
    }
    ?>
    

    Overall, this method is far-more memory friendly as it doesn’t need to save a copy of every line in memory (only the SKU codes).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is there a simple and clean jQuery way to find the element who a
Is there a simple way in Ruby to get a true/false value from something
Is there a simple way to export / archive only the changed files from
Is there a nice and simple way to find nth element in C++ std::map
Is there a simple way to insert the current time (like TIME: [2012-07-02 Mon
Is there a simple way to serialize a single-level structure as a string for
Is there a simple way, possibly with open-source command line tools in Linux, to
Is there a simple way in Symfony 1.4 to know whether a submitted form
Is there a simple way to move an element inside its own parent? Like
Is there a simple way to get time time of day (17:30, 01:20...etc) that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.