Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 132123
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T06:12:47+00:00 2026-05-11T06:12:47+00:00

My web application parses data from an uploaded file and inserts it into a

  • 0

My web application parses data from an uploaded file and inserts it into a database table. Due to the nature of the input data (bank transaction data), duplicate data can exist from one upload to another. At the moment I’m using hideously inefficient code to check for the existence of duplicates by loading all rows within the date range from the DB into memory, and iterating over them and comparing each with the uploaded file data.

Needless to say, this can become very slow as the data set size increases.

So, I’m looking to replace this with a SQL query (against a MySQL database) which checks for the existence of duplicate data, e.g.

SELECT count(*) FROM transactions WHERE desc = ? AND dated_on = ? AND amount = ? 

This works fine, but my real-world case is a little bit more complicated. The description of a transaction in the input data can sometimes contain erroneous punctuation (e.g. ‘BANK 12323 DESCRIPTION’ can often be represented as ‘BANK.12323.DESCRIPTION’) so our existing (in memory) matching logic performs a little cleaning on this description before we do a comparison.

Whilst this works in memory, my question is can this cleaning be done in a SQL statement so I can move this matching logic to the database, something like:

SELECT count(*) FROM transactions WHERE CLEAN_ME(desc) = ? AND dated_on = ? AND amount = ? 

Where CLEAN_ME is a proc which strips the field of the erroneous data.

Obviously the cleanest (no pun intended!) solution would be to store the already cleaned data in the database (either in the same column, or in a separate column), but before I resort to that I thought I’d try and find out whether there’s a cleverer way around this.

Thanks a lot

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T06:12:47+00:00Added an answer on May 11, 2026 at 6:12 am

    can this cleaning be done in a SQL statement

    Yes, you can write a stored procedure to do it in the database layer:

    mysql> CREATE FUNCTION clean_me (s VARCHAR(255))     -> RETURNS VARCHAR(255) DETERMINISTIC     -> RETURN REPLACE(s, '.', ' ');  mysql> SELECT clean_me('BANK.12323.DESCRIPTION');  BANK 12323 DESCRIPTION 

    This will perform very poorly across a large table though.

    Obviously the cleanest (no pun intended!) solution would be to store the already cleaned data in the database (either in the same column, or in a separate column), but before I resort to that I thought I’d try and find out whether there’s a cleverer way around this.

    No, as far as databases are concerned the cleanest way is always the cleverest way (as long as performance isn’t awful).

    Do that, and add indexes to the columns you’re doing bulk compares on, to improve performance. If it’s actually intrinsic to the type of data that desc/dated-on/amount are always unique, then express that in the schema by making it a UNIQUE index constraint.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 75k
  • Answers 75k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • added an answer <%=Html.ActionLink('linkText', 'actionName', 'controllerName', new {CID = Request.QueryString['CID']}, null) %> Do… May 11, 2026 at 2:50 pm
  • added an answer According to the MSDN docs for custom date and time… May 11, 2026 at 2:50 pm
  • added an answer you could use the servlet context listener. More specifically you… May 11, 2026 at 2:50 pm

Related Questions

My web application (MonoRail, Windsor, ActiveRecord) has a Startable import service and one or
I have worked on single threaded business logic/back-end programming for most of my career.
I have a Flex application that needs to grab reporting data from a JasperReports
I have a web-app-database 3 tier server setup. Web requests data from app, and

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.