Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8274387
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T07:43:08+00:00 2026-06-08T07:43:08+00:00

Below is the data in TestingTable1 BUYER_ID | ITEM_ID | CREATED_TIME ———–+——————-+———————— 1345653 110909316904

  • 0

Below is the data in TestingTable1

BUYER_ID   |   ITEM_ID         |    CREATED_TIME
-----------+-------------------+------------------------
1345653        110909316904         2012-07-09 21:29:06
1345653        151851771618         2012-07-09 19:57:33
1345653        221065796761         2012-07-09 19:31:48
1345653        400307563710         2012-07-09 18:57:33
1345653        310411560125         2012-07-09 16:09:49
1345653        120945302103         2012-07-09 13:40:23
1345653        261060982989         2012-07-09 09:02:21

Below is the data in TestingTable2

USER_ID   |   PRODUCT_ID           |    LAST_TIME
-----------+-------------------+-------------------
1345653       110909316904         2012-07-09 21:30:06
1345653       152851771618         2012-07-09 19:57:33
1345653       221065796761         2012-07-09 19:31:48
1345653       400307563710         2012-07-09 18:57:33

I need to Compare TestingTable2 with TestingTable1 on BUYER_ID and USER_ID. And I need to find all (basically the count) the missing and mismatch entries in TestingTable2 after comparing from TestingTable1. I created SQL fiddle for this-

http://sqlfiddle.com/#!3/d87b2/1

If you run my query in the SQL Fiddle, you will get output as-

BUYER_ID    ERROR
1345653       5

which is right as last three rows from TestingTable1 is missing in TestingTable2 and rest two are mismatch after comparison from TestingTable1 on BUYER_ID and USER_ID.

Now the complicated thing is starting.

Problem Statement-

In my current output, I am getting ERROR count as 5. So if you see first row in both the tables ITEM_ID and PRODUCT_ID are same but CREATED_TIME and LAST_TIME is not same, and difference between those two times is of only 1 minute. So currently I am reporting that as a mismatch, but what I need is that if the difference between them is within 15 minutes range, then I don’t want to report as an error. So after implementing this feature in my current query, I will be getting error count as 4 because difference is within 15 minutes range for the first row.

So after taking help from Stack Overflow, I found the solution for this, and below is the sql query that works fine in SQL server(which will give error count as 4) but not in Hive as Hive supports only equality JOINS and I cannot run the below query in Hive. So I need some other way of doing this problem. Is it possible to do the date difference condition in where clause somehow? Basically how I can rewrite the below SQL query in some other way such that it would fulfill my all the requirements above.

SELECT  TT.BUYER_ID,
        COUNT(*)
FROM    (
          SELECT    testingtable1.buyer_id,
                    testingtable1.item_id,
                    testingtable1.created_time
          FROM      testingtable2
                    RIGHT JOIN testingtable1
                        ON (
                             testingtable1.item_id = testingtable2.product_id
                             AND testingtable1.BUYER_ID = testingtable2.USER_ID
                             AND ABS(DATEDIFF(mi, testingtable1.created_time, testingtable2.last_time)) <= 15
                           )
          WHERE     testingtable2.product_id IS NULL
        ) TT
GROUP BY TT.BUYER_ID;

Expected Output that I need after implementing the above feature-

BUYER_ID    ERROR
1345653       4

UPDATE:-

AS Per Below WEST comment, the output only show ERROR count as 1, but it should be showing as 4. And also after removing the last row he added in his SQL fiddle, its not working and I am getting zero error, which is not right as there is already one error in the time difference.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T07:43:11+00:00Added an answer on June 8, 2026 at 7:43 am

    What if you do an equijoin, and put your time comparison logic inside of a CASE expression with a SUM, instead of a COUNT?

    SELECT  TT1.BUYER_ID,
            SUM(CASE WHEN ABS(DATEDIFF(mi, TT1.created_time, TT2.last_time)) <= 15 THEN 0
                     ELSE 1
                END) AS ERROR
    FROM    testingtable1 TT1
            LEFT JOIN testingtable2 TT2
                ON (
                     TT1.item_id = TT2.product_id
                     AND TT1.BUYER_ID = TT2.USER_ID
                   )
    GROUP BY TT1.BUYER_ID;
    

    You will need to convert the date arithmetic to whatever hive uses…

    Here’s a MS Sql server SQLFiddle which gets 4 errors returned.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

If i have the below data: route_name updatedDate route_id FF 15/06/2012 0:00 22 DD
This is below data in Table2 ID2 | Count2 -----------+----------- 1345653 5 534140349 5
This is the below data in my Table1 BID PID TIME ---------+-------------------+---------------------- 1345653 330760137950
This is below data in Table2 ID2 | Count2 -----------+----------- 1345653 5 534140349 5
In below data list represents set of question's and answer, How to check whether
I want to validate below data using regex and python. Below is the dump
i have data below from streaming: 'ID|20120206|080500|0000001|0|1|record1|END' 'ID|20120206|080500|0000002|0|1|record2|END' 'ID|20120206|080500|0000003|0|1|record3|END' and i want to process
I have the data below saved as a pandas dataframe . With this data,
I have below sample data. AID Date Title ----- ---------- ------ 1 2011-12-12 test1
Please help me figure a single query that will transform the data below... |id

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.