Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8994673
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T23:29:06+00:00 2026-06-15T23:29:06+00:00

Data is fairly large and takes few minutes to run it every time, so

  • 0

Data is fairly large and takes few minutes to run it every time, so its taking a lot of time debugging this problem. When I run like concat('%',T.item,'%') on smaller data it seems to identify items properly. However, when I run it on the main DB (the code shown), it still shows many(maybe even all) of the exceptions.

EDIT:
it seems when i add NOT it stops identifying items

select distinct T.comment
from (select comment, source, item from data, non_informative where ticker != "O" and source != 7 and source != 6) as T
where T.comment not like concat('%',T.item,'%')
order by T.comment;

comment and source are in data, item is in non_informative

Some items from T.item:

‘Stock Analysis -‘, ‘#InsideTrades’, ‘IIROC Trade’

Example comment which should be removed

‘#InsideTrades #4 | MACNAB CRAIG (Director,Officer,Chief Executive
Officer): Filed Form 4 for $NNN (NATIONAL RETA’

Can’t seem to figure out it why shows all the items

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T23:29:08+00:00Added an answer on June 15, 2026 at 11:29 pm

    You’ve got a Cartesian product between non_informative and data tables. (Not at all clear which table the column ticker is from.

    Understand that for a “comment” to be returned, all that is required (to satisfy the predicates in your query) is for one row to be found in non_informative which does not “match” the comment. There may be rows in non_informative that do match, but your query doesn’t care about those. Your query is only looking for the existence of a row that does NOT match. The query is effectively saying that a “comment” will be excluded ONLY if it matches every single row in non_informative.


    If what you want to return is the values of “comment” for which there is NO matching row in non_informative, you need a different query. (I’m going to assume that the ticker column is from the data table.)

    I’m also going to exclude the corner cases of an empty string value for item, since that will essentially “match” every non-null value for comment.


    SQL Fiddle here

    — using a NOT EXISTS predicate:

     SELECT d.comment
       FROM `data` d
      WHERE d.ticker != 'O'
        AND d.source != 7
        AND d.source != 6
        AND NOT EXISTS
            ( SELECT 1
                FROM `non_informative` n
               WHERE n.item <> ''
                 AND d.comment LIKE CONCAT('%',n.item,'%')
            )
      GROUP BY d.comment
      ORDER BY d.comment
    

    — or, using an anti-join:

     SELECT d.comment
       FROM `data` d
       LEFT
       JOIN ( SELECT n.item
                FROM `non_informative` n
               WHERE n.item <> ''
               GROUP BY n.item
            ) m
         ON d.comment LIKE CONCAT('%',m.item,'%')
      WHERE d.ticker != 'O'
        AND d.source != 7
        AND d.source != 6
        AND m.item IS NULL
      GROUP BY d.comment
      ORDER BY d.comment
    

    These two statements should return an equivalent result set (but different from the resultset of your original query). They will also likely exhibit different performance characteristics (depending on the version of MySQL, and whether the MySQL engine can transform the NOT EXISTS predicate into an anti-join operation… performance is really going to depend on what indexes are available, and generated execution plan.)

    If we don’t bother with the empty string corner-case, we can simplify the second statement a bit…

     SELECT d.comment
       FROM `data` d
       LEFT
       JOIN `non_informative` n
         ON d.comment LIKE CONCAT('%',n.item,'%')
      WHERE d.ticker != 'O'
        AND d.source != 7
        AND d.source != 6
        AND n.item IS NULL
      GROUP BY d.comment
      ORDER BY d.comment
    

    Basically, for every row in the data table, we’re checking for a “match” in the non_informative table. For any row where we find a “match”, that row will be excluded by the “n.item IS NULL” predicate. For any row from data where it doesn’t find a matching row in non_informative, the LEFT JOIN operation will generate a NULL value for the “item” column, so the row will be included in the resultset.


    PERFORMANCE:

    Your original query includes an inline view (aliased as t). MySQL is going to materialize that as an intermediate MyISAM table, before the outer query runs. And that kind of think can be a real performance killer with large tables.

    But before we “tune” that statement, we really need a statement that returns a correct resultset. (There’s no sense in re-writing that statement if it doesn’t return the desired resultset, except as an exercise.)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have fairly large set of data returned via AJAX from a page. This
I'm writing a solution around MOSS 2007. And storing fairly large quantities of data
I am fairly new to core data technology and i searched a lot on
If you had a fairly large amount of data (say a million rows) that
We're using EF4 in a fairly large system and occasionally run into problems due
I have compared two queries which fetch some fairly large data from a database
I have several fairly large XML files that represent data exported from a system
We are designing a fairly large brownfield application, and run into a bit of
I have an optimisation problem with a fairly large table (~1.7M rows). There are
I have a fairly large application, where my data access strategy has always been

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.