Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 776853
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T19:31:01+00:00 2026-05-14T19:31:01+00:00

Code The following code calculates the slope and intercept for a linear regression against

  • 0

Code

The following code calculates the slope and intercept for a linear regression against a slathering of data. It then applies the equation y = mx + b against the same result set to calculate the value of the regression line for each row.

How can the two queries be joined so that the data and its slope/intercept are calculated without executing the WHERE clause twice?

The general form of the problem is:

SELECT a.group, func(a.group, avg_avg)
FROM a
    (SELECT AVG(field1_avg) as avg_avg
     FROM (SELECT a.group, AVG(field1) as field1_avg
           FROM a
           WHERE (SOME_CONDITION)
           GROUP BY a.group) as several_lines -- potentially
    ) as one_line -- always
WHERE (SOME_CONDITION)
GROUP BY a.group -- again, potentially several lines

I have SOME_CONDITION executing twice. This is shown below (updated with a STRAIGHT_JOIN optimization):

SELECT STRAIGHT_JOIN
  AVG(D.AMOUNT) as AMOUNT,
  Y.YEAR * ymxb.SLOPE + ymxb.INTERCEPT as REGRESSION_LINE,
  Y.YEAR as YEAR,
  MAKEDATE(Y.YEAR,1) as AMOUNT_DATE,
  ymxb.SLOPE,
  ymxb.INTERCEPT,
  ymxb.CORRELATION,
  ymxb.MEASUREMENTS
FROM
  CITY C,
  STATION S,
  STATION_DISTRICT SD,
  YEAR_REF Y,
  MONTH_REF M,
  DAILY D,
  (SELECT
    SUM(MEASUREMENTS) as MEASUREMENTS,

    ((sum(t.YEAR) * sum(t.AMOUNT)) - (count(1) * sum(t.YEAR * t.AMOUNT))) /
    (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as SLOPE,

    ((sum( t.YEAR ) * sum( t.YEAR * t.AMOUNT )) -
    (sum( t.AMOUNT ) * sum(power(t.YEAR, 2)))) /
    (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as INTERCEPT,

    ((avg(t.AMOUNT * t.YEAR)) - avg(t.AMOUNT) * avg(t.YEAR)) /
    (stddev( t.AMOUNT ) * stddev( t.YEAR )) as CORRELATION
  FROM (
    SELECT STRAIGHT_JOIN
      COUNT(1) as MEASUREMENTS,
      AVG(D.AMOUNT) as AMOUNT,
      Y.YEAR as YEAR
    FROM
      CITY C,
      STATION S,
      STATION_DISTRICT SD,
      YEAR_REF Y,
      MONTH_REF M,
      DAILY D
    WHERE
      -- For a specific city ...
      --
      $X{ IN, C.ID, CityCode } AND

      -- Find all the stations within a specific unit radius ...
      --
      6371.009 *
      SQRT(
        POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
        (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
         POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= $P{Radius} AND

      SD.ID = S.STATION_DISTRICT_ID AND

      -- Gather all known years for that station ...
      --
      Y.STATION_DISTRICT_ID = SD.ID AND

      -- The data before 1900 is shaky; insufficient after 2009.
      --
      Y.YEAR BETWEEN 1900 AND 2009 AND

      -- Filtered by all known months ...
      --
      M.YEAR_REF_ID = Y.ID AND

      -- Whittled down by category ...
      --
      M.CATEGORY_ID = $P{CategoryCode} AND

      -- Into the valid daily climate data.
      --
      M.ID = D.MONTH_REF_ID AND
      D.DAILY_FLAG_ID <> 'M'
    GROUP BY
      Y.YEAR
  ) t
) ymxb
WHERE
  -- For a specific city ...
  --
  $X{ IN, C.ID, CityCode } AND

  -- Find all the stations within a specific unit radius ...
  --
  6371.009 *
  SQRT(
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
     POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= $P{Radius} AND

  SD.ID = S.STATION_DISTRICT_ID AND

  -- Gather all known years for that station ...
  --
  Y.STATION_DISTRICT_ID = SD.ID AND

  -- The data before 1900 is shaky; insufficient after 2009.
  --
  Y.YEAR BETWEEN 1900 AND 2009 AND

  -- Filtered by all known months ...
  --
  M.YEAR_REF_ID = Y.ID AND

  -- Whittled down by category ...
  --
  M.CATEGORY_ID = $P{CategoryCode} AND

  -- Into the valid daily climate data.
  --
  M.ID = D.MONTH_REF_ID AND
  D.DAILY_FLAG_ID <> 'M'
GROUP BY
  Y.YEAR

Question

How do I execute the duplicate bits only once per query, instead of twice? The duplicate code:

  $X{ IN, C.ID, CityCode } AND
  6371.009 *
  SQRT(
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
     POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= $P{Radius} AND
  SD.ID = S.STATION_DISTRICT_ID AND
  Y.STATION_DISTRICT_ID = SD.ID AND
  Y.YEAR BETWEEN 1900 AND 2009 AND
  M.YEAR_REF_ID = Y.ID AND
  M.CATEGORY_ID = $P{CategoryCode} AND
  M.ID = D.MONTH_REF_ID AND
  D.DAILY_FLAG_ID <> 'M'
GROUP BY
  Y.YEAR

Update 1

Using variables and splitting the query seems to allow the cache to kick in as this now runs in 3.5 seconds, whereas it used to run in 7. Still, if there is any way to remove the duplicate code, I’d be grateful for any help.


Update 2

The above code does not run in JasperReports, and a VIEW, while a possible fix, would probably be extremely inefficient (because the WHERE clauses are parameterized).

Update 3

Validating distance using Unreason’s suggestion of the Pythagorean formula with converging meridians:

  6371.009 *
  SQRT(
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
    POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) )

(This is unrelated to the question, but should someone else want to know …)

Update 4

The code, as shown, works in JasperReports, running against a MySQL database. JasperReports does not allow variables or multiple queries.

Update 5

Am looking for a solution that executes cleanly. 😉 I have written a number of partially working solutions, but MySQL, sadly, does not understand partially correct. See the discussions with Unreason for answers that almost work.

Update 6

I might be able to reuse variables from the first WHERE clause and compare them to the second (thereby eliminating some duplication — the checks against $P{} values), but I’d really like the duplication eliminated.

Update 7

Comparing the YEAR clause, as hypothesized in the previous update, to eliminate the duplicate BETWEEN, does not work.

Related

How to eliminate duplicate calculation in SQL?

Thank you!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T19:31:02+00:00Added an answer on May 14, 2026 at 7:31 pm

    You should be able to get everything you need in one go:

     SELECT
        AVG(D.AMOUNT) as AMOUNT,
        Y.YEAR as YEAR,
        MAKEDATE(Y.YEAR,1) as AMOUNT_DATE,
        Y.YEAR * ymxb.SLOPE + ymxb.INTERCEPT as REGRESSION_LINE,             
        ((avg(AVG(D.AMOUNT) * Y.YEAR)) - avg(AVG(D.AMOUNT)) * avg(Y.YEAR)) /                  
        (stddev( AVG(D.AMOUNT) ) * stddev( Y.YEAR )) as CORRELATION,                     
        ((sum(Y.YEAR) * sum(AVG(D.AMOUNT))) - (count(1) * sum(Y.YEAR * AVG(D.AMOUNT)))) /
        (power(sum(Y.YEAR), 2) - count(1) * sum(power(Y.YEAR, 2))) as SLOPE,   
        ((sum( Y.YEAR ) * sum( Y.YEAR * AVG(D.AMOUNT) )) -
        (sum( AVG(D.AMOUNT) ) * sum(power(Y.YEAR, 2)))) / 
        (power(sum(Y.YEAR), 2) - count(1) * sum(power(Y.YEAR, 2))) as INTERCEPT
     FROM
        CITY C,
        STATION S,
        YEAR_REF Y,
        MONTH_REF M,
        DAILY D
     WHERE
        $X{ IN, C.ID, CityCode } AND
        SQRT(
            POW( C.LATITUDE - S.LATITUDE, 2 ) +
            POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) < $P{Radius} AND
        S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND
        Y.YEAR BETWEEN 1900 AND 2009 AND
        M.YEAR_REF_ID = Y.ID AND
        M.CATEGORY_ID = $P{CategoryCode} AND
        M.ID = D.MONTH_REF_ID AND
        D.DAILY_FLAG_ID <> 'M'
     GROUP BY
        Y.YEAR
    

    The things will not work straight from the query above (it has nonsensically combined aggregates and other errors); this can be a good time to check your formulas

    If you decide to do sub queries do simplify the formulas, then:

    • you can grab (you do grab) all the necessary data in the inner most query and you don’t have to repeat all the tables in the outer queries any more (just select the relevant columns from the t, they are already at your disposal)
    • you don’t have to repeat the where condition
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 493k
  • Answers 493k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer in my WPF app, I was able to solve it… May 16, 2026 at 10:51 am
  • Editorial Team
    Editorial Team added an answer In tableView:didSelectRowAtIndexPath: you are referencing a navigationController by self.navigationController, that… May 16, 2026 at 10:51 am
  • Editorial Team
    Editorial Team added an answer It would be something like that: PreparedStatement stmt = cnt.prepareStatement("...");… May 16, 2026 at 10:51 am

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Related Questions

I have the following code in SQL (2005) which calculates the avarage user logins
the following code compiles fine under gcc: class vec3 { private: float data[3]; public:
The following code is in Haskell. How would I write similar function in C#?
The following code has been created from a different post, but I now have
Can anyone please explain what the following code checks for? I can make no
i have question how write program which calculates following procedures http://en.wikipedia.org/wiki/Tetration i have exponential
The following code puts a cube at (0, 0, 0) and another at (0,
The following code does not generate a compiler warning in D6. Can I make
The following code is from a NSTableViewDataSource where I'm trying to impliement drag and
In the following code, How can I check for null reference exception in a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.