MySQL multi-dependent subqueries, painfully slow

I have a working query that fetches the data I need, but unfortunately it is very slow (takes 3 minutes). I have the indexes in place, but I think the problem is with multiple dependent subqueries. I've tried to rewrite the query using joins, but I can't seem to get it to work. Any help would be greatly appreciated.

Tables:

Basically, I have 2 tables. The former (prices) contain the prices of items in the store. Each line is the item's price per day, and new lines are added every day with an updated price.

The second table (watches_US) contains position information (name, description, etc.).

CREATE TABLE `prices` (
`prices_id` int(11) NOT NULL auto_increment,
`prices_locale` enum('CA','DE','FR','JP','UK','US') NOT NULL default 'US',
`prices_watches_ID` char(10) NOT NULL,
`prices_date` datetime NOT NULL,
`prices_am` varchar(10) default NULL,
`prices_new` varchar(10) default NULL,
`prices_used` varchar(10) default NULL,
PRIMARY KEY  (`prices_id`),
KEY `prices_am` (`prices_am`),
KEY `prices_locale` (`prices_locale`),
KEY `prices_watches_ID` (`prices_watches_ID`),
KEY `prices_date` (`prices_date`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=61764 ;

CREATE TABLE `watches_US` (
`watches_ID` char(10) NOT NULL,
`watches_date_added` datetime NOT NULL,
`watches_last_update` datetime default NULL,
`watches_title` varchar(255) default NULL,
`watches_small_image_height` int(11) default NULL,
`watches_small_image_width` int(11) default NULL,
`watches_description` text,
PRIMARY KEY  (`watches_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;   

      

The request returns the last 10 price changes within 30 hours, ordered by price change size. So I have sub queries to get the newest price, oldest price for 30 hours, and then calculate the price change.

Here's the request:

SELECT watches_US.*, prices.*, watches_US.watches_ID as current_ID,
    ( SELECT prices_am FROM prices WHERE prices_watches_ID = current_ID AND prices_locale = 'US' ORDER BY prices_date DESC LIMIT 1 ) as new_price, 
    ( SELECT prices_date FROM prices WHERE prices_watches_ID = current_ID AND prices_locale = 'US' ORDER BY prices_date DESC LIMIT 1 ) as new_price_date, 
    ( SELECT prices_am FROM prices WHERE ( prices_watches_ID = current_ID AND prices_locale = 'US') AND ( prices_date >= DATE_SUB(new_price_date,INTERVAL 30 HOUR) ) ORDER BY prices_date ASC LIMIT 1 ) as old_price,
    ( SELECT ROUND(((new_price - old_price)/old_price)*100,2) ) as percent_change,
    ( SELECT (new_price - old_price) ) as absolute_change
FROM watches_US 
LEFT OUTER JOIN prices ON prices.prices_watches_ID = watches_US.watches_ID 
WHERE ( prices_locale = 'US' )
AND ( prices_am IS NOT NULL )
AND ( prices_am != '' )
HAVING ( old_price IS NOT NULL )
AND ( old_price != 0 )
AND ( old_price != '' )
AND ( absolute_change < 0 )
AND ( prices.prices_date = new_price_date )
ORDER BY absolute_change ASC
LIMIT 10

      

How would I rewrite this to use joins instead, or else optimize this so it doesn't take more than 3 minutes to get the result? Any help would be greatly appreciated!

Thank you.

UPDATE

Using the answers below, I got a request before, which takes 2 seconds to run:

SELECT watches_US.*, prices.*,
    ( SELECT prices_am FROM prices prices2 WHERE ( prices2.prices_watches_ID = watches_US.watches_ID AND prices2.prices_locale = 'US') AND ( prices2.prices_date >= DATE_SUB(prices.prices_date,INTERVAL 30 HOUR) ) ORDER BY prices2.prices_date ASC LIMIT 1 ) as old_price,
    ( SELECT ROUND(((prices.prices_am - old_price)/old_price)*100,2) ) as percent_change,
    ( SELECT (prices.prices_am - old_price) ) as absolute_change
FROM watches_US 
LEFT OUTER JOIN prices ON prices.prices_watches_ID = watches_US.watches_ID AND prices.prices_locale = 'US'
WHERE ( prices.prices_am IS NOT NULL )
AND ( prices.prices_am != '' )
AND ( prices.prices_date IN (SELECT MAX(prices_date) FROM prices WHERE prices_watches_ID = watches_US.watches_ID AND prices_locale = 'US' ) )
HAVING ( old_price IS NOT NULL )
AND ( old_price != 0 )
AND ( old_price != '' )
AND ( absolute_change < 0 )
ORDER BY absolute_change ASC
LIMIT 10

      

It might still be able to work with some work, but it can be used as is. Thanks everyone for your help!

+2


a source to share


3 answers


Here's a partial idea:

SELECT watches_US.*, prices.*, watches_US.watches_ID as current_ID,
    prices2.prices_am as new_price, 
    prices2.prices_date as new_price_date, 
    ( SELECT prices_am FROM prices WHERE ( prices_watches_ID = current_ID AND prices_locale = 'US') AND ( prices_date >= DATE_SUB(new_price_date,INTERVAL 30 HOUR) ) ORDER BY prices_date ASC LIMIT 1 ) as old_price,
    ( SELECT ROUND(((new_price - old_price)/old_price)*100,2) ) as percent_change,
    ( SELECT (new_price - old_price) ) as absolute_change
FROM watches_US 
LEFT OUTER JOIN prices ON prices.prices_watches_ID = watches_US.watches_ID 
LEFT OUTER JOIN prices prices2 ON prices2.prices_watches_ID = watches_US.watches_ID 
WHERE ( prices_locale = 'US' )
AND ( prices_am IS NOT NULL )
AND ( prices_am != '' )
AND ( prices2.prices_date IN (SELECT MAX(price_date) FROM prices WHERE prices_watches_ID = watches_US.watches_ID AND prices_locale = 'US' )
HAVING ( old_price IS NOT NULL )
AND ( old_price != 0 )
AND ( old_price != '' )
AND ( absolute_change < 0 )
AND ( prices.prices_date = new_price_date )
ORDER BY absolute_change ASC
LIMIT 10

      



The changes are the second pricing join used to get new_price and new_price_date with a WHERE clause to select only the most recent record. Perhaps you could clean it up a bit, but I wanted to get it.

0


a source


There are several problems with this SQL:

  • You are running the same query multiple times:

    (SELECT prices_am FROM prices WHERE prices_watches_ID = current_ID AND prices_locale = 'US' ORDER BY prices_date DESC LIMIT 1) as new_price, (SELECT prices_date FROM prices WHERE prices_watches_ID = current_ID AND prices_locale = 'US' ORDER BY prices_date DESC LIMIT



You only have to run the query once, give it a name, and select multiple columns, eq SELECT ... sub1.prices_am, sub1.prices_date FROM ... SELECT () sub1

If I'm not mistaken.

  • Do not use for any reason HAVING

    . It kills your performance as it forces the database to fetch all the rows in your query and then filter some of them as described in the description HAVING

    .
0


a source


I would start by saying you have numeric values ​​where you do comparisons and expressions. Any index that includes type conversion will be non-functional. Your prices are barchers.

0


a source







All Articles