Quantcast
Channel: MySQL Forums - NDB clusters
Viewing all articles
Browse latest Browse all 1562

Inner Join significantly slower then subselect on ndbmtd (no replies)

$
0
0
I have two data nodes and found out a strange behaviour:

I have a table raw_data with a column user_id. raw_data has approx. 300.000.000 rows with 417.000 different user_id's.


Now when I run similar queries with very different results. There is a key on "date" which results in 11.900.000 rows for the selection below:

1. Only the raw_data table:

select user_id, count(*)
from raw_data
where date between '2014-03-01' and '2014-03-05'
group by user_id
order by count(*)
limit 100;

Execution time is approx. 15 seconds.

2. Now I want to join the user table in that query:

select user.name, user_id, count(*)
from raw_data rd inner join user u on rd.user_id = u.id
where date between '2014-03-01' and '2014-03-05'
group by user_id
order by count(*)
limit 100;

Execution time is more then 20 minutes.

This is the output of the explain statement:

+----+-------------+-------+--------+---------------------------+---------+---------+------------------------+----------+-------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------+---------+---------+------------------------+----------+-------------------------------------------------------------------------------+
| 1 | SIMPLE | i | range | search,date,user_id_id | date | 3 | NULL | 11905284 | Using where with pushed condition; Using MRR; Using temporary; Using filesort |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 4 | report.d.user_id | 1 | Using where |
+----+-------------+-------+--------+---------------------------+---------+---------+------------------------+----------+-------------------------------------------------------------------------------+

3. I used a subselect from try number 1 to join user table and this was as fast as try number 1:

select u.name, user_id, hits
from (select user_id, count(*) as hits
from raw_data
where date between '2014-03-01' and '2014-03-05'
group by user_id
order by count(*)
limit 100 ) t1
inner join user u on u.id = t1.user_id;

Execution time is the same as in query 1, approx. 15 seconds.

So why is the join that much slower?

Viewing all articles
Browse latest Browse all 1562

Trending Articles