Hi all,
I have a legacy app using a mysql cluster version 7.4.13.
Cluster has 12 data nodes and 12 mysql nodes (data nodes are 200GB/32 cores machines)
I have a problem with a query that takes a long time to execute on a very large table (>1B rows)
The query takes almost 30s
mysql> select sum(result) from my_table where search_field=x'0abcde';
+-------------+
| sum(result) |
+-------------+
| 20000000 |
+-------------+
1 row in set (29.53 sec)
This is the table:
mysql> show create table my_table\G
*************************** 1. row ***************************
Table: my_table
Create Table: CREATE TABLE `my_table` (
`search_field` binary(20) NOT NULL,
`field1` bigint(20) unsigned NOT NULL,
`result` bigint(20) NOT NULL,
`field2` varbinary(10240) NOT NULL DEFAULT '',
PRIMARY KEY (`search_field`,`field1`),
KEY `key1` (`field1`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1
/*!50100 PARTITION BY KEY (`search_field`) */
1 row in set (0.00 sec)
This is the size of the table:
mysql> select count(*) from my_table;
+------------+
| count(*) |
+------------+
| 1075174759 |
+------------+
The explain shows that the query is using the index:
mysql> explain select sum(result) from my_table where search_field=x'0xabcde'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: my_table;
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 20
ref: const
rows: 380
Extra: Using where with pushed condition
1 row in set (0.00 sec)
The set returned by the query is quite large also, so mysql has to
sum ~ 700k rows.
mysql> select count(result) from my_table where search_field=x'0xabcde'\G
+---------------+
| count(result) |
+---------------+
| 786932 |
+---------------+
1 row in set (22.78 sec)
Apart rewriting the data structure, is there anything we can do?
This is our conf:
[NDBD default]
BackupDataDir=/var/ndb/cluster/backup
BackupReportFrequency=30
BuildIndexThreads=6
DataDir=/var/ndb/cluster/data
DataMemory=120GB
FragmentLogFileSize=256M
IndexMemory=20GB
LcpScanProgressTimeout=120
LockPagesInMainMemory=1
LongMessageBuffer=128M
MaxBufferedEpochs=300
MaxDiskWriteSpeed=180M
MaxDiskWriteSpeedOtherNodeRestart=200M
MaxNoOfAttributes=20000
MaxNoOfConcurrentOperations=1000000
MaxNoOfConcurrentScans=500
MaxNoOfConcurrentSubOperations=512
MaxNoOfConcurrentTransactions=32768
MaxNoOfExecutionThreads=26
MaxParallelScansPerFragment=512
MemReportFrequency=30
NoOfFragmentLogFiles=64
NoOfFragmentLogParts=12
NoOfReplicas=2
ODirect=1
RedoBuffer=32M
RedoOverCommitCounter=5
RedoOverCommitLimit=60
StartPartialTimeout=600000
StartupStatusReportFrequency=30
TimeBetweenGlobalCheckpoints=10000
TransactionDeadlockDetectionTimeout=30000
TwoPassInitialNodeRestartCopy=TRUE
[TCP default]
SendBufferMemory=64M
ReceiveBufferMemory=64M
[NDB_MGMD default]
LogDestination=SYSLOG:facility: daemon
[API default]
DefaultOperationRedoProblemAction=QUEUE
BatchSize=256
MaxScanBatchSize=192K
BatchByteSize=64K
ExtraSendBufferMemory=12M
Thanks!
I have a legacy app using a mysql cluster version 7.4.13.
Cluster has 12 data nodes and 12 mysql nodes (data nodes are 200GB/32 cores machines)
I have a problem with a query that takes a long time to execute on a very large table (>1B rows)
The query takes almost 30s
mysql> select sum(result) from my_table where search_field=x'0abcde';
+-------------+
| sum(result) |
+-------------+
| 20000000 |
+-------------+
1 row in set (29.53 sec)
This is the table:
mysql> show create table my_table\G
*************************** 1. row ***************************
Table: my_table
Create Table: CREATE TABLE `my_table` (
`search_field` binary(20) NOT NULL,
`field1` bigint(20) unsigned NOT NULL,
`result` bigint(20) NOT NULL,
`field2` varbinary(10240) NOT NULL DEFAULT '',
PRIMARY KEY (`search_field`,`field1`),
KEY `key1` (`field1`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1
/*!50100 PARTITION BY KEY (`search_field`) */
1 row in set (0.00 sec)
This is the size of the table:
mysql> select count(*) from my_table;
+------------+
| count(*) |
+------------+
| 1075174759 |
+------------+
The explain shows that the query is using the index:
mysql> explain select sum(result) from my_table where search_field=x'0xabcde'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: my_table;
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 20
ref: const
rows: 380
Extra: Using where with pushed condition
1 row in set (0.00 sec)
The set returned by the query is quite large also, so mysql has to
sum ~ 700k rows.
mysql> select count(result) from my_table where search_field=x'0xabcde'\G
+---------------+
| count(result) |
+---------------+
| 786932 |
+---------------+
1 row in set (22.78 sec)
Apart rewriting the data structure, is there anything we can do?
This is our conf:
[NDBD default]
BackupDataDir=/var/ndb/cluster/backup
BackupReportFrequency=30
BuildIndexThreads=6
DataDir=/var/ndb/cluster/data
DataMemory=120GB
FragmentLogFileSize=256M
IndexMemory=20GB
LcpScanProgressTimeout=120
LockPagesInMainMemory=1
LongMessageBuffer=128M
MaxBufferedEpochs=300
MaxDiskWriteSpeed=180M
MaxDiskWriteSpeedOtherNodeRestart=200M
MaxNoOfAttributes=20000
MaxNoOfConcurrentOperations=1000000
MaxNoOfConcurrentScans=500
MaxNoOfConcurrentSubOperations=512
MaxNoOfConcurrentTransactions=32768
MaxNoOfExecutionThreads=26
MaxParallelScansPerFragment=512
MemReportFrequency=30
NoOfFragmentLogFiles=64
NoOfFragmentLogParts=12
NoOfReplicas=2
ODirect=1
RedoBuffer=32M
RedoOverCommitCounter=5
RedoOverCommitLimit=60
StartPartialTimeout=600000
StartupStatusReportFrequency=30
TimeBetweenGlobalCheckpoints=10000
TransactionDeadlockDetectionTimeout=30000
TwoPassInitialNodeRestartCopy=TRUE
[TCP default]
SendBufferMemory=64M
ReceiveBufferMemory=64M
[NDB_MGMD default]
LogDestination=SYSLOG:facility: daemon
[API default]
DefaultOperationRedoProblemAction=QUEUE
BatchSize=256
MaxScanBatchSize=192K
BatchByteSize=64K
ExtraSendBufferMemory=12M
Thanks!