I have a cluster with four machines such as:
[ndbd(NDB)] 2 node(s)
id=1 @19.85.1.183 (mysql-5.6.23 ndb-7.4.5, Nodegroup: 0)
id=2 @19.85.1.165 (mysql-5.6.23 ndb-7.4.5, Nodegroup: 0, *)
[ndb_mgmd(MGM)] 2 node(s)
id=49 @19.85.1.167 (mysql-5.6.23 ndb-7.4.5)
id=52 @19.85.1.184 (mysql-5.6.23 ndb-7.4.5)
[mysqld(API)] 3 node(s)
id=50 (not connected, accepting connect from 129.85.128.167)
id=55 @19.85.1.167 (mysql-5.6.23 ndb-7.4.5)
id=56 @19.85.1.184 (mysql-5.6.23 ndb-7.4.5)
If I do a reboot on either data node machine id=1 or id=2 then I cannot run any SQL queries until that machine comes up. Here are some errors:
ERROR 1297 (HY000): Got temporary error 4028 'Node failure caused abort of transaction' from NDBCLUSTER
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
If I stop the data node first in ndb_mgm such as 1 Stop or 2 Stop and then reboot the machine then there is no issue.
Is this the correct behavior? If a data node machine crashes or someone pulls the plug on it will the cluster be inoperable until the machine comes back online? Or am I doing something wrong?
Thanks in advance!
[ndbd(NDB)] 2 node(s)
id=1 @19.85.1.183 (mysql-5.6.23 ndb-7.4.5, Nodegroup: 0)
id=2 @19.85.1.165 (mysql-5.6.23 ndb-7.4.5, Nodegroup: 0, *)
[ndb_mgmd(MGM)] 2 node(s)
id=49 @19.85.1.167 (mysql-5.6.23 ndb-7.4.5)
id=52 @19.85.1.184 (mysql-5.6.23 ndb-7.4.5)
[mysqld(API)] 3 node(s)
id=50 (not connected, accepting connect from 129.85.128.167)
id=55 @19.85.1.167 (mysql-5.6.23 ndb-7.4.5)
id=56 @19.85.1.184 (mysql-5.6.23 ndb-7.4.5)
If I do a reboot on either data node machine id=1 or id=2 then I cannot run any SQL queries until that machine comes up. Here are some errors:
ERROR 1297 (HY000): Got temporary error 4028 'Node failure caused abort of transaction' from NDBCLUSTER
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
If I stop the data node first in ndb_mgm such as 1 Stop or 2 Stop and then reboot the machine then there is no issue.
Is this the correct behavior? If a data node machine crashes or someone pulls the plug on it will the cluster be inoperable until the machine comes back online? Or am I doing something wrong?
Thanks in advance!