Hi All,
I'm trying to sort out an issue when testing an "--initial" restart of a data node. It's a two-node cluster, where each node has both ndbmtd and mysqld, and read_backup is enabled. ndb_mgmd is running on two other machines. Here is a snippet from the log:
"""
2017-10-05 09:05:34 [ndbd] INFO -- NDB start phase 3 completed
2017-10-05 09:05:34 [ndbd] INFO -- Start phase 4 completed
2017-10-05 09:05:34 [ndbd] INFO -- Phase 4 continued preparations of the REDO log
2017-10-05 09:05:34 [ndbd] INFO -- Request copying of distribution and dictionary information from master Starting
2017-10-05 09:05:38 [ndbd] INFO -- Copying of dictionary information from master Starting
2017-10-05 09:06:23 [ndbd] INFO -- Copying of dictionary information from master Completed
2017-10-05 09:06:23 [ndbd] INFO -- Request copying of distribution and dictionary information from master Completed
2017-10-05 09:06:23 [ndbd] INFO -- NDB start phase 4 completed
2017-10-05 09:06:23 [ndbd] INFO -- Start NDB start phase 5 (only to DBDIH)
2017-10-05 09:06:23 [ndbd] INFO -- Restore Database Off-line Starting
job buffer full
Dumping non-empty job queues:
job buffer 0 --> 1, used 31 FULL!
2017-10-05 09:06:23 [ndbd] INFO -- Received signal 6. Running error handler.
2017-10-05 09:06:23 [ndbd] INFO -- Signal 6 received; Aborted
2017-10-05 09:06:23 [ndbd] INFO -- /export/home2/pb2/build/sb_1-23963488-1498206357.67/rpm/BUILD/mysql-cluster-gpl-7.5.7/mysql-cluster-gpl-7.5.7/storage/ndb/src/kernel/ndbd.cpp
2017-10-05 09:06:23 [ndbd] INFO -- Error handler signal shutting down system
2017-10-05 09:06:23 [ndbd] INFO -- Error handler shutdown completed - exiting
2017-10-05 09:06:24 [ndbd] ALERT -- Node 22: Forced node shutdown completed. Occured during startphase 5. Initiated by signal 6. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
"""
Hunting around, I did find Bug #83785, in which the [3 May 2017] comment from Johan Andersson matches my issue. It's mentioned that the bug is fixed in 7.5.8; however, I installed via the yum repo and only have 7.5.7 available.
My questions are:
* Is there a workaround for 7.5.7?
* Do we know when 7.5.8 will be available via the yum repo?
* Is there more information as to why I would be hitting this bug?
Feel free to request additional information and troubleshooting; I've hit a brick wall and am unsure what to do next.
I'm trying to sort out an issue when testing an "--initial" restart of a data node. It's a two-node cluster, where each node has both ndbmtd and mysqld, and read_backup is enabled. ndb_mgmd is running on two other machines. Here is a snippet from the log:
"""
2017-10-05 09:05:34 [ndbd] INFO -- NDB start phase 3 completed
2017-10-05 09:05:34 [ndbd] INFO -- Start phase 4 completed
2017-10-05 09:05:34 [ndbd] INFO -- Phase 4 continued preparations of the REDO log
2017-10-05 09:05:34 [ndbd] INFO -- Request copying of distribution and dictionary information from master Starting
2017-10-05 09:05:38 [ndbd] INFO -- Copying of dictionary information from master Starting
2017-10-05 09:06:23 [ndbd] INFO -- Copying of dictionary information from master Completed
2017-10-05 09:06:23 [ndbd] INFO -- Request copying of distribution and dictionary information from master Completed
2017-10-05 09:06:23 [ndbd] INFO -- NDB start phase 4 completed
2017-10-05 09:06:23 [ndbd] INFO -- Start NDB start phase 5 (only to DBDIH)
2017-10-05 09:06:23 [ndbd] INFO -- Restore Database Off-line Starting
job buffer full
Dumping non-empty job queues:
job buffer 0 --> 1, used 31 FULL!
2017-10-05 09:06:23 [ndbd] INFO -- Received signal 6. Running error handler.
2017-10-05 09:06:23 [ndbd] INFO -- Signal 6 received; Aborted
2017-10-05 09:06:23 [ndbd] INFO -- /export/home2/pb2/build/sb_1-23963488-1498206357.67/rpm/BUILD/mysql-cluster-gpl-7.5.7/mysql-cluster-gpl-7.5.7/storage/ndb/src/kernel/ndbd.cpp
2017-10-05 09:06:23 [ndbd] INFO -- Error handler signal shutting down system
2017-10-05 09:06:23 [ndbd] INFO -- Error handler shutdown completed - exiting
2017-10-05 09:06:24 [ndbd] ALERT -- Node 22: Forced node shutdown completed. Occured during startphase 5. Initiated by signal 6. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
"""
Hunting around, I did find Bug #83785, in which the [3 May 2017] comment from Johan Andersson matches my issue. It's mentioned that the bug is fixed in 7.5.8; however, I installed via the yum repo and only have 7.5.7 available.
My questions are:
* Is there a workaround for 7.5.7?
* Do we know when 7.5.8 will be available via the yum repo?
* Is there more information as to why I would be hitting this bug?
Feel free to request additional information and troubleshooting; I've hit a brick wall and am unsure what to do next.