Hi,
We encounter the same trouble with data nodes since 1 month, we have changed hardware but without success.
the data nodes go down after few hours of uptime with this message :
2012-04-13 21:06:49 [ndbd] INFO -- Received signal 8. Running error handler.
2012-04-13 21:06:49 [ndbd] INFO -- Signal 8 received; Floating point exception
2012-04-13 21:06:49 [ndbd] INFO -- /pb2/build/sb_0-5227860-1331719818.23/mysql-cluster-gpl-7.2.5/storage/ndb/src/kernel/ndbd.cpp
2012-04-13 21:06:49 [ndbd] INFO -- Error handler signal shutting down system
2012-04-13 21:06:49 [ndbd] INFO -- Error handler shutdown completed - exiting
2012-04-13 21:06:52 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Initiated by signal 8. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
There is two data nodes servers and an other server for MySQL API and ndb_mgm. (The Database size is above 9Gb)
The config :
[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=16000MB
IndexMemory=2500MB
#StopOnError=1
LockPagesInMainMemory=1
MaxNoOfTables=9096
MaxNoOfOrderedIndexes=512
MaxNoOfUniqueHashIndexes=256
MaxNoOfConcurrentOperations=2000000
MaxNoOfLocalOperations=2200000
MaxNoOfAttributes=10000
#MaxNoOfExecutionThreads=8
#TimeBetweenLocalCheckpoints=8
#TimeBetweenWatchDogCheck=10000
#TimeBetweenEpochs=200
#DiskCheckpointSpeed=25M
FragmentLogFileSize=256M
InitFragmentLogFiles=SPARSE
NoOfFragmentLogFiles=99
RedoBuffer=512M
RedoOverCommitLimit=0
RedoOverCommitCounter=0
ODirect=1
TransactionDeadlockDetectionTimeout=30000
RealtimeScheduler=1
SchedulerExecutionTimer=80
SchedulerSpinTimer=400
#DiskPageBufferMemory=1500M
#SharedGlobalMemory=2000M
[TCP DEFAULT]
SendBufferMemory=4M
ReceiveBufferMemory=4M
[NDB_MGMD]
NodeId=1 # the NDB Management Node (this one)
HostName=XX.XXX.XXX.X1
[NDBD]
NodeId=2 # the first NDB Data Node
HostName=XXX.XX.XXX.X2
DataDir= /srv/mysql-cluster/
[NDBD]
NodeId=3 # the second NDB Data Node
HostName=XXX.XX.XXX.X3
DataDir=/srv/mysql-cluster/
[MYSQLD]
DefaultOperationRedoProblemAction=QUEUE
[MYSQLD]
DefaultOperationRedoProblemAction=QUEUE
We observe this trouble when we import a dump or where we use a script PHP in order to insert new rows.
Any idea ?
We encounter the same trouble with data nodes since 1 month, we have changed hardware but without success.
the data nodes go down after few hours of uptime with this message :
2012-04-13 21:06:49 [ndbd] INFO -- Received signal 8. Running error handler.
2012-04-13 21:06:49 [ndbd] INFO -- Signal 8 received; Floating point exception
2012-04-13 21:06:49 [ndbd] INFO -- /pb2/build/sb_0-5227860-1331719818.23/mysql-cluster-gpl-7.2.5/storage/ndb/src/kernel/ndbd.cpp
2012-04-13 21:06:49 [ndbd] INFO -- Error handler signal shutting down system
2012-04-13 21:06:49 [ndbd] INFO -- Error handler shutdown completed - exiting
2012-04-13 21:06:52 [ndbd] ALERT -- Node 3: Forced node shutdown completed. Initiated by signal 8. Caused by error 6000: 'Error OS signal received(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'.
There is two data nodes servers and an other server for MySQL API and ndb_mgm. (The Database size is above 9Gb)
The config :
[NDBD DEFAULT]
NoOfReplicas=2
DataMemory=16000MB
IndexMemory=2500MB
#StopOnError=1
LockPagesInMainMemory=1
MaxNoOfTables=9096
MaxNoOfOrderedIndexes=512
MaxNoOfUniqueHashIndexes=256
MaxNoOfConcurrentOperations=2000000
MaxNoOfLocalOperations=2200000
MaxNoOfAttributes=10000
#MaxNoOfExecutionThreads=8
#TimeBetweenLocalCheckpoints=8
#TimeBetweenWatchDogCheck=10000
#TimeBetweenEpochs=200
#DiskCheckpointSpeed=25M
FragmentLogFileSize=256M
InitFragmentLogFiles=SPARSE
NoOfFragmentLogFiles=99
RedoBuffer=512M
RedoOverCommitLimit=0
RedoOverCommitCounter=0
ODirect=1
TransactionDeadlockDetectionTimeout=30000
RealtimeScheduler=1
SchedulerExecutionTimer=80
SchedulerSpinTimer=400
#DiskPageBufferMemory=1500M
#SharedGlobalMemory=2000M
[TCP DEFAULT]
SendBufferMemory=4M
ReceiveBufferMemory=4M
[NDB_MGMD]
NodeId=1 # the NDB Management Node (this one)
HostName=XX.XXX.XXX.X1
[NDBD]
NodeId=2 # the first NDB Data Node
HostName=XXX.XX.XXX.X2
DataDir= /srv/mysql-cluster/
[NDBD]
NodeId=3 # the second NDB Data Node
HostName=XXX.XX.XXX.X3
DataDir=/srv/mysql-cluster/
[MYSQLD]
DefaultOperationRedoProblemAction=QUEUE
[MYSQLD]
DefaultOperationRedoProblemAction=QUEUE
We observe this trouble when we import a dump or where we use a script PHP in order to insert new rows.
Any idea ?