• unknown's avatar
    BUG#13985 ndb_mgm "status" command can return incorrect data node status · 3cea3705
    unknown authored
    Second half of the fix for this bug.
    
    This patch forces a heartbeat to be sent and will wait (a little while)
    for replies. This way we can get
    
    > all status
    X starting
    Y started
    X started
    >
    
    which is okay as the new status comes after the old status, always.
    There is the slimmest of opportunities to get output like above where only half
    the cluster appears started.
    
    This is about the best we can do with a command line interactive program.
    
    
    ndb/src/mgmsrv/MgmtSrvr.cpp:
      Add updateStatus method to MgmtSrvr.
      
      Used to force an update of node status for the nodes.
    ndb/src/mgmsrv/MgmtSrvr.hpp:
      add prototype for updateStatus(NodeBitmask) method
    ndb/src/mgmsrv/Services.cpp:
      When status is queried, force an update of the status in the mgm server. (i.e. send heartbeats)
    ndb/src/ndbapi/ClusterMgr.cpp:
      new DEBUG_REG define for debugging registration and HB code.
      
      Add ClusterMgr::forceHB(NodeBitmask) which sends a HB signal to each node in
      the bitmask and then waits for a REGCONF from them.
      Will only wait for a total of 1 second, not blocking an end client for too long.
      
      On receipt of HB, clear the nodeId in the waiting for bitmask and signal any
      waiting threads.
    ndb/src/ndbapi/ClusterMgr.hpp:
      Add ::forceHB(NodeBitmask) and associated variables
    3cea3705
ClusterMgr.cpp 23.2 KB