One Server crashing often / different hardware setup

Hi,

 

we run a cluster with three vertica nodes and the following hardware:

 

1. 2 x Intel Xeon 4-Core 2,40Ghz 96GB ECC DDR3 RAM

2. 2 x Intel Xeon 4-Core 2,40Ghz 96GB ECC DDR3 RAM

3. 2 x Intel Xeon 6-Core 2,40Ghz 96GB ECC DDR3 RAM

 

Is it possible that the 3rd box crashes a lot because of the different CPU?

 

I can't find anything helpful in vertica.log. The last line before the last crash was:

 

2015-11-14 07:50:06.759 DistCall Dispatch:0x7f0b5c248c40 [Txn] <INFO> Commit Complete: Txn: a00000014e4fb0 at epoch 0xc7fb76

Comments

  • Pelle,

       It is certianly possible that the different configuration in Node 3 is causing issues.  When you say the box crashes alot, are we talking about vertica on the box crashing, or the entire box?

     

    -Chris

  • Hey Chris,

     

    thanks for your reply and sorry for being so unspecific. It's the vertica instance crashing, not the entire server.

     

    Cheers

  • Pelle,

       Ok, in that case, what I would try first is limiting the number of cores in node 3 to match the number of cores in nodes 1 and 2.

     

    Here is an article I found online that walks you through that - http://www.absolutelytech.com/2011/08/01/how-to-disable-cpu-cores-in-linux/ 

     

    Typically we don't do a lot of testing on non-homogeneous configurations, so this will be interesting! :)

     

    -Chris

  • Thanks, I'll try that right now :-)

  • I'm hoping that no news is good news.  How did restricting the number of CPUs in node 3 work out?

     

    -Chris

  • Hi Chris,

     

    sorry for getting back so late... I found out that all three nodes had different CPUs so we deceided to replace the hardware with identical stuff. It seems to be stable since.

     

    Thanks for your help :-)

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file