changing ip and host name of running vertica cluster. Moved cluster to a new site
Hi Team,
I am currently running a 3 node cluster with restart option as k-safe mode. Actually all three nodes were hosted in 3 different virtual machines.
As per the movement of servers to new site, I had to reconfigure the ips to new ip address and also the new host name. I remember, sometime back I had used the re_ip option with admintool ( example : admintool - t re_ip -f re_map.txt ) and it changed the required file automatically and the servers also restarted.
I created a re_map.txt file
Observations:
$admintools -t list_all_nodes
,,,
node001,old ip,down,unavailable,
$admintools -t re_ip -f re_map.txt
Parsing Mapping file
Host : Unreachable .......
Note: I have no clue why am I getting this error. I believe this is udp or tcp connection , right? Can I check the port for making sure the connection reaching to the end point?
In fact, I tried to update the admintool.conf file with the new ip address but that didn't work. When I started the DB on server its giving some synchronization error.
I am not sure how to change the required file for manual update of the networking change.
Need some help. Please advise.
Regards,
SM
Answers
Mapping New IP Addresses: https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ManageNodes/ReMapIPs/ReMapIPOverview.htm
For running "admintools -t re_ip -f re_map.txt" with mapping of old to new ip address, do I need to have the passwordless setting as pre-requisite between the nodes or from the machine I am running the command?
I have one fundamental doubt , the old ips are no more valid for my current system since they have been changed to new IPs already in network setting. Of course, Old IPs are still present in all the vertica configuration file like admintools.conf,config.cat and spread.conf. This is again replicated in 3 nodes.
I was under assumption that, this "admintools -t re_ip -f re_map.txt" will go and change the configuration file automatically . In fact, this worked for me earlier in a different setup.
Any help in troubleshooting this issue will be greatly appreciated, as I am stuck in debugging this issue. At least, I want to reassure that I have understood this tool properly.
Hi,
I am waiting for the response as this is a nice solution which I see in my solution case. In fact, it worked for me on some old setup but this time I am having issue in a new setup.
Does it have some internal communication dependencies? Do I need to check some internal communication setting? or passwordless setup etc?
This is my case,
We have 2 sites ( 1- DC ( Active), 2 DR- (Passive) ). My application will be talking to active site only. However, the vertica cluster nodes are copied over to passive site using vmware storage level copy in 15 minutes interval.
Which means, if I want to switch my traffic to DR environment , then I need to change the passive to active mode for DR. In that case, I need to bring the vertica cluster in DR which is in a different network.
I changed the network interfaces of all the 3 nodes and used the "admintool -t re_ip -f <oldip-newip-mapped. file> for all the configuration change required for ip.
I believe it should work.
Please advise, as I am stuck at some point and not able to move.
Regards,
SM
Hi Team,
Can someone find the issue, I am just trying to remap the ips to new ip set for 3 node cluster.
admintools -t re_ip -f map.txt
Parsing mapfile ...
Error: Host '10.168.yyy.xxx' in the first column is unknown. Make sure this is a valid host IP address. See the doc or help message (/appdefender/vertica/bin/admintools -t re_ip -h).
here is the file
10.168.xxx.yyy 10.168.aaa.bbb
Note: I checked the below link for parsing error
A warning occurs if:
Any of the IP addresses is incorrectly formatted
a duplicate old or new IP address exists in the file. For example, 192.0.2.256 appears twice in the old IP set.
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AdministratorsGuide/ManageNodes/ReMapIPs/CreateMapFile.htm?tocpath=Administrator's Guide|Managing the Database|Managing Nodes|Mapping New IP Addresses|_____1
Cheers,
SM
Let's tackle this. Before we begin,
Assuming that you've found some way to address the above (or simply don't care). Then, the next thing to know is that
re_ip
requires access to both old IP addresses and new IP addresses. And as you've blown away the old IP addresses, this is probably whyre_ip
is failing.So let's say my old IPs are:
And my new IPs are:
Now, since in your ideal world 10.168.xxx.yy* is exactly like 10.168.aaa.bb*. You can just use linux
iptables
to temporarily redirect all requests from 10.168.xxx.yyy to 10.168.aaa.bbb. See this super user post for how. Do this on all hosts. So something like:Then create a mapfile like so:
Then run
re_ip
:Proceed to start your DB.
Note that you shouldn't have edited
admintools.conf
(or any other conf files). Let them all be the original files (i.e. from your 10.168.xxx.yy* cluster) and just letre_ip
handle all that reconfiguration.Thanks for the detailed response. I was completely wrong in my assumption about the remapping of ip concept. I haven't touched the
admintool.conf file this time. let me check my luck this time
Cheers,
SM
Thanks for the help.
I tried in my test setup, it worked when I added a new adapter and redirected the traffic from old to new for all the nodes.
Just wanted another verification . I need to to try this option as well, because my customer may not be issuing new adapter in the vertica server in quick time.
Can I add just a additional ip i.e. old ip in the same network adapter. For example:
ip a show ens38
4: ens38: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:50:56:34:96:b4 brd ff:ff:ff:ff:ff:ff
inet 10.168.aaa.bb1/24 brd 192.168.220.255 scope global noprefixroute ens38
valid_lft forever preferred_lft forever
inet 10.168.xxx.yy1/24 brd 192.176.46.255 scope global noprefixroute ens38
valid_lft forever preferred_lft forever
inet6 fe80::4cf0:36b9:30be:e8f5/64 scope link noprefixroute
valid_lft forever preferred_lft forever
Note: I will do this for all the nodes and then change the file as you mentioned above.
$ cat mapfile
192.168.xxx.yy1 10.168.aaa.bb1
192.168.xxx.yy2 10.168.aaa.bb2
192.168.xxx.yy3 10.168.aaa.bb3
I believe admintools -t re_ip -f mapfile should work then , right?
I just checked and it worked for me. The secondary IP added on the network adapter was in same subnet. I am not sure how the admintools program is working. Is it that both the ip address must be accessible from other nodes using some protocol?
In this case, I have not done any traffic redirection, of course its not required.
Regards,
SM
Hi Expert,
I was wondering how the remapping worked for me in my test lab , when I didn't have access to old ip set. It just worked, although as you mentioned it took long time to recover the node with changed address.
I believe the process is like this , just need clarification.
1- just read the map file.
2- ssh to the respective nodes from the host we run admintool and connect to other hosts for update.
3- It takes backup file and update file with new ip address set in admintoo.conf and spread.conf
4- Once that is done, it tried to recover the server or start using new IP . I just see these states ( 1- Preparing , Prepared and then Out of date and finalled rollback)
Please suggest.
For me it's failing in one of customer setup,
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0001=Preparing -> v_dummy_node0001=Prepared
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0002=Preparing -> v_dummy_node0002=Prepared
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0003=Preparing -> v_dummy_node0003=Prepared
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0002=Prepared -> v_dummy_node0002=OutOfDate
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0003=Prepared -> v_dummy_node0003=OutOfDate
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0001=Prepared -> v_dummy_node0001=RollingBack
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0002=OutOfDate -> v_dummy_node0002=RollingBack
2020-08-09 13:37:03.618 admintools/18909:0x7f8b4b4ef740 [NodeState.transition] v_dummy_node0003=OutOfDate -> v_dummy_node0003=RollingBack
Thanks in advance,
Regards,
SM