Why is Kafka loading to the same node always?
I am looking for the setting which will tell kafka to load in a round robin manner.
Vertica is already load balanced with roundrobin policy.
I am using vertica v8.1.1-12, with a 2 node cluster.
dbadmin=> select GET_LOAD_BALANCE_POLICY () ;
GET_LOAD_BALANCE_POLICY
roundrobin
What I see from scheduler history is that only one host is selected for loading. Which poses different questions of wosdata configuration.
dbadmin=> SELECT elected_leader_time,host,scheduler_id FROM stream_config.stream_scheduler_history;
elected_leader_time | host | scheduler_id
-------------------------+------------+--------------
2018-03-12 08:04:32.31 | 10.3.0.235 | 0
2018-03-12 08:38:40.138 | 10.3.0.235 | 1
2018-03-12 08:46:35.648 | 10.3.0.235 | 2
2018-03-12 09:06:19.757 | 10.3.0.235 | 3
2018-03-12 09:12:06.398 | 10.3.0.235 | 4
2018-03-12 09:20:34.51 | 10.3.0.235 | 5
2018-03-12 09:57:51.851 | 10.3.0.235 | 6
2018-03-12 10:23:56.93 | 10.3.0.235 | 7
2018-03-12 10:26:11.212 | 10.3.0.235 | 8
2018-03-12 10:37:42.032 | 10.3.0.235 | 9
2018-03-12 10:43:12.279 | 10.3.0.235 | 10
2018-03-12 10:51:42.808 | 10.3.0.235 | 11
2018-03-12 10:54:59.273 | 10.3.0.235 | 12
2018-03-12 11:01:21.24 | 10.3.0.235 | 13
2018-03-28 18:15:09.809 | 10.3.0.235 | 14
(15 rows)
Is it something controlled from kafka cluster?
Any pointers would be helpful.
Thanks,
Minat
Comments
Are you using the Kafka scheduler? You must have specified the dbhost as 10.3.0.235 or localhost (if the scheduler is running on 10.3.0.235). So that's the initiator node.
Note that Kafka connects to Vertica via JDBC. You have to tell JDBC to load balance. There are 2 parameters of the Kafka schedule that are of interest here:
You can set the jdbc-url and include the connectionloadbalance JDBC setting:
Or it's probably easier to use the jdbc-opt parameter to add the connectionloadbalance=1 setting to the Vertica generated URL.
Note that I haven't tested these option, but in theory, they should work