Doris-0.13.15扩容问题汇总

1、环境

Doris版本0.13.15(进去可直接下载binary tgz,解压即用,无需自己编译) 现有节点,3个节点node1,node2,node3 FE服务扩容observer节点:node4,node5,node6 BE服务扩容节点:node4,node5,node6 Broker服务扩容节点:node4,node5,node6 supervisor挂载上面三个服务,保证服务不挂

2、扩容

2.1、扩容前准备

检测Kafka连通性(保证后续RoutineLoad可以正常使用)

检测HDFS连通性(保证后续BrokerLoad可以正常使用)

2.2、扩容步骤

2.2.1、分发tgz包,解压

2.2.2、分发配置到新节点

将{DORIS_HOME}/fe/conf/fe.conf分发至新fe节点对应目录下 将{DORIS_HOME}/fe/bin/start_fe.sh分发至新fe节点对应目录下(optional) 将{DORIS_HOME}/be/conf/be.conf分发至新be节点对应目录下 将{DORIS_HOME}/be/bin/start_be.sh分发至新be节点对应目录下(optional

2.2.3、创建(元)数据目录

新fe节点创建fe的meta_dir目录(保持与现有节点配置一样) 新be节点创建be的storage_root_path目录(保持与现有节点配置一样)

2.2.4、查看节点状态

mysql -h host -P port -uroot

用上面命令利用mysql-client 连接任一已启动的 FE

其中 host 为 FE 所在节点 ip

port 为 fe/conf/fe.conf 中的 query_port(默认9030)

默认使用 root 账户,无密码登录。

在mysql-client执行以下命令查看 状态

show proc '/frontends';#查看FE节点状态
show proc '/backends';#查看BE节点状态
show proc '/brokers';#查看Broker节点状态

目前都是3台,node1,node2,node3

2.2.4、启动扩容的FE服务

FE 分为 Leader,Follower 和 Observer 三种角色。 默认一个集群,只能有一个 Leader,可以有多个 Follower 和 Observer。其中 Leader 和 Follower 组成一个 Paxos 选择组,如果 Leader 宕机,则剩下的 Follower 会自动选出新的 Leader,保证写入高可用。Observer 同步 Leader 的数据,但是不参加选举。如果只部署一个 FE,则 FE 默认就是 Leader。

第一个启动的 FE 自动成为 Leader。在此基础上,可以添加若干 Follower 和 Observer。

之前集群中已经有3台节点node1,node2,node3了,ROLE分别为leader,follower,follower

现在将node4,node5,node6扩容进去

mysql-client 连接到已启动的 FE

ALTER SYSTEM ADD OBSERVER "host:port";
##其中 host 为 Follower 或 Observer 所在节点 ip
##port 为其配置文件 fe.conf 中的 edit_log_port(默认9010)。

例如下面

ALTER SYSTEM ADD OBSERVER "node4:9010";
ALTER SYSTEM ADD OBSERVER "node5:9010";
ALTER SYSTEM ADD OBSERVER "node6:9010";

在node4,node5,node6上第一次启动fe需要执行以下命令

./bin/start_fe.sh --helper host:port --daemon

例如下面

./bin/start_fe.sh --helper node1:9010 --daemon

切记
第一次启动fe一定要用--helper启动,否则启动不成功,无法加入集群,必须借助已经启动的FE节点(master或者follower)来扩容新的节点,helper 不能指向 新FE节点 自身

2.2.5、启动扩容的Broker服务

mysql-client添加实例

ALTER SYSTEM ADD BROKER hdfs_broker "node4:8000";
ALTER SYSTEM ADD BROKER hdfs_broker "node5:8000";
ALTER SYSTEM ADD BROKER hdfs_broker "node6:8000";

后台启动

sh bin/start_broker.sh --daemon 启动 Broker。

2.2.6、启动扩容的BE服务

mysql-client添加实例

ALTER SYSTEM ADD BACKEND "node4:9050";
ALTER SYSTEM ADD BACKEND "node5:9050";
ALTER SYSTEM ADD BACKEND "node6:9050";

后台启动

sh bin/start_be.sh --daemon

2.2.7、新节点部署supervisor,配置服务自重启

root用户不说了,其他用户要拥有sudo权限

yum install supervisor -y

配置文件/etc/supervisord.conf
cat /etc/supervisord.conf中的include模块如下

[include]
files = supervisord.d/*.conf

此files变量代表服务配置文件的后缀默认ini,所以在/etc/supervisord.d/目录中新添加的服务配置文件后缀必须与这个后缀相符,我这里是*.conf

启动Linux的supervisor服务
supervisord

更新新的配置到supervisord
supervisorctl update
注意:这里只要在etc/supervisord.d/目录下的服务配置发生了改变,就需要执行这个更新命令

启动某个进程(program_name=你配置中写的程序名称)
supervisorctl start program_name

查看全部的进程状态
supervisorctl status

因为我的所有doris服务需要用supervisor挂着,实现服务自重启,所以需要查询到3个服务的pid,kill -9 {pid}后用supervisor 启动

##查询FE、BE进程id
ps -ef | grep -i palo
##查询Broker进程id
ps -ef | grep -i bootstrap

3、问题记录

3.1、supervisor使用

root用户不说了,其他用户要拥有sudo权限
一、启动linux的supervisor服务
supervisord
二、更新新的配置到supervisord
supervisorctl update
三、重新启动配置中的所有程序
supervisorctl reload
四、启动某个进程(program_name=你配置中写的程序名称)
supervisorctl start program_name
五、查看正在守候的进程
supervisorctl
六、停止某一进程 (program_name=你配置中写的程序名称)
supervisorctl stop program_name
七、重启某一进程 (program_name=你配置中写的程序名称)
supervisorctl restart program_name
八、停止全部进程
supervisorctl stop all
注意:显示用stop停止掉的进程,用reload或者update都不会自动重启。

3.2、FE无法Alive&Join

发现新加入的fe节点查询结果如下
在这里插入图片描述
QueryPort为0,RpcPort为0,Join为false,Alive为false
分为如下几种可能被

3.2.1、priority_networks配置错误

当然,这种情况下,fe跟be都会有问题,因为这个参数是fe.conf与be.conf都有的,需要配置正确
解决方法:参考我的另一篇文章
Doris关于priority_networks配置错误,FE无法Alive&Join

3.2.2、节点第一次启动没有用helper

这里在上面提到过,新fe节点第一次启动必须用如下方式
./bin/start_fe.sh --helper node1:9010 --daemon
借助已经启动的fe节点来启动自己,否则mysql-client查询frontends结果就会出现QueryPort为0,RpcPort为0,Join为false,Alive为false的问题

解决办法需要看当前操作的是什么环境

<1>老节点没有load任务【测试环境】

我在虚拟机(没有任务)试过,如果扩容的新fe节点第一次启动没有用helper启动的话,需要如下操作

1、停止fe进程

2、将fe的{meta_dir}底下的bdb子目录、image子目录删除掉

3、将mysql-client中查询到的frontends结果中的问题fe实例drop掉

4、然后再用-- helper启动fe
./bin/start_fe.sh --helper node1:9010 --daemon
此时fe.log如下
192.168.56.111为node ip

2021-03-01 02:21:12,618 WARN (main|1) [Catalog.getClusterIdAndRole():890] current node is not added to the group. please add it first. sleep 5 seconds and retry, current helper nodes: [192.168.56.111:9010]
2021-03-01 02:21:17,624 WARN (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1020] failed to get fe node type from helper node: 192.168.56.111:9010.

这是日志正常告警,因为他需要从helper node(node1)上获取信息,然后mysql中没有

5、ALTER SYSTEM ADD OBSERVER "node4:9010";mysql-client中添加下

6、日志正常

INFO (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1043] get fe node type OBSERVER, name 192.168.56.114_9010_1614583297702 from 192.168.56.111:8030
2021-03-01 02:21:43,039 INFO (main|1) [Catalog.getClusterIdAndRole():987] finished to get cluster id: 102528807, role: OBSERVER and node name: 192.168.56.114_9010_1614583297702
2021-03-01 02:21:43,066 INFO (main|1) [Catalog.loadImage():1463] image does not exist: /disk1/keop/doris/doris-meta/image/image.0

7、查看实例信息
在这里插入图片描述

<2>老节点有load任务【生产环境】

列出这个原因是,如果你之前的节点是生产集群,有很多load任务在跑,那么你在上面的第5步骤ALTER SYSTEM ADD OBSERVER "node4:9010";mysql-client中添加后,到第6步骤,fe.log会报错
如下

2021-03-01 13:49:37,502 INFO (main|1) [PaloAuth.grantInternal():709] finished to grant resource privilege. is replay: true
2021-03-01 13:49:37,503 INFO (main|1) [PaloAuth.createUserInternal():575] finished to create user: 'admin'@'%', is replay: true
2021-03-01 13:49:37,525 INFO (main|1) [Catalog.getHelperNodes():1116] get helper nodes: [xx.xxx.xxx.127:9010]
2021-03-01 13:49:37,583 INFO (main|1) [Catalog.getFeNodeTypeAndNameFromHelpers():1043] get fe node type OBSERVER, name xx.xxx.xxx.xxx_9010_1614577535219 from xx.xxx.xxx.127:8030
2021-03-01 13:49:37,775 INFO (main|1) [Catalog.getClusterIdAndRole():987] finished to get cluster id: 1684078400, role: OBSERVER and node name: xx.xxx.xxx.xxx_9010_1614577535219
2021-03-01 13:49:37,788 INFO (main|1) [Catalog.loadImage():1463] image does not exist: /disk1/keop/doris/doris-meta/image/image.0
2021-03-01 13:49:38,175 INFO (UNKNOWN xx.xxx.xxx.xxx_9010_1614577535219(-1)|1) [BDBEnvironment.setup():157] add helper[xx.xxx.xxx.127:9010] as ReplicationGroupAdmin
2021-03-01 13:49:38,182 WARN (UNKNOWN xx.xxx.xxx.180_9010_1614577535219(-1)|1) [Catalog.notifyNewFETypeTransfer():2363] notify new FE type transfer: UNKNOWN
2021-03-01 13:49:38,214 WARN (RepNode xx.xxx.xxx.180_9010_1614577535219(-1)|60) [Catalog.notifyNewFETypeTransfer():2363] notify new FE type transfer: OBSERVER
2021-03-01 13:49:38,230 WARN (REPLICA xx.xxx.xxx.180_9010_1614577535219(2147483647)|60) [BDBStateChangeListener.stateChange():59] this node is DETACHED
2021-03-01 13:49:44,380 WARN (UNKNOWN xx.xxx.xxx.180_9010_1614577535219(-1)|1) [BDBJEJournal.open():356] catch insufficient log exception. will recover and try again.
com.sleepycat.je.rep.InsufficientLogException: (JE 7.3.7) Environment must be closed, caused by: com.sleepycat.je.rep.InsufficientLogException: Environment invalid because of previous exception: (JE 7.3.7) xx.xxx.xxx.180_9010_1614577535219(2147483647):/disk1/keop/doris/doris-meta/bdb INSUFFICIENT_LOG: Log files at this node are obsolete. Environment is invalid and must be closed. Originally thrown by HA thread: REPLICA xx.xxx.xxx.180_9010_1614577535219(2147483647) Originally thrown by HA thread: REPLICA xx.xxx.xxx.180_9010_1614577535219(2147483647)refreshVLSN=202,256,593 logProviders=[Node:xx.xxx.xxx.128_9010_1591157476825 xx.xxx.xxx.128:9010 (is member) changeVersion:3 LocalCBVLSN:202,407,966 at:Mon Mar 01 13:41:48 CST 2021 jeVersion:7.3.7
, Node:xx.xxx.xxx.127_9010_1591157474223 xx.xxx.xxx.127:9010 (is member) changeVersion:2 LocalCBVLSN:202,410,378 at:Mon Mar 01 13:45:46 CST 2021 jeVersion:7.3.7
, Node:xx.xxx.xxx.126_9010_1591157257798 xx.xxx.xxx.126:9010 (is member) changeVersion:1 LocalCBVLSN:202,407,772 at:Mon Mar 01 13:41:30 CST 2021 jeVersion:7.3.7
, Node:xx.xxx.xxx.180_9010_1614577535219 xx.xxx.xxx.180:9010 (is member) SECONDARY changeVersion:-1 LocalCBVLSN:-1 at:Mon Mar 01 13:49:38 CST 2021 jeVersion:7.3.7
] repImpl=com.sleepycat.je.rep.impl.RepImpl@134bcd77 props={GROUP_NAME=PALO_JOURNAL_GROUP, REFRESH_VLSN=202256593, NODE_NAME=xx.xxx.xxx.180_9010_1614577535219, HOSTNAME=xx.xxx.xxx.180, P_NODETYPE3=SECONDARY, P_NODETYPE2=ELECTABLE, P_NODETYPE1=ELECTABLE, P_NODENAME3=xx.xxx.xxx.180_9010_1614577535219, P_NODETYPE0=ELECTABLE, P_HOSTNAME3=xx.xxx.xxx.180, P_NODENAME2=xx.xxx.xxx.126_9010_1591157257798, P_HOSTNAME2=xx.xxx.xxx.126, P_NODENAME1=xx.xxx.xxx.127_9010_1591157474223, P_HOSTNAME1=xx.xxx.xxx.127, P_NODENAME0=xx.xxx.xxx.128_9010_1591157476825, PORT=9010, P_HOSTNAME0=xx.xxx.xxx.128, P_NUMPROVIDERS=4, P_PORT3=9010, ENV_DIR=/disk1/keop/doris/doris-meta/bdb, P_PORT2=9010, P_PORT1=9010, P_PORT0=9010}
        at com.sleepycat.je.rep.InsufficientLogException.wrapSelf(InsufficientLogException.java:315) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1766) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.dbi.EnvironmentImpl.checkOpen(EnvironmentImpl.java:1775) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Environment.checkOpen(Environment.java:2473) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2245) ~[je-7.3.7.jar:7.3.7]
        at org.apache.doris.journal.bdbje.BDBEnvironment.getDatabaseNames(BDBEnvironment.java:318) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.journal.bdbje.BDBJEJournal.open(BDBJEJournal.java:329) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.open(EditLog.java:835) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.initialize(Catalog.java:766) [palo-fe.jar:3.4.0]
        at org.apache.doris.PaloFe.start(PaloFe.java:108) [palo-fe.jar:3.4.0]
        at org.apache.doris.PaloFe.main(PaloFe.java:60) [palo-fe.jar:3.4.0]
Caused by: com.sleepycat.je.rep.InsufficientLogException: Environment invalid because of previous exception: (JE 7.3.7) xx.xxx.xxx.180_9010_1614577535219(2147483647):/disk1/keop/doris/doris-meta/bdb INSUFFICIENT_LOG: Log files at this node are obsolete. Environment is invalid and must be closed. Originally thrown by HA thread: REPLICA xx.xxx.xxx.180_9010_1614577535219(2147483647) Originally thrown by HA thread: REPLICA xx.xxx.xxx.180_9010_1614577535219(2147483647)
        at com.sleepycat.je.rep.stream.ReplicaFeederSyncup.setupLogRefresh(ReplicaFeederSyncup.java:664) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.ReplicaFeederSyncup.getFeederRecord(ReplicaFeederSyncup.java:732) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.ReplicaFeederSyncup.findMatchpoint(ReplicaFeederSyncup.java:406) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.ReplicaFeederSyncup.execute(ReplicaFeederSyncup.java:151) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.node.Replica.initReplicaLoop(Replica.java:711) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoopInternal(Replica.java:474) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.node.Replica.runReplicaLoop(Replica.java:409) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.node.RepNode.run(RepNode.java:1873) ~[je-7.3.7.jar:7.3.7]
2021-03-01 13:49:44,822 INFO (UNKNOWN xx.xxx.xxx.180_9010_1614577535219(-1)|1) [BDBEnvironment.setup():157] add helper[xx.xxx.xxx.127:9010] as ReplicationGroupAdmin
2021-03-01 13:49:44,823 WARN (UNKNOWN xx.xxx.xxx.180_9010_1614577535219(-1)|1) [Catalog.notifyNewFETypeTransfer():2363] notify new FE type transfer: UNKNOWN
2021-03-01 13:49:44,830 WARN (RepNode xx.xxx.xxx.180_9010_1614577535219(-1)|75) [Catalog.notifyNewFETypeTransfer():2363] notify new FE type transfer: OBSERVER
2021-03-01 13:49:44,853 INFO (stateListener|88) [Catalog$4.runOneCycle():2386] begin to transfer FE type from INIT to UNKNOWN
2021-03-01 13:49:44,855 INFO (stateListener|88) [Catalog$4.runOneCycle():2472] finished to transfer FE type to UNKNOWN
2021-03-01 13:49:44,855 INFO (stateListener|88) [Catalog$4.runOneCycle():2386] begin to transfer FE type from UNKNOWN to OBSERVER
2021-03-01 13:49:44,868 INFO (replayer|89) [Catalog.replayJournal():2489] replayed journal id is 0, replay to journal id is 101109709
2021-03-01 13:49:44,871 ERROR (replayer|89) [BDBJournalCursor.<init>():84] Can not find the key:1, fail to get journal cursor. will exit


这个错是Doris当前版本的BUG
具体社区的PR如下
https://github.com/apache/incubator-doris/pull/5418
但是当前版本也是有略微简便的方法的

1、将master fe的{meta_dir}目录中的image子目录下的形如image.101305405的文件拷贝到当前需要扩容的问题fe节点对应目录中去 2、mysql-client删除这个有问题的fe实例 3、–helper方式重启fe 4、mysql-client重新添加即可

3.3、BE节点需要增加fd

启动be报错,需要更多的file_descriptor
增加即可
但是问题是这个东西另很多人头疼,设置完不生效的现象比比皆是

用我的三个方法,一定能保证/proc/{be_pid}/limits文件中该be进程所用的文件描述符达到要求,且重启be不报错

3.3.1、环境变量法

在/etc/profile加入
ulimit -n 99999
然后 source
退出终端session重新登录

3.3.2、增加服务配置法

/etc/security/limits.d
在这个目录下优先级最高
里面所有的.conf文件里包含进程对应的用户的话,就对应生效nproc,nofile配置
按照字典序去查文件
vim root.conf
root - nofile 99999
root - nproc 99999
退出重新登录

3.3.3、修改全局配置limits.conf

vi /etc/security/limits.conf

hard nofile 102400 soft nofile 102400
保存退出后重新登录,其最大文件描述符已经被永久更改了。

3.3.4、系统级的限制

系统级的限制
它是限制所有用户打开文件描述符的总和,可以通过修改内核参数来更改该限制:
sysctl -w fs.file-max=102400
使用sysctl命令更改也是临时的,如果想永久更改需要在/etc/sysctl.conf添加
fs.file-max=102400
保存退出后使用sysctl -p 命令使其生效。
sysctl -w fs.file-max 65536
或者
echo "65536" > /proc/sys/fs/file-max

来源url
栏目