hadoop3.x ha高可用-安装部署 作者:马育民 • 2021-10-24 20:00 • 阅读:10218 # 说明 本文 只配置 **HDFS HA高可用**,没有配置 yarn等其他服务 # 登录 hadoop1 虚拟机 登录 `hadoop1` 虚拟机,执行下面操作 # 上传 略 # 解压缩 创建 文件夹: ``` mkdir hadoop-3.0.3-ha ``` 解压缩: ``` tar zxvf hadoop-3.0.3.tar.gz -C hadoop-3.0.3-ha --no-same-owner ``` 移动文件: ``` mv /program/hadoop-3.0.3-ha/hadoop-3.0.3/* /program/hadoop-3.0.3-ha/ ``` 删除多余的目录 ``` rm -rf /program/hadoop-3.0.3-ha/hadoop-3.0.3 ``` # 修改环境变量 ``` vim /etc/profile.d/bigdata_env.sh ``` ### 已经安装过 hadoop(注意) 将原来的 `HADOOP_HOME` 配置注释掉,如下: ``` # export HADOOP_HOME=/program/hadoop-3.0.3 ``` 增加下面的配置: ``` export HADOOP_HOME=/program/hadoop-3.0.3-ha ``` ### 完整如下: ``` # 配置 HADOOP_HOME # export HADOOP_HOME=/program/hadoop-3.0.3 export HADOOP_HOME=/program/hadoop-3.0.3-ha export PATH=${HADOOP_HOME}/bin:$PATH export PATH=${HADOOP_HOME}/sbin:$PATH ``` ### 立即生效 ``` source /etc/profile ``` 查看: ``` echo $HADOOP_HOME ``` 显示结果如下,说明配置生效: ``` /program/hadoop-3.0.3-ha ``` ### 同步到 hadoop2 ``` rsync -av /etc/profile.d/bigdata_env.sh root@hadoop2:/etc/profile.d/ ``` ### 同步到 hadoop3 ``` rsync -av /etc/profile.d/bigdata_env.sh root@hadoop3:/etc/profile.d/ ``` ### 立即生效 登录 `hadoop2`、`hadoop3`,执行下面命令: ``` source /etc/profile ``` ### 验证 登录 `hadoop2`、`hadoop3`,执行下面命令: ``` echo $HADOOP_HOME ``` 显示如下,说明成功: ``` /program/hadoop-3.0.3-ha ``` # 修改 hadoop-env.sh hadoop-env.sh 配置了 hadoop 的环境 进入 hadoop目录 ``` cd /program/hadoop-3.0.3-ha/etc/hadoop/ ``` ### 设置 JAVA_HOME(可略) 指定 **java 绝对路径**,否则 启动 NameNode 和 DataNode 会报错 使用 vim 编辑该文件: ``` vim hadoop-env.sh ``` 找到 `export JAVA_HOME` ,如果该行前面有 `#` ,就去掉 `#`,改成 ``` export JAVA_HOME=/program/jdk1.8.0_202 ``` 可通过下面命令显示 java 绝对路径: ``` echo $JAVA_HOME ``` ### 设置 root用户 如果用 `root` 用户部署、启动hadoop,需要配置 **root** 用户,否则 执行 `start-dfs.sh` 命令,报错如下: ``` Starting namenodes on [hadoop1] ERROR: Attempting to operate on hdfs namenode as root ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation. Starting datanodes ERROR: Attempting to operate on hdfs datanode as root ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. Starting secondary namenodes [hadoop3] ERROR: Attempting to operate on hdfs secondarynamenode as root ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation. ``` 在 `hadoop-env.sh` 文件的 **最下面** 添加下面配置: ``` export HDFS_DATANODE_USER=root export HDFS_NAMENODE_USER=root export HDFS_JOURNALNODE_USER=root export HDFS_ZKFC_USER=root ``` # 修改 core-site.xml 进入 hadoop目录 ``` cd /program/hadoop-3.0.3-ha/etc/hadoop ``` ``` vim core-site.xml ``` 增加下面配置 ``` fs.defaultFS hdfs://mycluster hadoop.tmp.dir /program/hadoop-3.0.3-ha/data/tmp ha.zookeeper.quorum hadoop1:2181,hadoop2:2181,hadoop3:2181 hadoop.http.staticuser.user root ``` **解释:** `hadoop.http.staticuser.user`:不加此配置,上传文件报错如下: ``` java.net.ConnectException: Call From hadoop1/192.168.58.101 to hadoop2:8020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused ``` ### 配置文件解释:HA集群名称 ``` fs.defaultFS hdfs://mycluster ``` 不能像之前配置集群那样,写死 `NameNode` 的 ip 和 port,因为 有 2个 `NameNode`,不确定哪个是 `active` ,所以此处指定同一个名字,该名字可任意起,但后面配置要 **与 此处一致** ### 配置文件解释:设置HDFS存储文件的位置 当上传文件到 HDFS 后,HDFS 中的文件保存在: ``` /program/hadoop-3.0.3-ha/data/tmp/dfs/data/current/BP-生成的数字/current/finalized/subdir0/subdir0/ ``` # 修改 hdfs-site.xml 进入 hadoop目录 ``` cd /program/hadoop-3.0.3-ha/etc/hadoop/ ``` ``` vim hdfs-site.xml ``` 增加下面配置: ``` dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 hadoop1:8020 dfs.namenode.http-address.mycluster.nn1 hadoop1:9870 dfs.namenode.rpc-address.mycluster.nn2 hadoop2:8020 dfs.namenode.http-address.mycluster.nn2 hadoop2:9870 dfs.namenode.shared.edits.dir qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster dfs.journalnode.edits.dir /program/hadoop-3.0.3-ha/data/journaldata dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 ``` # 配置 MapReduce 进入 hadoop目录 ``` cd /program/hadoop-3.0.3-ha/etc/hadoop/ ``` ### 修改 mapred-site.xml ``` vim mapred-site.xml ``` ### 增加内容 增加下面内容 ``` mapreduce.framework.name yarn ``` # 修改 yarn-site.xml - yarn ha高可用(略) ``` yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 hadoop1 yarn.resourcemanager.hostname.rm2 hadoop2 yarn.resourcemanager.zk-address hadoop1:2181,hadoop2:2181,hadoop3:2181 yarn.nodemanager.aux-services mapreduce_shuffle ``` # 配置 workers 启动集群时,会读取 `workers`文件的主机名,然后访问这些主机 进入 hadoop目录 ``` cd /program/hadoop-3.0.3-ha/etc/hadoop/ ``` ### vim文件 ``` vim workers ``` ### 修改内容 配置主机名称,内容如下: ``` hadoop1 hadoop2 hadoop3 ``` **注意:** - 主机名 前后不允许有空格 - 不允许有空行 # 同步文件 hadoop 更新配置文件目录 ### 同步到 hadoop2 ``` rsync -av /program/hadoop-3.0.3-ha root@hadoop2:/program/ ``` ### 同步到 hadoop3 ``` rsync -av /program/hadoop-3.0.3-ha root@hadoop3:/program/ ``` ### 验证文件是否正确 登录 `hadoop2` 查看 `core-site.xml` 文件是否正确 登录 `hadoop3` 查看 `core-site.xml` 文件是否正确 原文出处:http://malaoshi.top/show_1IX26NUd2ZG9.html