前提
不是所有组件或角色都支持迁移,CDH中目前仅 开启了HA的namenode支持迁移,
像zk,hbase,spark,hive等包括datenode都不支持迁移
官网文档
docs.cloudera.com/documentati…
Moving NameNode Roles
This section describes two procedures for moving NameNode roles. Both procedures require cluster downtime. If highly availability is enabled for the NameNode, you can use a Cloudera Manager wizard to automate the migration process. Otherwise you must manually delete and add the NameNode role to a new host.
After moving a NameNode, if you have a Hive or Impala service, perform the steps in NameNode Post-Migration Steps.
Moving Highly Available NameNode, Failover Controller, and JournalNode Roles Using the Migrate Roles Wizard
Minimum Required Role: Cluster Administrator (also provided by Full Administrator)
The Migrate Roles wizard allows you to move roles of a highly available HDFS service from one host to another. You can use it to move NameNode, JournalNode, and Failover Controller roles.
Requirements and Limitations
-
Nameservice federation (multiple namespaces) is
not supported.
-
This procedure requires cluster downtime. The services discussed in this list must be running for the migration to complete.
-
The configuration of HDFS and services that depend on it must be valid.
-
The source and destination hosts must be commissioned and healthy.
-
The NameNode must be highly available using quorum-based storage.
-
HDFS automatic failover must be enabled, and the cluster must have a running ZooKeeper service.
-
If a Hue service is present in the cluster, its HDFS Web Interface Role property must refer to an HttpFS role, not to a NameNode role.
-
A majority of configured JournalNode roles must be running.
-
The Failover Controller role that is not located on the source host must be running.
Before You Begin
Do the following before you run the wizard:
- On hosts running active and standby NameNodes, back up the data directories.
- On hosts running JournalNodes, back up the JournalNode edits directory.
- If the source host is not functioning properly, or is not reliably reachable, decommission the host.
- If CDH and HDFS metadata was recently upgraded, and the metadata upgrade was not finalized, finalize the metadata upgrade.
Running the Migrate Roles Wizard
-
If the host to which you want to move the NameNode is not in the cluster, follow the instructions in Adding a Host to the Cluster to add the host.
-
Go to the HDFS service.
-
Click the Instances tab.
-
Click the Migrate Roles button.
-
Click the Source Host text field and specify the host running the roles to migrate. In the Search field optionally enter hostnames to filter the list of hosts and click Search.
The following shortcuts for specifying hostname patterns are supported:-
Range of hostnames (without the domain portion)
Range Definition Matching Hosts 10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4 host[1-3].company.com host1.company.com, host2.company.com, host3.company.com host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com -
IP addresses
-
Rack name
Select the checkboxes next to the desired host. The list of available roles to migrate displays. Clear any roles you do not want to migrate. When migrating a NameNode, the co-located Failover Controller must be migrated as well.
-
-
Click the Destination Host text field and specify the host to which the roles will be migrated. On destination hosts, indicate whether to delete data in the NameNode data directories and JournalNode edits directory. If you choose not to delete data and such role data exists, the Migrate Roles command will not complete successfully.
-
Acknowledge that the migration process incurs service downtime by selecting the Yes, I am ready to restart the cluster now checkbox.
-
Click Continue. The Command Progress screen displays listing each step in the migration process.
-
When the migration completes, click Finish.
备份
迁移namenode具有风险,如果是重要的环境,在操作之前,一定要备份
备份namenode元数据
备份JournalNode元数据
备份hdfs上重要的数据
备份CM的mysql
迁移之前的要求和限制
不支持多个namenode多个命名空间
迁移过程会停止所有 组件 依次自动停止 到 迁移完成会自动重启
源主机和目标主机网络正常,且目标主机已加到 CDH集群中,并且测试加其他组件没有问题
迁移之前所有的组件必须正常在运行中(Failover Controller JournalNode 必须正常运行)
目标主机在迁移之前不能有Failover Controller,因为迁移时会把原Failover Controller迁移过来
目标主机的dfs/nn dfs/jn 必须没有创建,迁移后会自动创建
源主机HA服务器(2)台 dfs/nn dfs/jn 不能有其他文件或文件夹(自已放在进去的,nn或jn自已的文件和文件夹没事),否则在迁移过程会失败
迁移之前 请停服。。。将yarn flink spark 等提交的job停止或其他和 大数据有关的业务停掉
组件迁移
迁移步骤25个。一切顺利就迁移完成了
迁移完成后 使用 hdfs fsck / 检查集群和块
登陆hdfs ui 页面查看namenode状态 数据块以和存储大小
意料之外的失败
如果在迁移组件时发生 失败。请根据 迁移步骤 来修复他。
如果是在初始化共享编辑目录 时和后面的步骤 失败。。请再备份一次edit和fsimage文件
一般发生在这里的错误 是由于 nn 和jn 目录里面 有另外的文件导致的
请在active namenode执行 初始化共享编辑目录 然后 启动namenode
接着standby namenode执行 初始化共享编辑目录 然后引导备用namenode