Kafka数据迁移在Debian上的步骤
sudo apt update && sudo apt install -y openjdk-8-jdk
tar命令解压:wget https://downloads.apache.org/kafka/3.5.2/kafka_2.12-3.5.2.tgz
tar -xzf kafka_2.12-3.5.2.tgz -C /opt/
/etc/profile文件,添加Kafka路径:echo 'export KAFKA_HOME=/opt/kafka_2.12-3.5.2' >> /etc/profile
echo 'export PATH=$PATH:$KAFKA_HOME/bin' >> /etc/profile
source /etc/profile # 生效配置
若需在同一个Kafka集群内迁移分区(如新增Broker节点),可使用kafka-reassign-partitions.sh工具:
server.properties文件,为新Broker分配唯一ID(如broker.id=3),并配置listeners、log.dirs等参数,启动新Broker。reassign.json),指定待迁移的分区及目标Broker,执行命令生成计划:cat reassign.json
# 示例内容:{"version":1,"partitions":[{"topic":"test_topic","partition":0,"replicas":[1,2,3]}]}
kafka-reassign-partitions.sh --zookeeper localhost:2181 --generate --topics-to-move-json-file reassign.json --broker-list "1,2,3" > reassign-plan.json
reassign-plan.json执行迁移:kafka-reassign-partitions.sh --zookeeper localhost:2181 --execute --reassignment-json-file reassign-plan.json
kafka-reassign-partitions.sh --zookeeper localhost:2181 --verify --reassignment-json-file reassign-plan.json
MirrorMaker可实现跨集群数据镜像,适用于大规模数据同步:
mirror-maker.properties文件,配置源集群(bootstrap.servers=source:9092)和目标集群(target.bootstrap.servers=target:9092)地址:# 源集群配置
bootstrap.servers=source-kafka:9092
# 目标集群配置
target.bootstrap.servers=target-kafka:9092
# 消费者组
group.id=mirror-maker-group
# 主题白名单(可选)
topics=.*
kafka-mirror-maker.sh --consumer.config consumer.properties --producer.config producer.properties --whitelist ".*"
kafka-consumer-groups.sh工具查看消费进度,确保数据一致性。适用于需要实时同步数据库变更或复杂数据流的场景:
docker-compose.yaml示例)。mysql-source.json),指定数据库连接信息和同步主题:{
"name": "mysql-source",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "mysql-host",
"database.port": "3306",
"database.user": "user",
"database.password": "password",
"database.server.id": "184054",
"database.server.name": "mysql-server",
"table.include.list": "db.table1,db.table2",
"topic.prefix": "mysql-"
}
}
kafka-console-consumer.sh从源集群和目标集群消费相同主题的数据,对比内容是否一致:# 源集群消费
kafka-console-consumer.sh --bootstrap-server source:9092 --topic test_topic --from-beginning | tee source-data.txt
# 目标集群消费
kafka-console-consumer.sh --bootstrap-server target:9092 --topic test_topic --from-beginning | tee target-data.txt
diff source-data.txt target-data.txt # 对比差异
bootstrap.servers配置,指向目标集群地址,重启客户端应用。kafka-dump-log.sh工具导出日志段)。