install hadoop cluster with docker
docker에 설치된 데이터들을 backup -> restore 해보기
Trying to pull repository docker.io/kiwenlau/hadoop ...
1.0: Pulling from docker.io/kiwenlau/hadoop
6c953ac5d795: Pull complete
3eed5ff20a90: Pull complete
f8419ea7c1b5: Pull complete
51900bc9e720: Pull complete
a3ed95caeb02: Pull complete
bd8785af34f8: Pull complete
bbc3db9806c0: Pull complete
10b317fed6db: Pull complete
ff167c65c3cc: Pull complete
b6f1a5a93aa5: Pull complete
21f0d52e6cae: Pull complete
35ebd7467cfb: Pull complete
Digest: sha256:e4fe1788c2845c857b98cec6bba0bbcd5ac9f97fd3d73088a17fd9a0c4017934
Status: Downloaded newer image for docker.io/kiwenlau/hadoop:1.0
3.exec hadoop
[root@kylin ~]# git clone https://github.com/kiwenlau/hadoop-cluster-docker
Cloning into 'hadoop-cluster-docker'...
remote: Enumerating objects: 392, done.
remote: Total 392 (delta 0), reused 0 (delta 0), pack-reused 392
Receiving objects: 100% (392/392), 191.83 KiB | 217.00 KiB/s, done.
Resolving deltas: 100% (211/211), done.
[root@kylin ~]# docker network create --driver=bridge hadoop
91a4925c59e3f2b9493414bdbbd9ac89fa43fd9d17fc7cb8fa4356122f78bb68
[root@kylin ~]# cd hadoop-cluster-docker
[root@kylin hadoop-cluster-docker]# ./start-container.sh
start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
root@hadoop-master:~#
root@hadoop-master:~# ./start-hadoop.sh
Starting namenodes on [hadoop-master]
hadoop-master: Warning: Permanently added 'hadoop-master,172.18.0.2' (ECDSA) to the list of known hosts.
hadoop-master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-hadoop-master.out
hadoop-slave2: Warning: Permanently added 'hadoop-slave2,172.18.0.4' (ECDSA) to the list of known hosts.
hadoop-slave1: Warning: Permanently added 'hadoop-slave1,172.18.0.3' (ECDSA) to the list of known hosts.
hadoop-slave2: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-hadoop-slave2.out
hadoop-slave1: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-hadoop-slave1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-hadoop-master.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-hadoop-master.out
hadoop-slave1: Warning: Permanently added 'hadoop-slave1,172.18.0.3' (ECDSA) to the list of known hosts.
hadoop-slave2: Warning: Permanently added 'hadoop-slave2,172.18.0.4' (ECDSA) to the list of known hosts.
hadoop-slave2: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-hadoop-slave2.out
hadoop-slave1: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-hadoop-slave1.out
root@hadoop-master:~# ls
hdfs run-wordcount.sh start-hadoop.sh
### test
root@hadoop-master:~# ./run-wordcount.sh
19/08/07 02:35:27 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.18.0.2:8032
19/08/07 02:35:29 INFO input.FileInputFormat: Total input paths to process : 2
19/08/07 02:35:29 INFO mapreduce.JobSubmitter: number of splits:2
19/08/07 02:35:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1565145208342_0001
19/08/07 02:35:31 INFO impl.YarnClientImpl: Submitted application application_1565145208342_0001
19/08/07 02:35:31 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1565145208342_0001/
19/08/07 02:35:31 INFO mapreduce.Job: Running job: job_1565145208342_0001
19/08/07 02:35:51 INFO mapreduce.Job: Job job_1565145208342_0001 running in uber mode : false
19/08/07 02:35:51 INFO mapreduce.Job: map 0% reduce 0%
19/08/07 02:36:05 INFO mapreduce.Job: map 100% reduce 0%
19/08/07 02:36:16 INFO mapreduce.Job: map 100% reduce 100%
19/08/07 02:36:16 INFO mapreduce.Job: Job job_1565145208342_0001 completed successfully
19/08/07 02:36:16 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=56
FILE: Number of bytes written=352398
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=258
HDFS: Number of bytes written=26
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=22209
Total time spent by all reduces in occupied slots (ms)=8167
Total time spent by all map tasks (ms)=22209
Total time spent by all reduce tasks (ms)=8167
Total vcore-milliseconds taken by all map tasks=22209
Total vcore-milliseconds taken by all reduce tasks=8167
Total megabyte-milliseconds taken by all map tasks=22742016
Total megabyte-milliseconds taken by all reduce tasks=8363008
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=42
Map output materialized bytes=62
Input split bytes=232
Combine input records=4
Combine output records=4
Reduce input groups=3
Reduce shuffle bytes=62
Reduce input records=4
Reduce output records=3
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=176
CPU time spent (ms)=4190
Physical memory (bytes) snapshot=781901824
Virtual memory (bytes) snapshot=2653466624
Total committed heap usage (bytes)=500695040
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=26
File Output Format Counters
Bytes Written=26
input file1.txt:
Hello Hadoop
input file2.txt:
Hello Docker
wordcount output:
Docker 1
Hadoop 1
Hello 2
root@hadoop-master:~#
1. 환경준비
centos 7.6
docker
2.install hadoop
[root@kylin ~]# docker pull kiwenlau/hadoop:1.0Trying to pull repository docker.io/kiwenlau/hadoop ...
1.0: Pulling from docker.io/kiwenlau/hadoop
6c953ac5d795: Pull complete
3eed5ff20a90: Pull complete
f8419ea7c1b5: Pull complete
51900bc9e720: Pull complete
a3ed95caeb02: Pull complete
bd8785af34f8: Pull complete
bbc3db9806c0: Pull complete
10b317fed6db: Pull complete
ff167c65c3cc: Pull complete
b6f1a5a93aa5: Pull complete
21f0d52e6cae: Pull complete
35ebd7467cfb: Pull complete
Digest: sha256:e4fe1788c2845c857b98cec6bba0bbcd5ac9f97fd3d73088a17fd9a0c4017934
Status: Downloaded newer image for docker.io/kiwenlau/hadoop:1.0
3.exec hadoop
[root@kylin ~]# git clone https://github.com/kiwenlau/hadoop-cluster-docker
Cloning into 'hadoop-cluster-docker'...
remote: Enumerating objects: 392, done.
remote: Total 392 (delta 0), reused 0 (delta 0), pack-reused 392
Receiving objects: 100% (392/392), 191.83 KiB | 217.00 KiB/s, done.
Resolving deltas: 100% (211/211), done.
[root@kylin ~]# docker network create --driver=bridge hadoop
91a4925c59e3f2b9493414bdbbd9ac89fa43fd9d17fc7cb8fa4356122f78bb68
[root@kylin ~]# cd hadoop-cluster-docker
[root@kylin hadoop-cluster-docker]# ./start-container.sh
start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
root@hadoop-master:~#
root@hadoop-master:~# ./start-hadoop.sh
Starting namenodes on [hadoop-master]
hadoop-master: Warning: Permanently added 'hadoop-master,172.18.0.2' (ECDSA) to the list of known hosts.
hadoop-master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-hadoop-master.out
hadoop-slave2: Warning: Permanently added 'hadoop-slave2,172.18.0.4' (ECDSA) to the list of known hosts.
hadoop-slave1: Warning: Permanently added 'hadoop-slave1,172.18.0.3' (ECDSA) to the list of known hosts.
hadoop-slave2: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-hadoop-slave2.out
hadoop-slave1: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-hadoop-slave1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-hadoop-master.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-hadoop-master.out
hadoop-slave1: Warning: Permanently added 'hadoop-slave1,172.18.0.3' (ECDSA) to the list of known hosts.
hadoop-slave2: Warning: Permanently added 'hadoop-slave2,172.18.0.4' (ECDSA) to the list of known hosts.
hadoop-slave2: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-hadoop-slave2.out
hadoop-slave1: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-hadoop-slave1.out
root@hadoop-master:~# ls
hdfs run-wordcount.sh start-hadoop.sh
### test
root@hadoop-master:~# ./run-wordcount.sh
19/08/07 02:35:27 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/172.18.0.2:8032
19/08/07 02:35:29 INFO input.FileInputFormat: Total input paths to process : 2
19/08/07 02:35:29 INFO mapreduce.JobSubmitter: number of splits:2
19/08/07 02:35:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1565145208342_0001
19/08/07 02:35:31 INFO impl.YarnClientImpl: Submitted application application_1565145208342_0001
19/08/07 02:35:31 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1565145208342_0001/
19/08/07 02:35:31 INFO mapreduce.Job: Running job: job_1565145208342_0001
19/08/07 02:35:51 INFO mapreduce.Job: Job job_1565145208342_0001 running in uber mode : false
19/08/07 02:35:51 INFO mapreduce.Job: map 0% reduce 0%
19/08/07 02:36:05 INFO mapreduce.Job: map 100% reduce 0%
19/08/07 02:36:16 INFO mapreduce.Job: map 100% reduce 100%
19/08/07 02:36:16 INFO mapreduce.Job: Job job_1565145208342_0001 completed successfully
19/08/07 02:36:16 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=56
FILE: Number of bytes written=352398
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=258
HDFS: Number of bytes written=26
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=22209
Total time spent by all reduces in occupied slots (ms)=8167
Total time spent by all map tasks (ms)=22209
Total time spent by all reduce tasks (ms)=8167
Total vcore-milliseconds taken by all map tasks=22209
Total vcore-milliseconds taken by all reduce tasks=8167
Total megabyte-milliseconds taken by all map tasks=22742016
Total megabyte-milliseconds taken by all reduce tasks=8363008
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=42
Map output materialized bytes=62
Input split bytes=232
Combine input records=4
Combine output records=4
Reduce input groups=3
Reduce shuffle bytes=62
Reduce input records=4
Reduce output records=3
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=176
CPU time spent (ms)=4190
Physical memory (bytes) snapshot=781901824
Virtual memory (bytes) snapshot=2653466624
Total committed heap usage (bytes)=500695040
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=26
File Output Format Counters
Bytes Written=26
input file1.txt:
Hello Hadoop
input file2.txt:
Hello Docker
wordcount output:
Docker 1
Hadoop 1
Hello 2
root@hadoop-master:~#
댓글
댓글 쓰기