Hadoop
简介
组件
hdfs: 分布式文件存储系统,包含namenode,datanode两个组件,
配置文件: $HADOOP_HOME/etc/hadoop/hdfs-site.xml
启动进程: $HADOOP_HOME/sbin/hadoop-daemon.sh start/stop namenode | datanode
yarn: 资源管理与调度
配置文件: $HADOOP_HOME/etc/hadoop/yarn-site.xml
启动进程: $HADOOP_HOME/sbin/yarn-daemon.sh start/stop resourcemanager | nodemanager
mr历史日志服务器: mapreduce历史服务器,查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。
配置文件: $HADOOP_HOME/etc/hadoop/mapred-site.xml
启动进程: $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
mapred-site.xml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
<configuration> <property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/tmp/hadoop-yarn/staging</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>${yarn.app.mapreduce.am.staging-dir}/history/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value> </property> </configuration>
安装
|
|
wordcount
1
$HADOOP_HOME/bin/hadoop jar $HODOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar wordcount /input /output
Hadoop HA 配置
Hadoop 的 namenode 存在单点故障,为解决这个问题,需对其配置HA方案。
hdfs-site.xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
<!-- ha集群名称 --> <property> <name>dfs.nameservices</name> <value>ha-cluster</value> </property> <!-- ha包含哪些节点 --> <property> <name>dfs.ha.namenodes.ha-cluster</name> <value>nn1,nn2</value> </property> <!-- namenode rpc 监听地址 --> <property> <name>dfs.namenode.rpc-address.ha-cluster.nn1</name> <value>machine1.example.com:8020</value> </property> <property> <name>dfs.namenode.rpc-address.ha-cluster.nn2</name> <value>machine2.example.com:8020</value> </property> <!-- namenode http 监听地址 --> <property> <name>dfs.namenode.http-address.ha-cluster.nn1</name> <value>machine1.example.com:50070</value> </property> <property> <name>dfs.namenode.http-address.ha-cluster.nn2</name> <value>machine2.example.com:50070</value> </property> <!-- NameNode读写JNs组的uri --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://jnode1.example.com:8485;jnode2.example.com:8485;jnode3.example.com:8485/ha-cluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.Hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/exampleuser/.ssh/id_rsa</value> </property>
core-site.xml
|
|
启动命令
|
|
Hadoop prometheus exporter
$HADOOP_HOME/etc/hadoop/hadoop-env.sh\
|
|