centos7 伪分布hadoop安装,centos7分布hadoop
centos7 伪分布hadoop安装,centos7分布hadoop
没有说明,都是root操作
1,安装对应版本的jdk
比如我的:
[root@localhost ~]# whereis java
java: /usr/bin/java /usr/lib/java /etc/java /usr/share/java /usr/java/jdk1.8.0_45/bin/java /usr/share/man/man1/java.1
/usr/share/man/man1/java.1.gz
配置java环境变量
vi /etc/profile 加入
export JAVA_HOME=/usr/java/jdk1.8.0_45
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
#export HADOOP_HOME=/opt/hadoop/hadoop-1.2.1
export PATH=$PATH:$JAVA_HOME/bin
#:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
生效,source /etc/profile
验证:
[root@localhost ~]# java -version
java version "1.7.0_85"
OpenJDK Runtime Environment (rhel-2.6.1.2.el7_1-x86_64 u85-b01)
OpenJDK 64-Bit Server VM (build 24.85-b03, mixed mode)
2,hadoop操作
创建用户目录:
useradd hadoop
echo hadoop | passwd --stdin hadoop
目录:
mkdir /hadoop
mkdir /hadoop/hdfs
mkdir /hadoop/hdfs/data
mkdir /hadoop/hdfs/name
mkdir /hadoop/mapred
mkdir /hadoop/mapred/local
mkdir /hadoop/mapred/system
mkdir /hadoop/tmp
chown -R hadoop /hadoop
ssh配置,
su - hadoop
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub>> ~/.ssh/authorized_keys
cd /home/hadoop/.ssh
chmod 600 authorized_keys
验证: su - hadoop ; ssh localhost
3,配置hadoop
mkdir /opt/hadoop
cp hadoop-1.2.1.tar.gz /opt/hadoop
cd /opt/hadoop
tar -zxvf hadoop-1.2.1.tar.gz
chown -R hadoop /opt/hadoop
切换到hadoop用户配置参数
su - hadoop
cd /opt/hadoop/hadoop-1.2.1/conf
vim core-site.xml 修改:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>
</configuration>
vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/hdfs/data</value>
</property>
</configuration>
vim mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
vim hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45
export HADOOP_HOME_WARN_SUPPRESS="TRUE"
然后source hadoop-env.sh
最后配置下/etcprofile
export JAVA_HOME=/usr/java/jdk1.8.0_45
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/opt/hadoop/hadoop-1.2.1
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
4,启动
测试:
hadoop version
启动:
hadoop namenode -format
Y
start-all.sh
查看:
jps
[hadoop@localhost ~]$ start
start-all.sh start-jobhistoryserver.sh start-pulseaudio-x11
start-balancer.sh start-mapred.sh start-statd
start-dfs.sh start-pulseaudio-kde startx
[hadoop@localhost ~]$ start-all.sh
starting namenode, logging to /opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /opt/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out
[hadoop@localhost ~]$ jps
13104 TaskTracker
12277 NameNode
12681 SecondaryNameNode
13337 Jps
12476 DataNode
12878 JobTracker
日志文件:
/opt/hadoop/hadoop-1.2.1/logs
通过web访问:
localhost:50030/ for the Jobtracker
localhost:50070/ for the Namenode
localhost:50060/ for the Tasktracker
完全分布式的安装与伪分布式安装相似,注意如下几点
1.配置文件中指定具体的ip地址,而不是localhost
2.配置masters和slaves文件,加入相关ip地址即可
以上配置需要在各个节点上保持一致
时过境迁,现在主版本是2.x了现在附上配置过程
复制一份配置文件,设立是etc/hadoop,比如复制到目录etc/hadoop1
然后copy原来配置文件中的*-site.xml、capacity-scheduler.xml,slaves到新的配置目录下,修改配置文件如下:
hdfs-site.xml:
<configuration>
<!--数据块的冗余度,默认是3-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!--是否开启HDFS的权限检查,默认:true-->
<!--
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
-->
</configuration>
core-site.xml
<!--NameNode的地址-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost</value>
</property>
<!--HDFS数据保存的目录,默认是Linux的tmp目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/root/xxx/xxx</value>
</property>
mapred-site.xml
<!--MapReduce程序运行的容器是Yarn-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<!--ResourceManager的地址-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost/ip地址</value>
</property>
<!--NodeManager运行MapReduce任务的方式-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
配置完成后:
对 NameNode 进行格式化: hdfs namenode -format
然后依次启动:
start-dfs.sh [--config xxx]、
start-yarn.sh [--config xxx]
mr-jobhistory-daemon.sh [--config xxx] start historyserver
注意:如果配置了HADOOP_CONF_DIR,上面的config可以忽略掉
环境变量配置:
export JAVA_HOME=/usr/local/jdk1.8.0_191
export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_CONF_DIR=/root/hadoop-2.9.2/etc/hadoop1/
export HADOOP_HOME=/root/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
查看日志,或者用浏览器查看:localhost:50070查看nodename,localhost:8088查看资源管理器,localhost:19888查看历史服务器
停止进程:
stop-dfs.sh [--config xxx]、
stop-yarn.sh [--config xxx]
mr-jobhistory-daemon.sh [--config xxx] stop historyserver
相关文章
- 暂无相关文章
用户点评