# 在Ubuntu上部署Hadoop單節點
### 安裝Java環境
Hadoop框架是依賴Java環境的,直接用最簡單的方式,直接安裝OpenJDK。
~~~
# 更新一下源列表
fish@hadoop:~$ sudo apt-get update
# 安裝OpenJDK
fish@hadoop:~$ sudo apt-get install default-jdk
# 測試Java環境
fish@hadoop:~$ java -version
java version "1.7.0_65"
OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-0ubuntu0.14.04.1)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
~~~
### 安裝SSH
Hadoop要依賴于SSH來管理節點,下面安裝SSH。
~~~
# 安裝SSH
fish@hadoop:~$ sudo apt-get install ssh
# 測試SSH
fish@hadoop:~$ which ssh
/usr/bin/ssh
fish@hadoop:~$ which sshd
/usr/sbin/sshd
~~~
### 添加Hadoop用戶
為Hadoop添加專門的用戶。
~~~
# 添加用戶
fish@hadoop:~$ sudo addgroup hadoop
Adding group `hadoop' (GID 1002) ...
Done.
fish@hadoop:~$ sudo adduser --ingroup hadoop hduser
Adding user `hduser' ...
Adding new user `hduser' (1001) with group `hadoop' ...
Creating home directory `/home/hduser' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for hduser
Enter the new value, or press ENTER for the default
Full Name []:
Room Number []:
Work Phone []:
Home Phone []:
Other []:
Is the information correct? [Y/n] Y
# 添加root權限
fish@hadoop:~$ sudo adduser hduser sudo
[sudo] password for fish:
Adding user `hduser' to group `sudo' ...
Adding user hduser to group sudo
Done.
~~~
### 設置SSH密鑰
~~~
# 切換用戶
fish@hadoop:/home$ su hduser
Password:
hduser@hadoop:/home$ cd ~
# 生成密鑰
hduser@hadoop:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
93:94:8f:3b:8e:c8:2b:c0:52:96:6c:8c:61:54:46:03 hduser@hadoop-7260
The key's randomart image is:
+--[ RSA 2048]----+
|.E+= |
|... . . |
|.= . o |
|. B . + |
|.+ S . |
|o. o |
|.. o |
| .. . o . |
| .+.. . |
+-----------------+
'
# 添加到認證密鑰中,避免每次都要輸密碼
hduser@hadoop:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
~~~
### 安裝Hadoop
~~~
# 下載解壓Hadoop
hduser@hadoop:~$ wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
hduser@hadoop:~$ tar xvzf hadoop-2.7.1.tar.gz
# 創建存放目錄
hduser@hadoop:~$ sudo mkdir /usr/local/hadoop
hduser@hadoop:~$ cd hadoop-2.7.1/
hduser@hadoop:~/hadoop-2.7.1$ sudo mv * /usr/local/hadoop
hduser@hadoop:~/hadoop-2.7.1$ sudo chown -R hduser:hadoop /usr/local/hadoop
~~~
### 設置配置文件
Hadoop還有很多配置文件要設置。
##### 1. ~/.bashrc
在bashrc文件中添加如下內容:
~~~
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
~~~
~~~
# 查看Java路徑
hduser@hadoop:~$ update-alternatives --config java
There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
Nothing to configure.
# 編輯~/.bashrc
hduser@hadoop:~$ vi ~/.bashrc
hduser@hadoop:~$ source ~/.bashrc
~~~
##### 2. /usr/local/hadoop/etc/hadoop/hadoop-env.sh
在hadoop-env.sh文件中添加如下內容:
~~~
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
~~~
~~~
hduser@hadoop:~$ vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
~~~
##### 3. /usr/local/hadoop/etc/hadoop/core-site.xml
將以下內容添加到core-site.xml文件中的`<configuration>`標簽中:
~~~
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
~~~
~~~
hduser@hadoop:~$ sudo mkdir -p /app/hadoop/tmp
hduser@hadoop:~$ sudo chown hduser:hadoop /app/hadoop/tmp
hduser@hadoop:~$ vi /usr/local/hadoop/etc/hadoop/core-site.xml
~~~
##### 4. /usr/local/hadoop/etc/hadoop/mapred-site.xml
將以下內容添加到mapred-site.xml文件中的`<configuration>`標簽中:
~~~
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
~~~
~~~
hduser@hadoop:~$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
hduser@hadoop:~$ vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
~~~
##### 5. /usr/local/hadoop/etc/hadoop/hdfs-site.xml
將以下內容添加到hdfs-site.xml文件中的`<configuration>`標簽中:
~~~
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
~~~
~~~
hduser@hadoop:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
hduser@hadoop:~$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
hduser@hadoop:~$ sudo chown -R hduser:hadoop /usr/local/hadoop_store
hduser@hadoop:~$ vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
~~~
### 格式化Hadoop文件系統
在開始使用之前,我們要格式化Hadoop文件系統。
~~~
hduser@hadoop:~$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
15/11/29 19:49:27 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop/10.64.253.22
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.7.1
......
15/11/29 19:49:29 INFO util.ExitUtil: Exiting with status 0
15/11/29 19:49:29 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop-7260.lvs01.dev.ebayc3.com/10.64.253.22
************************************************************/
~~~
### 啟動Hadoop
~~~
hduser@hadoop-7260:~$ cd /usr/local/hadoop/sbin/
// 查看所有的命令集
hduser@hadoop-7260:/usr/local/hadoop/sbin$ ls
distribute-exclude.sh start-all.cmd stop-balancer.sh
hadoop-daemon.sh start-all.sh stop-dfs.cmd
hadoop-daemons.sh start-balancer.sh stop-dfs.sh
hdfs-config.cmd start-dfs.cmd stop-secure-dns.sh
hdfs-config.sh start-dfs.sh stop-yarn.cmd
httpfs.sh start-secure-dns.sh stop-yarn.sh
kms.sh start-yarn.cmd yarn-daemon.sh
mr-jobhistory-daemon.sh start-yarn.sh yarn-daemons.sh
refresh-namenodes.sh stop-all.cmd
slaves.sh stop-all.sh
// 啟動Hadoop
hduser@hadoop:/usr/local/hadoop/sbin$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
15/11/29 19:56:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
......
# 檢查啟動情況
hduser@hadoop-7260:/usr/local/hadoop/sbin$ jps
5814 ResourceManager
5620 SecondaryNameNode
5401 DataNode
5951 NodeManager
5239 NameNode
6837 Jps
~~~
至此Hadoop在Ubuntu中的搭建工作完成。