Install Hortonworks HDP 3.1.0 on A Cluster of VMWare Virtual Machines
This post describes the process to install Hortontworks HDP 3.1.0 on a cluster of three VMWare virtual machines. The process includes four major steps: 1) set up the cluster environemnt; 2) set up a local repository for both Ambari and HDP stacks; 3) Install Ambari server and agent; 4) install, configure and deploy the cluster.
This installation process might work for other versions too. Please check the product versions through Hortonworks support matrix: https://supportmatrix.hortonworks.com/
Virtual Nodes Information
Three virtual machines in VMWare with following settings. RedHat Enterprise 7.6 has been installed on each node.
1. Prepare Cluster Environment
1.1. Setting Proxy
If you’re behind a proxy, you need to specify your proxy server information, because many repositories, including
yum, are accessed through the proxy servers. For each host, you can set up proxy server info through adding
/root/.bashrc. Remember to run
source .bashrc to refresh.
1.2. Make sure run level is multi-user text mode
Run following command to check the run level:
It is expected to see the response of
multi-user.target. If not, run the following command to set the run level to multi-user text mode:
systemctl set-default multi-user.target
1.3. Check and set hostnames
For each host in the cluster, confirm that the hostname is set to a Fully Qualified Domain Name ( FQDN) name by running the following command:
This should return a host name that has a format like
fully.qualified.doman.name You can use
hostname command to set the hostname, for example:
Edit Network Configuration File
For each host in the cluster, open the network configuration file through
HOSTNAME property to its FQDN:
Edit Hosts File
For each host in the cluster, open the hosts file through running
to add the following line to the file. For example:
146.xxx.xxx.75 hadoop-master.qualified.doman.name hadoop-master146.xxx.xxx.76 hadoop-node-1.qualified.doman.name hadoop-node-1146.xxx.xxx.77 hadoop-node-2.qualified.doman.name hadoop-node-2
It needs reboot to make these changes effective.
1.4. Set up password-less SSH
For the master node
- Login to the
rootuser and generate SSH keys using
ssh-keygen -t rsa. Press enter for all prompts and accept all default values.
- Run the following command
ssh-copy-id localhostto copy ssh identification for localhost. Enter password when prompted for the password.
- Then run command
ssh hadoop-masterto make sure no password needed.
For each of other hosts in the cluster:
- Copy the SSH file from
hadoop-masterto every other hosts in the cluster, through running:
scp -pr /root/.ssh email@example.com:/root/
- Upload the generated
id_rsa.pubto the root’s .ssh directory as a file with name authorized_keys, through running:
cat .ssh/id_rsa.pub | ssh firstname.lastname@example.org 'cat >> .ssh/authorized_keys'
- Set permissions for .ssh directory and authorized_keys file:
ssh email@example.com; chmod 700 .ssh; chmod 640 .ssh/authorized_keys
hadoop-master host, run following commands to make sure inter-node connection using SSH without password:
1.5. Enable NTP
Run following commands on each host to install and enable NTP service:
yum install -y ntp
systemctl enable ntpd
systemctl start ntpd
timedatectl status and look for following lines to verfiy that NTP is running:
NTP enabled: yes
NTP synchronized: yes
Otherwise, you need to force NTP synchronze through the following steps:
- Stop ntp serivce:
systemctl stop ntpd
server your.ntp.server.addressto the
/etc/ntp.conf's servers part.
- Force time synchronize:
- Restart ntp:
systemctl start ntpd
systemctl enable ntpdateto make sure running the
ntpdateat boot time.
1.6. Configure firewall
Run the following commands to disable firewall on each host in the cluster:
systemctl disable firewalld
service firewalld stop
systemctl status firewalld to make sure firewall is disabled.
1.7. Disable SElinux
For each host in the cluster, change SELINUX value from enhancing to disabled in
2. Set up a Local Repository for Ambari and HDP Stack
2.1. Create and start an HTTP sever on the master host
Install and start Apache Server through the following commands:
yum install -y httpd
service httpd restart
chkconfig httpd on
Make sure the directory
/var/www/html has been created on the host.
2.2. Set up the local repository
- Download the tarball files for Ambari and HDP stacks through running the following commands:
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/22.214.171.124/ambari-126.96.36.199-centos7.tar.gzwget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/188.8.131.52/HDP-184.108.40.206-centos7-rpm.tar.gzwget http://public-repo-1.hortonworks.com/HDP-UTILS-220.127.116.11/repos/centos7/HDP-UTILS-18.104.22.168-centos7.tar.gzwget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/22.214.171.124/HDP-GPL-126.96.36.199-centos7-gpl.tar.gz
- Untar and copy the files to
/var/www/html. For example,
tar zxvf ambari-188.8.131.52-centos7.tar.gz -C /var/www/html/
- Then record the local base URLs, which are needed for installing the cluster:
- make sure you can browser in the web browser;
- The path where you can see the
3. Install Ambari Server and Agent
3.1. Download Ambari repository
- Login to the
- Check the repository URL from Ambari Repository Links
- Download Ambari repository file to the directory
/etc/yum.repos.d/, through the following command:
wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/184.108.40.206/ambari.repo -O /etc/yum.repos.d/ambari.repo
- Edit the
ambari.repofile and change the
gpgkeyto the local repository obtained in step 2.2.
yum repolistto confirm that the repository has been configured successfully. You should see
ambari-220.127.116.11-xxxon the list.
See Download Ambari Repository for more information.
3.2. Install Ambari server
Install Ambari server on the master node through command:
yum install -y ambari-server
See Install Ambari Server for more information.
3.3. Set up Ambari server
-Dhttp.proxyHost=<yourProxyHost> -Dhttp.proxyPort=<yourProxyPort> -Dhttps.proxyHost=<yourProxyHost> -Dhttps.proxyPort=<yourProxyPort> in the file
Run following command on the Ambari server host to start the setup process:
See Set Up Ambari Server for more information.
4. Install, Configure and Deploy the Cluster
4.1. Start the Ambari server
You can start the Ambari server through running:
After the server starts successfully, you can login to the server with default username/password
4.2. Install HDP through installation wizard
Follow the steps of the Wizard to intall HDP:
- Step 0 — Get Started: give a name to your cluster, for example,
- Step 1 — Select Version: select
Use Local Repository. Delete all other OS, leave readhat7 only. Copy the local base URL to the places.
- Step 2 — Install Options settings:
- Step 3 — Confirm Hosts: it will automatically do the regiestration with the settings from Step 2.
- Step 4 — Choose Services: choose basic ones, you can add more later.
- Step 5 — Assign Masters: keep default
- Step 6 — Select all for the
There are the 5 errors I encountered during the installation process.
Error 2: Requires libtirpc-devel: https://community.hortonworks.com/idea/107386/libtirpc-devel-required.html
Run following commands on all hosts:
subscription-manager repos --enable=rhel-7-server-optional-rpms
yum install -y libtirpc-devel
Error 3 Hive install failed because of mysql-connector-java.jar due to HTTP error: HTTP Error 404: Not Found https://community.hortonworks.com/articles/170133/hive-start-failed-because-of-ambari-error-mysql-co.html
Run following commands on Ambari server:
yum install -y mysql-connector-java
ls -al /usr/share/java/mysql-connector-java.jar
ln -s /usr/share/java/mysql-connector-java.jar mysql-connector-java.jar
Error 4 Empty Baseurl for Public Repository (No solution, might be proxy issue): https://community.hortonworks.com/questions/45147/ambari-setup-select-stack-404-error.html https://community.hortonworks.com/questions/35820/hdp-stack-repositories-not-found.html
Error 5 Ambari Files View — Service hdfs check failed:
In order to fix it you should try creating a new “File View” instance by clicking on the “Create Instance” button on the File View. You can choose the default options to create the view instance (if it is not kerberized) https://community.hortonworks.com/questions/128758/service-hdfs-check-failed-from-ambari.html?page=1&pageSize=10&sort=votes
Official HDP 3.10 installation documentation: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/index.html
Apache Ambari Installation Document: https://docs.hortonworks.com/HDPDocuments/Ambari-18.104.22.168/bk_ambari-installation/content/ch_Getting_Ready.html
Check Hortonworks Support Matrix to make sure product versions: https://supportmatrix.hortonworks.com/
Using yum with a Proxy server: https://docs.fedoraproject.org/en-US/Fedora_Core/3/html/Software_Management_Guide/sn-yum-proxy-server.html