Install Hortonworks HDP 3.1.0 on A Cluster of VMWare Virtual Machines

Hortonworks Logo from https://images.app.goo.gl/3YDkHkEiqUEUEA8a7

Virtual Nodes Information

1. Prepare Cluster Environment

1.1. Setting Proxy

export https_proxy=https://<your.https-proxy.address>:<port#>
export http_proxy=http://<your.http-proxy.address>:<port#>

1.2. Make sure run level is multi-user text mode

systemctl get-default
systemctl set-default multi-user.target

1.3. Check and set hostnames

hostname -f
hostname hadoop-master.qualified.doman.name
vi /etc/sysconfig/network
NETWORK=yes
HOSTNAME=fully.qualified.doman.name
vi /etc/hosts
146.xxx.xxx.75 hadoop-master.qualified.doman.name hadoop-master146.xxx.xxx.76 hadoop-node-1.qualified.doman.name hadoop-node-1146.xxx.xxx.77 hadoop-node-2.qualified.doman.name hadoop-node-2

1.4. Set up password-less SSH

  • Login to the hadoop-master host with root user and generate SSH keys using ssh-keygen -t rsa. Press enter for all prompts and accept all default values.
  • Run the following command ssh-copy-id localhostto copy ssh identification for localhost. Enter password when prompted for the password.
  • Then run command ssh hadoop-master to make sure no password needed.
  • Copy the SSH file from hadoop-master to every other hosts in the cluster, through running:
scp -pr /root/.ssh root@hadoop-node-1.qualified.doman.name:/root/
  • Upload the generated id_rsa.pub to the root’s .ssh directory as a file with name authorized_keys, through running:
cat .ssh/id_rsa.pub | ssh root@hadoop-node-1.qualified.doman.name 'cat >> .ssh/authorized_keys'
  • Set permissions for .ssh directory and authorized_keys file:
ssh root@hadoop-node-1.qualified.doman.name; chmod 700 .ssh; chmod 640 .ssh/authorized_keys
ssh hadoop-node-1
ssh hadoop-node-2
ssh hadoop-master
ssh hadoop-node-2
ssh hadoop-node-1
ssh hadoop-master

1.5. Enable NTP

yum install -y ntp
systemctl enable ntpd
systemctl start ntpd
NTP enabled: yes
NTP synchronized: yes
  • Stop ntp serivce: systemctl stop ntpd
  • Add server your.ntp.server.address to the /etc/ntp.conf 's servers part.
  • Force time synchronize: ntpdate your.ntp.server.address
  • Restart ntp: systemctl start ntpd
  • Run systemctl enable ntpdate to make sure running the ntpdate at boot time.

1.6. Configure firewall

systemctl disable firewalld
service firewalld stop

1.7. Disable SElinux

2. Set up a Local Repository for Ambari and HDP Stack

2.1. Create and start an HTTP sever on the master host

yum install -y httpd
service httpd restart
chkconfig httpd on

2.2. Set up the local repository

  • Download the tarball files for Ambari and HDP stacks through running the following commands:
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari-2.7.3.0-centos7.tar.gzwget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0/HDP-3.1.0.0-centos7-rpm.tar.gzwget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7/HDP-UTILS-1.1.0.22-centos7.tar.gzwget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.1.0.0/HDP-GPL-3.1.0.0-centos7-gpl.tar.gz
  • Untar and copy the files to /var/www/html. For example,
tar zxvf ambari-2.7.3.0-centos7.tar.gz -C /var/www/html/
  • Then record the local base URLs, which are needed for installing the cluster:
Ambari: http://146.xxx.xxx.75/ambari/centos7/2.7.3.0-139/
HDP: http://146.xxx.xxx.75/HDP/centos7/3.1.0.0-78/
HDP-GPL: http://146.xxx.xxx.75/HDP-GPL/centos7/3.1.0.0-78/
HDP-UTILS: http://146.xxx.xxx.75/HDP-UTILS/centos7/1.1.0.22/
  • make sure you can browser in the web browser;
  • The path where you can see the repodata directory.

3. Install Ambari Server and Agent

3.1. Download Ambari repository

  • Login to the hadoop-master host as root
  • Check the repository URL from Ambari Repository Links
  • Download Ambari repository file to the directory/etc/yum.repos.d/, through the following command:
wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo -O   /etc/yum.repos.d/ambari.repo
  • Edit the ambari.repo file and change the baseurl and gpgkey to the local repository obtained in step 2.2.
  • Run yum repolist to confirm that the repository has been configured successfully. You should see ambari-2.7.3.0-xxx on the list.

3.2. Install Ambari server

yum install -y ambari-server

3.3. Set up Ambari server

ambari-server setup

4. Install, Configure and Deploy the Cluster

4.1. Start the Ambari server

ambari-server start

4.2. Install HDP through installation wizard

  • Step 0 — Get Started: give a name to your cluster, for example, MyHadoop
  • Step 1 — Select Version: select HDP-3.1, Use Local Repository. Delete all other OS, leave readhat7 only. Copy the local base URL to the places.
  • Step 2 — Install Options settings:
  • Step 3 — Confirm Hosts: it will automatically do the regiestration with the settings from Step 2.
  • Step 4 — Choose Services: choose basic ones, you can add more later.
  • Step 5 — Assign Masters: keep default
  • Step 6 — Select all for the Client option.

Known Errors

subscription-manager repos --enable=rhel-7-server-optional-rpms
yum install -y libtirpc-devel
yum install -y mysql-connector-java
ls -al /usr/share/java/mysql-connector-java.jar
cd /var/lib/ambari-server/resources/
ln -s /usr/share/java/mysql-connector-java.jar mysql-connector-java.jar

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Lei Feng

Lei Feng

Big Data, Google Cloud Platform, Machine Learning, Operations Research