Hortonworks Cluster on Docker

MR. Sahputra
2 min readJan 9, 2018

As part of research related to Apache Metron (Real-Time bigdata security), I need to setup a bigdata cluster. I decided to use hortonworks since its HCP offers full support for Apache Metron.

I prefer to setup the cluster on top of docker because the purpose is only to get in touch with Metron. Sandboxing is not my fave option especially because it required VM like virtualbox to run. I prefer to setup one server and utilize linux container to run the cluster.

I’ve been Hetzner customer since couple of years now, their auctioned server always satisfying. Price for 48GB memory with around 5TB disk only 30 eur / month. It is enough to run a cluster with 4 node (1 is ambari server, 3 other is hadoop node). Since all node need to be able to communicate with each other through hostname, and in order to simply provisioning part, I use docker-compose to start containers. Here’s my simple setup:

Ambari image is basically an ubuntu 16.04 image that is installed with ambari server application. It is very simple, just follow the official procedure.

Once ambari server image created, I added also image for basic hadoop_node. It is basically an ubuntu 16.04 pre-installed with additional application such as ssh, python, that are required to install Hortonworks.

Start all containers through docker-compose and then start ambari server. Proceed with normal cluster installation. Once finished, we’ll get HDP cluster running on top of containers.

Hortonworks Data Platform running on docker containers

Now, with a bigdata cluster setup’d, we can proceed with application or data pipeline that utilize the cluster. In my case, that would be application related to security operation center.

Unlisted

--

--