ApacheCon NA 2013

Portland, Oregon

February 26th – 28th, 2013

Register Now!

Monday 9 a.m.–12:30 p.m.

Build and deploy your own Big Data distribution with Apache Bigtop

Bruno Mahé, Roman Shaposhnik

Audience level:


Apache Bigtop is a project for the integration of the Apache Hadoop ecosystem. It includes recipes to build, test and deploy these components. In this tutorial we will go through each steps to learn about how you can build and customize the packages yourself as well as deploy these components to make your own cluster (physical machines or in the cloud)


The goal of this tutorial is to introduce people to Apache Bigtop and to show how it can help them build their own Big Data solution tailored to their own problems.

This tutorial will start by describing the life cycle of a component in Apache Bigtop and how are these steps related and why they are needed. Starting with building packages, validating them and deploying them.

Then attendees will go through the steps to build their own component. This will also give them an introduction to packaging (what is a package, how to build one, how to adapt it) along the way.

After building their first component, they can now deploy it. First manually, then through the puppet recipes provided by Apache Bigtop. This will also give them an introduction to deployment practices, whether on physical cluster or in the cloud.

So by the end of this tutorial, attendees would learn how to customize, build and deploy their own Apache Hadoop based cluster.

In order to make the best use of the time, a virtual machine image will be provided so as to shorten the setup time.

If time permits, we could also go over some of the following:

  • Validate the packages

  • Add your own components

  • Build your own VM for your packages for development, deployment or testing