Expert-led training for your team

Hadoop Administration training course

Install, configure, and manage the Apache Hadoop platform and its associated ecosystem, and build a Hadoop solution for Big Data

JBI training course London UK

"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"

Brian Leek, Data Analyst, May 2022

Public Courses

24/06/24 - 3 days
£1600 +VAT
05/08/24 - 3 days
£1600 +VAT
16/09/24 - 3 days
£1600 +VAT

Customised Courses

* Train a team
* Tailor content
* Flex dates
From £1200 / day
EDF logo Capita logo Sky logo NHS logo RBS logo BBC logo CISCO logo
JBI training course London UK

  • Gain an introduction to data storage and processing 
  • Architect a Hadoop solution to meet your business requirements
  • Install and build a Hadoop cluster capable of processing large data
  • Configure and tune the Hadoop environment to ensure high throughput and availability
  • Allocate, distribute and manage resources
  • Monitor the file system, job progress and overall cluster performance

Introduction to Data Storage and Processing

  • Installing the Hadoop Distributed File System (HDFS)
  •     Defining key design assumptions and architecture
  •     Configuring and setting up the file system
  •     Issuing commands from the console
  •     Reading and writing files
  • Setting the stage for MapReduce
  •     Reviewing the MapReduce approach
  •     Introducing the computing daemons
  •     Dissecting a MapReduce job

Defining Hadoop Cluster Requirements

  • Planning the architecture
  •     Selecting appropriate hardware
  •     Designing a scalable cluster
  • Building the cluster
  •     Installing Hadoop daemons
  •     Optimising the network architecture

Configuring a Cluster

  • Preparing HDFS
  •     Setting basic configuration parameters
  •     Configuring block allocation, redundancy and replication
  • Deploying MapReduce
  •     Installing and setting up the MapReduce environment
  •     Delivering redundant load balancing via Rack Awareness

Maximising HDFS Robustness

  • Creating a fault–tolerant file system
  •     Isolating single points of failure
  •     Maintaining High Availability
  •     Triggering manual failover
  •     Automating failover with Zookeeper
  • Leveraging NameNode Federation
  •     Extending HDFS resources
  •     Managing the namespace volumes
  • Introducing YARN
  •     Critiquing the YARN architecture
  •     Identifying the new daemons

Managing Resources and Cluster Health

  • Allocating resources
  •     Setting quotas to constrain HDFS utilisation
  •     Prioritising access to MapReduce using schedulers
  • Maintaining HDFS
  •     Starting and stopping Hadoop daemons
  •     Monitoring HDFS status
  •     Adding and removing data nodes
  • Administering MapReduce
  •     Managing MapReduce jobs
  •     Tracking progress with monitoring tools
  •     Commissioning and decommissioning compute nodes

Maintaining a Cluster

  • Employing the standard built–in tools
  •     Managing and debugging processes using JVM metrics
  •     Performing Hadoop status checks
  • Tuning with supplementary tools
  •     Assessing performance with Ganglia
  •     Benchmarking to ensure continued performance

Extending Hadoop

  • Simplifying information access
  •     Enabling SQL–like querying with Hive
  •     Installing Pig to create MapReduce jobs
  • Integrating additional elements of the ecosystem
  •     Imposing a tabular view on HDFS with HBase
  •     Configuring Oozie to schedule workflows

Implementing Data Ingress and Egress

  • Facilitating generic input/output
  •     Moving bulk data into and out of Hadoop
  •     Transmitting HDFS data over HTTP with WebHDFS
  • Acquiring application–specific data
  •     Collecting multi–sourced log files with Flume
  •     Importing and exporting relational information with Sqoop
  • Planning for Backup, Recovery and Security
  •     Coping with inevitable hardware failures
  •     Securing your Hadoop cluster
JBI training course London UK

IT professionals looking to learn about how to architect and administer Apache Hadoop and clusters for Big Data

5 star

4.8 out of 5 average

"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"

Brian Leek, Data Analyst, May 2022

“JBI  did a great job of customizing their syllabus to suit our business  needs and also bringing our team up to speed on the current best practices. Our teams varied widely in terms of experience and  the Instructor handled this particularly well - very impressive”

Brian F, Team Lead, RBS, Data Analysis Course, 20 April 2022



JBI training course London UK



Sign up for the JBI Training newsletter to stay updated with world-class technology training opportunities, including Analytics, AI, ML, DevOps, Web, Backend and Security. Our Power BI Training Course is especially popular.  Gain new skills, useful tips, and validate your expertise with an industry-leading organisation, all tailored to your schedule and learning preferences.

In this Hadoop architecture and administration training course, you will gain the skills to install, configure and manage the Apache Hadoop platform and its associated ecosystem.

You will learn how to build a Hadoop solution that satisfies your business requirements.

+44 (0)20 8446 7555

[email protected]



Copyright © 2023 JBI Training. All Rights Reserved.
JB International Training Ltd  -  Company Registration Number: 08458005
Registered Address: Wohl Enterprise Hub, 2B Redbourne Avenue, London, N3 2BS

Modern Slavery Statement & Corporate Policies | Terms & Conditions | Contact Us


Rust training course                                                                          React training course

Threat modelling training course   Python for data analysts training course

Power BI training course                                   Machine Learning training course

Spring Boot Microservices training course              Terraform training course

Kubernetes training course                                                            C++ training course

Power Automate training course                               Clean Code training course