CUSTOMISED
Expert-led training for your team

Dismiss

Hadoop Administration training course

Install, configure, and manage the Apache Hadoop platform and its associated ecosystem, and build a Hadoop solution for Big Data

"Our tailored course provided a well rounded introduction and also covered some intermediate level topics that we needed to know. Clive gave us some best practice ideas and tips to take away. Fast paced but the instructor never lost any of the delegates"

Brian Leek, Data Analyst, May 2022

Public Courses

13/05/24 - 3 days

£1600 +VAT

Enrol

24/06/24 - 3 days

£1600 +VAT

Enrol

05/08/24 - 3 days

£1600 +VAT

Enrol

Customised Courses

* Train a team
* Tailor content
* Flex dates

From £1200 / day

Highlights Details Audience Feedback Prices/Dates

Public Courses

Highlights

Gain an introduction to data storage and processing
Architect a Hadoop solution to meet your business requirements
Install and build a Hadoop cluster capable of processing large data
Configure and tune the Hadoop environment to ensure high throughput and availability
Allocate, distribute and manage resources
Monitor the file system, job progress and overall cluster performance

Course Details

Introduction to Data Storage and Processing

Installing the Hadoop Distributed File System (HDFS)
Defining key design assumptions and architecture
Configuring and setting up the file system
Issuing commands from the console
Reading and writing files
Setting the stage for MapReduce
Reviewing the MapReduce approach
Introducing the computing daemons
Dissecting a MapReduce job

Defining Hadoop Cluster Requirements

Planning the architecture
Selecting appropriate hardware
Designing a scalable cluster
Building the cluster
Installing Hadoop daemons
Optimising the network architecture

Configuring a Cluster

Preparing HDFS
Setting basic configuration parameters
Configuring block allocation, redundancy and replication
Deploying MapReduce
Installing and setting up the MapReduce environment
Delivering redundant load balancing via Rack Awareness

Maximising HDFS Robustness

Creating a fault–tolerant file system
Isolating single points of failure
Maintaining High Availability
Triggering manual failover
Automating failover with Zookeeper
Leveraging NameNode Federation
Extending HDFS resources
Managing the namespace volumes
Introducing YARN
Critiquing the YARN architecture
Identifying the new daemons

Managing Resources and Cluster Health

Allocating resources
Setting quotas to constrain HDFS utilisation
Prioritising access to MapReduce using schedulers
Maintaining HDFS
Starting and stopping Hadoop daemons
Monitoring HDFS status
Adding and removing data nodes
Administering MapReduce
Managing MapReduce jobs
Tracking progress with monitoring tools
Commissioning and decommissioning compute nodes

Maintaining a Cluster

Employing the standard built–in tools
Managing and debugging processes using JVM metrics
Performing Hadoop status checks
Tuning with supplementary tools
Assessing performance with Ganglia
Benchmarking to ensure continued performance

Extending Hadoop

Simplifying information access
Enabling SQL–like querying with Hive
Installing Pig to create MapReduce jobs
Integrating additional elements of the ecosystem
Imposing a tabular view on HDFS with HBase
Configuring Oozie to schedule workflows

Implementing Data Ingress and Egress

Facilitating generic input/output
Moving bulk data into and out of Hadoop
Transmitting HDFS data over HTTP with WebHDFS
Acquiring application–specific data
Collecting multi–sourced log files with Flume
Importing and exporting relational information with Sqoop
Planning for Backup, Recovery and Security
Coping with inevitable hardware failures
Securing your Hadoop cluster

Who should attend

IT professionals looking to learn about how to architect and administer Apache Hadoop and clusters for Big Data

Feedback

4.8 out of 5 average

Brian Leek, Data Analyst, May 2022

“JBI did a great job of customizing their syllabus to suit our business needs and also bringing our team up to speed on the current best practices. Our teams varied widely in terms of experience and the Instructor handled this particularly well - very impressive”

Brian F, Team Lead, RBS, Data Analysis Course, 20 April 2022

Big Data Introduction

SQL

Sign up for the JBI Training newsletter to stay updated with world-class technology training opportunities, including Analytics, AI, ML, DevOps, Web, Backend and Security. Our Power BI Training Course is especially popular. Gain new skills, useful tips, and validate your expertise with an industry-leading organisation, all tailored to your schedule and learning preferences.

More about this course

In this Hadoop architecture and administration training course, you will gain the skills to install, configure and manage the Apache Hadoop platform and its associated ecosystem.

You will learn how to build a Hadoop solution that satisfies your business requirements.

FAQs

CONTACT
+44 (0)20 8446 7555

[email protected]

SHARE

Copyright © 2023 JBI Training. All Rights Reserved.
JB International Training Ltd - Company Registration Number: 08458005
Registered Address: Wohl Enterprise Hub, 2B Redbourne Avenue, London, N3 2BS

Modern Slavery Statement & Corporate Policies | Terms & Conditions | Contact Us