In this Hadoop architecture and administration training course, you gain the skills to install, configure, and manage the Apache Hadoop platform and its associated ecosystem, and build a Hadoop solution that satisfies your business requirements.
IT professionals looking to learn about how to architect and administer Apache Hadoop and clusters for Big Data
Installing the Hadoop Distributed File System (HDFS)
Defining key design assumptions and architecture
Configuring and setting up the file system
Issuing commands from the console
Reading and writing files
Setting the stage for MapReduce
Reviewing the MapReduce approach
Introducing the computing daemons
Dissecting a MapReduce job
Planning the architecture
Selecting appropriate hardware
Designing a scalable cluster
Building the cluster
Installing Hadoop daemons
Optimising the network architecture
Setting basic configuration parameters
Configuring block allocation, redundancy and replication
Installing and setting up the MapReduce environment
Delivering redundant load balancing via Rack Awareness
Creating a fault–tolerant file system
Isolating single points of failure
Maintaining High Availability
Triggering manual failover
Automating failover with Zookeeper
Leveraging NameNode Federation
Extending HDFS resources
Managing the namespace volumes
Critiquing the YARN architecture
Identifying the new daemons
Setting quotas to constrain HDFS utilisation
Prioritising access to MapReduce using schedulers
Starting and stopping Hadoop daemons
Monitoring HDFS status
Adding and removing data nodes
Managing MapReduce jobs
Tracking progress with monitoring tools
Commissioning and decommissioning compute nodes
Employing the standard built–in tools
Managing and debugging processes using JVM metrics
Performing Hadoop status checks
Tuning with supplementary tools
Assessing performance with Ganglia
Benchmarking to ensure continued performance
Simplifying information access
Enabling SQL–like querying with Hive
Installing Pig to create MapReduce jobs
Integrating additional elements of the ecosystem
Imposing a tabular view on HDFS with HBase
Configuring Oozie to schedule workflows
Facilitating generic input/output
Moving bulk data into and out of Hadoop
Transmitting HDFS data over HTTP with WebHDFS
Acquiring application–specific data
Collecting multi–sourced log files with Flume
Importing and exporting relational information with Sqoop
Planning for Backup, Recovery and Security
Coping with inevitable hardware failures
Securing your Hadoop cluster
19/01/2018: Having established itself as a key part of corporate Big Data programs, Hadoop continues to grow in importance. Unsurprisingly, Hadoop and Big...
16/01/2018: As Big Data becomes an integral part of the data-driven enterprise, businesses are encountering problems securing the skills they need to make...
14/01/2018: Python, as we all know, is a general-purpose programming language that is fast becoming more and more popular for doing data science. Companies...
16/01/2018: Data Analysts at a Government establishment required the ability to respond rapidly to ad-hoc requests for information, including parliamentary...
18/12/2017: The Client A leading developer and manufacturer of sophisticated industrial products, abatement solutions and related value-added services. Their...
13/10/2017: This organisation needed their Supply Chain department to get fully involved with Microsoft’s Power BI reporting product as soon as possible....
Bring a JBI course to your office
and train a whole team onsite
0800 028 6400 or request quote
0800 028 6400
"great tips help reduce build times"
"we got access to exclusive content"
"Short course meant less time off"
"what an inspiring trainer !"
"colleagues at 2 sites joined via web"
"I passed my exam the next day"