Get the Best Training from IT Experts:    SAP Online Training | Oracle Online Training | Microsoft Online Training | Network/System Admin Online Training | Mobile Online Training | Databases Online Training | Testing Tools Online Training | Datawarehouse Online Training
NeedHelp? Need Help ? +1(214)453-1995

BigData Hadoop Online Training

Home » BigData Hadoop Online Training


Course Content:


HADOOP Admin Content.


The Case for Apache Hadoop
• A brief history of Hadoop
• Core Hadoop components
• Fundamental concepts

The Hadoop Distributed File System
• HDFS features
• HDFS design assumptions
• Overview of HDFS architecture
• Writing and reading files
• NameNode considerations
• An overview of HDFS security
• Hands-On Exercise

• What is MapReduce?
• Features of MapReduce
• Basic MapReduce concepts
• Architectural overview
• Failure recovery
• Hands-On Exercise

An Overview of the Hadoop Ecosystem
• What is the Hadoop ecosystem?
• Integration tools
• Analysis tools
• Data storage and retrieval tools

Planning your Hadoop Cluster
• General planning considerations
• Choosing the right hardware
• Network considerations
• Configuring nodes

Hadoop Installation
• Installing Hadoop
• Using Cloudera Manager for easy installation
• Basic configuration parameters
• Hands-On Exercise

Advanced Configuration
• Advanced parameters
• Configuring rack awareness
• Configuring Federation
• Configuring High Availability

Managing and Scheduling Jobs
• Managing running jobs
• Hands-On Exercise
• The FIFO Scheduler
• The FairScheduler
• Configuring the FairScheduler
• Hands-On Exercise

Cluster Maintenance
• Checking HDFS status
• Hands-On Exercise
• Copying data between clusters
• Adding and removing cluster nodes
• Rebalancing the cluster
• Hands-On Exercise
• NameNode Metadata backup
• Cluster upgrading

Cluster Monitoring and Troubleshooting
• General system monitoring
• Managing Hadoop's log files
• Using the NameNode and JobTracker Web UIs
• Hands-On Exercise
• Cluster monitoring with Ganglia
• Common troubleshooting issues
• Benchmarking your cluster

Populating HDFS from External Sources
• An overview of Flume
• Hands-On Exercise
• An overview of Sqoop
• Best practices for importing data

Installing and Managing Other Hadoop Projects
• Hive
• Pig
• HBase

HADOOP Developer Content:


The Motivation for Hadoop
• Problems with traditional large-scale systems
• Requirements for a new approach

Hadoop: Basic Concepts
• An Overview of Hadoop
• The Hadoop Distributed File System
• Hands-On Exercise
• How MapReduce Works
• Hands-On Exercise
• Anatomy of a Hadoop Cluster
• Other Hadoop Ecosystem Components

Writing a MapReduce Program
• The MapReduce Flow
• Examining a Sample MapReduce Program
• Basic MapReduce API Concepts
• The Driver Code
• The Mapper
• The Reducer
• Hadoop's Streaming API
• Using Eclipse for Rapid Development
• Hands-on exercise
• The New MapReduce API

Integrating Hadoop into the Workflow
• Relational Database Management Systems
• Storage Systems

• Importing Data from RDBMSs With Sqoop
• Hands-on exercise
• Importing Real-Time Data with Flume
• Accessing HDFS Using FuseDFS and Hoop

Delving Deeper Into The Hadoop API
• More about ToolRunner
• Testing with MRUnit
• Reducing Intermediate Data With Combiners
• The configure and close methods for Map/Reduce Setup and Teardown
• Writing Partitioners for Better Load Balancing
• Hands-On Exercise
• Directly Accessing HDFS
• Using the Distributed Cache
• Hands-On Exercise

Common MapReduce Algorithms
• Sorting and Searching
• Indexing
• Machine Learning With Mahout
• Term Frequency – Inverse Document Frequency
• Word Co-Occurrence
• Hands-On Exercise

Using Hive and Pig
• Hive Basics
• Pig Basics
• Hands-on exercise

Practical Development Tips and Techniques
• Debugging MapReduce Code
• Using LocalJobRunner Mode For Easier Debugging
• Retrieving Job Information with Counters
• Logging
• Splittable File Formats
• Determining the Optimal Number of Reducers
• Map-Only MapReduce Jobs
• Hands-On Exercise

Joining Data Sets in MapReduce
• Map-Side Joins
• The Secondary Sort
• Reduce-Side Joins

Graph Manipulation in Hadoop
• Introduction to graph techniques
• Representing graphs in Hadoop
• Implementing a sample algorithm: Single Source Shortest Path

Creating Workflows With Oozie
• The Motivation for Oozie
• Oozie's Workflow Definition Format
• Hands-On Exercise



Request a call back


[email protected]

Request A Call Back
Leave your details and Our training consultant will get back to you asap.
I Agree to be contacted over mail and phone.
Request For A Demo
Register Today for a Free Live Webinar with our Sr. Expert Trainer.
We Don't Share your contact details with others!
100% Privacy Guaranteed.