Home >
CSE&IT Courses >>
Hadoop Development Course
Hadoop Development Course
Curriculum
Introduction to big data
What is Big Data
Reasons of Big Data Generation
Use cases of Big Data
Different options of analyzing Big Data
Introduction to hadoop
What is Hadoop & Hadoop History
Problems with Traditional Large-Scale Systems and Need for Hadoop
Understanding Hadoop Architecture
Fundamental of HDFS (Blocks, Name Node, Data Node, Secondary Name Node)
Block Placement & Rack Awareness
HDFS Read/Write/Delete
Under/Over/Missing Replication
Types of Scaling(Horizantal/Vertical)
Drawback with 1.X Hadoop
Introduction to 2.X Hadoop
HDFS Federation and High Availability
Starting Hadoop
Setting up single node Hadoop cluster(Pseudo mode)
Understanding Hadoop configuration files
Hadoop Components- HDFS, MapReduce
Overview Of Hadoop Processes
Overview Of HDFS File System
The building blocks of Hadoop
Hands-On Exercise: Using HDFS commands
MapReduce-1(MR V1)
Understanding Map Reduce
Job Tracker and Task Tracker
Architecture of Map Reduce
Data Flow of Map Reduce
Hadoop Writable, Comparable & comparison with Java data types
Creation of local files and directories with Hadoop API
Creation of HDFS files and directories with Hadoop API
Map Function & Reduce Function
How Map Reduce Works
Anatomy of Map Reduce Job
Submission & Initialization of Map Reduce Job
Monitoring & Progress of Map Reduce Job
Understand Difference Between Block and Input Split
Role of Record Reader, Shuffler and Sorter
File Input Formats
Setting up Eclipse Development Environment
Creating Map Reduce Projects
Identity Reducer
Map Reduce program flow with word count
Speculative execution
Schedulers (FIFO Scheduler, FAIR Scheduler, CAPACITY Scheduler)
MapReduce-2(Yarn)
Limitations of Current Architecture
YARN Architecture
Application Master, Node Manager & Resource Manager
Developing Map Reduce Application using YARN
HIVE
Introduction to Apache Hive
Architecture of Hive
Installing Hive
Hive data types
Exploring hive metastore tables
Types of Tables in Hive
Partitions(Static & Dynamic)
Buckets & Sampling
Indexes & Views
Developing hive scripts
PIG
Introduction to Apache Pig
Building Blocks ( Bag, Tuple & Field)
Installing Pig
PIG Terminology & Data types
Different modes of execution of PIG
Working with various PIG Commands covering all the functions in PIG
Developing PIG scripts
Joins ( Left Outer, Right Outer, Full Outer)
Nested queries
Specialized joins in PIG (Replicated, Skewed & Merge Join)
HCatalog (Getting data from hive to pig & vice versa)
Working with un-structured data
Working with Semi-structured data like XML, JSON
SQOOP
Introduction to SQOOP & Architecture
Import data from RDBMS to HDFS
Importing Data from RDBMS to HIVE
Exporting data from HIVE to RDBMS
Handling incremental loads using sqoop
HBASE
Introduction to HBASE
Exploring HBASE Master & Region server
Exploring Zookeeper
HIVE integration with HBASE(HBASE-Managed hive tables)
OOZIE
What is Oozie & Why Oozie
Features of Oozie
Job Types in Oozie
Control Nodes & Action Nodes
Oozie Workflow Process flow
Oozie Web Console
FAQS, Real Time SCENARIOS And Real Time Interview Questions
Enroll Now