Hadoop

10000 25000 30000
Data Base Management

HADOOP Admin & Development

Introduction to Big data and Hadoop

  1. Understanding Big Data
  2. Challenges in processing Big Data
  3. 3V Characteristics (Volume, Variety and Velocity)
  4. Brief history of Hadoop
  5. How Hadoop addresses Big Data?
  6. Core Hadoop Daemons
  7. Hadoop echo system
  8. Hadoop Clusters

LINUX Commands Hands on

HDFS (Hadoop Distributed File System)

  1. HDFS Overview and Architecture
  2. HDFS Keywords like Name Node, Data Node, Heart Beat etc
  3. Configuring HDFS
  4. Data Flows (Read and Write)
  5. HDFS Permissions and Security
  6. HDFS commands
  7. HDFS from Admin stand point
  8. Rack Awareness

Map Reduce

  1. Basics of Map Reduce
  2. Map Reduce Data Flow
  3. Word count Example solving
  4. Developing a Map Reduce Application
  5. Configuring Map Reduce
  6. 2 ways executing Map Reduce program
  7. Input and Output file formats
  8. Driver, Mapper and Reducer Code walk thru
  9. Hadoop Integration with Eclipse in Linux
  10. Partitioners
  11. Map Reduce Web UI
  12. Joins, Distributed cache
  13. Compression techniques in Map Reduce

 

How Map Reduce works?

  1. Classic Map Reduce (Map Reduce I)
  2. YARN (Map Reduce II)
  3. Shuffle and Sort
  4. Job Chaining
  5. Input formats – Input splits & custom file input formats
  6. Output formats – text output, custom file output formats
  7. Hands-on

Hadoop Echo System PIG

  1. Overview of PIG
  2. PIG Latin
  3. Why PIG?
  4. Loading and storing data
  5. 21 Transformations of PIG
  6. Local and HDFS modes of PIG
  7. Grunt Shell
  8. Script and Embedded modes of processing using PIG
  9. Understanding Complex data types of PIG
  10. Word Count using PIG
  11. Hands-on

HIVE

  1. Overview of HIVE
  2. PIG vs HIVE
  3. HiveQL
  4. Managed and External Tables
  5. LOAD vs INSERT
  6. Views
  7. CTAS
  8. Partitioning
  9. Bucketing
  10. Dynamic partitioning vs Bucketing
  11. OVERWRITE key word
  12. Collection Data types in HIVE
  13. Date type in HIVE
  14. ORC File Format and other File Formats
  15. Understanding SerDe
  16. Types of Hive JOINS
  17. Tuning Hive JOINS
  18. Vectorization
  19. Exploring HIVE User Defined Functions
  20. HIVE Unions
  21. Hands-on
  22. Temporary Tables
  23. Delete, Update Operations

HBASE

  1. Overview of HBASE
  2. NoSQL vs RDBMS
  3. HBASE vs HDFS
  4. HBASE Shell
  5. CRUD with JAVA API
  6. Hands-on

SQOOP

  1. Overview
  2. Data Ingestion mechanisms
  3. Getting granted from MySQL
  4. SQOOPING from MySQL
  5. SQOOPING to MYSQL
  6. Incremental append
  7. working with Sqoop jobs

Understanding OOZIE with use cases

Understanding FLUME with use cases

Hadoop Developer Admin

  1. Single Node Hadoop Cluster setup
    1. OS installation
    2. SSH Setup
    3. Java Setup
    4. Hadoop Installation
    5. Configuring Hadoop
  2. Multi Node Hadoop Cluster setup
  3. Installation of PIG, HIVE, SQOOP2 Components

 

Assignments at the end of every Component

5 - POC's

Real time project explanation

Training Details:

Course Duration: 40-50 DAYS (Mutual agreement) + Assignments

Note: We provide soft copies of materials and recorded videos of all classes to students directly. If we have any issues lets have internal meeting with trainers

Pre-Requisites to Learn Hadoop:

Pretty basics of

  • Core Java
  •  SQL
  • Linux commands

Who Should Join this Course:

This course has been designed for people aspiring to learn and work in Big Data world using Hadoop Framework and become a Hadoop Developer. IT Freshers, Graduates/Post Graduates from other domains with knowledge on pre requisites, Software Professionals, Analytics Professionals, and ETL developers are the key beneficiaries of this course.

Eligibility:

Graduate/Post Graduate Degree (B.Tech, MCA, M.Sc, M.S, B.Sc, BCA), MBA, Computer professional.

 

Duration: 40 Days

Fee: Rs.25,000/-

 

Batches

26

Oct
Prof.Madhu
40 Days
06:15 AM - 08:15AM
25000
30000

Drop us a Query

Scroll