About The Course
This Big Data Hadoop Certification course is designed to give you an in-depth knowledge of the big data framework using Hadoop and Spark. In this hands-on big data course, you will execute real-life, industry-based projects using Simplilearn’s integrated labs.
The course will help you develop the right skills needed to be qualified as a Big Data Developer.
This course is brought to you by JobsForHer Foundation in association with Simplilearn.
Course Key Features:
74 hours of blended learning
22 hours of Online self-paced learning
52 hours of instructor-led training
Four industry-based course-end projects
Interactive learning with integrated labs
2 Curriculum aligned to Cloudera CCA175 certification exam
Training on essential big data and Hadoop ecosystem tools, and Apache Spark Dedicated mentoring session from faculty of industry experts
Course Duration: 4-5 weeks
Course Start Date: Shortlisted candidates will be notified via email
Certificate: Upon completion of the course
Program Fees: This course is priced at Rs. 23625. But those who are shortlisted for the scholarship will get this FREE of cost.
Online or Offline: Online Bootcamp: Online self-learning and live instructor-led classes
Who Should Enroll
- Analytics professionals
- Senior IT professionals
- Testing and mainframe professionals
- Data management professionals
- Business intelligence professionals
- Project managers
- Graduates looking to begin a career in big data analytics
Prerequisites
It is recommended that you have knowledge of:
- Core Java
- SQL
Course Takeaways
This Big Data Hadoop and Spark Developer course will enable you to:
- Learn how to navigate the Hadoop ecosystem and understand how to optimize its use Ingest data using Sqoop, Flume, and Kafka.
- Implement partitioning, bucketing, and indexing in Hive
- Work with RDD in Apache Spark
- Process real-time streaming data
- DataFrame operations in Spark using SQL queries
- Implement User-Defined Functions (UDF) and User-Defined Attribute Functions (UDAF) in Spark
Certification Alignment: Our curriculum is aligned to Cloudera CCA175 certification exam.
Certification Details and Criteria:
1) Completion of at least 85 percent of online self-paced learning or attendance of one live virtual classroom
2) A score of at least 75 percent in course-end assessment
3) Successful evaluation in at least one project
Course End Projects:
The course includes four real-world, industry-based projects. The successful evaluation of one of the following projects is a part of the certification eligibility criteria:
Project 1: Analyzing Historical Insurance Claims
Use Hadoop features to predict patterns and share actionable insights for a car insurance company
This project uses New York Stock Exchange data from 2010 to 2016, captured from 500+ listed companies. The data set consists of each listed company’s intraday prices and volume traded. The data is used in both machine learning and exploratory analysis projects for the purposes of automating the trading process and predicting the next trading-day winners or losers. The scope of this project is limited to exploratory data analysis.
Domain: BFSI
Project 2: Employee Review of Comment Analysis
Use Hive features for data analysis and share the actionable insights with the HR team for the purpose of taking corrective actions.
The HR team is surfing social media to gather current and ex-employee feedback and sentiments. This information will be used to derive actionable insights and take corrective actions to improve the employer-employee relationship. The data is web-scraped from Glassdoor and contains detailed reviews of 67K employees from Google, Amazon, Facebook, Apple, Microsoft, and Netflix.
Domain: Human Resources
Project 3:
K-Means Clustering for Telecommunication Domain LoudAcre Mobile is a mobile phone service provider which has introduced a new open network campaign. As a part of this campaign, the company has invited users to complain about mobile phone network towers in their area if they are experiencing connectivity issues with their present mobile network. LoudAcre has collected the dataset of users who have complained.
Domain: Telecommunication
Project 4: Market Analysis in Banking Domain
Our client, a Portuguese banking institution, ran a marketing campaign to convince potential customers to invest in a bank term deposit promotion. The marketing campaign pitches were delivered by phone calls. Often, however, the same customer was contacted more than once.
You have to perform the marketing analysis of the data generated by this campaign, keeping in mind the redundant calls
Domain: Banking(Market Analysis)
Tools Covered:
1. Apache Hive
2. Apache Impala by Cloudera
3. Apache Spark
4. Apache Flume
5. Hadoop Map Reduce
6. Apache Kafka
7. Apache Sqoop