AWS Big Data Course
Course Curriculum:
Lesson 01 – AWS in Big Data introduction Introduction to Cloud Computing
Cloud Computing Deployments Models
Amazon Web Services Cloud Platform
The Cloud Computing Difference
AWS Cloud Economics
AWS Virtuous Cycle
AWS Cloud Architecture Design Principles
Why AWS for Big Data – Reasons
Why AWS for Big Data – Challenges
Databases in AWS
Relational vs Non-Relational Databases
Data Warehousing in AWS
Services for Collecting, Processing, Storing, and Analyzing Big Data 1.Amazon Redshift
2.Amazon Kinesis
3.Amazon EMR
4.Amazon DynamoDB
5.Amazon Machine Learning
6.AWS Lambda
7.Amazon Elasticsearch Service
8.Amazon EC2 (big data analytics software on EC2 instances) Amazon Redshift
Amazon Kinesis
Amazon EMR
Amazon DynamoDB
Amazon Machine Learning
AWS Lambda
Amazon Elasticsearch Service
Amazon EC2 (big data analytics software on EC2 instances) Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 02 – Collection
Objectives
Amazon Kinesis Fundamentals
Loading Data into Kinesis Stream
Kinesis Data Stream High-Level Architecture Kinesis Stream Core Concepts
Kinesis Stream Emitting Data to AWS Services Kinesis Connector Library
Kinesis Firehose
Transferring Data Using Lambda
Amazon SQS
IoT and Big Data
IoT Framework
AWS Data Pipeline
AWS Data Pipeline Components
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 03 – Storage
Objectives
Introduction to AWS Big Data Storage Services Amazon Glacier
Glacier and Big Data
DynamoDB Introduction
The Architecture of the DynamoDB Table DynamoDB in AWS Ecosystem
DynamoDB Partitions
Data Distribution
Local Secondary Index (LSI) **
Global Secondary Index (GSI) **
DynamoDB GSI vs LSI
DynamoDB Stream
Cross-Region Replication in DynamoDB Partition Key Selection
Snowball & AWS Big Data
AWS DMS
AWS Aurora in Big Data
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 04 – Processing I Objectives
Introduction to AWS Big Data Processing Services Amazon Elastic MapReduce (EMR)
Apache Hadoop
EMR Architecture
Storage Options
EMR File Storage and Compression
Supported File Format and File Size
Single-AZ Concept
EMR Operations
EMR Releases
AWS Cluster
Launching a Cluster
Advanced EMR Setting Option
Choosing Instance Type
Number of Instances
Monitoring EMR
Resizing of Cluster
Using Hue with EMR
Setup Hue for LDAP
Hive on EMR
Hive Use Cases
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 05 – Processing II
HBase with EMR
HBase Use Cases
Comparison of HBase with Redshift and DynamoDB HBase Architecture HBase on S3
HBase and EMRFS
HBase Integration
HCatalog
Presto with EMR
Advantages of Presto
Presto Architecture
Spark with EMR
Spark Use Cases
Spark Components
Spark Integration With EMR
AWS Lambda in AWS Big Data Ecosystem Limitations of Lambda
Lambda and Kinesis Stream
Lambda and Redshift
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 06 – Analysis I
Objectives
Introduction to AWS Big Data Analysis Services RedShift
RedShift Architecture
RedShift in the AWS Ecosystem
Columnar Databases
RedShift Table Design
RedShift Workload Management
RedShift Loading Data
RedShift Maintenance and Operations
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 07 – Analysis II
Machine Learning
Machine Learning – Use Cases
Algorithms
Amazon SageMaker
Elasticsearch
Amazon Elasticsearch Service
Loading of Data into Elasticsearch
Logstash
Kibana
RStudio
Characteristics
Athena
Presto and Hive
Integration with AWS Glue
Comparison of Athena with Other AWS Services
Lab Run Query on S3 Using Serverless Athena
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 08 – Visualisation
Objectives
Introduction to AWS Big Data Visualization Services Amazon QuickSight
Amazon QuickSight – Use Cases
LAB Create an Analysis with a Single Visual Using Sample Data Working with Data
Assisted Practice: TBD
QuickSight Visualization
Big Data Visualization
Apache Zeppelin
Jupyter Notebook
Comparison Between Notebooks
D3.js (Data-Driven Documents)
MicroStrategy
Key Takeaway
Knowledge Checks
Lesson End Project
Lesson 09 – Security
Objectives
Introduction to AWS Big Data Security Services EMR Security
Roles
Private Subnet
Encryption At Rest and In Transit
RedShift Security
KMS Overview
SloudHSM
Limit Data Access
STS and Cross Account Access
Cloud Trail
Key Takeaway
Knowledge Checks
Lesson End Project