gammatica

Big Data - Advanced

Learning Format

Online mode

Total training duration

80-90 hrs (2 months)

Syllabus

6 Weeks

Certification

Yes

Big Data Engineering – Advanced

The Advanced course focuses on building enterprise-level big data solutions using advanced tools and cloud platforms. Learners will master real-time data processing, data lake architecture, data governance, and performance optimization. It includes hands-on projects using Apache Spark, Kafka, Airflow, and cloud services (AWS/GCP/Azure) to design end-to-end scalable data systems.

Syllabus Summary

Kafka Basics

  • Kafka architecture (topics, partitions, brokers)
  • Producers & Consumers
  • Hands-on: Ingest sample data into Kafka topics

Kafka Integration with Spark

  • Kafka → Spark Structured Streaming ingestion
  • Running streaming ETL jobs in PySpark
  • Hands-on: Real-time pipeline from Kafka → Spark

Databricks for Big Data Engineering

  • Databricks architecture (clusters, notebooks, jobs)
  • Running PySpark jobs on Databricks
  • Using Delta Lake in Databricks (schema enforcement, time travel)
  • Hands-on: Batch ETL pipeline in Databricks

Advanced Databricks Use Cases

  • Integrating Kafka streams into Databricks
  • Optimizing Delta tables (merge, upserts, deletes)
  • Job scheduling and monitoring in Databricks
  • Mini Project: Near real-time ETL with Databricks

Airflow Fundamentals

  • Airflow architecture (scheduler, webserver, workers)
  • Writing first DAGs for Spark jobs
  • Operators, tasks, and dependencies
  • Hands-on: Simple batch workflow in Airflow

Hive Queries & Integrations

  • Hive DDL/DML commands
  • Partitioning & Bucketing basics
  • Performance considerations in Hive
  • Mini Project: Sales dataset analysis in Hive

Course Summary

Eligibility

Tech & Non-Tech Working professional, Freshers, Graduate from any domain.

Live Doubt Solving

Get your queries solved with daily dedicated doubts solving sessions.

Instructor

Experts and trainer for top-tech companies.

Certification

10+ ISO Globally recognized certified

Mode of Learning

100% Live Learning with experienced instructors and hands-on sessions.

Real time projects

Get practical experience with real-world projects for a career in analytics.

Certification

Gammatica is a company dedicated to providing high-quality coaching classes for students, designed to foster academic success and personal growth.

Quick Links

About

Help Centre

Business

Contact

About Us

Terms of Use

Our Team

Accessibility

Support

FAQs

Terms & Conditions

Privacy Policy

Career

Gammatica is a company dedicated to providing high-quality coaching classes for students, designed to foster academic success and personal growth.

© 2025 Developed By OMX Technologies

Scroll to Top