<  Back

Applied Data Engineering

Applied Data Engineering


Course Overview

Upon completion of this course, participants will be familiarized with all major aspects of Big Data Analytics and its ecosystems. Participants will be able to develop, construct, test and maintain architectures such as databases and large-scale processing systems, perform batch and real-time streaming analytics on structured and unstructured data, execute professional data management, as well as create visualizations and dashboards. This course will provide an in-depth, stepwise hands-on experience.

Audience Profile

This course is suitable for candidates who are interested in knowing more about the Big Data Analytics ecosystem, looking to become full-fledged Data Engineers, and acquiring some technical know-how in the area of Data Science. Participants should preferably have some knowledge in Python Programming.

Course Outline

Course Outline

  • Introduction to Big Data
  • Features of Data Engineering
  • What is ETL/ELT and Best Practices
  • Metadata Management
  • Consolidating Multiple Data Sources
  • Data Ingestion, Cleansing & Transformation
  • Hadoop Architecture and Ecosystem
  • Flat Files Ingestion into Hadoop
  • RDBMS & Hadoop Integration
  • Hive Data Processing
  • Interactive Query using Impala
  • Log Files Handling and Processing
  • Data Web Scraping
  • Introduction to Spark
  • Processing Data using PySpark
  • Spark Data Query
  • Real-time Data Analytics in Spark Streaming
  • Troubleshooting ETL Jobs
  • Performance Optimization


Code ADE
Duration 5 Days
Full Price RM 7,200
Early Bird Price RM 6,500

    Sign up and enjoy
    early bird discount!