Data Engineering Course

Course Structure*

Module 1: Introduction to Data Engineering
	Sprint 1	Intermediate Python & Git - Python data model, Python sequences, Git basics
	Sprint 2	Introduction to Relational Databases & SQL Basics - Python mutability and object references, SQL queries
	Sprint 3	Intermediate SQL - SQL joins, subqueries, sets, and strings
Module 2: Fundamentals of Data Engineering
	Sprint 1	Advanced Python & Linux Shell Commands - Linux distribution and architecture, shell commands, Python interfaces, and inheritance
	Sprint 2	Managing Relational Databases & Advanced SQL - database security and compliance, Python iterators and generators, SQL indices, transactions, and views
	Sprint 3	Working with Data Pipelines & Apache Airflow - constructing ETL pipelines, Airflow DAGs, and workflows
Module 3: Intermediate Data Engineering
	Sprint 1	Data Warehousing & dbt - enterprise data warehousing, defining data models with dbt
	Sprint 2	Data Mesh & ML systems design - architecture, principles of data mesh, feature engineering, model development, and evaluation
	Sprint 3	Docker & Intro to MLOps - Docker basics, container concept, and containerization principles, ML model monitoring, and continual learning
Specialization modules (optional)
(learner must choose at least one)
Module 4A		Google Cloud Platform
Module 4B		Amazon Web Services
Specialization		Data analysis and visualisation with Python

*Turing College reserves the right to update and (or) amend the course curriculum and its structure as well as release new course versions.