Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 4 Next »

Part - the smallest piece of the curriculum, a notebook requiring ~10 hours of study time. Usually, 5 Parts make up a Sprint. A Part can contain a Project requiring corrections (usually the 5th part of a regular Sprint) or theoretical knowledge with some practical exercises and a quiz (usually the first 4 Parts of a Sprint). To progress further in the course, either a quiz or a correction needs to be completed.

Project - a Part dedicated completely for practical work. A project aims to incorporate as many topics from the current and previous sprints as possible to allow practicing your skills. Most projects require 1 STL and 1 peer correction to be passed.

Sprint - a larger piece of the curriculum requiring ~50 hours to complete. It is either a collection of 5 Parts out of which one is a project, or one larger capstone project. A sprint always requires a correction to be passed.

Capstone project - a practical task at the end of a module that takes a whole sprint (~50) hours to complete. It allows to practice all of the skills learned throughout a module

Module - Largest piece of the curriculum, usually made up of 3 Regular Sprints and 1 Capstone Project Sprint. Takes about 200 hours to complete. Some of the modules can be optional.

Specialisation module - a module that a learner chooses from a pool of options depending on the data roles and companies that they plan on applying to. The module covers the tools, skills and technologies needed for specific roles or companies. Most specialisation modules are prepared in cooperation with our Hiring Partners.

Course Structure*

Module 1: Introduction to Data Engineering

Sprint 1

Intermediate Python & Git - Python data model, Python sequences, Git basics

Sprint 2

Introduction to Relational Databases & SQL Basics - Python mutability and object references, SQL queries

Sprint 3

Intermediate SQL - SQL joins, subqueries, sets, and strings

Sprint 4

Capstone project

Module 2: Fundamentals of Data Engineering

Sprint 1

Advanced Python & Linux Shell Commands - Linux distribution and architecture, shell commands, Python interfaces, and inheritance

Sprint 2

Managing Relational Databases & Advanced SQL - database security and compliance, Python iterators and generators, SQL indices, transactions, and views

Sprint 3

Working with Data Pipelines & Apache Airflow - constructing ETL pipelines, Airflow DAGs, and workflows

Sprint 4

Capstone project

Module 3: Intermediate Data Engineering

Sprint 1

Data Warehousing & dbt - enterprise data warehousing, defining data models with dbt

Sprint 2

Data Mesh & ML systems design - architecture, principles of data mesh, feature engineering, model development, and evaluation

Sprint 3

Docker & Intro to MLOps - Docker basics, container concept, and containerization principles, ML model monitoring, and continual learning

Sprint 4

Capstone project

Module 4: Specialisation module

(Should choose one)

Google Cloud Platform

Amazon Web Services

Microsoft Azure

Spark & Hadoop

*Turing College reserves the right to update and (or) amend the course curriculum and its structure as well as release new course versions. Major changes are most likely in the first (pilot) batches of the course.


Choosing a specialisation module

You get to choose the optional modules after you complete the first 3 modules of the course. Some things to take into account when choosing are:

  • Which areas do I want to get better in?

  • Which companies and positions are you most interested in – are they looking for some specific skills that these modules offer?

You will be able to make the choice in the platform. While most learners are expected to do either 4A or 4B, In case you would like to do both, let us know via the support chat in the platform.

  • No labels