Who is this course for?
This Nanodegree program offers an ideal path for experienced programmers to advance their data engineering career. If you enjoy solving important technical challenges and want to learn to work with massive datasets, this is a great way to get hands-on practice with a variety of data engineering principles and techniques.
Course Syllabus
Data Modeling
Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
Data Modeling with Postgres
Data Modeling with Apache Cassandra
In this project, you’ll model user activity data for a music streaming app called Sparkify. You’ll create a noSQL database and ETL pipeline designed to optimize queries for understanding what songs users are listening to. You’ll model your data in Apache Cassandra to allow for specific queries provided by the analytics team at Sparkify.
Cloud Data Warehouses
Sharpen your data warehousing skills and deepen your understanding of data infrastructure. Create cloud-based data warehouses on Amazon Web Services (AWS).
Build a Cloud Data Warehouse
Spark and Data Lakes
Understand the big data ecosystem and how to use Spark to work with massive datasets. Store big data in a data lake and query it with Spark.
Build a Data Lake
Data Pipelines with Airflow
Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
Data Pipelines with Airflow
Capstone Project
Combine what you've learned throughout the program to build your own data engineering portfolio project.
Data Engineering Capstone
Enrollment Inclusions
Real-world projects from industry experts
Technical mentor support
Career services
Flexible learning program
Additional information
Course Page | https://www.udacity.com/course/data-engineer-nanodegree–nd027 |
---|---|
Program Length | Estimated Time Of 5 Months At 5-10 hrs/week |
Instructors | Amanda Moran, Ben Goldberg, Sameh El-Ansary, Olli Iivonen, David Drummond, Judit Lantos, Juno Lee |
Scheduled Class Batches? | Yes |
Program Format | Self-paced Online Classes |
Technical or Skill Pre-requisites | Intermediate Python programming knowledge, of the sort gained through the Programming for Data Science Nanodegree program, other introductory programming courses or programs, or additional real-world software development experience. Including: |
Pricing | Monthly Access – Pay as you go: $399 per month |
Financing Options | Available |
Scholarship Programs | Available |
Related Job Positions | Analytics engineer, Big Data Engineer, Data Platform Engineer, Data Analysts, Data Scientists, Machine Learning Engineers, or Software Engineers. |
isleepbad –
They recently had the free 1 month offer again and I’m doing it now. Honestly for me it’s an extremely mixed bag. I’m a beginner data engineer just transitioning into the field and I’ve had my own personal project going for 4 months. However, lot of the content filled in the gaps for me to help me understand things, like data modeling, different architectures and how to properly use data lakes.
But their exercises are really garbage though. A lot of stuff I am doing ony own the “”harder”” way. Their IaC section? I did using Terraform. Their free data sets I downloaded and messed around with them using my own copy of the same tools. Used docker to spin up containers and interacted with them.
All in all for a beginner lacking some concepts it’s great. But you’re better off doing your own projects than using their examples and thinking you learned anything.
mltut –
”
Pros and Cons of Udacity Data Engineering Nanodegree
Pros-
Provides hands-on labs to practice throughout each lesson.
The content is well developed and intuitive.
Provides a good explanation of SQL vs. NoSQL.
Discuss Postgres and Apache Cassandra commands.
Provides perfect exposure to skills required in the data engineering industry.
Focus on hands-on practice and believe in “how” to do things like ETL and Data Warehousing.
Good explanation of distributed file systems and cluster computing.
Clears the doubt between PySpark data frames and PySpark SQL.
Provides Technical mentor support.
Great community to help.
Cons-
Expensive
Some of the lectures are not very polished.
Data Modeling exercises have bugs.
The demonstration code sample is not available to students.
After completing the Nanodegree program, you will not get lifetime access to the course material.”