Spark Developer

Spark Developer

An in-depth course on the most powerful big data tools.

Who this course is for:
The course is designed for Data Engineers who want to learn more about Spark, but also Hadoop and Hive

In the course, you will learn the following basic topics:

  • Hadoop (basic components, vendor distributions)
  • HDFS architecture
  • YARN architecture
  • Data formats
  • Spark
  • Spark Streaming and Flink
  • Hive
  • Orchestration, Monitoring and CI/CD
    etc.

Learn how to put it all into practice and consolidate with interesting and challenging homework assignments and a final project.

After taking this course, you will be able to:

  • Use Hadoop to process data
  • Interact with its components via console clients and APIs
  • Work with loosely structured data in Hive
  • Write and optimize applications on Spark
  • Write tests for Spark applications
  • Use Spark to process tabular, streaming, geo-data, and even graphs
  • Configure CI and monitoring of Spark applications

Required Knowledge

Experience writing code in at least one of the following languages: Python, Java, Scala
Basic knowledge of SQL and experience with any relational database
A computer or a Linux-based virtual machine with at least 8 GB of RAM

  • take a full set of training materials: video recordings of all webinars, presentations for classes, as well as solutions of problems and projects in the form of code on github and other additional materials;
  • Receive a certificate of completion;
  • receive an invitation to an interview in partner companies (the most successful students get this opportunity).