This course provides the Spark Streaming conceptual and practical knowledge required to develop real time and event driven (or event oriented) processing applications using Apache Spark. The course covers using Spark with NoSQL systems as well as popular messaging platforms such as Apache Kafka and Amazon Kinesis.

The Spark streaming architecture is covered in depth using the DStream API as well as structured streaming in Spark. Practical hands-on exercises are provided to reinforce the usage of transformations and output operations in Spark streaming, as well as more advanced stream processing patterns such as stateful stream processing and sliding window operations using Spark.

Topics covered include:

  • Introduction to NoSQL systems
  • Using Spark with HBase
  • Using Spark with DynamoDB
  • Introducing spark streaming and the DStream API
  • DStream sources
  • DStream transformations and output operations
  • Stateful stream processing
  • Sliding window operations
  • Using Spark with Apache Kafka
  • Using Spark with Amazon Kinesis
  • Event sourcing using Spark
  • Structured streaming with Spark

Developed by Jeffrey Aven, author of SAMS Teach Yourself Apache Spark and Data and Analytics with Spark using Python, this course will provide the core knowledge and skills required to develop applications using Apache Spark.

The “Stream and Event Processing Using Apache Spark” module is the second of three modules in the “Big Data Development Using Apache Spark” series, following the “Data Transformation and Analysis using Apache Spark” module and preceding the “Advanced Analytics Using Apache Spark” module.

See what former trainees are saying about AlphaZetta courses.

Additional Information

Audience Expert
This course is suitable for developers and analysts who will be developing real time or event oriented applications using Spark; this could include IoT applications, real-time fraud applications and more. The course is designed for developers and analysts who have a basic level of competence with Spark programming using the RDD and DataFrame APIs.
Prerequisites
  • General programming skills
  • Basic Python programming skills
  • Basic Spark skills and knowledge (ability to program basic RDD and DataFrame applications in Spark)
  • OR knowledge equivalent to that gained by the course Data Transformation and Analysis Using Apache Spark
Objective / outcomes Attendees should, by the end of the course:

  • Understand the Spark streaming framework and runtime architecture
  • Understand the fundamentals of stream programming in Spark using both DStreams and structured streaming
  • Understand stream processing patterns including stateful and sliding window operations
  • Be able to use messaging (such as Kafka) and NoSQL platforms (such as HBase) as a source or target for event/stream processing applications
Format Class
Duration 2 days
Trainer photo of Jeffrey AvenCourses are taught by Jeffrey Aven.

Jeffrey Aven is a big data, open source software, and cloud computing consultant, author and instructor based in Melbourne, Australia.

Jeffrey has extensive experience as a technical instructor, having taught courses on Hadoop and HBase for Cloudera (awarded Cloudera Hadoop Instructor of the Year for APAC in 2013) and courses on Apache Kafka for Confluent in addition to delivering his own courses.

Jeffrey is also the author of several Big Data related books including SAMS Teach Yourself Hadoop in 24 Hours, SAMS Teach Yourself Apache Spark in 24 Hours and Data Analytics with Spark using Python.

In addition to his credentials as an instructor and author, Jeff has over thirty years of industry experience and has been involved in key roles with several major big data and cloud implementations over the last several years.

Delivery Method In-person at AlphaZetta Academy locations or on-premise for corporate groups

Meals and refreshments

Catered morning tea and lunch are provided on both days of the course. Please notify us at least a week ahead if you have any special dietary requirements.

Feedback

Use academy@alphazetta.ai to email us any questions about the course, including requests for more detail, or for specific content you would like to see covered, or queries regarding prerequisites and suitability.
If you would like to attend but for any reason cannot, please also let us know.

Variation

Course material may vary from advertised due to demands and learning pace of attendees. Additional material may be presented, along with or in place of advertised.

Cancellations and refunds

You can get a full refund if you cancel 14 days or more before the course starts. No refunds will be issued for cancellations made less than 14 days before the course starts.

Frequently asked questions (FAQ)

Do I need to bring my own computer?
There’s no need to bring your own laptop or PC. Our courses take place in modern, professional training facilities that have all the computing equipment you’ll need.

Private and Corporate Training

In addition to our public seminars, workshops and courses, AlphaZetta Academy can provide this training for your organisation in a private setting at your location or ours. Please enquire to discuss your needs.

Enquire Now

Scheduled Public Courses
BOOK NOW ⇓

Stream and Event Processing Using Apache Spark – Melbourne

December 10 @ 9:30 am - December 11 @ 5:00 pm

Private and Corporate Training

In addition to our public seminars, workshops and courses, AlphaZetta Academy can provide this training for your organisation in a private setting at your location or ours. Please enquire to discuss your needs.

Enquire Now

Other Apache Spark Courses

Other Data Science Curriculum Courses

Testimonials

Eugene’s courses are not your standard technical courses where you learn how to put data into a model and get a result. The real life experiences – warts and all – he brings to the instruction mean that attendees walk away with a better understanding of the real life challenges of analytics as well as the technical know-how. We routinely send our team members on these courses to help them get the capabilities that really help our clients get better insights from their data.

James Beresford, Director, Agile BI

Eugene’s introductory course to data science was outstanding. I found the subject matter and delivery fascinating, accessible and informative. I found Eugene approachable, interesting to listen to and excellent at simplifying complex concepts. I highly recommend this course for anyone who wants to know what data science—and all the buzz surrounding it!—are all about.

C.T. Johnson, Director, Statecraft