This course provides attendees with practical knowledge required to perform statistical, machine learning and graph analysis operations at scale using Apache Spark.
The Apache Spark family includes APIs and libraries designed to implement machine learning and statistical analysis operations in a distributed processing environment, offering horizontal scalability and parallel computing power. The “Advanced Analytics using Apache Spark” module is designed to enable data scientists and statisticians who have experience in other statistical or machine learning frameworks to extend their knowledge and experience to the Spark runtime environment.
The course introduces R on Spark (using the SparkR package) to common R functions using the Spark framework, this includes hands on examples of how to use the Spark runtime with RStudio. The course continues on to introduce the Spark MLlib and Spark ML APIs, including practical exercises implementing regression, classification and clustering algorithms as well as feature extraction operations using Spark. Collaborative filtering applications such as recommendation engines are covered as well.
Additionally, the course provides an introduction to graph processing and analysis using Spark.
- Using the Spark R API
- Using Spark with RStudio
- Machine learning using the Spark MLlib API
- Machine learning using the Spark ML API
- Feature extraction using Spark
- Linear algebra using Spark
- Classification using Spark
- Clustering using Spark
- Regression using Spark
- Building a recommender using Spark
- Using Spark with Jupyter
- Graph processing and analysis using Spark
Developed by Jeffrey Aven, author of SAMS Teach Yourself Apache Spark and Data and Analytics with Spark using Python, this course will provide the core knowledge and skills needed to develop applications using Apache Spark.
The “Advanced Analytics using Apache Spark” module is the third of three modules in the “Big Data Development using Apache Spark” series, following the “Data Transformation and Analysis using Apache Spark” and “Stream and Event Processing using Apache Spark” modules.
Private and Corporate Training
In addition to our public seminars, workshops and courses, AlphaZetta Academy can provide this training for your organisation in a private setting at your location or ours, or online. Please enquire to discuss your needs.