All Academy Courses

Data Transformation and Analysis Using Apache Spark

2019-11-25T06:49:44+00:00Categories: Jeffrey Aven, Level 1, Data Science Curriculum Electives, Data Governance Curriculum Electives, Apache Spark, Data Engineering Curriculum, All Academy Courses, Apache Spark Training with Jeffrey Aven, Experienced Analytics Instructor + Big Data Author|Tags: , |

With big data expert and author Jeffrey Aven. The first module in the “Big Data Development Using Apache Spark” series, this course provides a detailed overview of the spark runtime and application architecture, processing patterns, functional programming using Python, fundamental API concepts, basic programming skills and deep dives into additional constructs including broadcast variables, accumulators, and storage and lineage options. Attendees will learn to understand the Apache Spark framework and runtime architecture, fundamentals of programming for Spark, gain mastery of basic transformations, actions, and operations, and be prepared for advanced topics in Spark including streaming and machine learning.

Stream and Event Processing using Apache Spark

2019-11-25T06:49:44+00:00Categories: Jeffrey Aven, Level 2, Data Science Curriculum Electives, Apache Spark, Data Engineering Curriculum, All Academy Courses, Apache Spark Training with Jeffrey Aven, Experienced Analytics Instructor + Big Data Author|Tags: , |

The second module in the “Big Data Development Using Apache Spark” series, this course provides the Spark streaming knowledge needed to develop real-time, event-driven or event-oriented processing applications using Apache Spark. It covers using Spark with NoSQL systems and popular messaging platforms like Apache Kafka and Amazon Kinesis. It covers the Spark streaming architecture in depth, and uses practical hands-on exercises to reinforce the use of transformations and output operations, as well as more advanced stream-processing patterns. With big data expert and author Jeffrey Aven.

Advanced Analytics Using Apache Spark

2020-01-22T01:55:48+00:00Categories: Jeffrey Aven, Data Science Curriculum Electives, Apache Spark, Level 3, R Electives, AI Engineering Curriculum, All Academy Courses, Apache Spark Training with Jeffrey Aven, Experienced Analytics Instructor + Big Data Author|Tags: , |

With big data expert and author Jeffrey Aven. The third module in the “Big Data Development Using Apache Spark” series, this course provides the practical knowledge needed to perform statistical, machine learning and graph analysis operations at scale using Apache Spark. It enables data scientists and statisticians with experience in other frameworks to extend their knowledge to the Spark runtime environment with its specific APIs and libraries designed to implement machine learning and statistical analysis in a distributed and scalable processing environment.

Fraud and Anomaly Detection

2020-02-04T02:24:55+00:00Categories: Level 2, Data Science Curriculum Electives, Fraud and Security, R, Dr Eugene Dubossarsky, Financial Risk, All Academy Courses|Tags: , |

This course presents statistical, computational and machine-learning techniques for predictive detection of fraud and security breaches. These methods are shown in the context of use cases for their application, and include the extraction of business rules and a framework for the interoperation of human, rule-based, predictive and outlier-detection methods. Methods presented include predictive tools that do not rely on explicit fraud labels, as well as a range of outlier-detection techniques including unsupervised learning methods, notably the powerful random-forest algorithm, which can be used for all supervised and unsupervised applications, as well as cluster analysis, visualisation and fraud detection based on Benford’s law. The course will also cover the analysis and visualisation of social-network data. A basic knowledge of R and predictive analytics is advantageous.

Stars, Flakes, Vaults and the Sins of Denormalisation

2019-10-18T03:01:05+00:00Categories: Data Governance Level 2, Innovation & Tech (CTO) Curriculum Electives, Data Governance Curriculum Electives, Executive Curriculum Electives, Innovation & Tech (CTO) Level 2, Stephen Brobst, Data Engineering Curriculum, Data Management, AI Engineering Curriculum, Executive Level 2, Data Engineering Level 1, AI Engineering Level 1, All Academy Courses|Tags: , , , |

Providing both performance and flexibility are often seen as contradictory goals in designing large scale data implementations. In this talk we will discuss techniques for denormalisation and provide a framework for understanding the performance and flexibility implications of various design options. We will examine a variety of logical and physical design approaches and evaluate the trade offs between them. Specific recommendations are made for guiding the translation from a normalised logical data model to an engineered-for-performance physical data model. The role of dimensional modeling and various physical design approaches are discussed in detail. Best practices in the use of surrogate keys is also discussed. The focus is on understanding the benefit (or not) of various denormalisation approaches commonly taken in analytic database designs.

Best Practices in Enterprise Information Management

2019-10-24T04:45:22+00:00Categories: Data Culture Level 1, Data Culture Curriculum, Innovation & Tech (CTO) Curriculum Electives, Data Governance Curriculum, Stephen Brobst, Fraud and Security, Executive Curriculum, Data Engineering Curriculum, Data Governance Level 1, Data Management, Executive Level 2, Big Data, Data Engineering Level 1, All Academy Courses, Innovation & Tech (CTO) Level 3|Tags: , , , , , |

The effective management of enterprise information for analytics deployment requires best practices in the areas of people, processes, and technology. In this talk we will share both successful and unsuccessful practices in these areas. The scope of this workshop will involve five key areas of enterprise information management: (1) metadata management, (2) data quality management, (3) data security and privacy, (4) master data management, and (5) data integration.

Agile Insights

2019-10-25T10:26:46+00:00Categories: AI Engineering Curriculum Electives, Data Culture Electives, Data Governance Curriculum, Introductory, Executive Curriculum, Innovation & Tech (CTO) Curriculum, Alexander Heidl, All Academy Courses|Tags: , , , , |

This course presents a process and methods for an agile analytics delivery. Agile Insights reflects the capabilities required by any organization to develop insights from data and validating potential business value.Content presented describes the process, how it is executed and how it can be deployed as a standard process inside an organization. The course will also share best practices, highlight potential tripwires to watch out for, as well as roles and resources required.

Data Driven Management

2019-12-01T06:42:57+00:00Categories: AI Engineering Curriculum Electives, Data Engineering Curriculum Electives, Government, Data Science Curriculum, Data Governance Curriculum, Data Science Level 1, Executive Curriculum, Data Engineering Level 2, Dr Eugene Dubossarsky, Innovation & Tech (CTO) Curriculum, Data Governance Level 1, AI Engineering Level 2, Executive Level 1, All Academy Courses, Innovation & Tech (CTO) Level 1|Tags: , , , |

This course is for executives and managers who want to leverage analytics to support their most vital decisions and enable better decision-making at the highest levels. It empowers senior executives with skills to make more effective use of data analytics. It covers contexts including strategic decision-making and shows attendees ways to use data to make better decisions. Attendees will learn how to receive, understand and make decisions from a range of analytics methods, including visualisation and dashboards. They will also be taught to work with analysts as effective customers.

Deep Learning and AI

2019-10-17T05:12:36+00:00Categories: Keras, Tensorflow, Level 2, Data Science Curriculum, Python, Data Engineering Curriculum, Dr Eugene Dubossarsky, All Academy Courses|Tags: , |

This course is an introduction to the highly celebrated area of Neural Networks, popularised as “deep learning” and “AI”. The course will cover the key concepts underlying neural network technology, as well as the unique capabilities of a number of advanced deep learning technologies, including Convolutional Neural Nets for image recognition, recurrent neural nets for time series and text modelling, and new Artificial Intelligence techniques including Generative Adversarial Networks and Reinforcement Learning. Practical exercises will present these methods in some of the most popular Deep Learning packages available in Python, including Keras and Tensorflow. Trainees are expected to be familiar with the basics of machine learning from the Fundamentals course, as well as the python language.

Text and Language Analytics

2019-10-18T03:37:35+00:00Categories: AI Engineering Curriculum Electives, Level 2, Data Science Curriculum Electives, R, R Electives, Dr Eugene Dubossarsky, All Academy Courses|Tags: , |

Text analytics is a crucial skill set in nearly all contexts where data science has an impact, whether that be customer analytics, fraud detection, automation or fintech. In this course, you will learn a toolbox of skills and techniques, starting from effective data preparation and stretching right through to advanced modelling with deep-learning and neural-network approaches such as word2vec.