Data Science and Big Data Analytics: Leveraging Best Practices and Avoiding Pitfalls

2019-10-17T00:15:45+00:00Categories: Data Governance Level 2, Data Engineering Curriculum Electives, Data Science Curriculum, Data Science Level 2, Data Governance Curriculum Electives, Stephen Brobst, Executive Curriculum, Data Visualisation, Data Engineering Level 2, Data Management, Executive Level 2, Big Data, All Academy Courses|Tags: , , , , , , |

Data science is the key to business success in the information economy. This workshop will teach you about best practices in deploying a data science capability for your organisation. Technology is the easy part; the hard part is creating the right organisational and delivery framework in which data science can be successful in your organisation. We will discuss the necessary skill sets for a successful data scientist and the environment that will allow them to thrive. We will draw a strong distinction between “Data R&D” and “Data Product” capabilities within an enterprise and speak to the different skill sets, governance, and technologies needed across these areas. We will also explore the use of open data sets and open source software tools to enable best results from data science in large organisations. Advanced data visualisation will be described as a critical component of a big data analytics deployment strategy. We will also talk about the many pitfalls and how to avoid them.

Stars, Flakes, Vaults and the Sins of Denormalisation

2019-10-18T03:01:05+00:00Categories: Data Governance Level 2, Innovation & Tech (CTO) Curriculum Electives, Data Governance Curriculum Electives, Executive Curriculum Electives, Innovation & Tech (CTO) Level 2, Stephen Brobst, Data Engineering Curriculum, Data Management, AI Engineering Curriculum, Executive Level 2, Data Engineering Level 1, AI Engineering Level 1, All Academy Courses|Tags: , , , |

Providing both performance and flexibility are often seen as contradictory goals in designing large scale data implementations. In this talk we will discuss techniques for denormalisation and provide a framework for understanding the performance and flexibility implications of various design options. We will examine a variety of logical and physical design approaches and evaluate the trade offs between them. Specific recommendations are made for guiding the translation from a normalised logical data model to an engineered-for-performance physical data model. The role of dimensional modeling and various physical design approaches are discussed in detail. Best practices in the use of surrogate keys is also discussed. The focus is on understanding the benefit (or not) of various denormalisation approaches commonly taken in analytic database designs.

Modernising Your Data Warehouse and Analytic Ecosystem

2019-10-24T04:49:10+00:00Categories: Data Governance Level 2, Executive Curriculum Electives, Data Governance Curriculum, Innovation & Tech (CTO) Level 2, Stephen Brobst, Data Engineering Curriculum, Innovation & Tech (CTO) Curriculum, Data Management, AI Engineering Curriculum, Infrastructure & Technologies, Big Data, Data Engineering Level 1, AI Engineering Level 1, All Academy Courses|Tags: , , , , |

This full-day workshop examines the emergence of new trends in data warehouse implementation and the deployment of analytic ecosystems.  We will discuss new platform technologies such as columnar databases, in-memory computing, and cloud-based infrastructure deployment.  We will also examine the concept of a “logical” data warehouse – including and ecosystem of both commercial and open source technologies.  Real-time analytics and in-database analytics will also be covered.  The implications of these developments for deployment of analytic capabilities will be discussed with examples in future architecture and implementation. This workshop also presents best practices for deployment of next generation analytics using AI and machine learning. 

Cost-Based Optimisation: Obtaining the Best Execution Plan for Complex Queries

2019-10-24T04:52:00+00:00Categories: Data Governance Level 2, Predictive Analytics & AI, Innovation & Tech (CTO) Curriculum Electives, Data Science Level 2, Data Science Curriculum Electives, Data Governance Curriculum, Innovation & Tech (CTO) Level 2, Stephen Brobst, Data Engineering Curriculum, Data Engineering Level 2, AI Engineering Curriculum, Big Data, AI Engineering Level 2, All Academy Courses|Tags: , , , |

Optimiser choices in determining the execution plan for complex queries is a dominant factor in the performance delivery for a data foundation environment. The goal of this workshop is to de-mystify the inner workings of cost-based optimisation for complex query workloads. We will discuss the differences between rule-based optimisation and cost-based optimisation with a focus on how a cost-based optimization enumerates and selects among possible execution plans for a complex query. The influences of parallelism and hardware configuration on plan selection will be discussed along with the importance of data demographics. Advanced statistics collection is discussed as the foundational input for decision-making within the cost-based optimiser. Performance characteristics and optimiser selection among different join and indexing opportunities will also be discussed with examples. The inner workings of the query re-write engine will be described along with the performance implications of various re-write strategies.

Social Network Analysis: Practical Use Cases and Implementation

2019-10-24T04:51:00+00:00Categories: Data Governance Level 2, Predictive Analytics & AI, Data Culture Electives, Innovation & Tech (CTO) Curriculum Electives, Data Science Curriculum, Data Governance Curriculum Electives, Executive Curriculum Electives, Marketing, Data Science Level 1, Data Culture Level 2, Innovation & Tech (CTO) Level 2, Stephen Brobst, Fraud and Security, Data Management, AI Engineering Curriculum, Executive Level 2, Big Data, AI Engineering Level 2, All Academy Courses|Tags: , , , |

Social networking via Web 2.0 applications such as LinkedIn and Facebook has created huge interest in understanding the connections between individuals to predict patterns of churn, influencers related to early adoption of new products and services, successful pricing strategies for certain kinds of services, and customer segmentation. We will explain how to use these advanced analytic techniques with mini case studies across a wide range of industries including telecommunications, financial services, health care, retailing, and government agencies.