Data science is the key to business success in the information economy. This workshop will teach you about best practices in deploying a data science capability for your organisation. Technology is the easy part; the hard part is creating the right organisational and delivery framework in which data science can be successful in your organisation. We will discuss the necessary skill sets for a successful data scientist and the environment that will allow them to thrive. We will draw a strong distinction between “Data R&D” and “Data Product” capabilities within an enterprise and speak to the different skill sets, governance, and technologies needed across these areas. We will also explore the use of open data sets and open source software tools to enable best results from data science in large organisations. Advanced data visualisation will be described as a critical component of a big data analytics deployment strategy. We will also talk about the many pitfalls and how to avoid them.
With big data expert and author Jeffrey Aven. The first module in the “Big Data Development Using Apache Spark” series, this course provides a detailed overview of the spark runtime and application architecture, processing patterns, functional programming using Python, fundamental API concepts, basic programming skills and deep dives into additional constructs including broadcast variables, accumulators, and storage and lineage options. Attendees will learn to understand the Apache Spark framework and runtime architecture, fundamentals of programming for Spark, gain mastery of basic transformations, actions, and operations, and be prepared for advanced topics in Spark including streaming and machine learning.
Providing both performance and flexibility are often seen as contradictory goals in designing large scale data implementations. In this talk we will discuss techniques for denormalisation and provide a framework for understanding the performance and flexibility implications of various design options. We will examine a variety of logical and physical design approaches and evaluate the trade offs between them. Specific recommendations are made for guiding the translation from a normalised logical data model to an engineered-for-performance physical data model. The role of dimensional modeling and various physical design approaches are discussed in detail. Best practices in the use of surrogate keys is also discussed. The focus is on understanding the benefit (or not) of various denormalisation approaches commonly taken in analytic database designs.
This course describes the cultural and organisational aspects required for an organisation on the digital transformation path. A healthy corporate culture around data awareness is imperative to leverage the potential and value of data to the benefit of a company's business model. The organisation needs to reflect the culture and reward those who add value to a corporation by using data and analytics. Content presented explains personality and skill identification, how to prototype an agile analytics organisation and describe how to validate change capabilities, close gaps and execute a transition strategy.
Blockchain is one of the most disruptive and least understood technologies to emerge over the previous decade. This course gives participants an intuitive understanding of blockchain in both public and private contexts, allowing them to distinguish genuine use cases from hype. We explore public crypto-currencies, smart contracts and consortium chains, interspersing theory with case studies from areas such as financial markets, health care, trade finance, and supply chain. The course does not require a technical background.
This full day workshop examines the trends in analytics deployment and developments in advanced technology. The implications of these technology developments for data foundation implementations will be discussed with examples in future architecture and deployment. This workshop presents best practices for deployment of a next generation data management implementation as the realization of analytic capability for mobile devices and consumer intelligence. We will also explore emerging trends related to big data analytics using content from Web 3.0 applications and other non-traditional data sources such as sensors and rich media.
Social networking via Web 2.0 applications such as LinkedIn and Facebook has created huge interest in understanding the connections between individuals to predict patterns of churn, influencers related to early adoption of new products and services, successful pricing strategies for certain kinds of services, and customer segmentation. We will explain how to use these advanced analytic techniques with mini case studies across a wide range of industries including telecommunications, financial services, health care, retailing, and government agencies.
This workshop describes a framework for capacity planning in an enterprise data environment. We will propose a model for defining service level agreements (SLAs) and then using these SLAs to drive the capacity planning and configuration for enterprise data solutions. Guidelines will be provided for capacity planning in a mixed workload environment involving both strategic and tactical decision support. Performance implications related to technology trends in multi-core CPU deployment, large memory deployment, and high density disk drives will be described. In addition, the capacity planning implications for different approaches for data acquisition will be considered.