Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (3/5): Azure Data Factory

Machine Learning, predictive analytics, web services and all the rest to make it happen are really about one thing. And that is to acquire, process and act on data. For the workshop, this is done with a Data Factory pipeline configured to automatically upload a dataset to the storage account of a Spark cluster where Azure Machine Learning is integrated to score the dataset. Importantly, this addresses a fundamental requirement relevant to data-centric applications involved cloud computing. Which is to securely, automatically and on demand moving data between an on-premises location and a designated one in the cloud. For IT today, cloud can be a source, a destination and a broker of data and the ability to securely move data between an on-premises facility and a cloud destination is imperative for a hybrid cloud setting and a backup-and-restore scenarios. And Azure Data Factory is a vehicle to achieve that ability.

image

The workshop video tutorial series is as listed below:

Specifically, Exercises 2 -4 are to accomplish three things:

  • Creating an Azure Data Factory service and pairing which with a designated
    on-premises (file) server
  • Constructing an Azure Data Factory Pipeline to automatically and securely
    move data from the designated on-premises server to a target Azure blob storage
    account
  • Enabling the developed Azure Machine Learning model to score the date
    provided by Azure Data Factory pipeline

Notice that the lab VM is also employed as an on-premises file server hosting a dataset to be uploaded to Azure. At one moment, you may be using the lab VM as a workstation to access Azure remotely, and the next on an on-premises file server installing a gateway. When following the instructions, be mindful where a task is carried out, as the context switching is not always apparently.

Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (2/5): Azure Machine Learning

This video tutorial series walks you through the development of a predictive analytics solution using Microsoft Cortana Intelligence Suite. The solution is realized as a web application to predict the number of delays with probability of a flight segment between two cities with a particular airline at a particular date and time. The content of this series is based on what is published at http://aka.ms/CortanaManual by and thanks to Todd Kitta.

The first video is an introduction of how the workshop is structured and highlights a few important items to get you prepared. There are additional four to cover all eight exercises as

Here, the video walks through the process and operations in Exercise 1 to build an Azure Machine Learning model, as below,

image

and package it as a web service for consumption. If having not tried Microsoft Azure Machine Learning Studio before, I hope you will enjoy and appreciate the build-in canvas and native drag-and-drop capability for creating and composing a model. It let you explore and realize your creativities in multiple dimensions.

There are nine tasks total. Here we go.

So, what qualifies the machine as being able to learn? What is learning anyway? Look for my upcoming blog posts to examine the concept of “learning” and more.

Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (1/5): Introduction

This series, based on the content developed by Microsoft, offers a learning path with minimal time and effort to acquire the essential operation-level knowledge of Microsoft Cortana Intelligence Suite. The workshop steps through a process to construct and deploy a web application with predictive analytics, while along the way introducing key functional components. By specifying an origin and a destination airports, a future date and time and an airline carrier, this application predicts a flight delay with probability based on the weather forecast. The video tutorial series runs about 75 minutes and has captured exactly when and what you will see on the screen, where and how to respond based on the instruction of each exercise in the workshop.

I believe this series will most benefit those who function in a technical leadership capacity including: enterprise architect, solution architect, cloud architect, application architect, DevOps lead, etc. and are interested in the solution architecture of an application of predictive analytics. Going through the recordings will provide you an end-to-end view and clarity on how to constructing and deploying a predictive analytics solution, hence a better understanding on the processes and technologies, integration points, packaging and publishing, resource skill profiles, critical path, cost model, etc.

Cortana Intelligence Suite is a set of processes and tools. This workshop outlines an approach where analytic models, data, analysis, visualization, packaging, publishing and deployment are delivered in an integrated fashion. In my view, this is a productive and the right way to start learning how to architect a predictive analytics solution. The above video is the first of five to accelerate your learning of Cortana Intelligence Suite, and highlight a few important items before starting the workshop.

Content Repo

The content of this workshop made available by Todd Kitta is at http://aka.ms/CortanaManual in github. The readme file of the workshop details the scenario, architecture, prerequisites and a list of links to the instructions of all eight exercises.

image

The above architecture diagram of the workshop depicts the functional components for a web application with predictive analytics. Here the lab VM is also employed as an on-premises file server as the source of a data pipeline securely connected to a created Azure Data Factory service to automatically upload data to be scored by the Azure Machine Learning model. At the center is a Spark HDInsight cluster for data analysis, while the data are visualized by Power BI. The predictive analytics model is integrated and package as a web service consumed by a web application.

Introduction

Let’s first pay attention to a few important items before doing the workshop. There are eight exercises in this workshop and I have grouped them into five videos: an introduction and four learning units.

I recommend reading the instruction of an exercise in its entirety before doing the exercise, this will help set the context and gain clarity the objectives of each exercise. To do the workshop, one will need an active Azure subscription. Notice that a free trial account does provide sufficient credit for doing the entire workshop.

image

The workshop environment is a collection of resources deployed to Azure, as shown above, including:

  • A VM with Internet connectivity for a student to log in and work on all the exercises, such that there is no need to download or install anything locally for this workshop
  • A Machine Learning workspace accessed via Microsoft Azure Machine Learning studio to develop an experiment of predictive analytics
  • A Spark cluster for hosting and analyzing data including a scored dataset and a summary table
  • A number of storage accounts for storing workshop data

These resources do incur a cost. And to minimize the cost, try deploying the workshop environment only when you are ready to work on the exercises and delete it once completed the workshop. The deployment will take about fifteen minutes, if not more. And do deploy all resources and create services into the same resource group, so all can be later removed by simply deleting the resources group. Personally, when doing the workshop, I will set aside at least a four-hour block, find a quiet room and get a great cup of coffee. It is indeed a lot to consume.

Enjoy the workshop. Let’s get started.