Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (3/5): Azure Data Factory

Machine Learning, predictive analytics, web services and all the rest to make it happen are really about one thing. And that is to acquire, process and act on data. For the workshop, this is done with a Data Factory pipeline configured to automatically upload a dataset to the storage account of a Spark cluster where Azure Machine Learning is integrated to score the dataset. Importantly, this addresses a fundamental requirement relevant to data-centric applications involved cloud computing. Which is to securely, automatically and on demand moving data between an on-premises location and a designated one in the cloud. For IT today, cloud can be a source, a destination and a broker of data and the ability to securely move data between an on-premises facility and a cloud destination is imperative for a hybrid cloud setting and a backup-and-restore scenarios. And Azure Data Factory is a vehicle to achieve that ability.

image

The workshop video tutorial series is as listed below:

Specifically, Exercises 2 -4 are to accomplish three things:

  • Creating an Azure Data Factory service and pairing which with a designated
    on-premises (file) server
  • Constructing an Azure Data Factory Pipeline to automatically and securely
    move data from the designated on-premises server to a target Azure blob storage
    account
  • Enabling the developed Azure Machine Learning model to score the date
    provided by Azure Data Factory pipeline

Notice that the lab VM is also employed as an on-premises file server hosting a dataset to be uploaded to Azure. At one moment, you may be using the lab VM as a workstation to access Azure remotely, and the next on an on-premises file server installing a gateway. When following the instructions, be mindful where a task is carried out, as the context switching is not always apparently.

Advertisements