A Secure Software Supply Chain with Containers

The concept of software supply chain is not a new one. What may be new is that CI/CD (Continuous Integration/Continuous Delivery) with containers makes it conceptually easy to understand and technically practical to implement. Here’s a process diagram illustrates this approach with five steps.

image

CI/CD Process

A software supply chain is here the “master” branch of a release, while development activities at other branch level are not considered here. The start of a master branch is where and when code or a change is introduced, while on the other end of the master branch is a production runtime environment where applications run. The process, as shown above, are highlighted in 5 steps.

  1. A CI/CD process starts by committing and pushing code to a centralized repository. Code here encompasses all programmatic assets relevant to defining, configuring, deploying and managing a relevant application workload.
  2. Changes made on the master branch triggers the CI process to automatically (re)build and (re)test all assets relevant to the workload which the master branch delivers.
  3. The successful outcome will generate a container images which are automatically versioned, tagged, registered and published in a designated trusted registry. In Docker Enterprise Edition, this is implemented as a Docker Trusted Registry, or DTR. The function of a trusted registry is to secure and host container images. Important tasks carried out here are to at a binary level scan for known malware, check vulnerabilities and digitally sign a container image upon being successfully processed. The generation of a container image signifies application assets are successfully integrated and packaged, which signifies the end of CI part.
  4. CD kicks off, executes and validates the steps for deploying the workload to a target production environment.
  5. Upon substantiating containers, referenced container images are then as needed pulled to a local host and start the container instances, hence an application or service.

Notice that Continuous Delivery is a reference of capability and not the state. Continuous Delivery signifies the ability to maintain payload at a production ready and deployable condition at all time. It is not necessary suggesting payload once validated is deployed to production immediately.

Once Version of Truth

A software supply chain starts with developers commit and push the code into the master branch of a release’s centralized repository. To have one version of truth, centralized management is essential. Nevertheless, as preferred we can operate a centralized repository in a distributed fashion like github. Further, a centralize source code hosting solution must properly address these priorities including role-based access control, naming and branching, high availability, network latency, single-point-of-failure, etc. With source control, promoted code can be versioned and tagged for asset and release management.

Triggering upon Pushing Changes

When changes introduced into the master branch which is the supply chain, CI must and will automatically kick off validation process which includes a set of defined criteria, namely test cases.

Test-Driven, a Must

Once the development criteria (or requirements) are set, test cases can be developed accordingly prior to writing the code. And this essentially establishes the acceptance criteria and force a developer to focus on designing code guided by what must be later validated. Which in essence designs in the quality.

When changes are made, there is no point to “manually” execute all test cases against all the code. Let me be clear. The regression tests are necessary, but the repetitions with manual labor is counterproductive and error-prone. A better approach is to programmatically automate all structured tests (i.e. the criteria are structured and stable, like canned test cases) and let a tester to do exploratory testing which may not be performed with scripted, expected or even logical steps, but with the intent to break the system. The automation makes regression tests consistent and efficient, while exploratory testing adds extra values by expanding the test coverage.

Master Branch Has No Downtime

A test-driven development in CI/CD holds a key objective that the master branch is always functional and ready to deliver. This may first appear to some of us as idealistic and over-committed. While in fact, considering an automobile production line, as material and parts are put together while moving through one station to another, an automobile is being built. If at any time, a station breaks down, it must be fixed at the scene since the whole production line is on hold. Fixing what stops a subject moving from one station to the next on the spot is necessary to keep the production producing.

Software supply chain, or a CI/CD pipeline, with containers is a digital version of the above-mentioned model where artifacts are definitions, configurations and scripts. As these artifacts are integrated, built and tested throughout the pipeline, the process to construct a service based on containers is validated. If a step fails the validation, the pipeline stops and the issue must be immediately addressed and resulted, so the process can continue to the next step, hence material flows through the pipeline. To CI/CD, a master branch is the pipeline and must be always kept functional and ready to deliver.

Containers Are Not the Deliverables

It should be noted here that artifacts passing through the CI/CD pipeline are neither container images, nor container instances. What the pipeline validates are a set of developed definitions, configurations and processes based on application requirements and presented in mark-up language like json or yaml. In Docker, they are dockerfile, compose yaml file and template-based scripts, for example, to define application architecture with configured Docker runtime environment for a target application delivered as containers upon instantiation.

Container images and instances are in a way by-products. Container images and instances are however not intended to be deliverables. A container image generated by a CI/CD pipeline should always first programmatically created by the initial CI and later reference or updated by CD. The key is that images must be pulled or generated by executing CI. With Docker, thanks for natively configuration management as code, a release may employ a particular version of a container image. And upgrade or fall back a software supply chain may be as easy as changing the reference version, followed by redeploying the associated payload.

Trusted Registry, the Connective Tissue

CI once successfully generated a container image should register and upload the image to a trusted registry for security scanning and digital signing, before CD takes over and later pulls or updates the images, as needed to complete the CI/CD pipeline. Technically CI starts from receiving code changes and ends at successfully register a container image.

Fail to register a resulting container image will prevent CD from progressing upon referencing the image. In other words, a trusted registry is like a connecting tissue holding and keeping CI and CD fully synchronized and functional with the associated container images. A generated container image does not flow through every step of a CI/CD pipeline, the image is however the focal point to the validity of produced results. As shown in the above diagram, I used a dashed line between CI and CD to indicate there is a dependency of the trusted registry. Failing a registration will eventually fail the overall process.

Closing Thoughts

The essence of CI is automatic testing against defined criteria at unit, function and integration levels, as configured. These criteria are basically test cases which should be developed prior to code development as acceptance criteria. This is a key. Development must fully address these test criteria at coding time to build in quality.

Software supply chain is a better way. Wait, make that a much better way than just developing applications. I remember those days when every release to production was a nightmare. And code promotion was an event full of anxiety, numerous crashes and many long hours. Good riddance, so long those days. CI/CD and particularly with containers presents a very interesting and powerful combination for quality and speed, which is unusual to be able to achieve both at the same time. Docker with Jenkins, github and Azure, a software supply chain is in my view more realistic than what many of us believe. Which I will detail in an upcoming post.

Advertisements

Connecting Raspberry Pi with Sense HAT to Azure IoT Hub Using Node-RED

Following up with A Simulated IoT Device with Node-RED, I replaced the simulated device with a Raspberry Pi with Sense HAT.

Node-RED Flow

The flow is much similar to that of a simulated device with as shown below,

image

where

  • I initialize temperature, humidity and pressure as global variables and each is set to 0.
  • In the sensehat function, load the ambient data from Sense HAT, use mathjs to round the output to 2 decimal points and update the global variables with the rounded values before sending the data to Azure IoT Hub. Here I include the device connection string in the payload.
  • For the dashboard, the gauges and the charts are reading the data from the global variables. I noticed the Sense HAT temperature reading is generally about 10 degree Celsius higher than my room temperature.

Dashboard

Other than the ranges of data and minor cosmetic changes, the settings are the same with what I used with a simulated device. Here’s a snapshot.

image

Next Step: Azure IoT Edge

<

p align=”left”>Introduce Azure IoT Edge in opaque mode and connect the Raspberry Pi as an isolated device from Azure IoT Hub. Should be interesting. Stay tuned.

A Simulated IoT device with Node-RED

In the last few months, I have gradually shifted to use Node-RED as the tool for demonstrating and prototyping Azure IoT solutions. In particular, I configure a dashboard to display the ambient information sent from the device and verify the data received by an Azure IoT Hub and stored in an Azure storage account using Azure Storage Explorer form my desktop. Ideally, I would configure all on an Arduino or a Raspberry Pi. To make it more portable, I also do it with a local Ubuntu VM, so no need to plug in anything and I can demo a simple IoT setup anytime and anywhere on demand with Internet connectivity. Briefly, here’s an outline of what I did.

1. Installing & Starting Node-RED

On my Ubuntu (16.04 LTS) VM, update and upgrade everything, followed by install Node-Red.

If you need to make a require node module globally available in Node_RED, edit the file, ~/.node-red/settings.js accordingly. Here, I made the module, math.js, globally available and used it to round the ambient data to two decimal points.

imageimage

Now, start Node-RED, as the below.

image

As activities being carried out in Node-RED, this session displays the log with diagnostics in real-time.

In Ubuntu, when close out the terminal session running Node-RED, somehow it also stops the Node-RED service. This is different than how it behaves in Raspberry Pi where closing a Node-RED terminal session will not stop the service.

2. Accessing Node-RED IDE

The default port for Node-RED IDE is 1880, as shown below accessing the service from localhost. If preferred, authentication can be enabled and port changed by following what is stated in documentation and the above-mentioned settings.js file.

image

By default, there are a number of nodes installed as show on the left panel. And you may install addition nodes to better fit the needs.

3. Install additional Nodes

There are ample nodes and flows in Node-RED web site which you may install a from and contribute to. In addition to install these node modules with a command line interface, doing it interactively is also an option. In the IDE, click the upper right waffle within the Node-RED session and click ‘Manage palette’. If you do not see the option, update npm to the latest should make this option appear. As shown below, the Nodes tab presents the nodes installable directly or already installed currently. The

image

and on the install tab, you may keyword-search the Node-RED repository for relevant modules. Below, I search the modules relevant to Azure.

image

A few modules, I frequently install including:

4. Develop & Deploy a Node-RED Process Flow

To create a flow, start dragging selected nodes from the left panel to the canvas and construct flows by connecting the nodes. There copious amount of contents with how-to instructions on Node-RED in Internet already. Or if you like to do it in an old fashioned way, like me, by reading the document. The following is a simulated device with a few nodes to send and display ambient data to an Azure IoT Hub call thisiothub, as the following.

SNAGHTMLe2519f

Global Variables

Here, I added a config node to set the global variable to set the baseline temperature, humidity and pressure for a simulation run. Node-RED will always initialize a config node prior to executing all flows presented on the canvas.

Timestamp

The timestamp node sets the time interval for sending data. When developing and troubleshooting, I set it to a long period between messages to minimize the noises. When demoing, I will then set it based on a customer’s requirements. Each time, the timestamp triggers, the connected nodes are consequently executing the programmed the logic, respectively.

The Functions

In this setting, each emission by the timestamp node has the following effects.

  • This IoT Device function prepares the message payload and updates current ambient data which are global variables.
  • The temperature, humidity and pressure functions pipe the data stored in the global variables to a configured Node-RED dashboard.

This IoT Hub

This node has the host name of a target Azure IoT hub, here thisiothub, and the device connection information is provided in the function, This IoT Device.

msg.payload

This is a debug node. Once dragged to the canvas, it will automatically rename itself to msg.payload. Once connected, this node becomes a standardout of Node-RED. And you can examine the output in the debug tab in the right panel. In the screen capture above, you will find that I rounded the data to two decimal points and send it with mqtt.

SNAGHTML118120f

Gauges and Charts

A main reason motivating me to use Node-RED is the simplicity to configure and deploy a dashboard directly on an IoT device. Data visualization is essential for an IoT solution which is all about data. The ability to deploy a dashboard right there and when on demand is a significant time saver and a noticeable advantage. It did however took me some practices to correctly place those gauges and charts the way I wanted. Once configured, the dashboard is published automatically at http://node-red-instance/ui and here is what I got.

image

Verifying the Data Sent to Azure IoT Hub

There are two tools I use for managing and examining the activities between an IoT device and Azure IoT Hub. Namely, iothub-explorer for Linux and Device Explorer for Windows. The latter is a sharp tool for Windows users to examine data, .

image

And a convenient way to get the connection strings.

image

I also deployed a sample web app which plots the temperature and the humidity data received form thisiotdevice in real time, as shown.

image

So either from Azure IoT Hub or directly on the device, we may present the data visually.

Some Gotcha

Ubuntu frequently stopped Node-RED when a deployment had failed connecting to Azure IoT Hub, and the node will also lost the host name data. And I had to frequently restart the services and re-enter the Azure IoT Hub host name in the node configuration. And it is better to leave the terminal session where you started the Node-RED service visible at all time to restart the service as needed. I once spent hours troubleshooting a flow, researching material and was not able to figured out why, only to later find out the Node-RED service exited its session upon a failed deployment behind the scene.

Closing Thoughts

Node-RED is a great learning and prototyping tool. And once learned, you can create process logic based on data flows relatively easily. It is visual and a picture is always worth a thousand words

<

p align=”left”>Azure IoT Hub is the Swiss army knife for formulating an IoT solution. It does the heavy lifting for registering, securing and managing devices with interfaces to integrate other Azure or 3rd-party services. The recent announcement of Azure IoT Edge opens up many scenarios and opportunities to increase ROI by processing data right where they are collected. Which is what I plan to include to the next version of my Node-RED flow. Stay tuned.

NIST Guidance on Container Security

Here, a selected few of NIST documents which I’ve found very informative may help those who seek formal criteria, guidelines and recommendations for evaluating containerization and security.

NIST SP 800-190

Application Container Security Guide

Published in September of 2017, this document (800-190) reminds us the potential security concerns and how to address those concerns when employing containers. 800-190 details the major risks and the countermeasures of container technologies include image, registry, orchestrator, containers and host OS.

Worth pointing out that in section 6 of 800-190 recommends organizations should apply all, while listing out  exceptions and additions in planning and implementation to, the NIST SP 800-125 Section 5 recommendations in a container technology context.

NISTIR 8176

Security Assurance Requirements for Linux Application Container Deployments

Published in October of 2017, this document (8176) explains the execution model of Linux containers and assumes the attack model is that the vulnerability in the application code of the container or its faulty configuration has been exploited by an attacker. 8176 also examines securing containers based on hardware and configurations including namespace, cgroups and capabilities. Addressing the functionality and assurance requirements for the two types container security solutions, 8176 complements NIST 800-190 which provides the security guidelines and counter measures for application containers, .

NIST SP 800-180

NIST Definition of Microservices, Application Containers and System Virtual Machines

As of January of 2018, this document (800-180) is not yet finalized, while the draft was published in February of 2017 and the call for comments had ended in the following month.

The overwhelming interests on container technologies and their applications have energized organizations for seeking new and improved ways to add values to their customers and increase ROI. At the same time, as containers, containerization and microservices have become highly popular terms and over and over again being abused in our daily business conversations, the lack of rigorous and recognized criteria to clearly define what containers and microservices are has been in my view a main factor confusing and perhaps misguided many. For those who seek definitions and clarity before examining a solution, the agony of being in a state of confusion suffocated by the ambiguity of technical jargons indiscreetly applied to statements can be, or for me personally is, a very stressful experience. And some apparently has had enough and urged us that “There is no such thing as a microservice!”

With 800-180 serving a similar role to what NIST 800-145 to cloud computing, we now have a set of criteria to reference as a baseline for carrying out a productive conversation on containers., microservices and related solutions. And that’s a good thing.

NIST SP 800-125

Guide to Security for Full Virtualization Technologies

Like many NIST documents, this document (800-125) first gives the background information by explaining what full virtualization is, the motivations of employing it and how it works, before depicting the use cases, requirements and security recommendations for planning and deployment. Although today most business and technical professionals in the IT industry are to some degree versed in virtualization technologies. 800-125 remains an interesting read and provides an insight into virtualization and security. There are two associated documents, as below, point out important topics on virtualization to for a core knowledge domain of the subject.

  • NIST SP 800-125A Security Recommendations for Hypervisor Deployment on Servers
  • NIST SP 800-125B Secure Virtual Network Configuration for Virtual Machine (VM) Protection

Microsoft Nano Server with Docker Enterprise Edition (EE)

This article details the process to install the latest Docker EE Version 17.06.2-ee-3 to a Microsoft Nano Server. I am sure there are different ways to do this. After a few iterations, here is one verified approach. A sample script is available.

Background

As shown below, when adding the server feature, containers, in Windows Server 2016, it installs Docker EE Version 17.06-1-ee-2. As opposed to what is in Windows 10, adding containers in ‘Turn Windows features on or off’ of  Program and Features of Control Panel installs Docker CE Version 17.03.1-ce, i.e. Community Edition. Information about the two versions is available. The latest version of Docker EE is 17.06-2-ee-3. To keep all Docker EE to the same and the latest version, one may need to manually install Docker EE, instead of employing the default version with a Windows Server. To manually install Docker EE to a Microsoft Nano Server, follow the steps provided below.

Windows Server 2016 patched on 10/05/2017

image

Windows 10 patched on 10/05/2017

image

Step 1 – Create a Nano Server vhdx file with the container package

First, use Nano Server Image Builder to create a vhdx file with intended packages including containers. Notice if Windows ADK (Assessment and Deployment Kit) is not in place, it will prompt for installing ADK. Which is about 6.7 GB download. Once ADK is in place, start the image builder which is wizard-driven and straightforward. I picked vhdx format for building a Gen2 VM. And as shown below, I also added containers, Hyper-V and Anti-Virus packages. The Windows Server 2016 media used to create the Nano Server vhdx file is en_windows_server_2016_x64_dvd_9718492.iso download from my MSDN subscription.

clip_image001

Step 2 – Update the Nano Server OS

In Hyper-V manager, I created and started a Gen2 VM using the vhdx created in Step 1. And log in the VM to find out the IP address, as shown below.

image

I did the following to connect to the host. Once connected, not shown in the following is that to test the Internet connectivity and update the DNS setting, as needed, by following the instructions in the sample script. What should be done first is to carry out a Windows update. Which I did.

image

For this particular VM, I had already updated the OS before taking this screen capture, thus there was no, i.e. zero, updates applicable. Originally there were two updates, KB4035631 and KB4038782, listed as applicable. This update took 20 minutes with about 2 GB download, followed by a reboot of the system. If there is an interest in examining the list of applicable updates beforehand, you can run the following in the PSSession before the Invoke-CimMethod in line 8,

$updateList = ($ci | Invoke-CimMethod -MethodName ScanForUpdates -Arguments @{SearchCriteria=”IsInstalled=0″;OnlineScan=$true}).Updates

Step 3 – Install Docker EE

If to simply use the Docker default to current Windows Server 2016 installation, which is Docker EE Version 17.06-1-ee-2, as stated earlier, install the provider and package will do. In this case, execute line 1 and line 4 to start a PSSession after updating followed by rebooting the OS, then run the following PowerShell commands to install Docker.

Install-Module -Name DockerMsftProvider -Repository PSGallery -Force
Install-Package -Name docker -ProviderName DockerMsftProvider
Restart-Computer -Force; exit

To manually install Docker EE Version 17.06.2-ee-3, I did the following:

image

After downloading and extracting the source file in line 27 and 28, the PATH in the registry was updated with the installation of Docker to persist the reference across sessions. Rather than reboot the system, the current PATH was updated to start and verify the installation of an intended version of Docker from line 39 to line 41. The following is the user experience of executing line 1 to line 41 and successfully installed Docker EE Version 17.06.2-ee-3.

image

In a swarm, keeping all Docker instances in the same version is essential. In case there is a need to have Docker EE Version 17.06.2-ee-3 in Windows workloads, the presented steps achieve that.

What’s Next

Having installed Docker EE, start pulling down images and building containers. Deploying a swarm in a hybrid setting is what I plan to share next.

Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (5/5): Predictive Web Service

The last part of this video tutorial series includes three exercises. First, Exercise 6 uses Power BI Desktop, import the summary data from the Spark cluster and create a report with drag-n-drop to visualize the data. Exercise 7 is the exciting part, configures and deploys a sample web app and configures it to consume the predictive web service published in Exercise 1, followed by conducting a few simple tests. Finally, Exercise 8 shows how to clean up the deployed resources of the workshop.

Here you start.

Microsoft Cortana Intelligence Workshop encompasses a set of processes and supporting tools to architect, construct, package and deploy a predictive analytics solution. It is a friendly platform with no hardware to purchase, no software to configure. The workshop ultimately deploys a web application with a predictive analytics service. The app predicts the total number and the probability of flight delays between two cities based on date, time, carrier and real-time forecast weather information. It is a relative simple project, however includes all the essential components to formulate a modern and intelligent application.

Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series by Yung Chou

The workshop is intended to be delivered as a whole-day event with presentation sessions and lab time. On the other hand, within 75 minutes the above video tutorial series can also offer you an experience and guide you through all the screens and interactions to successfully deploy the web service.

The next step is to apply what learned from this series to your work. Good luck.

Microsoft Cortana Intelligence Suite Workshop Video Tutorial Series (4/5): Azure Spark Cluster

The objective of Exercise 5 is to create a table, then store and prepare summary data for later visualization. You will find out it is simple and straightforward using a Spark notebook to interactively work on an Azure Spark cluster.

This video tutor series presents the live demonstrations of all the exercises to facilitate the learning of Microsoft Cortana Intelligence Suite. There are 5 parts: