A Secure Software Supply Chain with Containers

The concept of software supply chain is not a new one. What may be new is that CI/CD (Continuous Integration/Continuous Delivery) with containers makes it conceptually easy to understand and technically practical to implement. Here’s a process diagram illustrates this approach with five steps.

image

CI/CD Process

A software supply chain is here the “master” branch of a release, while development activities at other branch level are not considered here. The start of a master branch is where and when code or a change is introduced, while on the other end of the master branch is a production runtime environment where applications run. The process, as shown above, are highlighted in 5 steps.

  1. A CI/CD process starts by committing and pushing code to a centralized repository. Code here encompasses all programmatic assets relevant to defining, configuring, deploying and managing a relevant application workload.
  2. Changes made on the master branch triggers the CI process to automatically (re)build and (re)test all assets relevant to the workload which the master branch delivers.
  3. The successful outcome will generate a container images which are automatically versioned, tagged, registered and published in a designated trusted registry. In Docker Enterprise Edition, this is implemented as a Docker Trusted Registry, or DTR. The function of a trusted registry is to secure and host container images. Important tasks carried out here are to at a binary level scan for known malware, check vulnerabilities and digitally sign a container image upon being successfully processed. The generation of a container image signifies application assets are successfully integrated and packaged, which signifies the end of CI part.
  4. CD kicks off, executes and validates the steps for deploying the workload to a target production environment.
  5. Upon substantiating containers, referenced container images are then as needed pulled to a local host and start the container instances, hence an application or service.

Notice that Continuous Delivery is a reference of capability and not the state. Continuous Delivery signifies the ability to maintain payload at a production ready and deployable condition at all time. It is not necessary suggesting payload once validated is deployed to production immediately.

One Version of Truth

A software supply chain starts with developers commit and push the code into the master branch of a release’s centralized repository. To have one version of truth, centralized management is essential. Nevertheless, as preferred we can operate a centralized repository in a distributed fashion like github. Further, a centralize source code hosting solution must properly address these priorities including role-based access control, naming and branching, high availability, network latency, single-point-of-failure, etc. With source control, promoted code can be versioned and tagged for asset and release management.

Triggering upon Pushing Changes

When changes introduced into the master branch which is the supply chain, CI must and will automatically kick off validation process which includes a set of defined criteria, namely test cases.

Test-Driven, a Must

Once the development criteria (or requirements) are set, test cases can be developed accordingly prior to writing the code. And this essentially establishes the acceptance criteria and force a developer to focus on designing code guided by what must be later validated. Which in essence designs in the quality.

When changes are made, there is no point to “manually” execute all test cases against all the code. Let me be clear. The regression tests are necessary, but the repetitions with manual labor is counterproductive and error-prone. A better approach is to programmatically automate all structured tests (i.e. the criteria are structured and stable, like canned test cases) and let a tester to do exploratory testing which may not be performed with scripted, expected or even logical steps, but with the intent to break the system. The automation makes regression tests consistent and efficient, while exploratory testing adds extra values by expanding the test coverage.

Master Branch Has No Downtime

A test-driven development in CI/CD holds a key objective that the master branch is always functional and ready to deliver. This may first appear to some of us as idealistic and over-committed. While in fact, considering an automobile production line, as material and parts are put together while moving through one station to another, an automobile is being built. If at any time, a station breaks down, it must be fixed at the scene since the whole production line is on hold. Fixing what stops a subject moving from one station to the next on the spot is necessary to keep the production producing.

Software supply chain, or a CI/CD pipeline, with containers is a digital version of the above-mentioned model where artifacts are definitions, configurations and scripts. As these artifacts are integrated, built and tested throughout the pipeline, the process to construct a service based on containers is validated. If a step fails the validation, the pipeline stops and the issue must be immediately addressed and resulted, so the process can continue to the next step, hence material flows through the pipeline. To CI/CD, a master branch is the pipeline and must be always kept functional and ready to deliver.

Containers Are Not the Deliverables

It should be noted here that artifacts passing through the CI/CD pipeline are neither container images, nor container instances. What the pipeline validates are a set of developed definitions, configurations and processes based on application requirements and presented in mark-up language like json or yaml. In Docker, they are dockerfile, compose yaml file and template-based scripts, for example, to define application architecture with configured Docker runtime environment for a target application delivered as containers upon instantiation.

Container images and instances are in a way by-products. Container images and instances are however not intended to be deliverables. A container image generated by a CI/CD pipeline should always first programmatically created by the initial CI and later reference or updated by CD. The key is that images must be pulled or generated by executing CI. With Docker, thanks for natively configuration management as code, a release may employ a particular version of a container image. And upgrade or fall back a software supply chain may be as easy as changing the reference version, followed by redeploying the associated payload.

Trusted Registry, the Connective Tissue

CI once successfully generated a container image should register and upload the image to a trusted registry for security scanning and digital signing, before CD takes over and later pulls or updates the images, as needed to complete the CI/CD pipeline. Technically CI starts from receiving code changes and ends at successfully register a container image.

Fail to register a resulting container image will prevent CD from progressing upon referencing the image. In other words, a trusted registry is like a connecting tissue holding and keeping CI and CD fully synchronized and functional with the associated container images. A generated container image does not flow through every step of a CI/CD pipeline, the image is however the focal point to the validity of produced results. As shown in the above diagram, I used a dashed line between CI and CD to indicate there is a dependency of the trusted registry. Failing a registration will eventually fail the overall process.

Closing Thoughts

The essence of CI is automatic testing against defined criteria at unit, function and integration levels. These criteria are basically test cases which should be developed prior to code development as acceptance criteria. This is a key. Development must fully address these test criteria at coding time to build in quality.

Software supply chain is a better way. Wait, make that a much better way than just “developing” applications. I remember those days when every release to production was a nightmare. And code promotion was an event full of anxiety, numerous crashes and many long hours. Good riddance, so long those days. CI/CD with containers presents a very interesting and powerful combination for quality and speed, which is unusual to be able to achieve both at the same time. Docker with Jenkins, github and Azure, a software supply chain is realistic and approachable. Which I will detail in an upcoming post. Stay tuned.

Containerization Use Cases

The term, containerization, here denotes not only packaging an application as a container, but practicing DevOps and employing container artifacts to generate container images and instantiate instances accordingly throughout an application lifecycle. The following is a list capturing some pertinent information on containerization.

image

There are a number of use cases that containerization may play a significant role as show below. And I use a 5-step process flow to depict the concept of software supply chain with containerization. image

Use Cases

Here, for each use case I highlight the issues and concerns, what business may pursue, and how containers may help address the problems. It is crucial to first get a clear vision of what the problems are and achievable goals. Frankly, there is always a solution for a problem. Depending on the available resources, how bad and how soon it needs to be done, the path to Rome may vary. Containerization is not an one-for-all solution.

Equally important is to determine what to measure and do benchmark current state before implementing changes. Containerization can and will change how you deliver services. There will be noticeable and fundamental impact on organization structure, company culture and team dynamics. Further containers-based deployment is for speed and scale. From the surface, changes made may initially appear disorganized, chaotic and without governance. Many will question the manageability and feel discomfort since containerization like DevOps, microservices and other emerging technologies are indeed different and disrupting. Therefore, realizing the ROI becomes even more critical. A project team must determine and get agreements with stake holders on KPIs, track them closely, report often to prove the business benefits early. Doing so,will not only improve the quality delivered, but drive consensus, strengthen the confidence and help validate and align with business goals.

image

image

image

image

image

image

image

image

Microservices architecture

Regarding containerization relevant to microservices architecture, not only most of the listed use cases are also contributing factors, but microservices architecture is a topic by itself and there has been much material already published and available online. It is included here for completeness and I will leave microservices architecture for the reader to investigate.

Closing Thoughts

Containerization is a technology backed up by Google’s years of implementation experience and strong credit. The adoption in IT industry has been aggressive and skills of Docker containers are becoming commodity like tcpip, virtualization and cloud. Identifying use cases and defining KPIs are in my view essential before implementation. Hope this article provide some values for achieving these tasks.

In my next blog, I will address more on a secure software supply chain with containerization which also have mush relevance to microservices architecture. Stay tuned.

Connecting Raspberry Pi with Sense HAT to Azure IoT Hub Using Node-RED

Following up with A Simulated IoT Device with Node-RED, I replaced the simulated device with a Raspberry Pi with Sense HAT.

Node-RED Flow

The flow is much similar to that of a simulated device with as shown below,

image

where

  • I initialize temperature, humidity and pressure as global variables and each is set to 0.
  • In the sensehat function, load the ambient data from Sense HAT, use mathjs to round the output to 2 decimal points and update the global variables with the rounded values before sending the data to Azure IoT Hub. Here I include the device connection string in the payload.
  • For the dashboard, the gauges and the charts are reading the data from the global variables. I noticed the Sense HAT temperature reading is generally about 10 degree Celsius higher than my room temperature.

Dashboard

Other than the ranges of data and minor cosmetic changes, the settings are the same with what I used with a simulated device. Here’s a snapshot.

image

Next Step: Azure IoT Edge

Introduce Azure IoT Edge in opaque mode and connect the Raspberry Pi as an isolated device from Azure IoT Hub. Should be interesting. Stay tuned.