The future of source code security is consensus-based
The security landscape is ever-changing. It is the most non-constant industry on the planet. New threats appear and new solutions are built to squash them. Rinse, repeat. It’s a never-ending cycle in what seems like no end in sight. What’s the promised land? Can we ever reach an end-state where all software running across the world is secure and 100 percent free of breaches?
Let me be brief: No.
Nothing will ever be 100 percent breach free. But, that should not be our measure of success. Rather, our goals should be around ensuring that as new code is created, it has eyes and scrutiny by as many people and systems as possible without slowing down innovation.
As I wrote in an earlier article, shifting this process as far left as possible ensures the highest efficiency with the least energy. Once in the wild, an increasingly large amount of effort, time and capital is needed to detect, mitigate and address underlying security problems in your code. And, because CIOs are spending 9/10th of their budgets on post-deployment (endpoint, firewalls, etc.), it is no surprise we see Equifax-sized meltdowns in the world pretty regularly now.
The vast amount of noise and data you have to sift through at that phase in the security life cycle almost guarantees you will miss threats. The key to success is hitting it early, when the noise is low. Shifting “left” is this philosophy and it is now gaining steam in the minds of DevOps leaders.
Your code is worth millions — why not treat it that way?
Take a journey with me, and by the end of this article I think you will be convinced that the promised land is possible. So, is there a future where the world’s developers unite in a global coordinated effort to ensure that code is reviewed en mass? I believe strongly there is. What better way to reach the end-state than working together for the greater good. We are all businesses run on software, by software for software, so it’s extremely valuable to pool our resources for shared gains because the world is trusting us with their information!
A worldwide mechanism to reach consensus (via “votes”) on new code is what we need; and we need that code to be carefully vetted from a vulnerability perspective toward a central goal: every line of code that is added to open source is secured at the moment of birth.
After all, if 90 percent of our global software products are built on top of open-source software (OSS) then it seems to me that everyone stands to benefit from a coordinated way to vet new components that make their way into our products through DevOps.
That brings me back to consensus, or rather a topic some of you may have heard buzzing around in the news — blockchain and its less popular yet equal alter-ego hashgraph. Both are technologies adept at solving the aforementioned peer review problem at scale. But what does it all really have to do with code security and shifting your DevOps left?
The devil is in the details
First, let’s understand a little bit about what consensus technology is — because only a handful of the people I talk to really understand what consensus or blockchain is.
Consensus technology like blockchain is diverse and can be described in varying ways. A blockchain can be a private or public decentralized database that keeps public records in an add-only fashion. A ledger of sorts. Once anything is added into the ledger, a record cannot be modified and it is very difficult to falsify entries. This capability is called persistence. When an entry in the ledger needs to be changed, a new record must be appended to the existing info.
Finally, each of the records can be viewed by any member, allowing for any person to individually verify the authenticity of each transaction recorded for any single entry in the ledger. This transparency means that blockchains are auditable. Auditability brings a slew of value in ensuring software is secure, especially for a supply chain.
But why bother with ledgers over standard databases? Ledgers become immediately appealing as soon as a database needs to be decentralized. An organization looking to avoid a single point of failure or to create a more resilient system for their data and code might find a ledger-based database more appealing than a central one. A distributed database cannot be hacked, manipulated or otherwise disrupted the way a central database can because of its intrinsic design around the topic of trust.
Further, a centralized database requires access control systems; it requires a system directly operated by trustworthy parties. A ledger, however, is operated by unknown and untrusted parties.
This lack of trust inherent in the system is in fact the key framework behind supporting secure consensus.
Because any party can submit information to the ledger, it is necessary for the distributed operators of the ledger to evaluate and agree on all additions before they are permanently incorporated into it! Because we cannot be sure of any author’s trustworthiness, it is vital that all new information must be reviewed and confirmed before being accepted.
This sounds very familiar to my aforementioned sidebar on enabling a massive “code review,” doesn’t it?
Hold on a second, what is consensus?
How can a distributed network of people who have never met come to a common conclusion that something is good or bad — wouldn’t it just be random chaos?
We must dig one level deeper and understand what it really means to reach consensus. There are many methods of finding consensus in a distributed system, but two stand out that are most compelling: the practical byzantine fault tolerance algorithm (PBFT), and the proof-of-work algorithm (PoW).
What does this mean for me?
The practical byzantine fault tolerance algorithm (PBFT) was designed as a solution to a problem presented in the form of a fun parable.
Imagine several divisions of an army are camped outside an enemy city, each division commanded by its own general. The generals can talk with one another only by messenger. After observing the enemy, they must choose a common plan of action. However, some of the generals may be traitors, trying to prevent the loyal generals from reaching agreement. The generals must decide when to attack the city, but they need a strong majority of their army to attack at the same time.
The generals must have an algorithm to guarantee that (1) all loyal generals decide upon the same plan of action, and (2) a small number of traitors cannot cause the loyal generals to adopt a bad plan. The loyal generals will all do what the algorithm says they should, but the traitors may do anything they wish. The algorithm must guarantee condition (1) regardless of what the traitors do. The loyal generals should not only reach agreement, but should agree upon a reasonable plan.
Let’s imagine the generals in the story are the developers participating in a distributed code review backed by a ledger. The messengers they are sending back and forth are the means of communication across the cloud on which the ledger is running; maybe via @mentions. The collective goal of the “loyal developers” is to decide whether or not to accept a piece of code submitted to the ledger as valid or not. A valid piece of code would be a correct opportunity to decide in favor of a new build.
Loyal coders are faithful ledger participants who are interested in ensuring the integrity of the ledger and therefore ensuring that only correct and secure code is accepted. The treacherous coders, on the other hand, would be any party seeking to falsify or subterfuge code on the ledger. Their potential motives are myriad — it could be an individual seeking to add bad code or a backdoor to an open-source library or degrade performance of a library that could cripple every system running cloud software.
In the PBFT solution, each coder maintains an internal state. When a coder receives a message, they use the message in concert with their internal state to run a computation. This computation in turn tells that individual coder what to think about the message in question. Then, after reaching their individual decision about the new message, that coder shares that decision with all the other coders in the system. A consensus decision is determined based on the total decisions submitted by all coders.
The only issue with this system is that all coders must vote; in large-scale projects this may not be feasible or desirable.
So, another method of reaching consensus on a ledger is the proof-of-work (PoW) scheme, which is used by the popular service Bitcoin. In contrast to the solution above, PoW does not require all parties on the network to submit their individual conclusions in order for a consensus to be reached. Rather, PoW is a system that uses a function to create conditions under which a single coder is permitted to announce their conclusions about the submitted code, and those conclusions can then be independently verified by all other system participants.
False conclusions are prevented by the inputs to the function, which ensure that false information will fail to compute in an acceptable way. In the Bitcoin system specifically, the participant who publicly verified the information on behalf of the network is in turn rewarded for its participation with newly mined Bitcoins. Therefore, this process of searching for valid “answers” is known as mining.
Yes folks, that is what mining is all about. Incentivizing participation in the network ensures broad participation, which in turn ensures a more robust network and a safer ledger.
This PoW strategy allows for easy, broad participation, which in turn ensures greater network stability with minimal requirements on each participant, allowing participants to remain, for example, anonymous. I can see PoW preferred for solving the source-code code review problem because it allows a smaller set of “maintainers” to verify security of code while still maintaining the benefits of the ledger backed system in the first place.
With the world addicted to open-source software providing business the speed and agility to create and launch new products to market faster than ever before, with the growing threat of software vulnerabilities causing catastrophic breaches and a macro regulatory environment that is only tightening, the consensus model of the generals or the miners is well-suited for ensuring open-source software is secure and reviewed and auditable by a global community!
We are not there yet, but this is where we are heading.
Wouldn’t it be valuable if a vendor could pinpoint what company and what developer introduced a potential problem into a supply chain? That may scare developers to never write code again, but it also may entice them to double down on making security and bug-free code an utmost priority. After all, once everything is out in the open, it’s amazing what changes are seen in human behavior. Fair transparency is the key.
I see the next five years as an exciting time for consensus-based technology intersecting with the world of software development. We are all riding the same train — waiting for applications from the hashgraphs and blockchains to pop up that will change the world. I for one believe that the security of the supply chain and the security of all of our code is a great mission in the age of digital transformation.