Regulating source code

As more areas of our economy become computerized and move online, more and more of what regulators need to understand will be in the source code.

For example, take the VW emissions scandal:

These days, cars are an order of magnitude more complex, making it easier for manufacturers to hide cheats among the 100 million lines of code that make up a modern, premium-class vehicle.

In 2015, regulators realized that diesel Volkswagens and Audis were emitting several times the legal limit of nitrogen oxides (NOx) during real-world driving tests. But one problem regulators confronted was that they couldn’t point to specific code that allowed the cars to do this. They could prove the symptom (high emissions on the road), but they didn’t have concrete evidence of the cause (code that circumvented US and EU standards).

Part of the challenge here is not just the volume of code, but the way it’s delivered: in the case of most consumer devices, code is compiled to binary, for competitive and copyright reasons.  So, in the case of the VW scandal, researchers had to reverse-engineer the cheating, by looking at outputs and by studying firmware images.

By contrast, with cryptocurrencies and blockchains, everything is open source, by definition.  If you’re curious about how the bitcoin, or ethereum, or tezos networks work, you can not only read the white papers, but you can examine the source code.

Because the value of cryptocurrency networks is embedded in the token, there is no longer a commercial incentive to obscure the source code — indeed, doing so would be detrimental to the value of the network, as no one would trust a system they can’t introspect.

This may seem like a minor detail now, but I suspect it will become an important differentiator over time, and we’ll begin to see widespread commercial and regulatory expectations for open source code over time.


  • Twain Twain

    Wrt open source vs proprietary, Amazon recently released Code Star which makes it even easier for developers and, in a few cycles’ time, there may be no need for GitHub because the CI/CD (Continuous Integration/Continuous Development) all happens in Amazon Developer Console.

    I’ve now got 2 Alexa Skills published on Amazon apps store. The Lambda console is super-easy to use AND it bakes in Unit Testing.

    • yep, I can see how Lambda et al will make it easier to compose apps out of micro-apis, rather than out of open source code libraries / fragments

  • Twain Twain

    I agree with the need for being able to introspect source code and for explicability.

    This is especially important given “black box” Deep Learning:


    Now, on Github and StackOverflow (great open source resources), we can see people’s code and inspect it.

    Much much harder to know WHY a Deep Learning algorithm chose to adjust weights within a layer. We can program algorithms to detect the where and what but not the why.

    By comparison, if we ask a human developer, “Why did you do that?” there might be some textual documentation or verbal interview with them that includes their working assumptions, the glitches / bugs / bottlenecks that arose which made them adjust the source code, how the source code reflects a client need, design and architecture philosophy etc. That’s a lot of context, comprehensiveness and reasoning.

    We can’t interrogate the machine’s DL reasoning in that way. They CAN’T reason. They can logically, mechanistically, probabilistically and statistically process to reduce risks and standard errors. That’s not the same as or equivalent to human reasoning at all.

    It’s important we make those distinctions.

    Even in how we think about regulating source code and how AI might help to do that.

  • Rob Underwood

    The Tezos white paper is a very deep read, and also a great primer in OCaml.

  • Perhaps the government can start by dog-fooding… It’s amazing how arcane a procurement system govt has for digital goods that were borne out of the well-reasoned procurement reform brought on by wide scale corruption in the purchase of physical goods in the mid 20th century.

    True, the hard lesson of has brought us USDS, 18F and TechFAR, but as a govtech company, we still see far too many RFPs that still hew to the old model.

    And perhaps, govt can start by leveraging the same procurement system to require all critical infrastructure software be open source – which is not only required for regulating source code, but also ensures a solid foundation for innovation to help us collaboratively build out our digital future.

    What if the open internet never came to be, and we’re still using the walled gardens that were AOL, Compuserve, Prodigy, etc. – will Google, Amazon, and all the modern services we now come to expect even be around?

    The same way 20th century “open source” physical infrastructure like the standard gauge, sewage, and electricity have allowed us to build out modern cities, and the “open” digital infrastructure of the internet has enabled all these modern 21st century conveniences, we need to rebuild the 21st century government we’ve been waiting for by using “open source” scaffolding.

  • Yep

    This is a really hard one, especially because of the long and expensive procurement process, long contracts, and cultural issues around technology in government

    It’s possible to bring an open source mindset, at least in pockets , but there are very strong headwinds there