Free As In Beer, Public As In Bathroom

Another day, another npm security incident.

In summary: Someone created an open source library called event-stream as a fun learning project. Unfortunately for him, event-stream became very popular; he now had a responsibility to maintain it. A stranger volunteered to take over maintenance duties for him, so he transferred ownership of the codebase to the volunteer and washed his hands of it. The “volunteer” turned out to be a hacker who added malicious code to steal Bitcoin, specifically targeting a Bitcoin wallet app called Copay that depended on event-stream. Although the malicious code was discovered late in November, multiple affected versions of Copay had already been released.

As tempting (and fun!) as it would be to use this as an excuse to criticize npm and/or the creator of event-stream, my heart just isn’t in it. It’s hard for me not to feel sympathetic for someone who wanted to play around with a side project and suddenly found himself – quite against his will – responsible for the security of some company’s cryptocurrency app. I’m also not sure what, from a technical standpoint, npm could’ve done to prevent this kind of social engineering attack; someone who’d be willing to hand over his codebase to a stranger would probably also feel comfortable handing over things like security keys.

No, I don’t think this incident really comes down to bad behavior on the part of a few individuals. I think it’s a symptom of a broader problem with the way the JavaScript community in particular has come to engage with “open source”. It’s a structural problem.

Let’s talk about how we might fix it.

JavaScript Packages Are Too Small To Generate Communities

People who are critical of npm’s track record on security usually focus in on its technical shortcomings. I think a pretty obvious counter-argument against any technical criticism of npm on those grounds is to point out that other package managers, like Python’s pip, are very similar from a technical perspective and yet have experienced far fewer security incidents. How do we explain this?

Security in open source software is supposed to come from many volunteers reviewing the code. In the Python world, this is a pretty realistic security model. Your Python projects will typically depend on a few very large, very popular libraries that have large communities of volunteers supporting them, and perhaps a handful of smaller dependencies that you can audit yourself. To a large extent, you’re able to freeload off of the work of the people volunteering for the bigger projects.

The people volunteering for those projects aren’t doing it for nothing. It’s fun to work on big, interesting projects with lots of other developers, and being able to say that you’re a contributor to an important open source project can look good on a resume. These big projects can even wind up having formal, quasi-political organizations on top of them that are responsible for maintaining the project – things like the Django Software Foundation – that present even more opportunities for community, recognition, and professional development.

Contrast a project like Django with something like npm’s is-odd. The is-odd library has under thirty lines of code, contributed by just four developers. It also has more than a million weekly downloads on npm. It’s hard to see how anyone could get involved with is-odd even if they wanted to – there’s just so little code that there isn’t very much work to do, and so there’s very little room for outside volunteers to get invested in the project. The narrow scope and small size of JavaScript libraries actively discourages the formation of communities around them.

In a vacuum that isn’t a problem, because anyone who depends on is-odd can just read its thirty-ish lines of code themselves, or can look into the backgrounds of its four developers and decide to trust them. Here’s the thing, though: JavaScript has a very limited standard library and a culture that favors very small, single-purpose libraries.

To better illustrate why this is a problem, let’s create a basic React app and see how many dependencies we might have:

$ mkdir -p /tmp/react && cd /tmp/react
$ npm install create-react-app
$ ls node_modules | wc -l
$ $(npm bin)/create-react-app foo
$ ls foo/node_modules | wc -l

If we install Facebook’s offical tool for bootstrapping a React app, we wind up installing 59 packages in total. When we use that tool to create the React app itself, the number of dependencies balloons to over 1,000!

If your project depends on 1,000 packages just like is-odd, you either need to review 30,000 lines of code yourself or you need to research and decide to trust (or not trust) 4,000 mostly independent, unaffiliated developers. That’s a lot of work! If you split that same amount of code into a few packages that were large enough to interest and attract actual communities of volunteers, you could safely shove off most of that work onto other people.

Sadly, it didn’t used to be like this. JavaScript used to have larger projects like jQuery and lodash that attracted significant community investment. Because these libraries are used in a browser context, the argument was made that they were too “bloated” to be worth including in web apps. There’s some merit to that. And yet…

Almost all modern JavaScript projects now are using build tools like Webpack that give you the power to selectively import specific modules within a library, and to eliminate dead code. You can use a big library like lodash with Webpack and end up with a bundle file that only includes the parts of lodash you’re actually using. There’s no real technical reason to prefer smaller libraries in JavaScript anymore; our build tools have allowed us to overcome the performance problems that used to be associated with them.

Moving the JavaScript community back in the direction of fewer, larger dependencies would do a lot to improve JavaScript security by fostering community growth.

The JavaScript Ecosystem Encourages Corporate Freeloading

I’ve talked a lot about how you, as a developer, can freeload off of the work of volunteers who are auditing open source code for security. I’m using “freeload” a little bit facetiously here; I think most people who write open source software do it with the full awareness that they are, in a sense, donating their time to other people, and that they can’t expect to be compensated for it in any way. It’s that whole “free as in beer” thing that open source advocates talk about. We all benefit from the free labor of volunteers, but it’s with the understanding (or at least hope) that we’ll give back to the community by volunteering ourselves.

Where it gets a little tricky, ethically, is when it’s rich companies – not hobbyist developers or even freelancers – who’re benefiting from the free labor. I would not be happy to work part-time for Facebook or Microsoft for free, and yet that’s what a lot of open source developers are, in effect, doing when they work on their projects.

It’s not a problem that’s inherent to open source software, I don’t think. When you have very big, very popular open source projects, they usually have some kind of mechanism for sponsorship, patronage, or donations. The Django Software Foundation is a non-profit. Django REST Framework, another very popular Python library, offers paid support plans. People who benefit financially from the software can donate back to the people who sustain it. This is another shortcoming of JavaScript’s propensity for small, uni-purpose libraries.

Of course, financial support isn’t the only way to support an open source project. A company could empower its developers to spend some of their time on the open source projects they depend on. At a minimum, the company should be doing its due diligence to ensure that the open source software they use meets minimum quality standards, and alerting the community when a popular library falls short.

Thinking back to the event-stream fiasco, has anyone really questioned Copay’s decision to base its Bitcoin wallet app on open source code they (apparently) weren’t making any effort to review? When the creator of event-stream published his library on npm, did he have an obligation to make sure that his project was secure enough to be suitable for a commercial cryptocurrency app?

I would argue that no, he didn’t, and that if anything the obligation runs in the other direction. BitPay (the company behind Copay) has raised over $100 million and has an absurd valuation. They could have – and should have – spent some of those resources ensuring that the software their business depends on is secure. It should not be expected that a developer must now donate his labor to BitPay for free because he published a package to npm that someone else decided to use.

I almost titled this section “Richard Stallman Was Right”, even though almost every post I write could have that somewhere in its title. But… Richard Stallman was right. Licenses like the MIT license, which are popular in the JavaScript community, don’t actually help open source software as a community or as a movement. They enable rich companies to use open source code without having any obligation to give back to the community, and so most companies don’t. They don’t contribute code improvements and they don’t contribute money or resources. They just use the free labor of volunteers to make money (or at least, to attract VC funding).

Companies will typically take the path of least resistance. If you make it easy to contribute to open source software, they probably will. If the JavaScript community either had (1) bigger libraries that offered corporate sponsorships, paid support plans, etc. or (2) a community norm around using GPL (or equivalent) software licenses, then there would be avenues for companies to contribute back to the community instead of just exploiting the free labor. As it is, there are npm packages with tens of millions of weekly downloads that depend on the free labor of one or two people; it’s just not a tenable situation.

Many JavaScript Projects Are “Public”, Not “Open”

It may strike you as a bit odd that I’m complaining about JavaScript’s open source community mostly consisting of tiny projects with no corporate sponsorship when there are huge, famous corporate projects like React (Facebook), TypeScript (Microsoft), and the Apollo GraphQL libraries (er, Apollo GraphQL) that are released under open source licenses. That’s not an oversight on my part. I think there’s an important distinction to be made between code that is “open source” and code that is merely “public”.

“Public” code is viewable by anyone, can be forked by anyone, and can be used by anyone. In that respect, it is very similar to “open source” code. But whereas anyone can (in principle) contribute to an open source project, public projects are tightly controlled by a particular company that has veto power over changes to the codebase. More importantly, the design goals of the project are entirely oriented around the needs and desires of the controlling company.

React is a great example of this, because Facebook is constantly changing React’s API to add or remove lifecycle methods based on how they use the library internally and what optimizations will best help their own app’s performance. Outside contributors to React don’t really have a say in any of these design decisions. If React stops being a good fit for your use case, well, too bad for you. You can fork it or move on to something else, but you certainly can’t expect Facebook to sacrifice their own needs for yours or anyone else’s.

Facebook benefits by getting free labor for things like bug fixes, and also by creating a pool of potential Facebook employees who are already trained in their internal tools. Ordinary developers get a library that may or may not be useful to them depending on how closely their needs align with Facebook’s. I think this is a little exploitative, but I also can see why other people might disagree and think this is a fair exchange. That’s a bit beside the point.

The bigger issue, I think, is that public repositories like these create a false sense of security. People assume that a company like Facebook will have a very talented team of engineers who have the time and resources needed to make sure that any open source library they publish is secure, performant, and so on.

The reality is that for-profit companies have an absolutely horrendous track record with security, in part because they have to balance the need for more and better features (which actually makes them money) with tech debt tasks like preventing potential security issues. Open source projects have the freedom to dedicate as much time as they want to code quality because they don’t have this financial pressure to create user- or consumer-facing features. This is especially bad for JavaScript projects because, as I mentioned earlier, it’s so incredibly expensive (in time and resources) to audit the many hundreds or thousands of dependencies your project will inevitably have.

Think back to the React app I created earlier using Facebook’s tooling. Do you think that Facebook has people auditing all 1,000+ dependencies they bundle with that app. Who are they? Do they do a good job? How frequently do they do these audits? Do you actually know the answers to any of these questions, or have you just been assuming that they must have good answers?

It’s entirely possible that Facebook is, in fact, dedicating adequate developer time to addressing these concerns. My point is just that most people choose to believe that’s true without actually verifying it in any way.

This is another example of the “many eyes” principle of open source security falling apart in JavaScript-land. I think people place far too much trust in code created by and for for-profit corporations. Even projects like React are a house of cards built on top of hundreds of tiny libraries made by no-name developers that no one is really keeping an eye on. We have to change our attitude toward these corporate projects so that we aren’t giving them blind trust they haven’t earned.

An Analogy, and a Pledge

People often describe open source software as “free as in beer” or “free as in freedom”. I think we should look at it a different way. I think that most open source JavaScript projects in 2018 are “public as in bathroom”. A gas station bathroom, more specifically.

Gas station bathrooms are usually free to use, and pretty much nobody involved does a good job of taking care of them. People make messes on the toilet seats or the floors. They forget to flush. They don’t always wash their hands. Their owners don’t send people in to clean them often enough, and sometimes let them run out of important things like soap or toilet paper. Sometimes the plumbing doesn’t even work right.

Nobody takes good care of the bathroom because nobody feels any real responsibility towards it. The people who use it are just passing through – perhaps on a road trip – and may never visit that gas station ever again. The people who own it are either unable to unwilling to devote the resources needed to maintain it. There aren’t really any incentives (beyond a sense of basic human decency) for anyone involved to do better.

I think the real, long-term solution to this problem in the JavaScript community would be to build bigger, more active community projects that people are invested in, instead of basing the ecosystem on thousands upon thousands of tiny micro-libraries that no one cares about individually. That kind of structure has been proven to work well for other programming languages.

In the short term, what can an individual developer do? For my part, I’m making this pledge to do better:

  • If I publish a project I’m not willing to maintain for production use, it will come with big, bold disclaimers, including an npm warning on installation.
  • I won’t include microdependencies in my projects. If some problem is small enough that it takes a few dozen lines of code to solve, I’ll just copy it directly into my project with proper attribution so that I can maintain it myself.
  • I’ll take responsibility for any dependencies I do include in my projects. If something is listed in my package.json file, it’s because I know how it works, who created it, who’s responsible for it, and so on.
  • If there’s a library that I use and depend on, I’ll do my best to contribute back to it in some way.

It’s like washing your hands. It’s just basic hygiene.