BUYING INTO THE BIAS: WHY VULNERABILITY STATISTICS SUCK

Academic researchers, journalists, security vendors, software vendors, and other enterprising... uh... enterprises often analyze vulnerability statistics using large repositories of vulnerability data, such as CVE, OSVDB, and others. These stats are claimed to demonstrate trends in disclosure, such as the number or type of vulnerabilities, or their relative severity. Worse, they are often (mis)used to compare competing products to assess which one offers the best security.

Most of these statistical analyses are faulty or just pure hogwash. They use the easily-available, but drastically misunderstood data to craft irrelevant questions based on wild assumptions, while never figuring out (or even asking us about) the limitations of the data. This leads to a wide variety of bias that typically goes unchallenged, that ultimately forms statistics that make headlines and, far worse, are used for budget and spending.

As maintainers of two well-known vulnerability information repositories, we're sick of hearing about sloppy research after it's been released, and we're not going to take it any more.

We will give concrete examples of the misuses and abuses of vulnerability statistics over the years, revealing which studies do it right (rather, the least wrong), and how to judge future claims so that you can make better decisions based on these "studies." We will cover all the kinds of documented and undocumented bias that can exist in a vulnerability data source; how variations in counting hurt comparative analyses; and all the ways that vulnerability information is observed, cataloged, and annotated.

Steve will provide vendor-neutral, friendly, supportive suggestions to the industry. Jericho will do no such thing.

Presented by