CVE editor Steven Christey has begun to post commentary related to CVE and VDBs.
[20013-07-07 Update: This effort didn't last long. The last update was 2006-02-16, 4 days after this blog post. =(]
Jason Bergen posted to Full-Disclosure trying to sell a “Security Vulnerability Database Company“. From that mail:
The company maintains a database of all security vulnerabilities, and the database is updated on a daily basis. The company maybe of interest to organisations who are currently licensing a vulnerability database. In addition the company has developed some software applications built upon the vulnerability database.
This is interesting on many levels, especially the approach in selling it. Why post to that mail list and not others? When asked for more details, Mr Bergen tells you “In order to provide further information a signed NDA would be required.” You must sign a non-disclosure agreement just to find out the name of the company being sold. He also makes the following claim:
The database contains all vulnerabilities since 1988. Each entry has Bugtraq, CVE, and Nessus ids. It has developed its own vulnerability alerting system, but recently changed focus to providing OEM database licensing.
Sadly, he is not the first to make this claim. Throughout the years, many people have referred to CVE as having “all vulnerabilities since 1988” which simply is not the case. If you ask Steve Christey or anyone involved with CVE, they will be the first to tell you that isn’t the case. So why do people think that? CERT started releasing advisories in 1988, but only released them for serious/critical vulnerabilities. Between 1988 and 1999 (CVE inception), many vulnerabilities were never added or given a formal advisory for. In short, claims that their database has “all vulnerabilities since 1988″ is extremely suspect. Had it been any year other than 1988, perhaps they took the time to go back and add them making the claim true. His wording also begs the question, what if a vulnerability doesn’t have a BID, CVE or Nessus ID to match? As much as databases try to maintain a perfect cross reference mapping, it just doesn’t happen all the time.
Steve Christey of CVE has posted to several lists asking What is the state of vulnerability research? Before you dismiss the question, give it serious thought for a few minutes. Have any ideas, opinions or concerns about where vuln research is heading? Where it should be? Drop him a line and let him know.
One person challenged him stating that if MITRE were the experts they proclaim, he wouldn’t have to ask. After a few years of being heavily involved with vulnerability databases and monitoring such research, I of course had to reply.
Brian Krebs has a fantastic post on his blog covering the time it takes for Microsoft to release a patch, and if they are getting any better at it. Here are a few relevant paragraphs from it, but I encourage you to read the entire article. It appears to be a well developed article that is heavily researched and quite balanced. Makes me wonder if his editors shot it down for some reason. If they did, shame on them.
A few months back while researching a Microsoft patch from way back in 2003, I began to wonder whether anyone had ever conducted a longitudinal study of Redmond’s patch process to see whether the company was indeed getting more nimble at fixing security problems.
Finding no such comprehensive research, Security Fix set about digging through the publicly available data for each patch that Microsoft issued over the past three years that earned a “critical” rating. Microsoft considers a patch “critical” if it fixes a security hole that attackers could use to break into and take control over vulnerable Windows computers.
Here’s what we found: Over the past three years, Microsoft has actually taken longer to issue critical fixes when researchers waited to disclose their research until after the company issued a patch. In 2003, Microsoft took an average of three months to issue patches for problems reported to them. In 2004, that time frame shot up to 134.5 days, a number that remained virtually unchanged in 2005.
First off, these are the kind of statistics and research that I mean when I talk about the lack of evolution of vulnerability databases. This type of information is interesting, useful, and needed in our industry. This begins to give customers a solid idea on just how responsive our vendors are, and just how long we stay at risk with unpatched vulnerabilities. This is also the type of data that any solid vulnerability database should be able to produce with a few clicks of the mouse.
This type of article can be written due to the right data being available. Specifically, a well documented and detailed time line of the life of a vulnerability. Discovery, disclosure to the vendor, vendor acknowledgement, public disclosure, and patch date are required to generate this type of information. People like Steven Christey (CVE) and Chris Wysopal (VulnWatch) have been pushing for this information to be made public, often behind the scenes in extensive mail to vendors. In the future if we finally get these types of statistics for all vendors over a longer period of time, you will need to thank them for seeing it early on and helping to make it happen.
This type of data is of particular interest to OSVDB and has been worked into our database (to a degree) from the beginning. We currently track the disclosure date, discovery date and exploit publish date for each vulnerability, as best we can. Sometimes this data is not available but we include it when it is. One of our outstanding development/bugzilla entries involving adding a couple more date fields, specifically vendor acknowledge date and vendor solution date. With these five fields, we can begin to trend this type of vendor response time with accuracy, and with a better historical perspective.
While Krebs used Microsoft as an example, are you aware that other vendors are worse than Microsoft? Some of the large Unix vendors have been slow to patch for the last twenty years! Take the recent disclosure of a bug in uustat on Sun Microsystems Solaris Operating System. iDefense recently reported the problem and included a time line of the disclosure process.
08/11/2004 Initial vendor contact
08/11/2004 Initial vendor response
01/10/2006 Coordinated public disclosure
Yes, one year and five months for Sun Microsystems to fix a standard buffer overflow in a SUID binary. The same thing that has plagued them as far back as January 1997 (maybe as far back as December 6, 1994, but details aren’t clear). It would be nice to see this type of data available for all vendors on demand, and it will be in due time. Move beyond the basic stats and consider if we apply this based on the severity of the vulnerability. Does it change the vendor’s response time (consistently)? Compare the time lines along with who discovered the vulnerability, and how it was disclosed (responsibly or no). Do those factors change the vendor’s response time?
The answers to those questions have been on our minds for a long time and are just a few of the many goals of OSVDB.
Steve Christey (CVE Editor) wrote an open letter to several mailing lists regarding the nature of vulnerability statistics. What he said is spot on, and most of what I would have pointed out had my previous rant been more broad, and not a direct attack on a specific group. I am posting his entire letter here, because it needs to be said, read, understood, and drilled into the heads of so many people. I am reformatting this for the blog, you can read an original copy via a mail list.
Open Letter on the Interpretation of “Vulnerability Statistics”
Author: Steve Christey, CVE Editor
Date: January 4, 2006
As the new year begins, there will be many temptations to generate, comment, or report on vulnerability statistics based on totals from 2005. The original reports will likely come from publicly available Refined Vulnerability Information (RVI) sources – that is, vulnerability databases (including CVE/NVD), notification services, and periodic summary producers.
RVI sources collect unstructured vulnerability information from Raw Sources. Then, they refine, correlate, and redistribute the information to others. Raw sources include mailing lists like Bugtraq, Vulnwatch, and Full-Disclosure, web sites like PacketStorm and Securiteam, blogs, conferences, newsgroups, direct emails, etc.
In my opinion, RVI sources are still a year or two away from being able to produce reliable, repeatable, and COMPARABLE statistics. In general, consumers should treat current statistics as suggestive, not conclusive.
Vulnerability statistics are difficult to interpret due to several factors:
- - VARIATIONS IN EDITORIAL POLICY. An RVI source’s editorial policy dictates HOW MANY vulnerabilities are reported, and WHICH vulnerabilities are reported. RVIs have widely varying policies. You can’t even compare an RVI against itself, unless you can be sure that its editorial policy has not changed within the relevant data set. The editorial policies of RVIs seem to take a few years before they stabilize, and there is evidence that they can change periodically.
- – FRACTURED VULNERABILITY INFORMATION. Each RVI source collects its information from its own list of raw sources – web sites, mailing lists, blogs, etc. RVIs can also use other RVIs as sources. Apparently for competitive reasons, some RVIs might not identify the raw source that was used for a vulnerability item, which is one aspect of what I refer to as the provenance problem. Long gone are the days when a couple mailing lists or newsgroups were the raw source for 90% of widely available vulnerability information. Based on what I have seen, the provenance problem is only going to get worse.
- – LACK OF COMPLETE CROSS-REFERENCING BETWEEN RVI SOURCES. No RVI has an exhaustive set of cross-references, so no RVI can be sure that it is 100% comprehensive, even with respect to its own editorial policy. Some RVIs compete with each other directly, so they don’t cross-reference each other. Some sources could theoretically support all public cross-references – most notably OSVDB and CVE – but they do not, due to resource limitations or other priorities.
- – UNMEASURABLE RESEARCH COMMUNITY BIAS. Vulnerability researchers vary widely in skill sets, thoroughness, preference for certain vulnerability types or product classes, and so on. This collectively produces a bias that is not currently measurable against the number of latent vulnerabilities that actually exist. Example: web browser vulnerabilities were once thought to belong to Internet Explorer only, until people actually started researching other browsers; many elite researchers concentrate on a small number of operating systems or product classes; basic SQL injection and XSS are very easy to find manually; etc.
- – UNMEASURABLE DISCLOSURE BIAS. Vendors and researchers vary widely in their disclosure models, which creates an unmeasurable bias. For example, one vendor might hire an independent auditor and patch all reported vulnerabilities without publicly announcing any of them, or a different vendor might publish advisories even for very low-risk issues. One researcher might disclose without coordinating with the vendor at all, whereas another researcher might never disclose an issue until a patch is provided, even if the vendor takes an inordinate amount of time to respond. Note that many large-scale comparisons, such as “Linux vs. Windows,” can not be verified due to unmeasurable bias, and/or editorial policy of the core RVI that was used to conduct the comparison.
EDITORIAL POLICY VARIATIONS
This is just a sample of variations in editorial policy. There are legitimate reasons for each variation, usually due to audience needs or availability of analytical resources.
COMPLETENESS (what is included):
- SEVERITY. Some RVIs do not include very low-risk items such as a bug that causes path disclosure in an error message in certain non-operational configurations. Secunia and SecurityFocus do not do this, although they might note this when other issues are identified. Others include low-risk issues, such as CVE, ISS X-Force, US-CERT Security Bulletins, and OSVDB.
- VERACITY. Some RVIs will only publish vulnerabilities when they are confident that the original, raw report is legitimate – or if they’re verified it themselves. Others will publish reports when they are first detected from the raw sources. Still others will only publish reports when they are included in other RVIs, which makes them subject to the editorial policies of those RVIs unless care is taken. For example, US-CERT’s Vulnerability Notes have a high veracity requirement before they are published; OSVDB and CVE have a lower requirement for veracity, although they have correction mechanisms in place if veracity is questioned, and CVE has a two-stage approach (candidates and entries).
- PRODUCT SPACE. Some RVIs might omit certain products that have very limited distribution, are in the beta development stage, or are not applicable to the intended audience. For example, version 0.0.1 of a low-distribution package might be omitted, or if the RVI is intended for a business audience, video game vulnerabilities might be excluded. On the other hand, some “beta” products have extremely wide distribution.
- OTHER VARIATIONS. Other variations exist but have not been studied or categorized at this time. One example, though, is historical completeness. Most RVIs do not cover vulnerabilities before the RVI was first launched, whereas others – such as CVE and OSVDB – can include issues that are older than the RVI itself. As another example: a few years ago, Neohapsis made an editorial decision to omit most PHP application vulnerabilities from their summaries, if they were obscure products, or if the
vulnerability was not exploitable in a typical operational configuration.
ABSTRACTION (how vulnerabilities are “counted”):
- VULNERABILITY TYPE. Some RVIs distinguish between types of vulnerabilities (e.g. buffer overflow, format string, symlink, XSS, SQL injection). CVE, OSVDB, ISS X-Force, and US-CERT Vulnerability Notes perform this distinction; Secunia, FrSIRT, and US-CERT Cyber Security Bulletins do not. Bugtraq IDs vary. As vulnerability classification becomes more detailed, there is more room for variation (e.g. integer overflows and off-by-ones might be separated from “classic” overflows).
- REPLICATION. Some RVIs will produce multiple records for the same core vulnerability, even based on the RVI’s own definition. Usually this is done when the same vulnerability affects multiple vendors, or if important information is released at a later date. Secunia and US-CERT Security Bulletins use replication; so might vendor advisories (for each supported distribution). OSVDB, Bugtraq ID, CVE, US-CERT Vulnerability Notes, and ISS X-Force do not – or, they use different replication than others. Replication’s impact on statistics is not well understood.
- OTHER VARIATIONS. Other abstraction variations exist but have not been studied or categorized at this time. As one example, if an SQL injection vulnerability affects multiple executables in the same product, OSVDB will create one record for each affected program, whereas CVE will combine them.
- RVIs differ in how quickly they must release vulnerability information. While this used to vary significantly in the past, these days most public RVIs have very short timelines, from the hour of release to within a few days. Vulnerability information can be volatile in the early stages, so an RVI’s requirements for timeliness directly affects its veracity and completeness.
- All RVIs deal with limited resources or time, which significantly affects completeness, especially with respect to veracity, or timeliness (which is strongly associated with the ability to achieve completeness). Abstraction might also be affected, although usually to a lesser degree, except in the case of large-scale disclosures.
In my opinion:
You should not interpret any RVI’s statistics without considering its editorial policy. For example, the US-CERT Cyber Security Bulletin Summary for 2005 uses statistics that include replication. (As a side note, a causal glance at the bulletin’s contents makes it clear that it cannot be used to compare Windows to Linux as operating systems.)
In addition, you should not compare statistics from different RVIs until (a) the RVIs are clear about their editorial policy and (b) the differences in editorial policy can be normalized. Example: based on my PRELIMINARY investigations of a few hours’ work, OSVDB would have about 50% more records than CVE, even though it has the same underlying number of vulnerabilities and the same completeness policy for recent issues.
Third, for the sake of more knowledgeable analysis, RVIs should consider developing and publishing their own editorial policies.
(Note that based on CVE’s experience, this can be difficult to do.) Consumers should be aware that some RVIs might not be open about their raw sources, veracity analysis, and/or completeness.
Finally: while RVIs are not yet ready to provide usable, conclusive statistics, there is a solid chance that they will be able to do so in the near future. Then, the only problem will be whether the statistics are properly interpreted. But that is beyond the scope of this letter.
P.S. This post was written for the purpose of timely technical exchange. Members of the press are politely requested to consult me before directly attributing quotes from this article, especially with respect to stated opinion.