Despite the talk given at BlackHat 2013 by Steve Christey and myself, companies continue to produce pedestrian and inaccurate statistics. This batch comes from Cristian Florian at GFI Software and offers little more than confusing and misleading statistics. Florian falls into many of the traps and pitfalls outlined previously.
These are compiled from data from the National Vulnerability Database (NVD).
There’s your first problem, using a drastically inferior data set than is available. The next bit really invalidates the rest of the article:
On average, 13 new vulnerabilities per day were reported in 2013, for a total of 4,794 security vulnerabilities: the highest number in the last five years.
This is laughable. OSVDB cataloged 10,472 disclosed vulnerabilities for 2013 (average of 28 a day), meaning these statistics were generated with less than half of known vulnerabilities. 2013 was our third year of breaking 10,000 vulnerabilities, where the rest have a single year (2006) if any at all. Seriously; what is the point of generating statistics when you knowingly use a data set lacking so much? Given that 2012 was another ’10k’ year, the statement about it being the highest number in the last five years is also wrong.
Around one-third of these vulnerabilities were classified ‘high severity’, meaning that an exploit for these vulnerabilities would have a high impact on the attacked systems.
By who? Who generated these CVSS scores exactly, and why isn’t that disclaimed in the article? Why no mention of the ‘CVSS 10′ scoring problem as VDBs must default to that for a completely unspecified issue? With a serious number of vulnerabilities either scored by vendors with a history of incorrect scoring, or VDBs forced to use ’10′ for unspecified issues, these numbers are completely meaningless and skewed.
The vulnerabilities were discovered in software provided by 760 different vendors, but the top 10 vendors were found to have 50% of the vulnerabilities:
I would imagine Oracle is accurate on this table, as we have cataloged 570 vulnerabilites in 2013 from them. However, the rest of the table is inaccurate because #2 is wrong. You say Cisco with 373, I say ffmpeg with 490. You say #10 is HP with 112 and I counter that WebKit had 139 (which in turn adds to Apple and Google among others). You do factor in that whole “software library” thing, right? For example, what products incorporate ffmpeg that have their own vulnerabilities? These are contenders for taking the #1 and #2 spot on the table.
Most Targeted Operating Systems in 2013
As we frequently see, no mention of severity here. Of the 363 Microsoft vulnerabilities in 2013, compared to the 161 Linux Kernel issues, impact and severity is important to look at. Privilege escalation and code execution is typical in Microsoft, while authenticated local denial of service accounts for 22% of the Linux issues (and only 1% for Microsoft).
In 2013 web browsers continued to justle – as in previous years – for first place on the list of third-party applications with the most security vulnerabilities. If Mozilla Firefox had the most security vulnerabilities reported last year and in 2009, Google Chrome had the “honor” in 2010 and 2011, it is now the turn of Microsoft Internet Explorer to lead with 128 vulnerabilities, 117 of them ‘critical’.
We already know your numbers are horribly wrong, as you don’t factor in WebKit vulnerabilities that affect multiple browsers. Further, what is with the sorting of this table putting MSIE up top despite it not being reported with the most vulnerabilities?
Sticking to just the browsers, Google Chrome had 297 reported vulnerabilities in 2013 and that does not count additional WebKit issues that very likely affect it. Next is Mozilla and then Microsoft IE with Safari at the lowest (again, ignoring the WebKit issue).
[Sent to Ashley directly via email. Posting for the rest of the world as yet another example of how vulnerability statistics are typically done poorly. In this case, a company that does not aggregate vulnerabilities themselves, and has no particular expertise in vulnerability metrics weighs in on 2013 "statistics". They obviously did not attend Steve Christey and my talk at BlackHat last year titled "Buying Into the Bias: Why Vulnerability Statistics Suck". If we do this talk again, we have a fresh example to use courtesy of Skybox.]
[Update: SkyboxSecurity has quickly written a second blog in response to this one, clarifying a lot of their methodology. No word from Carman or SC Magazine. Not surprised; they have a dismal history as far as printing corrections, retractions, or even addressing criticism.]
In your recent article “Microsoft leads vendors with most critical vulnerabilities“, you cite research that is factually incorrect, and I fully expect a retraction to be printed. In fact, the list of errata in this article is considerably longer than the article itself. Some of this may seem to be semantics to you, but I assure you that in our industry they are anything but. Read down, where I show you how their research is *entirely wrong* and Microsoft is not ‘number one’ here.
1. If Skybox is only comparing vendors based on their database, as maps to CVE identifiers, then their database for this purpose is nothing but a copy of CVE. It is important to note this because aggregating vulnerability information is considerably more demanding than aggregating a few databases that do that work for you.
2. You say “More than half of the company’s 414 vulnerabilities were critical.” First, you do not disclaim that this number is limited to 2013 until your last paragraph. Second, Microsoft had 490 disclosed vulnerabilities in 2013 according to OSVDB.org, apparently not one of the “20″ sources Skybox checked. And we don’t claim to have all of the disclosed vulnerabilities.
3. You cite “critical vulnerability” and refer to Microsoft’s definition of that as “one that allows code execution without user interaction.” Yet Skybox did not define ‘critical’. This is amateur hour in the world of vulnerabilities. For example, if Microsoft’s definition were taken at face value, then code execution in a sandbox would still qualify, while being considerably less severe than without. If you go for what I believe is the ‘spirit’ of the research, then you are talking about vulnerabilities with a CVSS score of 10.0 (network, no user interaction, no authentication, full code execution to impact confidentiality / integrity / availability completely), then Microsoft had 10 vulnerabilities. Yes, only 10. If you add the ‘user interaction’ component, giving it a CVSS score of 9.3, they had 176. That is closer to the ’216′ Skybox is claiming. So again, how can you cite their research when they don’t define what ‘critical’ is exactly? As we frequently see, companies like to throw around vulnerability statistics but give no way to reproduce their findings.
4. You say, “The lab’s findings weren’t particularly surprising, considering the vendors’ market shares. Microsoft, for instance, is the largest company and its products are the most widely used.” This is completely subjective and arbitrary. While Microsoft captures the desktop OS market share, they do not capture the browser share for example. Further, like all of the vendors in this study, they use third-party code from other people. I point this line out because when you consider that another vendor/software is really ‘number one’, it makes this line seem to
be the basis of an anecdotal fallacy.
5. You finish by largely parroting Skybox, “Skybox analyzed more than 20 sources of data to determine the number of vulnerabilities that occurred in 2013. The lab found that about 700 critical vulnerabilities occurred in 2013, and more than 500 of them were from four vendors.” We’ve covered the ‘critical’ fallacy already, as they never define what that means. I mentioned the “CVE” angle above. Now, I question why you didn’t challenge them further on this. As a security writer, the notion that “20″ sources has any meaning in that context should be suspect. Did they simply look to 20 other vulnerability databases (that do all the initial data aggregation) and then aggregate them? Did they look at 20 unique sources of vulnerability information themselves (e.g. the MS / Adobe / Oracle advisory pages)? This matters greatly. Why? OSVDB.org monitors over 1,500 sources for vulnerability information. Monitoring CVE, BID, Secunia, and X-Force (other large vulnerability databases) is considered to be 4 of those sources. So what does 20 mean exactly? To me, it means they are amateurs at best.
6. Jumping to the Skybox blog, “Oracle had the highest total number of vulnerabilities at 568, but only 18 percent of their total vulnerabilities were deemed critical.” This is nothing short of a big red warning flag to anyone familiar with vulnerabilities. This line alone should have made you steer clear from their ‘research’ and demanded you challenge them. It is well known that Oracle does not follow the CVSS standards when scoring a majority of their vulnerabilities. It has been shown time and time again that what they scored is not grounded in reality, when compared to the
researcher report that is eventually released. Every aspect of a CVSS score is frequently botched. Microsoft and Adobe do not have that reputation; they are known for generally providing accurate scoring. Since that scoring is the quickest way to determine criticality, it is important to note here.
7. Now for what you are likely waiting for. If not Microsoft, who? Before I answer that, let me qualify my statements since no one else at this table did. Based on vulnerabilities initially disclosed in 2013, that have a CVSS score of 10.0 (meaning full remote code execution without user interaction), we get this:
Two vendors place higher than Microsoft based on this. Now, if we consider “context-dependent code execution”, meaning that user interaction is required but it leads to full code execution (e.g. click this malicious PDF/DOC/GIF and we base that on a 9.3 CVSS score (CVSS2#AV:N/AC:M/Au:N/C:C/I:C/A:C”)) or full remote code execution (CVSS2#AV:N/AC:L/Au:N/C:C/I:C/A:C) we get the following:
I know, Microsoft is back on top. But wait…
OSF / OSVDB.org
Last week, Steve Christey and I gave a presentation at Black Hat Briefings 2013 in Las Vegas about vulnerability statistics. We submitted a brief whitepaper on the topic, reproduced below, to accompany the slides that are now available.
Buying Into the Bias: Why Vulnerability Statistics Suck
By Steve Christey (MITRE) and Brian Martin (Open Security Foundation)
July 11, 2013
Academic researchers, journalists, security vendors, software vendors, and professional analysts often analyze vulnerability statistics using large repositories of vulnerability data, such as “Common Vulnerabilities and Exposures” (CVE), the Open Sourced Vulnerability Database (OSVDB), and other sources of aggregated vulnerability information. These statistics are claimed to demonstrate trends in vulnerability disclosure, such as the number or type of vulnerabilities, or their relative severity. Worse, they are typically misused to compare competing products to assess which one offers the best security.
Most of these statistical analyses demonstrate a serious fault in methodology, or are pure speculation in the long run. They use the easily-available, but drastically misunderstood data to craft irrelevant questions based on wild assumptions, while never figuring out (or even asking the sources about) the limitations of the data. This leads to a wide variety of bias that typically goes unchallenged, that ultimately forms statistics that make headlines and, far worse, are used to justify security budget and spending.
As maintainers of two well-known vulnerability information repositories, we’re sick of hearing about research that is quickly determined to be sloppy after it’s been released and gained public attention. In almost every case, the research casts aside any logical approach to generating the statistics. They frequently do not release their methodology, and they rarely disclaim the serious pitfalls in their conclusions. This stems from their serious lack of understanding about the data source they use, and how it operates. In short, vulnerability databases (VDBs) are very different and very fickle creatures. They are constantly evolving and see the world of vulnerabilities through very different glasses.
This paper and its associated presentation introduce a framework in which vulnerability statistics can be judged and improved. The better we get about talking about the issues, the better the chances of truly improving how vulnerability statistics are generated and interpreted.
Bias, We All Have It
Bias is inherent in everything humans do. Even the most rigorous and well-documented process can be affected by levels of bias that we simply do not understand are working against us. This is part of human nature. As with all things, bias is present in the creation of the VDBs, how the databases are populated with vulnerability data, and the subsequent analysis of that data. Not all bias is bad; for example, VDBs have a bias to avoid providing inaccurate information whenever possible, and each VDB effectively has a customer base whose needs directly drive what content is published.
Bias comes in many forms that we see as strongly influencing vulnerability statistics, via a number of actors involved in the process. It is important to remember that VDBs catalog the public disclosure of security vulnerabilities by a wide variety of people with vastly different skills and motivations. The disclosure process varies from person to person and introduces bias for sure, but even before the disclosure occurs, bias has already entered the picture.
Consider the general sequence of events that lead to a vulnerability being cataloged in a VDB.
- A researcher chooses a piece of software to examine.
- Each researcher operates with a different skill set and focus, using tools or techniques with varying strengths and weaknesses; these differences can impact which vulnerabilities are capable of being discovered.
- During the process, the researcher will find at least one vulnerability, often more.
- The researcher may or may not opt for vendor involvement in verifying or fixing the issue.
- At some point, the researcher may choose to disclose the vulnerability. That disclosure will not be in a common format, may suffer from language barriers, may not be technically accurate, may leave out critical details that impact the severity of the vulnerability (e.g. administrator authentication required), may be a duplicate of prior research, or introduce a number of other problems.
- Many VDBs attempt to catalog all public disclosures of information. This is a “best effort” activity, as there are simply too many sources for any one VDB to monitor, and accuracy problems can increase the expense of analyzing a single disclosure.
- If the VDB maintainers see the disclosure mentioned above, they will add it to the database if it meets their criteria, which is not always public. If the VDB does not see it, they will not add it. If the VDB disagrees with the disclosure (i.e. believes it to be inaccurate), they may not add it.
By this point, there are a number of criteria that may prevent the disclosure from ever making it into a VDB. Without using the word, the above steps have introduced several types of bias that impact the process. These biases carry forward into any subsequent examination of the database in any manner.
Types of Bias
Specific to the vulnerability disclosure aggregation process that VDBs go through every day, there are four primary types of bias that enter the picture. Note that while each of these can be seen in researchers, vendors, and VDBs, some are more common to one than the others. There are other types of bias that could also apply, but they are beyond the scope of this paper.
Selection bias covers what gets selected for study. In the case of disclosure, this refers to the researcher’s bias in selecting software and the methodology used to test the software for vulnerabilities; for example, a researcher might only investigate software written in a specific language and only look for a handful of the most common vulnerability types. In the case of VDBs, this involves how the VDB discovers and handles vulnerability disclosures from researchers and vendors. Perhaps the largest influence on selection bias is that many VDBs monitor a limited source of disclosures. It is not necessary to argue what “limited” means. Suffice it to say, no VDB is remotely complete on monitoring every source of vulnerability data that is public on the net. Lack of resources – primarily the time of those working on the database – causes a VDB to prioritize sources of information. With an increasing number of regional or country-based CERT groups disclosing vulnerabilities in their native tongue, VDBs have a harder time processing the information. Each vulnerability that is disclosed but does not end up in the VDB, ultimately factors into statistics such as “there were X vulnerabilities disclosed last year”.
Publication bias governs what portion of the research gets published. This ranges from “none”, to sparse information, to incredible technical detail about every finding. Somewhere between selection and publication bias, the researcher will determine how much time they are spending on this particular product, what vulnerabilities they are interested in, and more. All of this folds into what gets published. VDBs may discover a researcher’s disclosure, but then decide not to publish the vulnerability due to other criteria.
Abstraction bias is a term that we crafted to explain the process that VDBs use to assign identifiers to vulnerabilities. Depending on the purpose and stated goal of the VDB, the same 10 vulnerabilities may be given a single identifier by one database, and 10 identifiers by a different one. This level of abstraction is an absolutely critical factor when analyzing the data to generate vulnerability statistics. This is also the most prevalent source of problems for analysis, as researchers rarely understand the concept of abstraction, why it varies, and how to overcome it as an obstacle in generating meaningful statistics. Researchers will use whichever abstraction is most appropriate or convenient for them; after all, there are many different consumers for a researcher advisory, not just VDBs. Abstraction bias is also frequently seen in vendors, and occasionally researchers in the way they disclose one vulnerability multiple times, as it affects different software that bundles additional vendor’s software in it.
Measurement bias refers to potential errors in how a vulnerability is analyzed, verified, and catalogued. For example, with researchers, this bias might be in the form of failing to verify that a potential issue is actually a vulnerability, or in over-estimating the severity of the issue compared to how consumers might prioritize the issue. With vendors, measurement bias may affect how the vendor prioritizes an issue to be fixed, or in under-estimating the severity of the issue. With VDBs, measurement bias may also occur if analysts do not appropriately reflect the severity of the issue, or if inaccuracies are introduced while studying incomplete vulnerability disclosures, such as missing a version of the product that is affected by the vulnerability. It could be argued that abstraction bias is a certain type of measurement bias (since it involves using inconsistent “units of measurement”), but for the purposes of understanding vulnerability statistics, abstraction bias deserves special attention.
Measurement bias, as it affects statistics, is arguably the domain of VDBs, since most statistics are calculated using an underlying VDB instead of the original disclosures. As the primary sources of vulnerability data aggregation, several factors come into play when performing database updates.
Why Bias Matters, in Detail
These forms of bias can work together to create interesting spikes in vulnerability disclosure trends. To the VDB worker, they are typically apparent and sometimes amusing. To an outsider just using a data set to generate statistics, they can be a serious pitfall.
In August, 2008, a single researcher using rudimentary, yet effective methods for finding symlink vulnerabilities single handedly caused a significant spike in symlink vulnerability disclosures over the past 10 years. Starting in 2012 and continuing up to the publication of this paper, a pair of researchers have significantly impacted the number of disclosures in a single product. Not only has this caused a huge spike for the vulnerability count related to the product, it has led to them being ranked as two of the top vulnerability disclosers since January, 2012. Later this year, we expect there to be articles written regarding the number of supervisory control and data acquisition (SCADA) vulnerabilities disclosed from 2012 to 2013. Those articles will be based purely on vulnerability counts as determined from VDBs, likely with no mention of why the numbers are skewed. One prominent researcher who published many SCADA flaws has changed his personal disclosure policy. Instead of publicly disclosing details, he now keeps them private as part of a competitive advantage of his new business.
Another popular place for vulnerability statistics to break down is related to vulnerability severity. Researchers and journalists like to mention the raw number of vulnerabilities in two products and try to compare their relative security. They frequently overlook the severity of the vulnerabilities and may not note that while one product had twice as many disclosures, a significant percentage of them were low severity. Further, they do not understand how the industry-standard CVSSv2 scoring system works, or the bias that can creep in when using it to score vulnerabilities. Considering that a vague disclosure that has little actionable details will frequently be scored for the worst possible impact, that also drastically skews the severity ratings.
The forms of bias and how they may impact vulnerability statistics outlined in this paper are just the beginning. For each party involved, for each type of bias, there are many considerations that must be made. Accurate and meaningful vulnerability statistics are not impossible; they are just very difficult to accurately generate and disclaim.
Our 2013 BlackHat Briefings USA talk hopes to explore many of these points, outline the types of bias, and show concrete examples of misleading statistics. In addition, we will show how you can easily spot questionable statistics, and give some tips on generating and disclaiming good statistics.
About two weeks ago, another round of vulnerability stats got passed around. Like others before, it claims to use CVE to compare Apple iOS versus Android in an attempt to establish which is more secure based on “vulnerability counts”. The statistics put forth are basically meaningless, because like most people using a VDB to generate stats, they don’t fully understand their data source. This is one type of bias that enters the picture when generating statistics, and one of many points Steve Christey (MITRE/CVE) and I will be making next week at BlackHat (Wednesday afternoon).
As with other vulnerability statistics, I will debunk the latest by showing why the conclusions are not based on a solid understanding of vulnerabilities, or vulnerability data sources. The post is published on The Verge, written by ‘Mechanicix’. The results match last year’s Symantec Internet Security Threat Report (as mentioned in the comments), as well as the results published this year by Sourcefire in their paper titled “25 Years of Security Vulns“. In all three cases, they use the same data set (CVE), and do the same rudimentary counting to reach their results.
The gist of the finding is that Apple iOS is considerably less secure than Android, as iOS had 238 reported vulnerabilities versus the 27 reported in Android, based on CVE and illustrated through CVEdetails.com.
Total iOS Vulnerabilities 2007-2013: 238
Total Android Vulnerabilities 2009-2013: 27
Keeping in mind those numbers, if you look at the CVE entries that are included, a number of problems are obvious:
- We see that the comparison timeframes differ by two years. There are at least 3 vulnerabilities in Android SDK reported before 2009, two of which have CVEs (CVE-2008-0985 and CVE-2008-0986).
- These totals are based on CVE identifiers, which does not necessarily reflect a 1-to-1 vulnerability mapping, as they document. You absolutely cannot count CVE as a substitute for vulnerabilities, they are not the same.
- The vulnerability totals are incorrect due to using CVE, a data source that has serious gaps in coverage. For example, OSVDB has 71 documented vulnerabilities for Android, and we do not make any claims that our coverage is complete.
- The iOS results include vulnerabilities in WebKit, the framework iOS Safari uses. This is problematic for several reasons.
- First, that means Mechanicix is now comparing the Android OS to the iOS operating system and applications.
- Second, WebKit vulnerabilities account for 109 of the CVE results, almost half of the total reported.
- Third, if they did count WebKit intentionally then the numbers are way off as there were around 700 WebKit vulnerabilities reported in that time frame.
- Fourth, the default browser in Android uses WebKit, yet they weren’t counted against that platform.
- The results include 16 vulnerabilities in Safari itself (or in WebKit and just not diagnosed as such), the default browser.
- At least 4 of the 238 are vulnerabilities in Google Chrome (as opposed to WebKit) with no mention of iOS in the CVE.
- A wide variety of iOS applications are included in the list including Office Viewer, iMessage, Mail, Broadcom BCM4325 and BCM4329 Wi-Fi chips, Calendar, FreeType, libxslt, and more.
When you factor in all of the above, Android likely comes out on top for the number of vulnerabilities when comparing the operating systems. Once again, vulnerability statistics seem simple on the surface. When you consider the above, and further consider that there are likely more points that influence vulnerability counts, we see that it is anything other than simple.
Readers may recall that I blogged about a similar topic just over a month ago, in an article titled Advisories != Vulnerabilities, and How It Affects Statistics. In this installment, instead of “advisories”, we have “CVEs” and the inherent problems when using CVE identifiers in the place of “vulnerabilities”. Doing so is technically inaccurate, and it negatively influences statistics, ultimately leading to bad conclusions.
NSS Labs just released an extensive report titled “Vulnerability Threat Trends; A Decade in Review, Transition on the Way“, by Stefan Frei. While the report is interesting, and the fundamental methodology is sound, Frei uses a dataset that is not designed for true vulnerability statistics. Additionally, I believe that some factors that Frei attributes to trends are incorrect. I offer this blog as open feedback to bring additional perspective to the realm of vulnerability stats, which is a long ways from approaching maturity.
Vulnerabilities versus CVE
In the NSS Labs paper, they define a vulnerability as “a weakness in software that enables an attacker to compromise the integrity, availability, or confidentiality of the software or the data that it processes.” This is as good a definition as any. The key point here is a weakness, singular. What Frei fails to point out, is that the CVE dictionary is not a vulnerability database in the same sense as many others. It is a specialty database designed primarily to assign a unique identifier to a vulnerability, or a group of vulnerabilities, to coordinate tracking and discussion. While CVE says “CVE Identifiers are unique, common identifiers for publicly known information security vulnerabilities” , it is more important to note the way CVE abstracts, which is covered in great detail. From the CVE page on abstraction:
CVE Abstraction Content Decisions (CDs) provide guidelines about when to combine multiple reports, bugs, and/or attack vectors into a single CVE name (“MERGE”), and when to create separate CVE names (“SPLIT”).
This clearly denotes that a single CVE may represent multiple vulnerabilities. With that in mind, every statistic generated by NSS Labs for this report is not accurate, and their numbers are not reproduceable using any other vulnerability dataset (unless it too is only based on CVE data and does not abstract differently, e.g. NVD). This distinction puts the report’s statements and conclusions in a different light:
As of January 2013 the NVD listed 53,489 vulnerabilities ..
In the last ten years on average 4,660 vulnerabilities were disclosed per year ..
.. with an all-‐time high of 6,462 vulnerabilities counted in 2006 ..
The abstraction distinction means that these numbers aren’t just technically inaccurate (i.e. terminology), they are factually inaccurate (i.e. actual stats when abstracting on a per-vulnerability basis). In each case where Frei uses the term “vulnerability”, he really means “CVE”. When you consider that a single CVE may cover as many as 66 or more distinct vulnerabilities, it really invalidates any statistic generated using this dataset as he did. For example:
However, in 2012 alone the number of vulnerabilities increased again to a considerable 5,225 (80% of the all-‐time high), which is 12% above the ten-‐year average. This is the largest increase observed in the past six years and ends the trend of moderate declines since 2006.
Based on my explanation, what does 5,225 really mean? If we agree for the sake of argument, that CVE averages two distinct vulnerabilities per CVE assignment, that is now over 10,000 vulnerabilities. How does that in turn change any observations on trending?
The report’s key findings offer 7 high-level conclusions based on the CVE data. To put all of the above in more perspective, I will examine a few of them and use an alternate dataset, OSVDB, that abstracts entries on a per-vulnerability basis. With those numbers, we can see how the findings stand. NSS Labs report text is quoted below.
The five year long trend in decreasing vulnerability disclosures ended abruptly in 2012 with a +12% increase
Based on OSVDB data, this is incorrect. Both 2009 (7,879) -> 2010 (8,835) as well as 2011 (7,565) -> 2012 (8,919) showed an upward trend.
More than 90 percent of the vulnerabilities disclosed are moderately or highly critical – and therefore relevant
If we assume “moderately” is “Medium” criticality, as later defined in the report, is 4.0 -‐ 6.9 then OSVDB shows 57,373 entries that are CVSSv2 4.0 – 10.0, out of 82,123 total. That means 90% is considerably higher than we show. Note: we do not have complete CVSSv2 data for 100% of our entries, but we do have them for all entries affiliated with the ones Frei examined and more. If “moderately critical” and “highly critical” refer to different ranges, then they should be more clearly defined.
It is also important to note that this finding is a red herring, due to the way CVSS scoring works. A remote path disclosure in a web application scores a 5.0 base score (CVSS2#AV:N/AC:L/Au:N/C:P/I:N/A:N). This skews the scoring data considerably higher than many in the industry would agree with, as 5.0 is the same score you get for many XSS vulnerabilities that can have more serious impact.
9 percent of vulnerabilities disclosed in 2012 are extremely critical (with CVSS score>9.9) paired with low attack/exploitation complexity
This is another red herring, because any CVSS 10.0 score means that “low complexity” was factored in. The wording in the report implies that a > 9.9 score could be paired with higher complexity, which isn’t possible. Further, CVSS is scored for the worst case scenario when details are not available (e.g. CVE-2012-5895). Given the number of “unspecified” issues, this may seriously skew the number of CVSSv2 10.0 scores.
Finally, there was one other element to this report that was used in the overview, and later in the document, that is used to attribute a shift in disclosure trends. From the overview:
The parallel and massive drop of vulnerability disclosures by the two long established purchase programs iDefense VCP and TippingPoint ZDI indicate a transition in the way vulnerability and exploit information is handled in the industry.
I believe this is a case of “correlation does not mean causation“. While these are the two most recognized third-party bug bounty programs around, there are many variables at play here. In the bigger picture, shifts in these programs do not necessarily mean anything. Some of the factors that may have influenced disclosure numbers for those two programs include:
- There are more bug bounty programs available. Some may offer better price or incentive for disclosing through them, stealing business from iDefense/ZDI.
- Both companies have enjoyed their share of internal politics that affected at least one program. In 2012, several people involved in the ZDI program left the company to form their startup. It has been theorized that since their departure, ZDI has not built the team back up and that disclosures were affected as a result.
- ZDI had a small bout of external politics, in which one of their most prevalent bounty collectors (Luigi Auriemma) had a serious disagreement about ZDI’s handling of a vulnerability, as relates to Portnoy and Exodus. Auriemma’s shift to disclose via his own company would dramatically affect ZDI disclosure totals alone.
- Both of these companies have a moving list of software that they offer a bounty on. As it changes, it may result in spikes of disclosures via their programs.
Regardless, iDefense and ZDI represent a small percentage of overall disclosures, it is curious that Frei opted to focus on this so prominently as a reason for vulnerability trends changing without considering some influencing factors. Even during a good year, 2011 for example, iDefense (42) and ZDI (297) together accounted for 339 out of 7,565 vulnerabilities, only ~ 4.5% of the overall disclosures. There are many other trends that could just as easily explain relatively small shifts in disclosure totals. When making statements about trends in vulnerability disclosure and how it affects statistics, it isn’t something that should be done by casual observers. They simply miss a lot of the low-level details you glean on the day-to-day vulnerability handling and cataloging.
To be clear, I am not against using CVE/NVD data to generate statistics. However, when doing so, it is important that the dataset be explained and qualified before going into analysis. The perception and definition of what “a vulnerability” is changes based on the person or VDB. In vulnerability statistics, not all vulnerabilities are created equal.
I’ve written about the various problems with generating vulnerability statistics in the past. There are countless factors that contribute to, or skew vulnerability stats. This is an ongoing problem for many reasons. First, important numbers are thrown around in the media and taken as gospel, creating varying degrees of bias in administrators and owners. Second, these stats are rarely explained to show how they were derived. In short, no one shows their work, shows potential bias, caveats, or other issues that should be included as a responsible security professional. A recent article has highlighted this problem again. To better show why vulnerability stats are messy, but important, I will show you how it is trivial to skew numbers simply by using different criteria, along with several pitfalls that must be factored into any set of stats you generate. The fun part is that the word used to describe the differences can be equally nebulous and they are all valid, if properly disclaimed!
I noticed a Tweet from @SCMagazine about an article titled “The ghosts of Microsoft: Patch, present and future”. The article is by Alex Horan, security strategist, CORE Security and discusses Microsoft’s vulnerabilities this year. Reading down, the first line of the second paragraph immediately struck me as being incorrect.
Based on my count, there were 83 vulnerabilities announced by Microsoft over the past year. This averages out to a little more than six per month, a reasonable number of patches (and reboots) to apply to your systems over the course of a year.
It is difficult to tell if Horan means “vulnerabilities” or “patches”, as he appears to use the same word to mean both, when they are quite different. The use of ’83′ makes it very clear, Horan is referencing Microsoft advisories, not vulnerabilities. This is an important distinction as a single advisory can contain multiple vulnerabilities.
A cursory look at the data in OSVDB showed there were closer to 170 vulnerabilities verified by Microsoft in 2012. Doing a search to include references for “MS12″ (used in their advisory designation), 160 results. This is how it was easy to determine the number Horan used was inaccurate, or his wording was. If you generate statistics based on advisories versus independent vulnerabilities, results will vary greatly. To add a third perspective, we must also consider the total number of disclosed vulnerabilities in Microsoft products. This means ones that did not correspond to a Microsoft advisory (e.g. perhaps a KB only), did not receive a CVE designation, or were missed completely by the company. On Twitter, Space Rogue (@spacerog) asked about severity breakdowns over the last few years. Since that would take considerable time to generate, I am going to stay focused on 2012 as it demonstrates the issues. Hopefully this will give him a few numbers though!
If we look at the 2012 Microsoft advisories versus 2012 Microsoft CVE versus 2012 Microsoft total vulnerabilities, and do a percentage breakdown by severity, you can see heavy bias. We will use the following breakdown of CVSS scores to determine severity: 9 – 10 = critical, 7 – 8.9 = important, 4 – 6.9 = moderate, 0 – 3.9 = low.
|2012 Advisories (83)||35 (42.2%)||46 (55.4%)||2 (2.4%)||–|
|2012 CVE (160)||100 (62.5%)||18 (11.3%)||39 (24.4%)||3 (1.8%)|
|2012 Total (176)||101 (57.4%)||19 (10.8%)||41 (23.3%)||15 (8.5%)|
It isn’t easy to see the big shifts in totals in a chart, but it is important to establish the numbers involved when displaying any type of chart or visual representation. If we look at those three breakdowns using simple pie charts, the shifts become much more apparent:
The visual jump in critical vulnerabilities from the first to the second two charts is distinct. In addition, notice the jump from the first two charts to the third in regards to the low severity vulnerabilities and that they didn’t even make an appearance on the first chart. This is a simple example of how the “same” vulnerabilities can be represented, based on terminology and the source of data. If you want to get pedantic, there are additional considerations that must be factored into these vulnerabilities.
In no particular order, these are other points that should not only be considered, but disclaimed in any presentation of the data above. While it may seem minor, at least one of these points could further skew vulnerability counts and severity distribution.
- MS12-080 Only contains 1 CVE if you look at immediate identifiers, but also contains 2 more CVE in the fine print related to Oracle Outside In, which is used by the products listed in the advisory.
- MS12-058 actually has no immediate CVEs! If you read the fine print, it actually covers 13 vulnerabilities. Again, these are vulnerabilities in Oracle Outside In, which is used in some Microsoft products.
- Of the 176 Microsoft vulnerabilities in 2012, as tracked by OSVDB, 10 do not have CVE identifiers assigned.
- OSVDB 83750 may or may not be a vulnerability, as it is based on a Microsoft KB with uncertain wording. Vague vulnerability disclosures can skew statistics.
- Most of these CVSS scores are taken from the National Vulnerability Database (NVD). NVD outsources CVSS score generation to junior analysts from a large consulting firm. Just as we occasionally have mistakes in our CVSS scores, so does NVD. Overall, the number of scores that have serious errors are low, but they can still introduce a level of error into statistics.
- One of the vulnerabilities (OSVDB 88774 / CVE-2012-4792) has no formal Microsoft advisory, because it is a 0-day that was just discovered two days ago. There will almost certainly be a formal Microsoft advisory in January 2013 that covers it. This highlights a big problem with using vendor advisories for any statistic generation. Vendors generally release advisories when their investigation of the issue has completed, and a formal solution is made available. Generating statistics or graphics off the same vulnerabilities, but using disclosure versus solution date will give two different results.
These are just a few ways that statistics can be manipulated, often by accident, and why presenting as much data and explanation is beneficial to everyone. I certainly hope that SCMagazine and/or CORE will issue a small correction or explanation as to the what the “83″ number really represents.
Back in early January, I issued a challenge to donate to OSF’s Winter Fundraiser for every new vulnerability pushed into OSVDB. Two of the three months have come and gone, and even though January was a little more productive than February in terms of new vulnerabilities, the moderation team is still making good progress:
2010-02-01: 13 vulns pushed, 133 vulns updated
2010-02-02: 31 vulns pushed, 79 vulns updated
2010-02-03: 25 vulns pushed, 145 vulns updated
2010-02-04: 21 vulns pushed, 31 vulns updated
2010-02-05: 25 vulns pushed, 153 vulns updated
2010-02-06: 8 vulns pushed, 76 vulns updated
2010-02-07: 3 vulns pushed, 278 vulns updated
2010-02-08: 27 vulns pushed, 64 vulns updated
2010-02-09: 47 vulns pushed, 159 vulns updated
2010-02-10: 37 vulns pushed, 160 vulns updated
2010-02-11: 16 vulns pushed, 59 vulns updated
2010-02-12: 27 vulns pushed, 128 vulns updated
2010-02-13: 10 vulns pushed, 51 vulns updated
2010-02-14: 4 vulns pushed, 112 vulns updated
2010-02-15: 12 vulns pushed, 81 vulns updated
2010-02-16: 23 vulns pushed, 181 vulns updated
2010-02-17: 28 vulns pushed, 235 vulns updated
2010-02-18: 25 vulns pushed, 119 vulns updated
2010-02-19: 43 vulns pushed, 261 vulns updated
2010-02-20: 11 vulns pushed, 126 vulns updated
2010-02-21: 2 vulns pushed, 34 vulns updated
2010-02-22: 3 vulns pushed, 64 vulns updated
2010-02-23: 41 vulns pushed, 221 vulns updated
2010-02-24: 37 vulns pushed, 112 vulns updated
2010-02-25: 15 vulns pushed, 138 vulns updated
2010-02-26: 17 vulns pushed, 146 vulns updated
2010-02-27: 9 vulns pushed, 17 vulns updated
2010-02-28: 8 vulns pushed, 24 vulns updated
With 568 new vulnerabilities pushed in February, we’re now up to 1,223 new entries for 2010; personally, I’d like to see that number hit at least 2,000 by the end of March (3,000 may be out of reach, but never say never), but that will depend on the time and efforts of our moderation team and the amount of vulnerabilities uncovered by our multiple reference sources. Please remember that I will donate $0.50 to OSF for every new vulnerability pushed into the database through April 1 (and no, there will not be an April Fools announcement saying that the challenge has been called off), and we’re hoping to obtain some matching offers to help offset the costs of maintaining the database. A special “thank you” goes to all parties who have offered to match the challenge so far, and we hope others who find OSVDB to be a valuable resource can jump in and help us out as well.
31 more days for the challenge… and away… we… go.
Elinor Mills wrote an article titled Firefox, Adobe top buggiest-software list. In it, she quotes Qualys as providing vulnerability statistics for Mozilla, Adobe and others. Qualys states:
The number of vulnerabilities in Adobe programs rose from 14 last year to 45 this year, while those in Microsoft software dropped from 44 to 41, according to Qualys. Internet Explorer, Windows Media Player and Microsoft Office together had 30 vulnerabilities.
This caught my attention immediately, as I know I have mangled more than 45 Adobe entries this year.
First, the “number of vulnerabilities” game will always have wiggle room, which has been discussed before. A big factor for statistic discrepancy when using public databases is the level of abstraction. CVE tends to bunch up vulnerabilities in a single CVE, where OSVDB tends to break them out. Over the past year, X-Force and BID have started abstracting more and more as well.
Either way, Qualys cited their source, NVD, which is entirely based on CVE. How they got 45 vulns in “Adobe programs” baffles me. My count says 97 Adobe vulns, 95 of them have CVEs assigned to them (covered by a total of 93 CVEs). OSVDB abstracted the entries like CVE did for the most part, but split out CVE-2009-1872 as distinct XSS vulnerabilities. OSVDB also has two entries that do not have CVE, 55820 and 56281.
Where did Qualys get 45 if they are using the same CVE data set OSVDB does? This discrepancy has nothing to do with abstraction, so something else appears to be going on. Doing a few more searches, I believe I figured it out. Searching OSVDB for “Adobe Reader” in 2009 yields 44 entries, one off from their cited 45. That could be easily explained as OSVDB also has 9 “Adobe Multiple Products” entries that could cover Reader as well. This may in turn be a breakdown where Qualys or Mills did not specify “Adobe Software” (cumulative, all software they release) versus “Adobe Reader” or some other specific software they release.
Qualys tallied 102 vulnerabilities that were found in Firefox this year, up from 90 last year.
What is certainly a discrepancy due to abstraction, OSVDB has 74 vulnerabilities specific to Mozilla Firefox (two without CVE), 11 for “Mozilla Multiple Browsers” (Firefox, Seamonkey, etc) and 81 for “Mozilla Multiple Products” (Firefox, Thunderbird, etc). While my numbers are somewhat anecdotal, because I cannot remember every single entry, I can say that most of the ‘multiple’ vulnerabilities include Firefox. That means OSVDB tracked as many as, but possibly less than, 166 vulnerabilities in Firefox.
Microsoft software dropped from 44 to 41, according to Qualys. Internet Explorer, Windows Media Player and Microsoft Office together had 30 vulnerabilities.
According to my searches on OSVDB, we get the following numbers:
- 234 vulnerabilities in Microsoft, only 4 without CVE
- 50 vulnerabilities in MSIE, all with CVE
- 4 vulnerabilities in Windows Media Player, 1 without CVE
- 52 vulnerabilities in Office, all with CVE. (based on “Office” being Excel, Powerpoint, Word and Outlook.
- 92 vulnerabilities in Windows, only 2 without CVE
When dealing with vulnerability numbers and statistics, like anything else, it’s all about qualifying your numbers. Saying “Adobe Software” is different than “Adobe Acrobat” or “Adobe Reader” as the software installation base is drastically different. Given the different levels of abstraction in VDBs, it is also equally important to qualify what “a vulnerability” (singular) is. Where CVE/NVD will group several vulnerabilities in one identifier, other databases may abstract and assign unique identifiers to each distinct vulnerability.
Qualys, since you provided the stats to CNet, could you clarify?
Last week, OSVDB enhanced the search results capability by adding a considerable amount of filter capability, a simple “results by year” graph and export capability. Rather than draft a huge walkthrough, open a search in a new tab and title search for “microsoft windows”.
As always, the results will display showing the OSVDB ID, disclosure date and OSVDB title. On the left however, are several new options. First, a summary graph will be displayed showing the number of vulnerabilities by year, based on your search results. Next, you can toggle the displayed fields to add CVE, CVSSv2 score and/or the percent complete. The percent complete refers to the status of the OSVDB entry, and how many fields have been completed. Below that are one click filters that let you further refine your search results by the following criteria:
- Reference Type – only show results that contain a given type of reference
- Category – show results based on the vulnerability category
- Disclosure Year – refine results by limiting to a specific year
- CVSS Score – only show entries that are scored in a given range
- Percent Complete – filter results based on how complete the OSVDB entry is
Once you have your ideal search results, you can then export them to XML, custom RSS feed or CSV. The export will only work for the first 100 results. If you need a bigger data set to work with,
we encourage you to download the database instead.
With the new search capability, you should be able to perform very detailed searches, easily manipulate the results and even import them into another application or presentation. If you have other ideas of how a VDB search can be refined to provide more flexibility and power, contact us!
This is a question OSVDB moderators, CVE staff and countless other VDB maintainers have asked. Today, Gunter Ollmann with IBM X-Force released his research trying to answer this question. Before you read on, I think this research is excellent. The relatively few criticisms I bring up are not the fault of Ollmann’s research and methodology, but the fault of his VDB of choice (and *every* other VDB) not having a complete data set.
Skimming his list, my first thought was that he was missing someone. Doing a quick search of OSVDB, I see that Lostmon Lords (aka ‘lostmon’) has close to 350 vulnerabilities published. How could the top ten list miss someone like this when his #10 only had 147? Read down to Ollmann’s caveat and there is a valid point, but sketchy wording. The data he is using relies on this information being public. As the caveat says though, “because they were disclosed on non-public lists” implies that the only source he or X-Force are using are mail lists such as Bugtraq and Full-disclosure. Back in the day, that was a pretty reliable source for a very high percentage of vulnerability information. In recent years though, a VDB must look at other sources of information to get a better picture. Web sites such as milw0rm get a steady stream of vulnerability information that is frequently not cross-posted to mail lists. In addition, many researchers (including lostmon) mail their discoveries directly to the VDBs and bypass the public mail lists. If researchers mail a few VDBs and not the rest, it creates a situation where the VDBs must start watching each other. This in turn leads to “VDB inbreeding” that Jake and I mentioned at CanSecWest 2005, which is a necessary evil if you want more data on vulnerabilities.
In May of 2008, OSVDB did the same research Ollmann did and we came up with different results. This was based on data we had available, which is still admittedly very incomplete (always need data manglers.) So who is right? Neither of us. Well, perhaps he is, perhaps we are, but unfortunately we’re both working with incomplete databases. As a matter of my opinion, I believe OSVDB has better coverage of vulnerabilities, while X-Force clearly has better consistency in their data and a fraction of the gaps we do.
Last, this data is interesting as is, but would be really fascinating if it was mixed with ‘researcher confidence’ (a big thing of Steve Christey/CVE and myself), in which we track a researcher’s track record for accuracy in disclosure. Someone that disclosed 500 vulnerabilities last year with a 10% error rate should not be above someone who found 475 with a 0% error rate. In addition, as Ollmann’s caveat says, these are pure numbers and do not factor in hundreds of XSS versus remote code execution in operating system default install services. Having a weight system that can be applied to a vulnerability (e.g., XSS = 3, SQLi = 7, remote code exec = 9) that is then factored into researcher could move beyond “who discovered the most” and perhaps start to answer “who found the most respectable vulnerabilities”.