[Sent to Ashley directly via email. Posting for the rest of the world as yet another example of how vulnerability statistics are typically done poorly. In this case, a company that does not aggregate vulnerabilities themselves, and has no particular expertise in vulnerability metrics weighs in on 2013 “statistics”. They obviously did not attend Steve Christey and my talk at BlackHat last year titled “Buying Into the Bias: Why Vulnerability Statistics Suck“. If we do this talk again, we have a fresh example to use courtesy of Skybox.]
[Update: SkyboxSecurity has quickly written a second blog in response to this one, clarifying a lot of their methodology. No word from Carman or SC Magazine. Not surprised; they have a dismal history as far as printing corrections, retractions, or even addressing criticism.]
In your recent article “Microsoft leads vendors with most critical vulnerabilities“, you cite research that is factually incorrect, and I fully expect a retraction to be printed. In fact, the list of errata in this article is considerably longer than the article itself. Some of this may seem to be semantics to you, but I assure you that in our industry they are anything but. Read down, where I show you how their research is *entirely wrong* and Microsoft is not ‘number one’ here.
1. If Skybox is only comparing vendors based on their database, as maps to CVE identifiers, then their database for this purpose is nothing but a copy of CVE. It is important to note this because aggregating vulnerability information is considerably more demanding than aggregating a few databases that do that work for you.
2. You say “More than half of the company’s 414 vulnerabilities were critical.” First, you do not disclaim that this number is limited to 2013 until your last paragraph. Second, Microsoft had 490 disclosed vulnerabilities in 2013 according to OSVDB.org, apparently not one of the “20” sources Skybox checked. And we don’t claim to have all of the disclosed vulnerabilities.
3. You cite “critical vulnerability” and refer to Microsoft’s definition of that as “one that allows code execution without user interaction.” Yet Skybox did not define ‘critical’. This is amateur hour in the world of vulnerabilities. For example, if Microsoft’s definition were taken at face value, then code execution in a sandbox would still qualify, while being considerably less severe than without. If you go for what I believe is the ‘spirit’ of the research, then you are talking about vulnerabilities with a CVSS score of 10.0 (network, no user interaction, no authentication, full code execution to impact confidentiality / integrity / availability completely), then Microsoft had 10 vulnerabilities. Yes, only 10. If you add the ‘user interaction’ component, giving it a CVSS score of 9.3, they had 176. That is closer to the ‘216’ Skybox is claiming. So again, how can you cite their research when they don’t define what ‘critical’ is exactly? As we frequently see, companies like to throw around vulnerability statistics but give no way to reproduce their findings.
4. You say, “The lab’s findings weren’t particularly surprising, considering the vendors’ market shares. Microsoft, for instance, is the largest company and its products are the most widely used.” This is completely subjective and arbitrary. While Microsoft captures the desktop OS market share, they do not capture the browser share for example. Further, like all of the vendors in this study, they use third-party code from other people. I point this line out because when you consider that another vendor/software is really ‘number one’, it makes this line seem to
be the basis of an anecdotal fallacy.
5. You finish by largely parroting Skybox, “Skybox analyzed more than 20 sources of data to determine the number of vulnerabilities that occurred in 2013. The lab found that about 700 critical vulnerabilities occurred in 2013, and more than 500 of them were from four vendors.” We’ve covered the ‘critical’ fallacy already, as they never define what that means. I mentioned the “CVE” angle above. Now, I question why you didn’t challenge them further on this. As a security writer, the notion that “20” sources has any meaning in that context should be suspect. Did they simply look to 20 other vulnerability databases (that do all the initial data aggregation) and then aggregate them? Did they look at 20 unique sources of vulnerability information themselves (e.g. the MS / Adobe / Oracle advisory pages)? This matters greatly. Why? OSVDB.org monitors over 1,500 sources for vulnerability information. Monitoring CVE, BID, Secunia, and X-Force (other large vulnerability databases) is considered to be 4 of those sources. So what does 20 mean exactly? To me, it means they are amateurs at best.
6. Jumping to the Skybox blog, “Oracle had the highest total number of vulnerabilities at 568, but only 18 percent of their total vulnerabilities were deemed critical.” This is nothing short of a big red warning flag to anyone familiar with vulnerabilities. This line alone should have made you steer clear from their ‘research’ and demanded you challenge them. It is well known that Oracle does not follow the CVSS standards when scoring a majority of their vulnerabilities. It has been shown time and time again that what they scored is not grounded in reality, when compared to the
researcher report that is eventually released. Every aspect of a CVSS score is frequently botched. Microsoft and Adobe do not have that reputation; they are known for generally providing accurate scoring. Since that scoring is the quickest way to determine criticality, it is important to note here.
7. Now for what you are likely waiting for. If not Microsoft, who? Before I answer that, let me qualify my statements since no one else at this table did. Based on vulnerabilities initially disclosed in 2013, that have a CVSS score of 10.0 (meaning full remote code execution without user interaction), we get this:
Two vendors place higher than Microsoft based on this. Now, if we consider “context-dependent code execution”, meaning that user interaction is required but it leads to full code execution (e.g. click this malicious PDF/DOC/GIF and we base that on a 9.3 CVSS score (CVSS2#AV:N/AC:M/Au:N/C:C/I:C/A:C”)) or full remote code execution (CVSS2#AV:N/AC:L/Au:N/C:C/I:C/A:C) we get the following:
I know, Microsoft is back on top. But wait…
OSF / OSVDB.org
If you didn’t catch the tweet, OSVDB pushed its 100,000th vulnerability on December 25, 2013.
This goal was on our minds the last quarter of 2013, with the entire team working to push an average of 36 vulnerabilities a day to reach it. That is quite the difference from when I started on the project ten years ago, where one day might bring 10 new vulnerabilities. Factor in the years where only one or two people worked on the project, and 100k in 10 years is substantial. In addition to the numbers, we track a considerable amount more data about each vulnerability than we did at the start, and every entry that goes out is 100% complete.
While this is a landmark number of sorts, as no other vulnerability database has that many entries, it is still a bit arbitrary to us. That’s because we know there are tens of thousands more vulnerabilities out there, already disclosed in some manner, that are not in the database yet. As time permits when doing our daily scrapes for new vulnerabilities, we work on backfilling the previous years. It is a bit scary to know that there are so many vulnerabilities out there that are not cataloged by any vulnerability database. It doesn’t matter if they were disclosed weeks ago, or decades ago. Thousands of pieces of software are used as libraries in bigger packages these days, and what may seem like a harmless crash to one could lead to code execution when bundled with additional software. It is critical that companies have vulnerability information available to them, even if it is older. Better late than never may sound rough, but it certainly is the truth.
In 2014, the only goal we have right now is to continue pushing out high-quality data that comes from a comprehensive list of sources. Over 1,500 sources with more being added every day actually. Now that the project is funded by Risk Based Security, we have an entire team that ensures this coverage. Now more than ever before, we’re in the position to slowly make the goal of cataloging every public vulnerability a reality.
We are occasionally asked how many people work on OSVDB. This question comes from those familiar with the project, and potential customers of our vulnerability intelligence feed. Back in the day, I had no problem answering it quickly and honestly. For years we limped along with one “full time” (unpaid volunteer) and a couple part-timers (unpaid volunteers), where those terms were strictly based on the hours worked. Since the start of 2012 though, we have had actual full-timers doing daily work on the project. This comes through the sponsorship provided by Risk Based Security (RBS), who also provided us with a good amount of developer time and hosting resources. Note that we are also frequently asked how much data comes from the community, to which we giggle and answer “virtually none” (less than 0.01%).
These days however, I don’t like to answer that question because it frequently seems to be a recipe for critique. For example, on one potential client call we were asked how many employees RBS had working on the offering. I answered honestly, that it was only three at time, because that was technically true. That didn’t represent the number of bodies as one was full-time but not RBS, and two were not full time. Before that could be qualified the potential client scoffed loudly, “there is no way you do that much with so few people”. Despite explaining that we had more than three people, I simply offered for them to enjoy a 30 day free trial of our data feed. Let the data answer his question.
To this day, if we say we have #lownumber, we get the response above. If we say we have #highnumber that includes part timers and drive-by employees (that are not tasked with this work but can dabble if they like), then we face criticism that we don’t output enough. Yes, despite us aggregating and producing over twice as much content as any of our competitors, we face that silly opinion. The number of warm bodies also doesn’t speak to the skill level of everyone involved. Two of our full time workers (one paid, one unpaid) have extensive history managing vulnerability databases and have continually evolved the offerings over the years. While most VDBs look the same as they did 10 years ago, OSVDB has done a lot to aggregate more data and more meta-data about each vulnerability than anyone else. We have been ahead of the curve at almost every turn, understanding and adapting to the challenges and pitfalls of VDBs.
So to officially answer the question, how many people work on this project? We have just enough. We make sure that we have the appropriate resources to provide the services offered. When we get more customers, we’ll hire more people to take on the myriad of additional projects and data aggregation we have wanted to do for years. Data that we feel is interesting and relevant, but no one is asking for yet. Likely because they haven’t thought of it, or haven’t realized the value of it. We have a lot more in store, and it is coming sooner than later now that we have the full support of RBS. If you are using any other vulnerability intelligence feed, it is time to consider the alternative.
In our pursuit of a more complete historical record of vulnerabilities, we’re offering a bounty! We don’t want your 0-day really. OK sure we do, but we know you are stingy with that, so we’ll settle on your ~ 12,775 day exploits!
First, the bounty. This is coming out my pocket since it is legacy and doesn’t immediately benefit people using us as a vulnerability feed. As such, this isn’t going to be a profit center for you. In addition to the personal satisfaction of helping preserve history, shout outs on this blog and multiple Twitter feeds, I will send you something. Want a gift card for Amazon? Something else I have that you want? I’ll make my best effort to make it reasonably worth your while. I know it isn’t a cool $1,337 Google style unfortunately, but I will try!
Now, what am I after. Not “a” vulnerability, but any of several lists of vulnerabilities from decades ago. These were maintained in the 1980’s most likely, one of which was internal at the time. I am hoping that given the time that has passed, and that the vulnerabilities have long since been patched and most products EOL’d, they can be disclosed. If you don’t have a copy but know someone might, send me a virtual introduction please! Any lead that results in me getting my hands on a list will be rewarded in some fashion as well. If you have a copy but it is buried in a box in the garage, let me know. I will see about traveling to help you dig through junk to find it. Seriously, that is how bad I want these historic lists!
- The Unix Known Problem List (this was not one of the vendor-specific lists, but those may be groovy)
- UC Santa Cruz hack method list
- Mt. Xinu bug list (later than 4.2 or with more details than this copy)
- Matt Bishop’s UNIX Hole List
- Sun Microsystems Bug-List (internal at the time no doubt)
- ISIS mail list archive (one run by Andrew Burt in 80’s)
- Bjorn Satedevas’ systems administration mailing list archive
- The “inner” Zardoz mail list archive (split from the main one, less members)
Any public-referenced vulnerability before 1980 that we do not have in the database. I know there has to be more out there, help us find them!
Bonus bonus bounty (for SCADA types):
Any SCADA or ICS vulnerability before 1985-06-01!
That’s it! Pretty simple, but may require some digging mentally or physically.
When referencing vulnerabilities in your products, you have a habit of only using an internal tracking number that is kept confidential between the reporter (e.g. ICS-CERT, ZDI) and you. For example, from your HotFix page (that requires registration):
WI2815: Directory Traversal Buffer overflow. Provided and/or discovered by: ICS-CERT ticket number ICS-VU-579709 created by Anthony …
The ICS-CERT ticket number is assigned as an internal tracking ID while the relevant parties figure out how to resolve the issue. Ultimately, that ticket number is not published by ICS-CERT. I have already sent a mail to them suggesting they include it in advisories moving forward, to help third parties match up vulnerabilities to fixes to initial reports. Instead of using that, you should use the public ICS-CERT advisory ID. The details you provide there are not specific enough to know which issue this corresponds to.
In another example:
WI2146: Improved the Remote Agent utility (CEServer.exe) to implement authentication between the development application and the target system, to ensure secure downloading, running, and stopping of projects. Also addressed problems with buffer overrun when downloading large files. Credits: ZDI reports ZDI-CAN-1181 and ZDI-CAN-1183 created by Luigi Auriemma
In this case, these likely correspond to OSVDB 77178 and 77179, but it would be nice to know that for sure. Further, we’d like to associate those internal tracking numbers to the entries but vendors do not reliably put them in order, so we don’t know if ZDI-CAN-1181 corresponds to the first or second.
WI1944: ISSymbol Virtual Machine buffer overflow Provided and/or discovered by: ZDI report ZDI-CAN-1341 and ZDI-CAN-1342
In this case, you give two ZDI tracking identifiers, but only mention a single vulnerability. ZDI has a history of abstracting issues very well. The presence of two identifiers, to us, means there are two distinct vulnerabilities.
This is one of the primary reasons CVE exists, and why ZDI, ICS-CERT, and most vendors now use it. In most cases, these larger reporting bodies will have a CVE number to share with you during the process, or if not, will have one at the time of disclosure.
Like your customers do, we appreciate clear information regarding vulnerabilities. Many large organizations will use a central clearing house like ours for vulnerability alerting, rather than trying to monitor hundreds of vendor pages. Helping us to understand your security patches in turn helps your customers.
Finally, adding a date the patch was made available will help to clarify these issues and give another piece of information that is helpful to organizations.
Thank you for your consideration in improving your process!
Anyone who knows me in the context of vulnerability databases will find this post a tad shocking, even if they have endured my rants about it before.
For the first time ever, I am making it policy that we will no longer put any priority on Vulnerability Labs advisories. For those unfamiliar with the site, it is run by Benjamin Kunz Mejri who now has a new company Evolution Security.
If you read that web site, and even a history of his/VL disclosures, it looks impressive on the surface. Yes, they have found some legitimate vulnerabilities, even in high-profile vendors. Most, if not all, are pedestrian web application vulnerabilities such as cross-site scripting, traversals, or file upload issues. More complex vulnerabilities like overflows typically end up being what we call “self hacks”, and do not result in the crossing of privilege boundaries. Many of their published vulnerabilities require excessive conditions and offer no real exploit scenario.
During the past 10 months, I know of three other vulnerability databases that officially gave up on adding their advisories. Nothing public, but the internal memo was “don’t bother”. OSVDB was the holdout. We did our best to keep up with their stream of horrible advisories. I personally offered to help them re-write and refine their advisory process several times. I started out nicely, giving a sincere offer of my time and experience, and it went unanswered. I slowly escalated, primarily on Twitter, giving them grief over their disclosures. Eventually, their advisories became nothing but an annoyance and incredible time sink. Then I got ugly, and I have been to this day. No, not my proudest moment, but I stand by it 100%.
As of tonight, we are giving in as well. Vulnerability Lab advisories represent too much of a time sink, trying to decipher their meaning, that they simply aren’t worth adding. For cases where the software is more notable, we will continue to slam our head against the wall and figure them out. For the rest, they are getting deprioritized in the form of a “to do when we run out of other import sources”. Since we monitor over 1,100 sources including blogs, web sites, changelogs, and bug trackers, this is not happening for a long time.
I truly regret having to do this. One of my biggest joys of running a vulnerability database is in cataloging all the vulnerabilities. ALL OF THEM.
So this also serves as my final offer Benjamin. Search the VDBs out there and notice how few of your advisories end up in them. Think about why that is. If you are as smart as you think you are, you will choke down your pride and accept my offer of help. I am willing to sink a lot of time into helping you improve your advisories. This will in turn help the rest of the community, and what I believe are your fictitious customers. As I have told you several times before, there is no downside to this for you, just me. I care about helping improve security. Do you?
Tonight, shortly before retiring from a long day of vulnerability import, I caught a tweet mentioning a web site about reporting vulnerabilities. Created on 15-aug-2013 per whois, the footer shows it was written by Fraser Scott, aka @zeroXten on Twitter.
This time, the web site is directly related to what we do. I want to be very clear here; I like the goal of this site. I like the simplistic approach to helping the reader decide which path is best for them. I want to see this site become the top result when searching for “how do I disclose a vulnerability?” This commentary is only meant to help the author improve the site. Please, take this advice to heart, and don’t hesitate if you would like additional feedback. [Update: After starting this blog last night, before publishing this morning, he already reached out. Awesome.]
Under the ‘What’ category, there are three general disclosure options:
NON DISCLOSURE, RESPONSIBLE DISCLOSURE, and FULL DISCLOSURE
First, you are missing a fourth option of ‘limited disclosure’. Researchers can announce they have found a vulnerability in given software, state the implications, and be done with it. Public reports of code execution in some software will encourage the vendor to prioritize the fix, as customers may begin putting pressure on them. Adding a video showing the code execution reinforces the severity. It often doesn’t help a VDB like ours, because such a disclosure typically doesn’t have enough actionable information. However, it is one way a researcher can disclose, and still protect themselves.
Second, “responsible”? No. The term was possibly coined by Steve Christey, further used by Russ Cooper, that was polarized by Cooper as well as Scott Culp at Microsoft (“Information Anarchy”, really?), in a (successful) effort to brand researchers as “irresponsible” if they don’t conform to vendor disclosure demands. The appropriate term more widely recognized, and fair to both sides, is that of “coordinated” disclosure. Culp’s term forgets that vendors can be irresponsible if they don’t prioritize critical vulnerabilities while customers are known to be vulnerable with public exploit code floating about. Since then, Microsoft and many other companies have adopted “coordinated” to refer to the disclosure process.
Under the ‘Who’ category, there are more things to consider:
SEND AN EMAIL
These days, it is rare to see domains following RFC-compliant addresses. That is a practice mostly lost to the old days. Telling readers to try to “Contact us” tab/link that invariably shows on web pages is better. Oh wait, you do that. However, that comes well after the big header reading TECHNICAL SUPPORT which may throw people off.
As a quick side note: “how to notifying them of security issues”. This is one of many spelling or grammar errors. Please run the text through a basic grammar checker.
Under the ‘How’ category:
This is excellent advice, except that using Tor bit since there are serious questions about the security/anonymity of it. If researchers are worried, they should look at a variety of options including using a coffee shop’s wireless, hotel wireless, etc.
This is also a great point, but more to the point, make sure your mail is polite and NOT THREATENING. Don’t threaten to disclose on your own timeline. See how the vendor handles the vulnerability report without any indication of disclosing it. Give them benefit of the doubt. If you get hints they are stalling at some point, then gently suggest it may be in the best interest of their customers to disclose. Remind them that vulnerabilities are rarely discovered by a single person and that they can’t assume you are the only one who has found it. You are just the only one who apparently decided to help the vendor.
Post to Full-Disclosure sure, or other options that may be more beneficial to you. Bugtraq has a history of stronger moderation, they tend to weed out crap. Send it directly to vulnerability databases and let them publish it anonymously. VDBs like Secunia generally validate all vulnerabilities before posting to their database. That may help you down the road if your intentions are called into question. Post to the OSS-security mail list if the vulnerability is in open-source software, so you get the community involved. For that list, getting a CVE identifier and having others on the list verifying or sanity checking your findings, it gives more positive attention to the technical issues instead of the politics of disclosure.
Using a bug bounty system is a great idea as it keeps the new researcher from dealing with disclosure politics generally. Let people experienced with the process, who have an established relationship and history with the vendor handle it. However, don’t steer newcomers to ZDI immediately. In fact, don’t name them specifically unless you have a vested interest in helping them, and if so, state it. Instead, break it down into vendor bug bounty programs and third-party programs. Provide a link to Bugcrowd’s excellent resource on a list of current bounty programs.
The fine print of course. Under CITATIONS, I love that you reference the Errata legal threats page, but this should go much higher on the page. Make sure new disclosers know the potential mess they are getting into. We know people don’t read the fine print. This could also be a good lead-in to using a third-party bounty or vulnerability handling service.
It’s great that you make this easy to share with everyone and their dog, but please consider getting a bit more feedback before publishing a site like this. It appears you did this in less than a day, when an extra 24 hours shows you could have made a stronger offering. You are clearly eager to make it better. You have already reached out to me, and likely Steve Christey if not others. As I said, with some edits and fix-ups, this will be a great resource.
Last week, Steve Christey and I gave a presentation at Black Hat Briefings 2013 in Las Vegas about vulnerability statistics. We submitted a brief whitepaper on the topic, reproduced below, to accompany the slides that are now available.
Buying Into the Bias: Why Vulnerability Statistics Suck
By Steve Christey (MITRE) and Brian Martin (Open Security Foundation)
July 11, 2013
Academic researchers, journalists, security vendors, software vendors, and professional analysts often analyze vulnerability statistics using large repositories of vulnerability data, such as “Common Vulnerabilities and Exposures” (CVE), the Open Sourced Vulnerability Database (OSVDB), and other sources of aggregated vulnerability information. These statistics are claimed to demonstrate trends in vulnerability disclosure, such as the number or type of vulnerabilities, or their relative severity. Worse, they are typically misused to compare competing products to assess which one offers the best security.
Most of these statistical analyses demonstrate a serious fault in methodology, or are pure speculation in the long run. They use the easily-available, but drastically misunderstood data to craft irrelevant questions based on wild assumptions, while never figuring out (or even asking the sources about) the limitations of the data. This leads to a wide variety of bias that typically goes unchallenged, that ultimately forms statistics that make headlines and, far worse, are used to justify security budget and spending.
As maintainers of two well-known vulnerability information repositories, we’re sick of hearing about research that is quickly determined to be sloppy after it’s been released and gained public attention. In almost every case, the research casts aside any logical approach to generating the statistics. They frequently do not release their methodology, and they rarely disclaim the serious pitfalls in their conclusions. This stems from their serious lack of understanding about the data source they use, and how it operates. In short, vulnerability databases (VDBs) are very different and very fickle creatures. They are constantly evolving and see the world of vulnerabilities through very different glasses.
This paper and its associated presentation introduce a framework in which vulnerability statistics can be judged and improved. The better we get about talking about the issues, the better the chances of truly improving how vulnerability statistics are generated and interpreted.
Bias, We All Have It
Bias is inherent in everything humans do. Even the most rigorous and well-documented process can be affected by levels of bias that we simply do not understand are working against us. This is part of human nature. As with all things, bias is present in the creation of the VDBs, how the databases are populated with vulnerability data, and the subsequent analysis of that data. Not all bias is bad; for example, VDBs have a bias to avoid providing inaccurate information whenever possible, and each VDB effectively has a customer base whose needs directly drive what content is published.
Bias comes in many forms that we see as strongly influencing vulnerability statistics, via a number of actors involved in the process. It is important to remember that VDBs catalog the public disclosure of security vulnerabilities by a wide variety of people with vastly different skills and motivations. The disclosure process varies from person to person and introduces bias for sure, but even before the disclosure occurs, bias has already entered the picture.
Consider the general sequence of events that lead to a vulnerability being cataloged in a VDB.
- A researcher chooses a piece of software to examine.
- Each researcher operates with a different skill set and focus, using tools or techniques with varying strengths and weaknesses; these differences can impact which vulnerabilities are capable of being discovered.
- During the process, the researcher will find at least one vulnerability, often more.
- The researcher may or may not opt for vendor involvement in verifying or fixing the issue.
- At some point, the researcher may choose to disclose the vulnerability. That disclosure will not be in a common format, may suffer from language barriers, may not be technically accurate, may leave out critical details that impact the severity of the vulnerability (e.g. administrator authentication required), may be a duplicate of prior research, or introduce a number of other problems.
- Many VDBs attempt to catalog all public disclosures of information. This is a “best effort” activity, as there are simply too many sources for any one VDB to monitor, and accuracy problems can increase the expense of analyzing a single disclosure.
- If the VDB maintainers see the disclosure mentioned above, they will add it to the database if it meets their criteria, which is not always public. If the VDB does not see it, they will not add it. If the VDB disagrees with the disclosure (i.e. believes it to be inaccurate), they may not add it.
By this point, there are a number of criteria that may prevent the disclosure from ever making it into a VDB. Without using the word, the above steps have introduced several types of bias that impact the process. These biases carry forward into any subsequent examination of the database in any manner.
Types of Bias
Specific to the vulnerability disclosure aggregation process that VDBs go through every day, there are four primary types of bias that enter the picture. Note that while each of these can be seen in researchers, vendors, and VDBs, some are more common to one than the others. There are other types of bias that could also apply, but they are beyond the scope of this paper.
Selection bias covers what gets selected for study. In the case of disclosure, this refers to the researcher’s bias in selecting software and the methodology used to test the software for vulnerabilities; for example, a researcher might only investigate software written in a specific language and only look for a handful of the most common vulnerability types. In the case of VDBs, this involves how the VDB discovers and handles vulnerability disclosures from researchers and vendors. Perhaps the largest influence on selection bias is that many VDBs monitor a limited source of disclosures. It is not necessary to argue what “limited” means. Suffice it to say, no VDB is remotely complete on monitoring every source of vulnerability data that is public on the net. Lack of resources – primarily the time of those working on the database – causes a VDB to prioritize sources of information. With an increasing number of regional or country-based CERT groups disclosing vulnerabilities in their native tongue, VDBs have a harder time processing the information. Each vulnerability that is disclosed but does not end up in the VDB, ultimately factors into statistics such as “there were X vulnerabilities disclosed last year”.
Publication bias governs what portion of the research gets published. This ranges from “none”, to sparse information, to incredible technical detail about every finding. Somewhere between selection and publication bias, the researcher will determine how much time they are spending on this particular product, what vulnerabilities they are interested in, and more. All of this folds into what gets published. VDBs may discover a researcher’s disclosure, but then decide not to publish the vulnerability due to other criteria.
Abstraction bias is a term that we crafted to explain the process that VDBs use to assign identifiers to vulnerabilities. Depending on the purpose and stated goal of the VDB, the same 10 vulnerabilities may be given a single identifier by one database, and 10 identifiers by a different one. This level of abstraction is an absolutely critical factor when analyzing the data to generate vulnerability statistics. This is also the most prevalent source of problems for analysis, as researchers rarely understand the concept of abstraction, why it varies, and how to overcome it as an obstacle in generating meaningful statistics. Researchers will use whichever abstraction is most appropriate or convenient for them; after all, there are many different consumers for a researcher advisory, not just VDBs. Abstraction bias is also frequently seen in vendors, and occasionally researchers in the way they disclose one vulnerability multiple times, as it affects different software that bundles additional vendor’s software in it.
Measurement bias refers to potential errors in how a vulnerability is analyzed, verified, and catalogued. For example, with researchers, this bias might be in the form of failing to verify that a potential issue is actually a vulnerability, or in over-estimating the severity of the issue compared to how consumers might prioritize the issue. With vendors, measurement bias may affect how the vendor prioritizes an issue to be fixed, or in under-estimating the severity of the issue. With VDBs, measurement bias may also occur if analysts do not appropriately reflect the severity of the issue, or if inaccuracies are introduced while studying incomplete vulnerability disclosures, such as missing a version of the product that is affected by the vulnerability. It could be argued that abstraction bias is a certain type of measurement bias (since it involves using inconsistent “units of measurement”), but for the purposes of understanding vulnerability statistics, abstraction bias deserves special attention.
Measurement bias, as it affects statistics, is arguably the domain of VDBs, since most statistics are calculated using an underlying VDB instead of the original disclosures. As the primary sources of vulnerability data aggregation, several factors come into play when performing database updates.
Why Bias Matters, in Detail
These forms of bias can work together to create interesting spikes in vulnerability disclosure trends. To the VDB worker, they are typically apparent and sometimes amusing. To an outsider just using a data set to generate statistics, they can be a serious pitfall.
In August, 2008, a single researcher using rudimentary, yet effective methods for finding symlink vulnerabilities single handedly caused a significant spike in symlink vulnerability disclosures over the past 10 years. Starting in 2012 and continuing up to the publication of this paper, a pair of researchers have significantly impacted the number of disclosures in a single product. Not only has this caused a huge spike for the vulnerability count related to the product, it has led to them being ranked as two of the top vulnerability disclosers since January, 2012. Later this year, we expect there to be articles written regarding the number of supervisory control and data acquisition (SCADA) vulnerabilities disclosed from 2012 to 2013. Those articles will be based purely on vulnerability counts as determined from VDBs, likely with no mention of why the numbers are skewed. One prominent researcher who published many SCADA flaws has changed his personal disclosure policy. Instead of publicly disclosing details, he now keeps them private as part of a competitive advantage of his new business.
Another popular place for vulnerability statistics to break down is related to vulnerability severity. Researchers and journalists like to mention the raw number of vulnerabilities in two products and try to compare their relative security. They frequently overlook the severity of the vulnerabilities and may not note that while one product had twice as many disclosures, a significant percentage of them were low severity. Further, they do not understand how the industry-standard CVSSv2 scoring system works, or the bias that can creep in when using it to score vulnerabilities. Considering that a vague disclosure that has little actionable details will frequently be scored for the worst possible impact, that also drastically skews the severity ratings.
The forms of bias and how they may impact vulnerability statistics outlined in this paper are just the beginning. For each party involved, for each type of bias, there are many considerations that must be made. Accurate and meaningful vulnerability statistics are not impossible; they are just very difficult to accurately generate and disclaim.
Our 2013 BlackHat Briefings USA talk hopes to explore many of these points, outline the types of bias, and show concrete examples of misleading statistics. In addition, we will show how you can easily spot questionable statistics, and give some tips on generating and disclaiming good statistics.
Our sponsor Risk Based Security (RBS) posted an interesting blog this morning about Research In Motion (RIM), creator of the BlackBerry device. The behavior outlined in the blog, and from the original blog by Frank Rieger is shocking to say the least. In addition to the vulnerability outlined, potentially sending credentials in cleartext, this begs the question of legality. Quickly skimming the BlackBerry enterprise end-user license agreement (EULA), there doesn’t appear to be any warning that the credentials are transmitted back to RIM, or that they will authenticate to your mail server.
If the EULA does not contain explicit wording that outlines this behavior, it begs the question of the legality of RIM’s actions. Regardless of their intention, wether trying to claim that it is covered in the EULA or making it easier to use their device, this activity is inexcusable. Without permission, unauthorized possession of authentication credentials is a violation of Title 18 USC § 1030 law, section (a)(2)(C) and potentially others depending on the purpose of the computer. Since the server doing this resides in Canada, RIM may be subject to Canadian law and their activity appears to violate Section 342.1 (d). Given the U.S. government’s adoption of BlackBerry devices, if RIM is authenticating to U.S. government servers during this process, this could get really messy.
Any time a user performs an action that would result in sharing that type of information, with any third party, the device or application should give explicit warning and require the user to not only opt-in, but confirm their choice. No exceptions.