On September 8, 2016, Jason Levy of WhiteSource Software published a blog titled “Open Source Vulnerability Database”. Almost two years later it came across my radar and I asked via Twitter if WhiteSource was interested in getting feedback on the blog, since it contained errata. They never replied and it fell off my radar. Last week, someone asked me about WhiteSource’s claims about maintaining the largest vulnerability database. While I have not seen it, I expressed doubt and dug into my notes. Until then, I had forgotten that I created notes to write a blog listing the errata. Since their name came up again, I figured it is worth revisiting this, especially since they never replied to me.
First, I was a long-term contributor to OSVDB, an officer in OSF (the 501(c)(3) that managed the project), a founder at Risk Based Security, and one of the maintainers of VulnDB since day one. Don’t worry, all of that is explained in more detail below. Here are quotes from their blog and my thoughts on the errata:
The open source community has been living up to this statement recently, with the accelerated rate of discoveries of open source vulnerabilities reported by such sources as the NVD open source vulnerability database.
CVE/NVD have dismal coverage of open source libraries in the big picture. Their coverage is primarily through a couple dozen projects that seek out CVE IDs, researchers that request CVE IDs, and the OSS-Sec mail list. To date, there are still many developers, projects, and even security researchers that don’t know what CVE is.
The NVD is by far the main source for researching vulnerabilities.
This sentence does not make sense. As you note below, NVD does not research vulnerabilities; they don’t even aggregate vulnerabilities themselves. While many consume NVD’s data over CVE, due to the format they make the data available, more “analysis” is done via CVEDetails.com than NVD in my experience.
Their analysis contains information like how the vulnerability operates, its impact rating, CVSS score, and links to any available patches/fixes.
The main purpose of NVD, aside from a burden on taxpayer’s dollars, is to provide CPE and CVSS scores for vulnerabilities via CVE. They rarely, if ever, do additional analysis of the vulnerability unless required to determine a CVSS score. If they do, they do not “show their work” so to speak, instead using the same CVE description. As best I am aware, they don’t dig up additional references either, so the only solution links provided are via CVE. That said, I haven’t audited NVD in several years to see if that still holds true, but I would be surprised if anything has changed.
The OSVDB (open source vulnerability database) was launched in 2004 by Jake Kouhns, the founder and current CISO of Risk Based Security – the company which now operates OSVDB’s commercial version, the VulnDB.
Jake Kouns, one of the founders of Risk Based Security (RBS) did not launch OSVDB. And that is the Open Sourced Vulnerability Database. OSVDB was created and launched by H.D. Moore. More history is available via Wikipedia. While OSVDB was a basis for the historical data in VulnDB, Risk Based Security funded OSVDB entirely for over two years and it was their data that was shared publicly via OSVDB before the project shut down. When OSVDB shut down, RBS continued to maintain VulnDB which contains over eight years of vulnerability information not found in OSVDB. With considerable resources and a team maintaining VulnDB, it has become something OSVDB wished it could, but never came close.
This commercial version continues to be maintained by Risk Based Security, but without the strong community that maintained the OSVDB.
There was no “strong community” that maintained OSVDB. For years, it limped along on the backs of a handful of volunteers, with very little support from the community; financial or labor. At times, there were only two of us maintaining the project while most volunteers signed up, worked on the database for 20 minutes, and never returned to help again.
The OSVDB/VulnDB is perceived by many in the community as redundant, for the majority of vulnerabilities are also reported in the NVD.
This is WhiteSource’s ignorance or bias at play. OSVDB was known for having tens of thousands of vulnerabilities not found in CVE/NVD. One of VulnDBs strong points is that it contains almost 69,000 vulnerabilities not found in CVE/NVD. Since WhiteSource claims to have a comprehensive database, you should have disclaimed this while spreading false information about VulnDB. Since VulnDB is not public, the notion that is perceived by “many” in such a light is just your marketing. Anyone reading VulnDBs quarterly or year-end reports can see the incredible gap between them and CVE/NVD. Also curious is that the link off “monitoring of multiple vulnerability databases including the CVE/NVD” just links to your blog on the topic. Other than CVE/NVD, which vulnerability databases do you monitor exactly?
I’d also happily sign a non-disclosure agreement to do a quick audit of your database, to verify or refute your claim that you have the “the largest coverage of vulnerability listings“. I’d only ask that in return, I can publish a simple ‘yes’ or ‘no’ answer to that claim.
Security advisories are usually the first place that security professionals and developers look when they have security issues within a specific scope.
If this is WhiteSource’s methodology for maintaining a vulnerability database, it does not bode well for your customers. Formal security advisories are quickly becoming, or have already become, a minority in the disclosure game. OSVDB and VulnDB didn’t stay far ahead of the curve using this antiquated mindset. We knew to look elsewhere all along.
WhiteSource tracks almost a dozen security advisories (like RubyOnRails, RubySec and Node Security), to ensure our customers are alerted on new vulnerabilities the minute they’re reported.
A whole dozen? The VulnDB team currently monitors 3,100 sources every week. Some are monitored ~ hourly, some a few times a day, some once a week depending on various criteria such as publication frequency. If WhiteSource is only looking at those sources, and that type of source, it really doesn’t bode well for your claims of the largest vulnerability listing.
The majority of vulnerabilities are published with a remediation suggestion, like a new patch, new version, system configuration suggestion etc.
While technically true, per VulnDB data, it is better to qualify that. For example, in Q1 2019, 60.8% of disclosed vulnerabilities they aggregated had a documented solution. But, a couple caveats. First, that was down 13.5% over Q1 2018! Second, not all solutions are created equal. Just because there is a patch committed against the ‘master’ branch doesn’t mean an organization can actually use it. Their policy or available resources may not allow for one-off code patching like that.
And here at WhiteSource, this is exactly what we do.
Unfortunately, based on the way you describe what you do, you are not much better than CVE/NVD I’m afraid.
A few days ago, while writing a draft of a different blog, I made reference to and said “we’re well aware of the pitfalls around calling a vulnerability ‘theoretical’“! I wanted to link off to what I was referencing, a case where security researchers found a vulnerability in a big vendor’s products and were told it was “theoretical”. Of course, they in turn provided a working exploit for the vulnerability proving it wasn’t just theory. Thus, around 1995, the researchers took on the slogan “L0pht, Making the theoretical practical since 1992.” After digging, I couldn’t find a concise story of the details around that event, so I took to Twitter. Over the course of a couple hours, with input from many people, including some involved with the story, I collected the details. I told those in the conversation that if I had the information I would blog about it to better preserve that slice of history.
My bad memory had me believing the vulnerability was in Sendmail, and that Eric Allman said it to Mudge. Royce Williams dug up the Sendmail exploit i was thinking of, and the header text from Mudge suggested I was on the right track.
*NIX Sendmail (8.7.5) – Buffer Overflow – Newest sendmail exploit
# Hrm… and Eric Allman told me to my face that there were *no* buffer
# overflows in 8.7.5 — .mudge
# This works on systems that have the chpass program runable by
# users. Tested on FreeBSD, though the vulnerability exists in all
# Sendmail8.7.5. Granted you need to be able to change your gecos field 😉
# The problem is in buildfnam() which lives in util.c – it treats
# the static allocated array nbuf[MAXSIZE+1], from recipient.c, in
# an unbounded fashion.
Next, Royce reminded me that I had actually referenced that vulnerability in a prior blog post from 2006… oops! Mark Dowd was the first to challenge me on that, saying he believed it was related to a vulnerability in Microsoft’s products, related to RAS or CHAP, and he was right about the vendor (which vuln it was specifically is still not confirmed). Next, Space Rogue, an original L0pht member chimed in saying he thought it referred to Microsoft and the NT/Lanman vulnerability, and that the 1992 part of the slogan simply referred to when L0pht was formed. DKP further confirmed this by digging into the wonderful Internet Archive, finding the slogan and quote on the L0pht’s page.
“That vulnerability is completely theoretical.” — Microsoft
L0pht, Making the theoretical practical since 1992.
From here, Weld Pond and Mudge, original members of the L0pht joined the conversation. First, Weld said it might be a RAS or CHAP vulnerability but he wasn’t certain, but that the slogan came from a response from a Microsoft spokesperson who was quoted saying “that vulnerability is theoretical“, and that resulted in the exploit being written to prove otherwise. Weld further confirmed what Space Rogue had said, that the “slogan was coined in 95 or so but made retrospective to the founding of the L0pht.”
DKP continued digging and found the quoted in Bruce Schneier’s “Secrets and Lies”, and pointed out Schneier worked with Mudge on a MS-CHAPv2 vulnerability. The paper on that vuln is from 1999 though, suggesting it wasn’t the “theoretical” vuln.
With that, the “theoretical” vulnerability is mostly uncovered. It would be great if anyone could confirm exactly which vulnerability it was that prompted the response from Microsoft. If anyone else recalls details about this, please share!
In the mean time, we also get to wonder about the Sendmail story, where this saga started, that also apparently is interesting. Space Rogue mentioned there was a separate story around that, but couldn’t remember details. Mudge jumped in confirming it was a Sendmail 8.6.9 exploit, “in response to in-person ‘discussion’ w/ [Eric] Allman at a Usenix Security in Texas. Witnesses, but no writeup.” Mudge added that “a very similar ‘quote’ happened in person with Allman quite some time prior to the MS issue. It wasn’t a throw away quote. I/we lived it 🙂”
August 14, 2017 Update:
Mudge provided more insight into another issue, also a ‘theoretical’ risk. During the Twitter thread, there were questions about L0phtcrack. Mudge says “In a nutshell – a MSFT PR article in NT magazine said l0phtcrack was a theoretical risk but not an actual one. I responded with LC rant. Soon after we coined the phrase to describe the PoCs I felt it was crucial to write and release“. DKP dug up a Wired article that better illustrates how vendors can dismiss security researchers:
This has been a long-recognized and proven thing, but every year we run into more glaring examples. SecurityFocus, who runs the BID database, which is part of Symantec’s DeepSight offering, routinely uses submissions to the Bugtraq mail list to seed their commercial database, sometimes days before approving the post. This means subscribers who use Bugtraq as one of many sources of ‘real-time’ vulnerability intelligence routinely get the short end of the stick. Full-Disclosure, managed by Fyodor and team, do not have that commercial interest in the content of the posts to the FD. Their average turnaround time seems to be considerably better in approving posts. So please, for the industry’s sake, post to Full-Disclosure and stop supporting Bugtraq.
Today’s example: A new CVE popped up in various places. Google showed the first hit to be the BID Database:
EMC only posts their advisories to the Bugtraq list, so we checked there first, since that would be the provenance:
There are EMC advisories visible, but not the one with CVE-2017-4985. Checking again today:
SecurityFocus delayed the post by three days while it was in their database.
The notion of expertise in any field is fascinating. It crosses so many aspects of humans and our perception. For example, two people in the same discipline, each with the highest honors academic can grant, can still have very different expertise within that field. Society and science have advanced so we don’t have just have “science” experts and medical doctors can specialize to extreme degrees. Within Information Security, we see the same where there are experts in penetration testing, malware analysis, reverse engineering, security administration, and more.
In the context of a software company, especially one that does not specifically specialize in security (and is trivial to argue was late to the security game), you cannot shoehorn them into any specific discipline or expertise. We can all absolutely agree there is an absolute incredible level of expertise across a variety of disciplines within Microsoft. So when Microsoft releases yet another report that speaks to vulnerability disclosures, the only thing I can think of is duality. Especially in the context of a report that puts forth some expertise that they are uniquely qualified to speak on, while mixed with a topic that pre-dates Microsoft and they certainly aren’t qualified to speak on to some degree.
A Tweet from Carsten Eiram pointed me to the latest report, and brought up the obvious fact that it seemed to be way off when it comes to vulnerability disclosures.
It’s always amusing to me that you get legal disclaimers in such analysis papers before you even get a summary of the paper:
Basically, the take away is that they don’t stand behind their data. Honestly, the fact I am blogging about this suggests that is a good move and that they should not. The next thing that is fascinating is that it was written by 33 authors and 14 contributors. Since you don’t know which of them worked on the vulnerability section, it means we get to hold them all accountable. Either they can break it down by author and section, or they all signed off on the entire paper. Sorry, the joys of academic papers!
After the legal disclaimers, then you start to get the analysis disclaimers, which are more telling to me. Companies love to blindly throw legal disclaimers on anything and everything (e.g. I bet you still get legal disclaimers in the footer of emails you receive, that have no merit). When they start to explain their results via disclaimers while not actually including their methodology, anyone reading the report should be concerned. From their “About this report” section:
This volume of the Microsoft Security Intelligence Report focuses on the first and second quarters of 2016, with trend data for the last several quarters presented on a quarterly basis. Because vulnerability disclosures can be highly inconsistent from quarter to quarter and often occur disproportionately at certain times of the year, statistics about vulnerability disclosures are presented on a half-yearly basis.
This is a fairly specific statement that speaks as if it is fact that vulnerability trends vary by quarter (they do!), but potentially ignores the fact that they can also vary by half-year or year. We have seen that impact not only a year, but the comparison to every year prior (e.g. Will Dormann in 2014 and his Tapioca project). Arbitrarily saying that it is a ‘quarter’ or ‘half-year’ does not demonstrate experience in aggregating vulnerabilities, instead it is a rather arbitrary and short time-frame. Focusing on a quarter can easily ignore some of the biases that impact vulnerability aggregation as outlined by Steve Christey and my talk titled “Buying Into the Bias: Why Vulnerability Statistics Suck” (PPT).
Jumping down to the “Ten years of exploits: A long-term study of exploitation of vulnerabilities in Microsoft software” section, Microsoft states:
However, despite the increasing number of disclosures, the number of remote code execution (RCE) and elevation of privilege (EOP) vulnerabilities in Microsoft software has declined
Doing a title search of Risk Based Security’s VulnDB for “microsoft windows local privilege escalation” tells a potentially different story. While 2015 is technically lower than 2011 and 2013, it is significantly higher than 2012 and 2014. I can’t say for certain why these dips occur, but they are very interesting.
Thousands of vulnerabilities are publicly disclosed across the industry every year. The 4,512 vulnerabilities disclosed during the second half of 2014 (2H14) is the largest
number of vulnerabilities disclosed in any half-year period since the Common Vulnerabilities and Exposures system was launched in 1999.
This quote from the report explicitly shows serious bias in their source data, and further shows that they do not consider their wording. This would be a bit more accurate saying “The 4,512 vulnerabilities aggregated by MITRE during the second half of 2014…” The simple fact is, a lot more than 4,512 vulnerabilities were disclosed during that time. VulnDB shows that they aggregated 8,836 vulnerabilities in that same period, but less than the 9,016 vulnerabilities aggregated in the second half of 2015. Microsoft also doesn’t disclaim that the second half of 2014 is when the aforementioned Will Dormann released the results of his ‘Tapioca’ project totaling over 20,500 vulnerabilities, only 1,384 of which received CVE IDs. Why? Because CVE basically said “it isn’t worth it”, and they weren’t the only vulnerability database to do so. With all of this in mind, Microsoft’s comment about the second half of 2014 becomes a lot more complicated.
The information in this section is compiled from vulnerability disclosure data that is published in the National Vulnerability Database (NVD), the US government’s repository of standards-based vulnerability management data at nvd.nist.gov. The NVD represents all disclosures that have a published CVE (Common Vulnerabilities and Exposures) identifier.
This is a curious statement, since CVE is run by MITRE under a contract from the Department of Homeland Security (DHS), making it a “US government repository” too. More importantly, NVD is essentially a clone of CVE that just wraps additional meta-data around each entry (e.g. CPE, CWE, and CVSS scoring). This also reminds us that they opted to use a limited data set, one that is well known in the Information Security field as being woefully incomplete. So even a company as large as Microsoft, with expected expertise in vulnerabilities, opts to use a sub-par data set which drastically influences statistics.
Figure 23. Remote code executable (RCE) and elevation of privilege (EOP) vulnerability disclosures in Microsoft software known to be exploited before the corresponding security update release or within 30 days afterward, 2006–2015
The explanation for Figure 23 is problematic in several ways. Does it cherry pick RCE and EOP while ignoring context-dependent (aka user-assisted) issues? Or does this represent all Microsoft vulnerabilities? This is important to ask as most web browser exploits are considered to be context-dependent and coveted by the bad guys. This could be Microsoft conveniently leaving out a subset of vulnerabilities that would make the stats look worse. Next, looking at 2015 as an example from their chart, they say 18 vulnerabilities were exploited and 397 were not. Of the 560 Microsoft vulnerabilities aggregated by VulnDB in 2015, 48 have a known public exploit. Rather than check each one to determine the time from disclosure to exploit publication, I’ll ask a more important question. What is the provenance of Microsoft’s exploitation data? That isn’t something CVE or NVD track.
Figure 25 illustrates the number of vulnerability disclosures across the software industry for each half-year period since 2H13
Once again, Microsoft fails to use the correct wording. This is not the number of vulnerability disclosures, this is the number of disclosures aggregated by MITRE/CVE. Here is their chart from the report:
Under the chart they claim:
Vulnerability disclosures across the industry decreased 9.8 percent between 2H15 and 1H16, to just above 3,000.
As mentioned earlier, since Microsoft is using a sub-par data set, I feel it is important to see what this chart would look like using more complete data. More importantly, watch how it invalidates their claim about an industry decrease of 9.8 percent between 2H15 and 1H16, since RBS shows the drop is closer to 18%.
I have blogged about vulnerability statistics, focusing on these types of reports, for quite a while now. And yet, every year we see the exact same mistakes made by just about every company publishing statistics on vulnerabilities. Remember, unless they are aggregating vulnerabilities every day, they are losing a serious understanding of the data they work with.
March 19, 2017 Update – Carsten Eiram (@carsteneiram) pointed out that the pattern of local privilege escalation numbers actually follow an expected pattern with knowledge of researcher activity and trends:
In 2011, Tarjei Mandt while he was at Norman found a metric ton of LPEs in win32k.sys as part of a project.
In 2013, it was Mateusz Jurczyk’s turn to also hit win32k.sys by focusing on a bug-class he dubbed “double-fetch” (he’s currently starting that project up again to see if he with tweaks can find more vulns).
And in 2015, Nils Sommer (reported via Google P0) was hitting win32k.sys again along with a few other drivers and churned out a respectable amount of LPE vulnerabilities.
So 2012 and 2014 represent “standard” years while 2011, 2013, and 2015 had specific high-profile researchers focus on Windows LPE flaws via various fuzzing projects.
So the explanation is the same explanation we almost always see: vulnerability disclosures and statistics are so incredibly researcher driven based on which product / component / vulnerability type a researcher or group of researchers decides to focus on.
Hopefully a really quick blog, but a section of a news article titled “Hackers are having a field day with stolen credentials” by Amol Sarwate, Qualys’ Director of Vulnerability Labs, published in SC Magazine caught my attention. The section:
Let’s X-ray the attack methods
Typically, hackers “fingerprint” websites’ underlying software, such as their blog content management system or discussion forum application, and exploit either known vulnerabilities the website owner left unpatched or zero-day flaws.
In one case, an attacker used misplaced install files to gain admin privileges. In another case, hackers stole one moderator’s credentials and used the account to post a malicious message in the forum. After viewing the message, the forum’s administrator had his account compromised, leading to a massive breach. Notable vulnerabilities exploited in recent years include CVE-2016-6483, CVE-2016-6195, CVE-2016-6635, CVE-2015-1431, CVE-2015-7808, CVE-2014-9574 and CVE-2013-6129.
Specifically, that list of CVE identifiers. First, a random list of CVE IDs and they don’t even link to the entries on the CVE or NVD site. It’s not like anyone will instantly recognize those and equate them to specific vulnerabilities, and the odds of someone cutting and pasting them into Google are slim. Second, in the context mentioned, talking about exploiting web sites, then mentioning stealing credentials and account compromise, the list is peculiar. Looking them up in VulnDB:
142673 2016-08-01 2016-6483 vBulletin /link/getlinkdata Server-side Request Forgery (SSRF)
141687 2016-07-11 2016-6195 vBulletin forumrunner/includes/moderation.php Multiple Parameter SQL Injection
137861 2016-03-30 2016-6635 WordPress wp-admin/includes/ajax-actions.php Script Compression Option CSRF
129847 2015-11-02 2015-7808 vBulletin /vbforum/ajax/api/hook/decodeArguments arguments Parameter Remote Code Execution
117888 2015-01-26 2015-1431 phpBB includes/startup.php Trailing Path Handling CSS Injection
116744 2014-12-31 2014-9574 FluxBB /install.php require() Function install_lang Parameter Path Traversal Local File Inclusion
98370 2013-10-09 2013-6129 vBulletin install/upgrade.php Configuration Mechanism Admin Account Creation
A few observations:
- While SSRF can be an underlooked vulnerability, it often leads to remote information disclosure about other hosts, not specifically the web site in Sarwate’s context.
- SQL Injection we’d certainly expect to see and can have devastating consequences.
- CSRF is another as it can be used to conduct privileged operations via phishing attacks.
- Any Remote Code Execution issue is obviously a big one, and on widely deployed software like vBulletin it is expected on such a list.
- CSS injection is an odd one too, as it is frequently considered less severe than XSS which doesn’t appear on the list at all, despite Sarwate describing an XSS in the example above.
- While LFI can be a serious vuln, curious to see it here instead of a RFI which gives the attacker far more control and more reliably results in a compromise.
- 2013-6129 was discovered in the wild and being actively exploited, so no surprise to see it in such a list. Curious it is from 2013 and there are no more recent examples he thought to use.
- Seven issues, with varying degrees of severity, some that may not pose a serious risk to a web site, while leaving out XSS completely, an RCE in Joomla! that was being exploited in the wild, various WordPress plugins that allowed for RCE, etc.
Anyway, just found this list odd and figured it was worth the mention.
Sometime in the past day or so, CVE-2016-10001 was publicly disclosed, and possibly a duplicate. Regardless, CVE-2016-10002 is also now public and legitimate. Tonight, I Tweeted that the presence of those IDs doesn’t mean what many will think it means. I say that based on the past experience, both historical and more recent. Even 17 years later, many people believe that CVE assignments are sequential and that a given ID means that is the number of vulnerabilities aggregated by MITRE that year. That isn’t how it works and it never has.
As of the December 18 dump available from MITRE, there are 10,137 identifiers in the dump. However, 44 of them are REJECTED and 4,760 are in RESERVED status. That means there are 5,333 live CVE identifiers at this time that correspond to vulnerabilities. Since a single CVE ID can include multiple similar vulnerabilities, that number is also misleading. If you take their data and abstract it on a per-vulnerability basis, they cover 8,058 issues as aggregated by Risk Based Security’s VulnDB. So, to be very clear:
CVE has not cataloged 10,000 vulnerabilities in 2016 based on CVE IDs.
Additionally, to be very clear again:
CVE has not cataloged 10,000 vulnerabilities in 2016 based on their actual aggregated vulnerability data.
Meanwhile, VulnDB has currently cataloged 14,485 vulnerabilities, compared to the CVE 8,058 actual number. Hopefully your organization uses more than just CVE data. That means within your security products that scan for vulnerabilities, your tools that collect the data, and ultimately the reporting that guides your security team in making decisions.
All of that said… taking bets if we see Tweets, blogs, or news articles claiming the “10,000 vulns in 2016” notion.
[Note: This blog had been sitting as a 99% completed draft since early September. I lost track of time and forgot to finish it off then. Since this is still a relevant topic, I am publishing now despite it not being quite as timely in the context of the articles cited in it.]
An article by Kim Zetter in Wired, titled “When Security Experts Gather to Talk Consensus, Chaos Ensues“, talks about a recent meeting that tried to address the decades-old problem of vulnerability disclosure.
This article covers a recent meetings organized by the National Telecommunications and Information Administration (NTIA), a division of the US Commerce Department (DoC). This is not the main reason I am blogging, when most would assume I would speak up on this topic. I’ll give you an odd insight into this from my perspective, then get to the real point. The person who organized this, was virtually introduced to me on July 31, 2015. He replied the same day, and had a great introduction that showed intent to make this a worthwhile effort:
The US Dept of Commerce is trying to help in the old dance between security researchers and system/SW vendors. We think that folks on most sides want a better relationship, with more trust, more efficiency, and better results. Our goal is to bring the voices together in a “multistakeholder process” to find common ground in an open participant-led forum.
I’ve been trying to reach out to many people in this area, on all sides. I know that you’re both veterans of many discussions like this, and I’d like to chat with you about lessons learned, how things may (or may not) have changed, and generally get your insight into how to make this not suck.
That level of understanding from a U.S. government employee is rewarding and encouraging. But, in the interest of fairness and pointing out the obvious, the first thing I asked:
In the past couple of weeks, I have heard mention of the Dept. of Commerce becoming involved in vulnerability disclosure. One thing that is not clear to myself, and many others, is what your intended role or “jurisdiction” (for lack of better words) is?
That is an important question that the DoC must consider, and understand how to respond to.
That question should still be on everyone’s minds, as it has not been answered in a factual manner. Good intentions only get you so far. Having another government player in this mess, who has no power, no jurisdiction, and only “the best of intentions” can only muddy the waters at best. The notes of the ~ 6 hour meeting are now online. I haven’t read them, don’t plan to. This is pure academic masturbation that has escaped from academia, and missed the age old point that so many refuse to admit. “When the $government outlaws chinchillas, only the outlaws will have chinchillas.” Seriously, stupidly simple concept that basically as much the basis for our society as anything else you hold dear.
I’m jumping past that, because a vulnerability tourist said something else that needs addressing. If you haven’t seen me use that term, ‘vulnerability tourist’ refers to someone who dabbles in the bigger world of vulnerability aggregation, disclosure, and the higher-level ideas surrounding it. This doesn’t speak to finding vulns (although it helps), disclosing them (although it helps), or aggregating them for long-term analysis (although it helps). It speaks to someone who thinks they are doing good speaking as some form of expert, but are actually continuing to harm it due to a lack of real knowledge on the topic. When someone who has next to no experience in those realms speaks on behalf of the industry, it becomes difficult to take them seriously… but unfortunately, journalists do. Especially when it comes to hot topics or the dreaded “0day” vulnerabilities, that happen every day… while the media cherry picks the occasional one. As usual in our industry, give such a person a bit of rope and you are likely to find them hanging a few days or weeks later.
Focusing on a single aspect of the Wired article:
Members of the audience snickered, for example, when a representative from the auto industry pleaded that researchers should consider “safety” when testing for vulnerabilities. At a bar after the event, some attendees said automakers are the ones who don’t seem concerned about consumer safety when they sell cars that haven’t been pen-tested for vulnerabilities or when it takes them five years to fix a known vulnerability.
And when Corman sought community support for new companies entering the bug bounty arena, some attendees responded with derision. He noted that after United Airlines launched its bug bounty program this year—the first for the airline industry—it suffered backlash from the security community instead of support.
Just Google “volkswagen emissions software” and you will see why the public cannot trust auto manufacturers. Nothing to do with vulnerability research, and at least one is manipulating their software to defraud the public which may hurt the world environment in ways we can’t comprehend. If that isn’t enough for you, consider the same company spent two years doing everything in their power to hide a vulnerability in their software, that may have allowed criminals to more easily steal consumer’s cars.
So first, ask yourself why anyone from the security arena is so gung-ho in supporting the auto industry. Sure, it would benefit our industry, and more importantly benefit the average consumer. Effecting that change would be incredible! But given the almost three decades of disclosure debate centered on questionable dealings between researchers and vendors, it is not a fight you can win fast. And more importantly, it is not one you capitulate to for your own gain.
Moving past that, the quote from Corman speaks to me personally. When United announced their bug bounty program, it received a lot of media attention. And this is critical to note. A company not known for a bug bounty program, in an industry not known for it, offering perks that were new to the industry (i.e. United, airlines, and frequent flier miles). Sure, that is interesting! Unfortunately, none of the journalists that covered that new bounty program read their terms, or if they did, couldn’t comprehend why it was a horrible program that put researchers at great risk. To anyone mildly familiar with bounty programs, it screamed “run, not walk, away…”. I was one of the more vocal in criticizing the program on social media. When a new bounty opens up, there are hundreds of (largely) neophyte researchers that flock to it, trying to find the lowest hanging fruit, to get the quick and easy bounties. If you have run a vulnerability reporting program, even one without bounties, you likely have witnessed this (I have). I was also very quick to start reaching out to my security contacts, trying to find back-channel contacts to United to give them feedback on their offering.
United’s original bounty program was not just a little misleading, but entirely irresponsible and misleading. It basically said, in laymen’s terms, “if you find a vuln in our site, report it and you may get up to 1 million airline miles!” It also said, you cannot test ANY united.com site, you cannot test our airplanes (on the back of the Chris Roberts / United / FBI drama), our mobile application, or anything else remotely worth testing. The program had a long list of what you could NOT test, that excluded every single target a malicious hacker would target. Worse? It did not offer a test / Dev network, a test airplane, or any other ‘safe’ target to test. In fact, it even excluded the “beta” United site! Uh… what were bounty-seekers supposed to do here? If United thinks that bug bounty seekers read past the “mail bugs to” and “this is the potential bounty”, they need to reconsider their bounty program. The original bounty excluded:
“Code injection on live systems”
Bugs on customer-facing websites such as:
Yet, despite that exclusionary list, they did NOT include what was ALLOWED to be tested. That, in modern security terms, is called a honeypot. Don’t even begin to talk about “intent”, because in a court of law with some 17 year old facing CFAA charges, intent doesn’t enter the picture until way too late. The original United program was set up so that they could trivially file a case with the FBI and go after anyone attacking any of their Internet-addressable systems, their mobile apps, or their airplanes. Shortly after the messy public drama of a white hat hacker telling the media he tested the airplane systems, and could have done “bad things” in so many words.
My efforts to reach out to United via back-channels worked. One of their security engineers that is part of the bounty program was happy to open a dialogue with me. See, this is where we get to the bits that are the most important. We traded a few mails, where I outlined my issues with the bounty program and gave them extensive feedback on how to better word it, so that researchers could not only trust the program, but help them with their efforts. The engineer replied quickly saying they would review my feedback and I never heard back. That is the norm for me, when I reach out and try to help a company. So now, when I go to write this blog, of course I look to see if United revised their bounty program! Let’s look at the current United bug bounty offering:
Bugs that are eligible for submission:
Bugs on United-operated, customer-facing websites such as:
Bugs on the United app
Bugs in third-party programs loaded by united.com or its other online properties
Wow… simply WOW. That is a full 180 from the original program! Not only do they allow testing of the sites they excluded before, but they opened up more sites to the program. They better defined what was allowed (considerably more lenient for testing) as far as technical attacks, and targets. I still disagree a bit on a few things that are “not allowed”, but completely understand their reasons for doing so. They have to balance the safety of their systems and customers with the possibility that a vulnerability exists. And they are wise to consider that a vulnerability may exist.
So after all that, let’s jump back to the quote which drew my ire.
And when [Josh] Corman sought community support for new companies entering the bug bounty arena, some attendees responded with derision. He noted that after United Airlines launched its bug bounty program this year—the first for the airline industry—it suffered backlash from the security community instead of support.
This is why a soundbyte in a media article doesn’t work. It doesn’t help vendors, and it doesn’t help researchers. It only helps someone who has largely operated outside the vulnerability world for a long time. Remember, “security researcher” is an overly vague term that has many meanings. Nothing about that press release suggests Corman has experience in any of the disciplines that matter in this debate. As a result, his lack of experience shows clearly here. Perhaps it was his transition from being a “DevOps” expert for a couple years, to being a “Rugged Software” expert, to becoming a “vulnerability disclosure” expert shortly after?
First, United’s original bug bounty offering was horrible in every way. There was basically zero merit to it, and only served to put researchers at risk. Corman’s notion that scaring companies away from this example is ‘bad’ is irresponsible and contradictory to his stated purpose with the I Am The Cavalry initiative. While Corman’s intentions are good, the delivery simply wasn’t helpful to our industry, the automobile industry, or the airline industry. Remember, “the road to hell is paved with good intentions“.
Keeping up with the latest vulnerabilities — especially in the context of the latest threats — can be a real challenge.
One would hope this article would help with that challenge, and it most certainly is one. First, a disclaimer; I was involved with OSVDB for roughly 10 years and the primary curator over that time. Further, I am now involved with Risk Based Security’s VulnDB commercial vulnerability database offering. Both of these are mentioned in the article, so my comments below will most certainly have some level of bias.
To help readers, Sean Martin writes “In no particular order, here are nine key vulnerability data sources for your consideration.” With that, flip to the next page of the article.
It’s important to understand the source — and backing for your source — to avoid getting left without a solid vulnerability database. A good example is the case where many had to say goodbye to their vulnerability feed when minority-player Open Source Vulnerability Database (OSVDB) was shut down.
“Not having OSVDB any longer, while sad for those that relied on it, may actually reduce the complexity in making sure there is integration across all products, MSSPs, services, and SIEMs,” says Fred Wilmot, chief technology officer at PacketSled
I am not sure how OSVDB constituted a “minority-player” in any sense of the term given the broad coverage for a decade. While historical entries were often incomplete, the database was commercially maintained from just before January, 2012, and the information was still given away for free, despite competing with the company providing the support and updates. Since the quote specifically mentions that OSVDB shut down, and it did on April 5, 2016, it’s nice to hear people give belated appreciation to the project. OSVDB shutting down, I would argue, does not reduce the complexity of anything for those knowledgeable about vulnerability disclosure. On the surface, sure! One less set of IDs to integrate across products sounds like a good thing. However, you have to also remember that OSVDB was cataloging thousands of vulnerabilities a year that were not found in the other sources listed in this article. That means there is a level of complexity here that is horrible for companies trying to keep up with vulnerabilities.
Page 3 tells readers about NIST’s National Vulnerability Database (NVD):
NVD is the US government repository of standards-based vulnerability management data. This data enables automation of vulnerability management, security measurement, and compliance. NVD is based on and synchronized with the CVE List (see next slide).
First, since NVD is synchronized with CVE, it is curious that they are listed as separate sources. For those not aware, NVD is a sort of ‘value add’ to CVE in that they generate CVSS and CPE data for the vulnerabilities cataloged by MITRE for the CVE project. Monitoring NVD means you are already monitoring all of CVE and getting the additional meta-data. It is also important to note that the meta-data is outsourced to a contractor who employs ‘junior analysts’ to do the work. This becomes apparent if you consume there data and actually look at their CVSS scores over the last ~ 8 years. Personally, I stopped emailing them corrections many years back due to the volume involved. To this day, you can still often see them scoring Adobe Flash vulnerabilities as CVSSv2 10.0, meaning they miss the ‘context-dependent’ (a.k.a. ‘user-assisted’) aspect which means the access complexity moves from ‘L’ow to ‘M’edium per the CVSSv2 scoring guide on FIRST, resulting in a 9.3 score. Seems minor, but that reclassifies a vulnerability from ‘Critical’ to ‘High’ for many organizations, and should make you question their scoring on more complex issues.
Page 4 tells us more about CVE and offers some “insight” into it that is horribly wrong:
CVE is a dictionary of publicly known information security vulnerabilities and exposures. CVE’s common identifiers enable data exchange between security products and provide a baseline index point for evaluating coverage of tools and services.
Morey Haber, VP of technology at BeyondTrust, offers these examples:
Scanning tools most commonly use CVEs for classification
SIEM technologies understand their applicability in reporting
Risk frameworks use them as a calculation vehicle for applied risk to the business
First, I cannot over-share Steve Ragan’s recent article titled “Over 6,000 vulnerabilities went unassigned by MITRE’s CVE project in 2015“. Consider just the headline, and then think about the fact that CVE does not catalog at least 47,267 vulnerabilities historically. Now, re-read Haber’s examples of how CVE is used and what kind of Achilles’ heel that is for any organization using security software based on CVE.
Fred Wilmot’s quote about CVE is what prompted me to write this entire blog. This is so incredibly wrong and misleading:
“Now that you have a common calculator for interoperability among vendors, the fact that CVE is maintained completely transparently to the community is a HUGE pro,” says Fred Wilmot, chief technology officer at PacketSled. “There is no holdout of exploits for vulnerabilities based on financial gain or intent. It’s altruism at its best. The weakness in the CVE comes in the weaponization of that information and the lack of disclosure for profit and activism, as two examples.”
Where to start…
- There is no common calculator for “interoperability among vendors” in the context of CVE. That isn’t what CVE is or does.
- CVE is most certainly not maintained transparently to the community. It is not maintained transparently to the volunteer Editorial Board (now known simply as the ‘CVE Board’) either. The backroom workings and decisions MITRE makes on behalf of CVE without Board or public input have been documented before. The last decision that lacked any transparency was their recent catastrophic decision to change the CVE format to a new ‘federated’ scheme. If you have any doubt about this being a backroom decision, look at the first reply from CVE Board member Kurt Seifried.
- Wilmot’s characterization that CVE is “altruism at its best” also speaks to a lack of knowledge of CVE. While MITRE, the organization that maintains CVE, is technically a not-for-profit organization, they only take non-compete contracts at incredible expense to the U.S. taxpayer. CVE, and a handful of other ‘C’ projects related to information security, bring in considerable money to the company. In 2015, they enjoyed over $1.4 billion in revenue and maintained $788 million in assets. The fact that the contract to maintain CVE is non-compete, and cannot be bid on by companies more qualified to run the project, speaks to where the real interest lies and it isn’t altruistic.
- The weakness in CVE is certainly not the “weaponization” of that information. A significant majority of weaponized exploits that lead to the thousands of data breaches and organizations being compromised are typically done with functional exploits that enjoy little technical information being made public. For example, phishing attacks that rely on Adobe Reader or Adobe Flash are usually patched by Adobe eventually, and the subsequent disclosure has no technical details. Even if researchers post more details down the road, the entries in CVE are rarely updated to include the additional details.
- The last bit of Wilmot’s quote, I will need someone to explain to me. “The weakness in the CVE [..] comes in the lack of disclosure for profit and activism.” I don’t know what that means.
Page 5 tells readers about the CERT Vulnerability Notes Database:
The Vulnerability Notes Database provides information about software vulnerabilities. Vulnerability notes include summaries, technical details, remediation information, and lists of affected vendors. Most Vulnerability notes are the result of private coordination and disclosure efforts. For more comprehensive coverage of public vulnerability reports, consider the National Vulnerability Database (NVD).
“This is nice to have, but it still uses CVEs as reference,” says Fred Wilmot, chief technology officer at PacketSled. “NVD is not nearly as practical to consume directly as CVE — the disclosure form is fine, but why would I go there and not directly to MITRE for CVE establishment first? However, it’s probably a good place to spend time during an investigation.”
The CERT VNDB is not a comprehensive vulnerability database, and does not aim to be one. As mentioned, their information is primarily via their assisting researchers in coordinating a disclosure with a vendor. Since CERT is a CNA, meaning they can assign CVE IDs to vulnerabilities they coordinate, it means that over 99% of their entries are covered by CVE and thus NVD. Monitoring NVD will get you all of CVE and almost all of CERT VNDB. The very few CERT VU that do not get CVE IDs assigned before disclosure are rare, and I believe they get assignments shortly after from MITRE.
Once again, Wilmot speaks about these sources and doesn’t appear to have real working knowledge which personifies my term ‘vulnerability tourist’. CERT VNDB disclosures appear on their site before they appear in CVE or NVD. It may be 24 – 72 hours before they appear in fact, meaning that while it still uses CVEs as a reference, for timely monitoring of vulnerabilities it may be important to keep an eye out on CERT directly. Next, Wilmot goes on to say “NVD is not nearly as practical to consume directly as CVE”, apparently not realizing that NVD makes its data available in XML. While MITRE makes the CVE data in several formats, it doesn’t mean NVD is not easy to consume. The most important distinction here is that NVD comes with CPE data where CVE does not. For any medium to large organization, this is basically mandatory meta-data for actually putting the information to use.
Page 6 tells readers about Risk Based Security’s VulnDB offering. The curious bit to me is the quote from Morey Haber:
“VulnDB does not contain audit information, but it is a good source for solutions that need to reference vulnerability information in their products such as firewalls or IDS/IPS and do not want to rely on open source or to build/maintain a library,” said Morey Haber, VP of technology at BeyondTrust.
First, BeyondTrust is not a user of VulnDB, which is a commercial offering unlike CVE/NVD/CERT. They did a short-term trial in 2012, during the beginning of the offering and opted not pursue it as a source of vulnerability intelligence. Second, what does “audit information” even mean in the context of a VDB? Audit information about your own environment maybe? Something that a vulnerability intelligence provider can’t possibly deliver. An audit trail is maintained for each vulnerability entry and is available to customers, but I doubt that is what he means since calling VulnDB out on this doesn’t make sense and the other sources of vulnerabilities listed in this article don’t maintain such a trail.
While VulnDB can certainly be used to reference vulnerabilities in security products as he says, that is the tip of the iceberg. With over 47,000 vulnerabilities not found in CVE or NVD, the breadth of information is incredible. Further, VulnDB has made a concerted effort for years to track vulnerabilities in third-party libraries, and builds on top of the robust meta-data that has been generated for over a decade. Haber’s comments do not reflect actual knowledge of the VulnDB offering.
Page 7 tells readers about the DISA IAVA Database And STIGS. Haber gives commentary on this as well:
“IAVA, the DISA-based vulnerability mapping database, is based on existing SCAP sources, and once in a while it contains details for government systems that are not a part of the commercial world,” says Morey Haber, VP of technology at BeyondTrust. “For any vendor doing .gov or .mil work, this reference is a must.”
While some of the IAVA advisories may contain additional detail, it is important to note that these will not provide any vulnerabilities above and beyond CVE/NVD, and their advisories lag well behind the issues being published in CVE/NVD. Haber is right, that this is a vital resource for .gov and .mil contractors, for several reasons.
Page 8 tells readers about SecurityTracker.com, which is a long-running site that aggregates vulnerabilities to a degree. Haber once again provides commentary:
“The website tends to focus on non-OS vulnerabilities, but they are certainly included in the feed,” says Morey Haber, VP of technology at BeyondTrust. “Infrastructure and IoT tend to make the front page the most, and this site is a good third-party reference for new flaws.”
Actually, they do focus on OS vulnerabilities and that can routinely be seen on their site. As I write this, 2 of the 5 vulnerabilities listed on the front page are in operating systems. The biggest thing to note about this offering is that publicly, they just don’t aggregate a significant volume of vulnerabilities for public consumption. Their most recent ID is 1037091, meaning they have ~ 37,000 entries in their database. Note that they operate like CVE though, where multiple vulnerabilities can be associated with a single ID. Regardless, reading their Weekly Vulnerability Summary emails for the past five weeks shows their volume: Sep 26 2016 (32 alerts), Oct 3 2016 (17 alerts), Oct 10 2016 (26), Oct 17 2016 (31), and Oct 24 2016 (32 alerts). To put that into perspective, VulnDB averages 46 new entries a day in 2016, with the most being 224 in a single day in 2016. For the most part, SecurityTracker may beat CVE on adding to the database, but they are almost entirely covering entries that have CVE IDs.
Page 9 tells readers about the Open Vulnerability And Assessment Language (OVAL) Interpreter And Repository. This is a curious addition to the article because it is not a source of vulnerability intelligence. Instead, it is a standard for reporting about systems. From the OVAL page:
OVAL® International in scope and free for public use, OVAL is an information security community effort to standardize how to assess and report upon the machine state of computer systems. OVAL includes a language to encode system details, and an assortment of content repositories held throughout the community.
While OVAL is certainly useful to some organizations, it does not belong in a list of vulnerability sources.
Page 10 tells readers about Information Sharing And Analysis Centers (ISACs). This is another curious addition as ISACs typically trade information on active attacks, threat actors, and which vulnerabilities may be targeted more heavily. They are generally not a source of vulnerabilities in the same context as most of the resources above.
In summary, Sean Martin’s article says it will share “9 Sources For Tracking New Vulnerabilities”. In reality, based on that quote and context, the article only tells readers about CVE/NVD, CERT VNDB, RBS VulnDB, and SecurityTracker. Several of the sources listed are not really for tracking new vulnerabilities, rather augment the vulnerability threat intelligence in various ways.
Earlier this week, Michael Roytman of Kenna Security wrote a blog with more details about the vulnerability section of the Verizon DBIR report, partially in response to my last blog here questioning how some of the data was generated and the conclusions put forth. The one real criticism I will note, is that Roytman’s blog does not acknowledge or warn that the list of CVE IDs included in the DBIR had typos, causing the wrong IDs to be included. In the world of vulnerability databases, that unique ID is designed specifically to avoid such confusion. Carrying the wrong IDs undermines the integrity of the data being presented.
In addition to my comments, Roytman had a long call with Adrian Sanabria which led to generating a new set of data with a different scope. From the Kenna blog:
We had an excellent offline discussion in which he dove deeply into the assumptions of my work, asked thoughtful, deep questions in private, and together, we came up with a better metric for generating a top 10 vulnerabilities list. To address these issues, I scaled the total successful exploitation count for every vulnerability in 2015 by the number of observed occurrences of that vulnerability in Kenna’s aggregate dataset. Sifting through 265 million vulnerabilities gives us a top 10 list perhaps more in line with what was expected – but equally unexpected! The takeaway here is that datasets like the one explored in the DBIR might be noisy, might have false positives and the like, but carefully applied to your enterprise the additional context successful exploitation data lends to vulnerability management is priceless.
I won’t go into much detail in this blog, but will say that I disagree with the statement that severely flawed data can produce takeaways that are “priceless”. Organizations acting on these top 10 lists may be spending time and resources chasing vulnerabilities that do not impact them, or pose very little risk compared to other threats they face. That action most certainly has a price; one that can be enumerated to some degree due to the cost of the employee time required. With that in mind, let’s look at the methodology which is spelled out in more detail than the DBIR, before we consider the new top 10 list Roytman generated. First, his notes on the data examined:
The first is a convenience sample that includes 2,442,792 assets (defined as: workstations, servers, databases, ips, mobile devices, etc) and 264,912,235 vulnerabilities associated to those assets. The vulnerabilities are generated by 8 different scanners, they are: Beyond Security, Tripwire, McAfee VM, Qualys, Tenable, BeyondTrust, Rapid7, and OpenVAS . This dataset is used in determining remediation rates and the normalized open rate of vulnerabilities.
I am curious if it is normal practice to consider an IP address an asset, when the system that addresses the IP is largely considered the asset. Moving past that, one point that sticks in my mind is the tools that generate the data. From the list above, consider that Beyond Security claims to have “what is arguably the world’s most complete database of security vulnerabilities.” Click around their site and you see that “the AVDS database includes over 10,000 known vulnerabilities and the updates include discoveries by our own team and those discovered by corporate and private security teams around the world.” That is less than 25% of what CVE has, and less than 10% of what VulnDB has. They even show you that they only cover 200 CVE IDs for 2016, as compared to the 1,474 open 2016 CVEs.
They don’t specify which Tripwire product, include McAfee Vulnerability Manager which was declared End of Life in October last year, and don’t specify which Qualys product. So it is a start as far as explaining what tools generate the data, but still leaves a lot of guess-work.
Roytman describes the second data set used as:
The second is a convenience sample that includes 3,615,706,022 successful exploitation events which all take place in 2015 which come from Dell Secureworks’ Counter-Threat Unit and Alienvault’s Open Threat Exchange.
The third qualification, describing the methodology is perhaps the most important, and was lacking in the DBIR:
Please note the methodology of data collection: Successful Exploitation is defined as one successful technical exploitation of a vulnerability on one machine at a particular timestamp. The event is defined as: 1. An asset has a known CVE open. 2. An attack come in that matches the signature for that CVE on that asset and 3. One or more IOCs are detected/correlated post attack. It is not necessarily a loss of a data, or even root on a machine. It is just the successful use of whatever condition is outlined in a CVE.
I have reached out to Michael and requested a sampling of data for two of the CVE IDs on his new list, and he is going to do so. In the meantime, I had a discussion with several people more familiar with IDS than myself and asked about how they would detect attacks for CVE-2013-0229 and CVE-2001-0540 as an example. Sure, detecting a specific type of packet meant to trigger this is one thing, but what is the threshold for the IDS to say it is an attack, when exploitation requires “a large number of malformed Remote Desktop Protocol (RDP) requests“. Is there a specific number of packets for it to flag an attack in progress? If too low, it may be prone to a high number of false positives. If too high, it may not detect a successful exploitation of the issue. Which leads into the second part of the methodology, comparing it with “one or more IOCs [that] are detected/correlated post attack”. In this case, it would presumably be the targeted service not responding, which could be detected a number of ways (e.g. probing the port, seeing specific errors in the logs, noticing a given process not running). Once the service is down, is the IDS that isn’t aware of the host state still cataloging a single attack? Or is it generating alerts every X minutes it detects an attack ongoing? Hopefully the data Michael sends will help me better understand how that correlation is being made, as it represents a source of incredible bias for the resulting data analysis.
Moving on to Roytman’s new list using the above data and methodology, here is the top 10 list he sees in the data:
- 2015-03-05 – 2015-1637 – Microsoft Windows Secure Channel (Schannel) RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2015-01-06 – 2015-0204 – OpenSSL RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2014-04-07 – 2014-0160 – OpenSSL TLS Heartbeat Extension Packets Handling Out-of-bounds Read Remote Memory Disclosure (Heartbleed)
- 2012-03-13 – 2012-0152 – Microsoft Windows Remote Desktop Protocol Terminal Server RDP Packet Parsing Remote DoS
- 2009-05-16 – 2013-0229 – MiniUPnPd SSDP Handler minissdp.c ProcessSSDPRequest Function Malformed Input Handling Remote DoS
- 2002-02-12 – 2002-0012 – Multiple Vendor Malformed SNMP Trap Handling DoS
- 2002-02-12 – 2002-0013 – Multiple Vendor Malformed SNMP Message-Handling Remote DoS
- 2001-12-20 – 2001-0876 – Microsoft Windows Universal Plug and Play NOTIFY Directive URL Handling Remote Overflow
- 2001-12-20 – 2001-0877 – Microsoft Windows Universal Plug and Play NOTIFY Request Remote DoS
- 2001-07-25 – 2001-0540 – Microsoft Windows Terminal Server RDP Request Handling Memory Exhaustion Remote DoS
Roytman prefaces that list with a comment that the “top 10 list perhaps more in line with what was expected – but equally unexpected!” Indeed, that is certainly true. Expected? A high-profile vulnerability disclosed in 2014 that was seen being widely exploited then, and subsequent years, something sorely missing from the DBIR list even after the corrected CVE IDs were factored in. This is the type of vulnerability almost everyone expected to top the list due to how easy it was to exploit, and knowing that it had been used heavily. This speaks to the original point and reason for the first blog; all the data ‘science’ in the world that produces highly questionable results should not be taken as gospel. Even if the methodology was sound, it doesn’t mean the data being used was.
Unfortunately, Roytman’s revised list deviates quickly into the “equally unexpected“. The first DBIR report list had a single denial of service (DoS) attack on the list, which stood out as odd. The revised list bumped that number to two which was a bit odd. The most recent list expands that to six DoS attacks, which is highly questionable for one reason or another. First, you can question the data set and methodology leading to this conclusion, but let’s say that passes muster solely for the sake of argument. Second, you can question why so many denial of service attacks are on a top 10 most exploited vulnerabilities list, framed in a context that uses the word ‘compromise’, riding on the back of a report centered around data breaches. These attacks are not causing attackers to gain privileges or steal data. They are a nuisance most of the time, or potentially used in serious DoS attacks other times. It makes one question why DoS attacks weren’t dropped from the results completely, and disclaimed as such! Or, generate two lists; one with results based on raw data, DoS and everything, and a second list that focuses on vulnerabilities that may allow for privileges to be gained and a real compromise to happen.
I cannot stress this enough. Using a term like ‘Indicator of Compromise’ (IOC) in the context of DoS attacks is disingenuous and misleading. Going back to Roytman’s introduction into this section where he makes a comment about seeing the trees (referring to the classic metaphor), I find it ironic as that sums up the purpose of my blog. The DBIR was written as if they could only see the trees (data points), and not the forest (bigger picture), which is what many people took issue with.
One point that I overlooked on the original list, and still appears on the new list, is the presence of FREAK (twice even). Fortunately for me, Thomas Ptacek does a great job explaining why FREAK is likely on the list, but absolutely should not be. Using Roytman’s blog and data, he calculates that attackers would have spent $332,183,325 using Amazon EC2 to exploit FREAK. He continues by citing one of the researchers who discovered FREAK explaining one way that a high number of false positives are generated on that particular vulnerability. He goes on to drive the point home, quoting the researcher and commenting that it likely has not been exploited in the wild by an average attacker.
Roytman essentially dismisses all of this in his blog post while saying that I am correct, that “IDS alerts generate a ton of false positives, vulnerability scanners often don’t revisit signatures, CVE is not a complete list of vulnerability definitions. But those are just the trees, and we’ll get to them later.” Unfortunately, he doesn’t get to it later in a way that provides meaningful insight into the questions about the data and conclusions. Dan Guido wrote an excellent summary of why the DBIR vulnerability section has issues, and factors in Roytman’s latest blog, breaking it all down in a manner that highlights the flaws. Even with the revised list, it is still missing the US-CERT top 30 previously cited, the Microsoft data, and the recently disclosed ‘top PoC exploits distributed on social media‘. At some point, one would logically conclude these lists should have more overlap. One thing I would love to see from Verizon and Kenna is a detailed explanation of their methodology as it relates to detecting client-side exploits, that appear to be the defacto standard for infecting tens, maybe hundreds of thousands of hosts, every year to create botnets.
I want to look at this from one more perspective, because I think it beautifully highlights how vulnerability analysis is a moving target, but in this case for all the wrong reasons. While most vulnerability aggregators and analysts are constantly adapting to new variations of vulnerabilities, new sources of vulnerability information, and new players in the game with wildly different styles of disclosure, others that come along after the data is generated and do analysis frequently seem to lose perspective in my experience. I believe this is such a case and is best illustrated in what the DBIR top 10 looks like over three revisions in less than two weeks. Yes, I know Roytman’s list isn’t officially the DBIR list, but he generated the initial data and then opted to do a different form of analysis putting it forth as something that is more representative due to applying his analysis to a more limited dataset that he presumably trusts more (i.e. the Kenna aggregate dataset). One has to wonder if this was brought up to Verizon as a better way to approach the list, and if so, why was it rejected.
The following lists show the evolution of the CVEs that appear on the top 10, with strikeout denoting the typos between the original DBIR and Gabe’s clarification which is reflected in current downloads, underlining to show any denial of service attacks, and bold to show the new CVE IDs that appear with Roytman’s reworking.
DBIR - DBIR Revised - Roytman Blog 2015-1637 – 2015-1637 - 2015-1637 2015-0204 – 2015-0204 - 2015-0204 20
12-1054 – 2002-1054 - 2014-0160 20 11-0877 – 2001-0877 - 2012-0152 2003-0818 – 2003-0818 - 2013-0229 2002-0126 – 2002-0126 - 2002-0012 2002-0953 – 2002-0953 - 2002-0013 2001-0876 – 2001-0876 - 2001-0876 2001-0680 – 2001-0680 - 2001-0877 1999-1058 – 1999-1058 - 2001-0540
In my mail to Roytman asking for a sample data set, I suggested that it would be interesting to see him generate a list using his methodology, but remove any DoS attack (six of his ten), so the top list only included exploits that could achieve remote privileges of some sort. He replied to me with:
… and again, awesome idea. One of those all-too-simple in retrospect, damnit why didn’t I think of it earlier things.
I found this interesting thinking back to his use of the forest and the trees metaphor.
Verizon released their yearly Data Breach Investigations Report (DBIR) and it wasn’t too long before I started getting asked about their “Vulnerabilities” section (page 13). After bringing up some highly questionable points about last year’s report regarding vulnerabilities, several people felt that the report did not stand up to scrutiny. With a few questions leveled at me, I was curious if Verizon and partners learned from last year.
This year’s vulnerability data was provided by Kenna Security (formerly Risk I/O), and Verizon “also utilized vulnerability scan data provided by Beyond Trust, Qualys and Tripwire in support of this section.” So the data isn’t from a single vendor, but at least four vendors, giving the impression that the data should be well-rounded, and have less questions than last year.
From the report:
Secondly, attackers automate certain weaponized vulnerabilities and spray and pray them across the internet, sometimes yielding incredible success. The distribution is very similar to last year, with the top 10 vulnerabilities accounting for 85% of successful exploit traffic. While being aware of and fixing these mega-vulns is a solid first step, don’t forget that the other 15% consists of over 900 CVEs, which are also being actively exploited in the wild.
This is not encouraging, as they have 10 vulnerabilities that account for an incredible amount of traffic, and the footnoted list of CVE IDs suggests the same problems as last year. And just like last year, the report does not explain the methodology for detecting the vulnerabilities, does not include details about the generation of the statistics, and provides a loose definition of what “successfully exploited” means. Without more detail it is impossible for others to reproduce their results, and extremely difficult to explain or disclaim them as a third party reading the report. Going to the Kenna Security page about this report doesn’t really yield much clarity, but does highlight another potential flaw in the methodology:
Kenna’s Chief Data Scientist Michael Roytman was the primary author of this year’s “Vulnerabilities” chapter, analyzing a correlated threat data set that spans 200M+ successful exploitations across 500+ common vulnerabilities and exposures from over 20,000 enterprises in more than 150 countries.
It’s subtle, but notice they went through a data set that spans exploitations across “500+ common vulnerabilities and exposures”, also known as CVE. If the data is only looking for CVEs, then there is an incredibly large bias at play from the start, since they are missing at least half of the disclosed vulnerabilities. More importantly, this becomes a game of fractions that the industry is keen to overlook at every opportunity:
- CVE represents approximately half of the disclosed vulnerabilities.
- Vulnerability scanners and IPS/IDS don’t have signatures for all CVE IDs, so they look for some fraction of CVE.
- Detection signatures are often flawed, leading to false positives and false negatives, meaning they are actually detecting a fraction of the CVE IDs they intend to.
Another crucial factor in how this data is generated is in the detection of the exploits. Of the four companies contributing data, one was founded in 2009 (Risk I/O / Kenna) and another in 2006 (Beyond Trust, in the context of this discussion). That leaves Qualys (founded 1999) and Tripwire (founded 1997), who are likely the sources of the signatures that detected the vulnerabilities. For those around in the late 90s, the vulnerability landscape was very different than today, and security products based on signatures back then are in some ways considered rudimentary compared to today. Over time, most security products do not revisit older signatures to improve them unless they have to, often due to customer demand. Newly formed companies basically never go back and write signatures for vulnerabilities from 1999. So it stands to reason that the detection of these issues are based on Qualys and/or Tripwire’s detection capabilities, and the signatures detecting these vulnerabilities are likely outdated and not as well-constructed as compared to their more recent signatures.
That leads us to ask, how many vulnerabilities are these companies really looking for? Where did the detection signatures originate and how accurate are they? While the DBIR does disclaim that the data used is a sample, they also admit “bias undoubtedly exists”. However, they don’t warn the reader of these extremely limiting caveats that put the entire data set into a perspective clearly showing strong bias. This, combined with the lack of detailed methodology for how these vulnerabilities are detected and correlated to measure ‘success’, ultimately means this data has little value other than for inclusion in pedestrian reports on vulnerabilities.
With that in mind, I can only go by what information is available. We’ll start with the concise list of the top 10 CVE IDs these four vulnerability intelligence providers say are being exploited the most, and Verizon labels as “successfully exploited”:
- 2015-03-05 – CVE-2015-1637 – Microsoft Windows Secure Channel (Schannel) RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2015-01-06 – CVE-2015-0204 – OpenSSL RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2012-02-25 – CVE-2012-1054 – Puppet k5login File Symlink File Overwrite Local Privilege Escalation
- 2011-07-19 – CVE-2011-0877 – Oracle Enterprise Manager Grid Control Instance Management Unspecified Remote Issue (2011-0877)
- 2004-02-10 – CVE-2003-0818 – Microsoft Windows ASN.1 Library (MSASN1.DLL) BER Encoding Handling Remote Integer Overflows
- 2002-01-15 – CVE-2002-0126 – BlackMoon FTP Server Multiple Command Remote Overflow
- 2001-12-26 – CVE-2002-0953 – PHPAddress globals.php LangCookie Parameter Remote File Inclusion
- 2001-12-20 – CVE-2001-0876 – Microsoft Windows Universal Plug and Play NOTIFY Directive URL Handling Remote Overflow
- 2001-04-13 – CVE-2001-0680 – QVT/Net / Term FTP Server LIST Command Traversal Remote File Access
- 1999-11-22 – CVE-1999-1058 – Vermillion FTPD Long CWD Command Handling Remote Overflow DoS
This list should raise serious red flags for anyone passingly familiar with vulnerabilities. Not only do we have very odd ‘top 10’ lists from last year and this year, but there is little overlap between them. How does 2015 show a top 10 list exploiting eight vulnerabilities with CVE identifiers between 1999 and 2002, meaning they had been exploited so much as many as thirteen years later, only to see them all drop off the list this year, to be replaced by new 15+ year old vulnerabilities? In addition to this oddity, there are more considerations leading to my top 10 list of questions about their list:
- How does a local vulnerability based on a symlink overwrite flaw (CVE-2012-1054) make it into a top 10 list of “85% of successful exploit traffic“?
- How does a local vulnerability in Puppet rank #3 on this list, given the install base of Puppet as compared to Adobe or Java?
- If they are detecting exploits on the wire, shouldn’t we see Java, Adobe Reader, and Adobe Flash somewhere on the list? The “Slow and steady—but how slow?” section even talks about time-to-exploit for Adobe.
- Why doesn’t this list remotely match US-CERT’s “Top 30 Targeted High Risk Vulnerabilities” that includes vulnerabilities back to 2006, but not a single one listed above?
- How does a vulnerability that by all accounts is so vague, that it has to be distinguished by the vendor issued CVE ID (CVE-2011-0877), have a signature and get exploited so much?
- How does a vulnerability in Oracle Enterprise Manager Grid Control show up as #4, when no Oracle Database vulnerabilities appear?
- How do you distinguish an FTP LIST command exploit from one vendor to another? (e.g. CVE-2005-2726, CVE-2002-0558, CVE-2001-0933, CVE-2001-0680) According to the one-liner methodology, this is done via pairing SIEM data, suggesting that BlackMoon and Vermillion are that popular today.
- Yet, how does a remote DoS in an Windows-based FTP program that doesn’t appear to have been distributed for a decade make it on this list? Are people really conducting targeted DoS attacks against this software?
- Is BlackMoon FTP Server really that prevalent to be exploited so often?
- Or is there a problem in generating this data, which would be more easily attributed to loose signatures detecting FTP attacks regardless of vendor?
Figure 12 in the report, which is described as “Count of CVEs exploited in 2015 by CVE publication date” is a curious thing to include, as the CVE publication date is very distinct from the vulnerability disclosure date. While a large percentage of CVE publication dates are within seven days of disclosure, many are not (e.g. CVE-2015-8852 disclosed 2015-03-23 and CVE publication on 2016-04-26). Enough to make this chart questionable as far as the insight it provides. Taking the data as presented, are they really saying that only ~ 73 vulnerabilities with a 2015 ID were successfully exploited in 2015 across “millions of successful real-world exploitations“? Given that 40 vulnerabilities were discovered in the wild, 33 of which have 2015 CVE IDs, that means that only ~40 other 2015 vulnerabilities were successfully exploited? If that is the takeaway, how is the security industry unable to stop the increasing wave of data breaches, the same kinds that led to this report? Something doesn’t add up here.
While people are cheering about the DBIR disclaiming there is sample bias (and not really enumerating it), they ignore the measurement bias, don’t speak to publication bias, don’t explain the attrition bias between 2015 and 2016, or potential chaining bias. As usual, the media is happy to glom onto such reports without asking any of these questions or providing critical analysis. As an industry, we need to keep challenging metrics and statistics to ensure they are not only accurate, but provide meaning that benefits us.
4/28/2016 – According to Gabe (@gdbassett), the list of CVEs in the DBIR is incorrect. He posted a new list of CVEs (mostly the same) via Twitter in a reply to Andreas Lindh who was surprised at the top ten list of vulnerabilities as well. Gabe also confirmed that afterwards they “compared the figure CVEs (listed above) against the raw data. After removing non-confirmed breaches, they match.” He went on to link to another source showing “data” about one of the CVEs, that really doesn’t mean anything without more context. Meanwhile, Michael Roytman, who did the vulnerability section of the report confirmed that he/Kenna would be responding to this blog with one of their own.
I hate to harp on simple transposition style mistakes in a report, but given the severity behind using numeric identifiers for vulnerabilities, it seems like that should have been double and triple-checked. Even then, I don’t understand how someone familiar with vulnerabilities could see either list and not ask many of the same questions I did, or provide more information in the report to back the claims. That said, let’s look at Gabe’s new list of CVEs. Bold and links are used to highlight the new ones:
- 2015-03-05 – 2015-1637 – Microsoft Windows Secure Channel (Schannel) RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2015-01-06 – 2015-0204 – OpenSSL RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
- 2004-02-10 – 2003-0818 – Microsoft Windows ASN.1 Library (MSASN1.DLL) BER Encoding Handling Remote Integer Overflows
- 2002-07-22 – 2002-1054 – Pablo FTP Server LIST Command Arbitrary Directory Listing Remote Information Disclosure
- 2002-01-15 – 2002-0126 – BlackMoon FTP Server Multiple Command Remote Overflow
- 2001-12-26 – 2002-0953 – PHPAddress globals.php LangCookie Parameter Remote File Inclusion
- 2001-12-20 – 2001-0877 – Microsoft Windows Universal Plug and Play NOTIFY Request Remote DoS
- 2001-12-20 – 2001-0876 – Microsoft Windows Universal Plug and Play NOTIFY Directive URL Handling Remote Overflow
- 2001-04-13 – 2001-0680 – QVT/Net / Term FTP Server LIST Command Traversal Remote File Access
- 1999-11-22 – 1999-1058 – Vermillion FTPD Long CWD Command Handling Remote Overflow DoS
The list gained another FTP server issue that doesn’t necessarily lead to privileges, and another remote denial of service attack, while losing the Puppet symlink (CVE-2012-1054) and the vague Oracle Enterprise Manager issue (CVE-2011-0877). All said and done, the list is just as confusing as before, perhaps more so. That gives us four FTP vulnerabilities, only one of which leads to remote code execution, and two denial of service attacks, that gain no real privileges for an attacker. As Andreas Lindh points out, that I failed to highlight, having a man-in-the-middle vulnerability occupy two spots on this list is also baffling in the context of the volume of attacks stated. Also note that with the addition of CVE-2002-1054 (Pablo FTP), there are now two vulnerabilities that appear on the DBIR 2015 and DBIR 2016 top ten CVE list.
Hopefully the forthcoming blog from Michael Roytman will shed some light on these issues.