Vulnerability Research In Numbers
Posted by jericho
I’m so far behind in my daily routine and missed Thomas Ptacek’s post on Vuln Research In Numbers. Fortunately, Dave Aitel referenced the blog entry which prompted me to check it out. I so desperately want Ptacek to run his numbers against a complete OSVDB data set, but alas, I know that we do not have a complete data set for 2004. Symantec’s BID database has some problems with consistancy, citing sources (aka provenance) and missing vulnerabilities (which plagues most VDBs to one degree or another). In my mind, OSVDB tends to be more complete than most VDBs and maintains a creditee field, but due to a lack of volunteers we just don’t have enough entries public for him to do the same generation of statistics. Some day maybe! Until then, this is a very neat post with a different slant.

I wonder whether we should prioritize our efforts to have a complete set of classified vulnerabilities for high priority apps, like Internet Explorer?
Also, have you guys thought about expanding the responsibilities of the new data mangler to include information that allow for data mining? I think adding version information as part of the new data manglers job would allow to play with the data.
Christian
One more thing regards data mining. Seems like the downloadable version of OSVDB only contains stable vulnerabilities. What about including new vulnerabilities as well? Potentially someone can already run some stats on the date and summary field… Christian
Adding the product/version information is the most time consuming part of the mangling process usually. For almost two years I handled a bulk of the daily NDM stuff and found time to work on other NDM related projects (old vulns, other missing timeframes, etc). For the last few months, i’ve fallen behind more and more. It’s gotten so bad that we had to recruit Lyger to help with maintaining cross-references to other databases, and swtornio to handle some of the NDM work as well. As time permit(ted|s) I will add the product/version and other information but it just takes too much time usually.
As for the daily export, we can certainly discuss changing the information that is included in it. There are pros and cons and these decisions were made a few years back. Revisiting those choices may be a good idea.