Jeremiah Grossman: Black Box vs White Box. You are doing it wrong.

Wednesday, October 28, 2009

Black Box vs White Box. You are doing it wrong.

A longstanding debate in Web application security, heck all of application security, is which software testing methodology is the best -- that is -- the best at finding the most vulnerabilities. Is it black box (aka: vulnerability assessment, dynamic testing, run-time analysis) or white box (aka: source code review, static analysis)? Some advocate that a combination of the two will yield the most comprehensive results. Indeed, they could be right. Closely tied into the discussion is the resource (time, money, skill) investment required, because getting the most security bang for the buck is obviously very important.

In my opinion, choosing between application security testing methodologies based upon a vulnerabilities-per-dollar metric is a mistake. They are not substitutes for each other, especially in website security. The reasons for choosing one particular testing methodology over the other are very different. Black and white box testing measure very different things. Identifying vulnerabilities should be considered a byproduct of the exercise, not the goal. When testing is properly conducted, the lack or reduction of discovered vulnerabilities demonstrates improvement of the organization, not the diminished value of the prescribed testing process.

If you reached zero vulnerabilities (unlikely), would it be a good idea to stop testing? Of course not.

Black box vulnerability assessments measure the hackability of a website given an attacker with a certain amount of resources, skill, and scope. We know that bad guys will attack essentially all publicly facing websites at some point in time, so it makes sense for us to learn about the defects before they do. As such, black box vulnerability assessments are best defined as an outcome based metric for measuring the security of a system with all security safeguards in place.

White box source code reviews, on the other hand, measure and/or help reduce the number of security defects in an application resulting from the current software development life-cycle. In the immortal words of Michael Howard regarding Microsoft’s SDL mantra, “Reduce the number of vulnerabilities and reduce the severity of the bugs you miss.” Software has bugs, and that will continue to be the case. Therefore it is best to minimize them to the extent we can in effort to increase software assurance.

Taking a step back, you might reasonably select a particular product/service using vulns-per-dollar as one of the criteria, but again, not the testing methodology itself. Just as you wouldn’t compare the value of network pen-testing against patch management, firewalls against IPS, and so on. Understanding first what you want to measure should be the guide to testing methodology selection.

3 comments:

chriscla said...: "Black box vulnerability assessments measure the hackability of a website given an attacker with a certain amount of resources, skill, and scope."

Unfortunately, the skill, resources, and scope that attackers have always outnumber the skill, resources, and scope that companies are willing to pay for during a blackbox pen-test.

A good compromise is to gauge how likely a given vulnerability is to be discovered and exploited. This method is accurate enough, even if it is a big subjective.; October 30, 2009 at 6:53 AM
James Landis said...: Black box is not synonymous with runtime testing, and white box is not synonymous with static analysis. Can we please stop perpetuating this?

I'd agree that all static analysis is "white box", but not all "white box" testing is static. All "black box" is runtime, but not all runtime is "black box".

Does anyone else care about this?; October 30, 2009 at 1:41 PM
Anonymous said...: Agreeing with James. White-box testing can be dynamic too. It is all about the level of access one has for testing the SUT. So, the classification presented in this article is erroneous. In fact, "smart fuzzing" is all about learning from the internals of the SUT, which => (in cases) you have its code (source of binary) to generate intelligent inputs.

-Sanjay; July 9, 2011 at 5:51 AM