Jeremiah Grossman: Mythbusting: Static Analysis Software Testing

Many Web security professionals believe that because Static Analysis Software Testing (SAST) has access to the source code and / or the binary of an application, it can deliver “100% code coverage.” Proponents of this assertion also claim that SAST therefore offers a more comprehensive vulnerability analysis than Dynamic Analysis Software Testing (DAST). This belief is a myth.

Arbitrarily declaring that one form of testing is superior to another is like saying that a household thermostat is better at measuring heat than a meat thermometer. Sure, both devices do measure heat, but that’s where the similarity ends. Source code access absolutely has its benefits, but just like comparing the functions of a room temperature gauge and a meat temperature thermometer, there are many other important distinctions between SAST and DAST that must be considered.

SAST and DAST should be used for different purposes, because they are adept at identifying different classes of vulnerabilities, and at different stages of the Software Development Life-Cycle (SDLC). That is precisely why SAST and DAST should be considered complementary, and NOT competitive with one another. As when performing most tasks, to achieve the best results you need to match the right tool for the job at hand.

SAST Is the Better Choice During Code Writing & QA

SAST, for instance, is ideal for helping to reduce the number of security defects in an application introduced during the code writing and unit testing phases of SDLC. However, SAST products lack the ability to discover defects that are introduced during the requirement stage or just prior to deployment.

Nevertheless, many organizations have found value in SAST, because any amount of accurate vulnerability data that developers have as they write or commit code can mean the difference between fixing a vulnerability within days as opposed to weeks, months − or maybe never.

DAST Is the Better Choice When the Application Is Functional

DAST is typically deployed in late-stage QA or production when the application is functional, and therefore ideally suited for testing how truly secure a system is against an attacker who has a given amount of skill, time, and access to the website. As such, the comprehensiveness of DAST should meet or exceed the “bandwidth” of the adversary that the organization would like to repel. That’s why DAST is often considered to be superior for measuring outcomes − after SAST, SDLC processes, and security controls have been implemented.

Busting the Myth About SAST and Its “100% Coverage”

Based on the information about DAST and SAST presented to this point, let’s see if we can bust the myths about SAST and the “100% code coverage” that some claim it can provide.

1. SAST Fails to Find Vulnerabilities Located Outside the Code

SAST searches are confined to checking either the source code or binary code, while many very serious website vulnerabilities are located elsewhere. In a recent example of being able to overcome SAST searches, “Bloomberg News” was able to obtain financial earnings data for Disney, NetApp, and other publicly traded companies hours ahead of any other media organization. “Bloomberg News” then funneled the information to its customer subscriber terminals. Obviously, if you’re a stock trader, having access to earnings data before anyone else does is like discovering nuggets of pure gold. Surprisingly, Bloomberg was able to break through the extensive security on the top-level websites by using a very common Web hacking technique called “Predictable Resource Location (PRL).” PLR is used so commonly, in fact, that at WhiteHat we have found during testing of customer websites that PLR is an issue for 14% of those sites.

In this example of “Bloomberg News“ accessing earnings’ announcements before their public release, the companies had uploaded their financial reports to a secret location on their websites just prior to the stock market’s closing on the day prior to the “official” public announcements. The companies then hyperlinked the content of their reports immediately after the New York Stock Exchange’s closing bell marked the end of trading for the day.

Unfortunately, the URLs to the “secret & secure” financial reports that Bloomberg published prior their official release were easily guessed. For example, Q2 2010 earnings for Disney were found at www.disney.com/earnings/Q22010release.html and Q3 2010 earnings were at www.disney.com/earnings/Q32010release.html.

It certainly didn’t take a genius at “Bloomberg News” to figure out where the Q4 earnings report might be found. Basically, someone on the Bloomberg staff periodically checked the “guessed” URLs, waiting for the exact moment when the documents associated with them became available, i.e., were uploaded. While the files were technically hidden, “security by obscurity” was the only safeguard to prevent a breach of security.

These simple hacks, which were possible due to minor security mistakes, nevertheless caused extremely serious breaches of Web security for several world-famous companies. And recently, both the national and international press have reported many similar examples of security breaches based on simple methods for hacking “highly secure” websites.

Clearly, SAST – with its “100% code coverage” – fails to find these types of issues. DAST on the other hand can and does find them, because it can perform an educated brute force search to discover these types of files.

With PRLs being 14% of the hacks that we find at WhiteHat, they are on our Top Ten List of Most Pervasive Vulnerabilities. Two other types of vulnerabilities NOT typically found in source code: “Information Leakage” – 64% of websites, and “HTTP Response Splitting” – discovered in 9% of websites.

2. SAST Is Unable to Find Vulnerabilities in Third-Party Code Unless It Has Access to That Code

One of the most often-mentioned shortcomings of a SAST deployment is the difficulty it has in capturing the entire code base of an application. That’s because the code is often spread across the enterprise in separate repositories, or resides in different business units that may share the same code, and /or is located in compiled libraries supplied by third-party vendors. While “100% code coverage” is possible to achieve in theory, the reality is much different.

If you are using third-party ISVs that are unwilling to provide source code, which is often the case, then the only alternative is to rely on binary analysis. However, what if the third-party software your application relies upon is also hosted at a different physical location, such as a remote website or XML Web service? In that case, you must request authorization to test the software, and even if the vendor grants your request, only DAST can perform a test under those circumstances.

3. SAST Is Unable to Prioritize Vulnerability Resolution Based on Exploitability

Let’s say a security professional wants to prioritize remediation efforts in the context of a particular vulnerability’s: (1) likelihood of discovery, (2) difficulty of discovery, (3) potential negative impact on business and/or regarding technical risks, (4) Web Application Firewall (WAF) virtual-patching potential, and (5) the relative skill required for the exploitation to be successful. These are the most critical metrics to prioritizing “risk.” Given these factors to consider, business stakeholders are left with the difficult decision of whether to: (A) transfer product development resources to fix a vulnerability that may or may not be exploited, risking potential financial loss IF it is, or (B) ignore the issue in the short-term in order to publish the new feature(s) that WILL definitely cost the company money if it is not delivered on time.

Experienced Web security professionals know this situation of “making a choice that’s really a guess” occurs quite often, and that an IT executive must have the insight to make the most well-informed security risk decisions. That’s because SAST itself is limited to simply analyzing the source code, and is thus unable to provide this kind of insight. Even if SAST could provide 100% code coverage − whether early in the software development lifecycle or at any other time − it can never deliver the degree of intelligence that DAST can.

4. SAST Is Unable to Find Vulnerabilities Caused by Intermediary Components

Websites can be an incredibly complex collection of Web servers, Web applications, application servers, databases, load balancers, caching proxies, Web application firewalls, CDNs, and more. The interaction between all of these discrete components as they process user-supplied input often causes unexpected security issues to occur that are only discoverable at run-time or production. One component in the flow might encode or decode a piece of data before passing it onto the next layer, and then that layer encodes yet another piece of data and passes it on to the next layer, and so on. And security issues like these cannot be replicated in QA. Furthermore, SAST can analyze only code that is at rest, and it has no contextual knowledge of any intermediary components, let alone the ability to test them under these conditions.

For example, in one of the checks used at WhiteHat Sentinel, we HTML entity encode a standard Cross-Site Scripting (XSS) test, then base64 encode that value, and finally URL hex encodes that. The result is an almost nonsensical, backwards, triple-encoded, filter-evasion / canonicalization attack that would seem to have absolutely no chance of succeeding. Only it does.

app.cgi?foo=JTI2JTZjJTc0JTNiJTczJTYzJTcyJTY5JTcwJTc0JTI2JTY3JTc0JTNiJTYxJTZjJTY1JTcyJTc0JTI4JTMxJTI5JTI2JTZjJTc0JTNiJTJmJTczJTYzJTcyJTY5JTcwJTc0JTI2JTY3JTc0JTNi

Astonishingly, a check such as the one above has indeed found a non-trivial amount of vulnerabilities. On the way in, the string has no “special” characters, so it is allowed to cross the first filter gate. However, as the data passes through the system, each hardware component or application layer may recognize an encoding, decode it, and then pass it along to the next component − until the fully decoded string eventually finds its way back to the Web browser in a vulnerable XSS state! We’ve also found similar vulnerabilities in Content Spoofing, HTTP Response Splitting, and SQL Injection.

While it may be easy to classify these vulnerabilities as edge-cases, in our experience of assessing over 3,000 websites − from start-ups to Fortune 500 companies − and performing thousands of filter-bypass experiments, there are hundreds of edge-cases like the one described here, and websites are susceptible to a wide variety of such techniques. Most double and triple-layered attacks score in the single and sub-single digit vulnerability percentage of total websites, but a surprising handful score up to low double-digit percentile (e.g. 10%, 12%, etc).

5. SAST Is Unable to Find Business Logic Flaws

To be fair, DAST is unable to find them, either. But that’s the point: “100% code coverage” is basically meaningless when it comes to Business Logic Flaws, which make up an extremely important segment of website security issues. For example, let’s say that a website allows users to reset their password if they’ve forgotten it. In many cases users must enter their email address and correctly answer a previously defined secret question. The secret question is personal, making it easy for a person to remember, but also difficult − hopefully − for others, i.e., attackers, to guess. Often the level of difficulty is less than assumed, such as “What is your favorite color?” or “What is your date of birth?” That’s because many systems using this type of password security method contain pools of likely answers that are too small to prevent brute force attacks designed to access user accounts.

This class of issue / attack, which is called Weak Password Recovery Validation, is just one of many types of Business Logic Flaws that SAST or DAST scanners are unable to identify without human intervention. The fact is that you can use the most advanced automation – even if it has access to the application source code and binary – and still fail to find certain types of vulnerabilities. Furthermore, if your SAST or DAST scanner shows clean results, you could have problems of the worst kind – the ones you don’t know you have. Many companies have discovered this only after the damage to their sites and customers had been done.

Essentially, Business Logic Flaws must be discovered manually, by analyzing source code or testing for them at run-time. If possible, do both procedures; but “manually” must be the method.

6. SAST Won’t Find Vulnerabilities in Code That It’s Been Told Is Safe

One of the most vital elements of writing secure software is input validation. This is the act of testing incoming data to make sure it conforms to what is expected to be received before allowing it to be used. Rock solid input validation, which includes data normalization and checks for length, character-set, and format is peerless in its ability to wipe out vulnerabilities such as Cross-Site Scripting, SQL Injection, Command Injection, etc. To check if a particular data path is being protected by an input validator a SAST scanner must guess at the right subroutine or be told the location by its human operator. In either case the problem is input validation libraries / subroutines are often not perfectly implemented. That is input validators will often contain security gaps that DAST can find ways to bypass. Clearly there is a difference between an input validator that’s applied and one that actually works well. SAST can only reliably verify the former case, potentially missing many vulnerabilities, while DAST excels at the latter and finds those missed issues.

7. SAST Has A Difficult Time Finding Vulnerabilities in Client-Side Code

Much of the code in a current Web 2.0 application is executed on the client rather than on the server. Because of this, and for several years now, an increasing number of vulnerabilities have been found in Web applications that use large amounts of client-side code, such as JavaScript and Adobe Flash. This client-side code also resides on websites that use third-party Web Widgets, such as ads, traffic counters, user polls, security badges, social buttons, etc. The growing amount of interaction based on user-supplied data, in code that may or may not be your own, has provided many new locations for Cross-Site Scripting vulnerabilities to exist. This type of vulnerability is generally referred to as “DOM-based” Cross-Site Scripting.

While the source code to client-side applications is technically available to a Web browser’s “view-source” functionality, SAST scanners designed with Java, PHP, ASP.NET, and other types of support are rarely helpful with JavaScript or Adobe Flash. Just as with Business Logic Flaws, most client-side code vulnerabilities must be found manually, aided by purpose-built penetration-testing tools.

Conclusion

Again, when used very early in the development process SAST is a great resource for discovering software security defects, and for locating improper code constructs and policy violations. SAST also enables organizations to identify code flaws while the code is being written, and to isolate bugs all the way down to the line of code and to the developer who checked the code in. However, the idea that SAST can find all issues, at all times – or even that SAST can find all the high severity vulnerabilities – is a myth.

The best advice is to determine precisely what your organization needs to measure; then to select the most appropriate software security testing methodology for making the measurements. Because doing the process the other way around would be like using a tape measure to figure out if a house has termites.

So, after reading in the six examples above about what SAST fails to do, or is unable to do, or finds difficult to do, what you think? Is the SAST “100% code coverage” myth busted?

Jeremiah Grossman

Tuesday, March 22, 2011

Mythbusting: Static Analysis Software Testing – 100% Code Coverage

No comments: