Jeremiah Grossman: 2013

Tuesday, October 22, 2013

What’s the Difference between Aviator and Chromium / Google Chrome?

Context:

It’s a fundamental rule of Web security: a Web browser must be able to defend itself against a hostile website. Presently, in our opinion, the market share leading browsers cannot do this adequately. This is an every day threat to personal security and privacy for the more than one billion people online, which includes us. We’ve long held and shared this point of view at WhiteHat Security. Like any sufficiently large company, we have many internal staff members who aren’t as tech savvy as WhiteHat’s Threat Research Center, so we had the same kind of security problem that the rest of the industry had: we had to rely on educating our users, because no browser on the market was suitable for our security needs. But education is a flawed approach – there are always new users and new security guidelines. So instead of engaging in a lengthy educational campaign, we began designing an internal browser that would be secure and privacy-protecting enough for our own users — by default. Over the years a great many people — friends, family members, and colleagues alike — have asked us what browser we recommend, even asked us what browser their children should use. Aviator became our answer.

Why Aviator:

The attacks a website can generate against a visiting browser are diverse and complex, but can be broadly categorized in two types. The first type of attack is designed to escape the confines of the browser walls and infect the desktop with malware. Today’s top tier browser defenses include software security in the browser core, an accompanying sandbox, URL blacklists, silent-updates, and plug-in click-to-play. Well-known browser vendors have done a great job in this regard and should be commended. No one wins when users desktops become part of a botnet.

Unfortunately, the second type of browser attack has been left largely undefended. These attacks are pernicious and carry out their exploits within the browser walls. They typically don’t implant malware, but they are indeed hazardous to online security and privacy. I’ve previously written up a lengthy 8-part blog post series on the subject documenting the problems. For a variety of reasons, these issues have not been addressed by the leading browser vendors. Rather than continue asking for updates that would likely never come, we decided we could do it ourselves.

To create Aviator we leveraged open source Chromium, the same browser core used by Google Chrome. Then, because the BSD license of Chromium allows us, we made many very particular changes to the code and configuration to enhance security and privacy. We named our product Aviator. Many people are eager to learn what exactly the differences are, so let’s go over them.

Differences:

Protected Mode (Incognito Mode) / Not Protected Mode:
TL;DR All Web history, cache, cookies, auto-complete, and local storage data is deleted after restart.
Most people are unaware that there are 12 or more locations that websites may store cookie and cookie-like data in a browser. Cookies are typically used to track your surfing habits from one website to the next, but they also expose your online activity to nosy people with access to your computer. Protected Mode purges these storage areas automatically with each browser restart. While other browsers have this feature or something similar, it is not enabled by default, which can make it a chore to use. Aviator launches directly into Protected Mode by default and clearly indicates the mode of the current window. The security / privacy side effect of Protected Mode also helps protect against browser auto-complete hacking, login detection, and deanonymization via clickjacking by reducing the amount of session states you have open – due to an intentional lack of persistence in the browser over different sessions.
Connection Control:
TL;DR Rules for controlling the connections made by Aviator. By default, Aviator blocks Intranet IP-addresses (RFC1918).
When you visit a website, it can instruct your browser to make potentially dangerous connections to internal IP addresses on your network — IP addresses that could not otherwise be connected to from the outside (NAT). Exploitation may lead to simple reconnaissance of internet networks, or it may permanently compromise your network by overwriting the firmware on the router. Without installing special third-party software, it’s impossible to block any bit of Web code from carrying out browser-based intranet hacking. If Aviator happens to be blocking something you want to be able to get to, Connection Control allows the user to create custom rules — or temporarily use another browser.
Disconnect bundled (Disconnect.me):
TL;DR Blocks ads and 3rd-party trackers.
Essentially every ad on every website your browser encounters is tracking you, storing bits of information about where you go and what you do. These ads, along with invisible 3rd-party trackers, also often carry malware designed to exploit your browser when you load a page, or to try to trick you into installing something should you choose to click on it. Since ads can be authored by anyone, including attackers, both ads and trackers may also harness your browser to hack other systems, hack your intranet, incriminate you, etc. Then of course the visuals in the ads themselves are often distasteful, offensive, and inappropriate, especially for children. To help protect against tracking, login detection and deanonymization, auto cross-site scripting, drive-by-downloads, and evil cross-site request forgery delivered through malicious ads, we bundled in the Disconnect extension, which is specifically designed to block ads and trackers. According to the Chrome web store, over 400,000 people are already using Disconnect to protect their privacy. Whether you use Aviator or not, we recommend that you use Disconnect too (Chrome / Firefox supported). We understand many publishers depend on advertising to fund the content. They also must understand that many who use ad blocking software aren’t necessarily anti-advertising, but more pro security and privacy. Ads are dangerous. Publishers should simply ask visitors to enable ads on the website to support the content they want to see, which Disconnect’s icon makes it easy to do with a couple of mouse-clicks. This puts the power and the choice into the hands of the user, which is where we believe it should be.
Block 3rd-party Cookies:
TL;DR Default configuration update.
While it’s very nice that cookies, including 3rd-party cookies, are deleted when the browser is closed, it’s even better when 3rd-party cookies are not allowed in the first place. Blocking 3rd-party cookies helps protect against tracking, login detection, and deanonymization during the current browser session.
DuckDuckGo replaces Google search:
TL;DR Privacy enhanced replacement for the default search engine.
It is well-known that Google search makes the company billions of dollars annually via user advertising and user tracking / profiling. DuckDuckGo promises exactly the opposite, “Search anonymously. Find instantly.” We felt that that was a much better default option. Of course if you prefer another search engine (including Google), you are free to change the setting.
Limit Referer Leaks:
TL;DR Referers no longer leak cross-domain, but are only sent same-domain by default.
When clicking from one link to the next, browsers will tell the destination website where the click came from via the Referer header (intentionally misspelled). Doing so could possibly leak sensitive information such as the search keywords used, internal IPs/hostnames, session tokens, etc. These leaks are often caused by the referring URL and offer little, if any, benefit to the user. Aviator therefore only sends these headers within the same domain.
Plug-Ins Click-to-Play:
TL;DR Default configuration update enabled by default.
Plug-ins (E.g. Flash and Java) are a source for tracking, malware exploitation, and general annoyance. Plug-ins often keep their own storage for cookie-like data, which isn’t easy to delete, especially from within the browser. Plug-ins are also a huge attack vector for malware infection. Your browser might be secure, but the plug-ins are not and one must update constantly. Then of course all those annoying sounds and visuals made by plug-ins which are difficult to identify and block once they load. So, we blocked them all by default. When you want to run a plug-in, say on YouTube, just one-click on the puzzle piece. If you want a website to always load the plug-ins, that’s a configuration change as well. “Always allow plug-ins on…”
Limit data leakage to Google:
TL;DR Default configuration update.
In Aviator we’ve disabled “Use a web service to help resolve navigation errors” and “Use a prediction service to help complete searches and URLs typed in the address bar” by default. We also removed all options to sync / login to Google, and the tracking traffic sent to Google upon Chromium installation. For many of the same reasons that we have defaulted to DuckDuckGo as a search engine, we have limited what is sent in the browser to Google to protect your privacy. If you chose to use Google services, that is your choice. If you chose not to though, it can be difficult in some browsers. Again, our mantra is choice – and this gives you the choice.
Do Not Track:
TL;DR Default configuration update.
Enabled by default. While we prefer “Can-Not-Track” to “Do-Not-Track”, we figure it was safe enough to enable the “Do Not Track” signal by default in the event it gains traction.

We so far have appreciated the response to WhiteHat Aviator and welcome additional questions and feedback. Our goal is to continue to make this a better and more secure browser option for consumers. Please continue to spread the word and share your thoughts with us. Please download it and give it a test run. Let us know what you think! Click here to learn more about the Aviator browser.

Tuesday, September 17, 2013

20,000

20,000. That’s the number of websites we’ve assessed for vulnerabilities with WhiteHat Sentinel. Just saying that number alone really doesn’t do it any justice though. The milestone doesn’t capture the gravity and importance of the accomplishment, nor does it fully articulate everything that goes into that number, and what it took to get here. As I reflect on 20,000 websites, I think back to the very early days when so many people told us our model could never work, that we’d never see 1,000 sites, let alone 20x that number. (By the way, I remember their names distinctly ;).) In fairness, what they couldn’t fully appreciate then is “Web security” in terms of what it really takes to scale, which means they truly didn’t understand “Web security.”

When WhiteHat Security first started back in late 2001, consultants dominated the vulnerability assessment space. If a website was [legally] tested for vulnerabilities, it was done by an independent third-party. A consultant would spend roughly a week per website, scanning, prodding around, modifying cookies, URLs and hidden form fields, and then finally deliver a stylized PDF report documenting their findings (aka “the annual assessment”). A fully billed consultant might be able to comprehensively test 40 individual websites per year, and the largest firms would maybe have many as 50 consultants. So collectively, the entire company could only get to about 2,000 websites annually. This is FAR shy of just the 1.8 million SSL-serving sites on the Web. This exposed an unacceptable limitation of their business model.

WhiteHat, at the time of this writing, handles 10x the workload of any consulting firm we’re aware of, and we’re nowhere near capacity. Not only that, WhiteHat Sentinel is assessing these 20,000 websites on a roughly weekly basis, not just once a year! That’s orders of magnitude more security value delivered than what the one-time assessments can possibly provide. Remember, the Web is a REALLY big place, like 700 million websites big in total. And that right there is what Web security is all about, scale. If any solution is unable to scale, it’s not a Web security solution. It’s a one-off. It might be a perfectly acceptable one-off, but a one-off none-the-less.

Achieving scalability in Web security must take into account the holy trinity, a symbiotic combination of People, Process, and Technology – in that order. No [scalable] Web security solution I’m aware of can exist without all three. Not developer training, not threat modeling, not security in QA, not Web application firewalls, not centralized security controls, and certainly not vulnerability assessment. Nothing. No technological innovation can replace the need for the other two factors. The best we can expect of technology is to increase the efficiency of people and processes. We’ve understood this at WhiteHat Security since day one, and it’s one of the biggest reasons WhiteHat Security continues to grow and be successful where many others have gone by the wayside.

Over the years, while the vulnerabilities themselves have not really changed much, Web security culture definitely has. As the industry matures and grows, and awareness builds, we see the average level of Web security competency decrease! This is something to be expected. The industry is no longer dominated by a small circle of “elites.” Today, most in this field are beginners, with 0 – 3 years of work experience, and this is a very good sign.

That said, there is still a huge skill and talent debt everyone must be mindful of. So the question is: in the labor force ecosystem, who is in the best position to hire, train, and retain Web security talent – particularly the Breaker (vulnerability finders) variety – security vendors or enterprises? Since vulnerability assessment is not and should not be in most enterprises’ core competency, AND the market is highly competitive for talent, we believe the clear answer is the former. This is why we’ve invested so greatly in our Threat Research Center (TRC) – our very own professional Web hacker army.

We started building our TRC more than a decade ago, recruiting and training some of the best and brightest minds, many of whom have now joined the ranks of the Web security elite. We pride ourselves on offering our customers not only a very powerful and scalable solution, but also an “army of hackers” – more than 100 strong and growing – that is at the ready, 24×7, to hack them first. “Hack Yourself First” is a motto that we share proudly, so our customers can be made aware of the vulnerabilities that exist on their sites and can fix them before the bad guys exploit them.

That is why crossing the threshold of 20,000 websites under management is so impressive. We have the opportunity to assess all these websites in production – as they are constantly updating and changing – on a continuous basis. This arms our team of security researchers with the latest vulnerability data for testing and measuring and ultimately protecting our customers.

Other vendors could spend millions of dollars building the next great technology over the next 18 months, but they cannot build an army of hackers in 18 months; it just cannot be done. Our research and development department is constantly working on ways to improve our methods of finding vulnerabilities, whether with our scanner or by understanding business logic vulnerabilities. They’re also constantly updated with new 0-days and other vulnerabilities that we try to incorporate into our testing. These are skills that take time to cultivate and strengthen and we have taken years to do just that.

So, I have to wonder: what will 200,000 websites under management look like? It’s hard to know, really. We had no idea 10+ years ago what getting to 20,000 would look like, and we certainly never would have guessed that it would mean we would be processing more than 7TB of data over millions of dollars of infrastructure per week. That said, given the speed at which the Internet is growing and the speed at which we are growing with it, we could reach 200,000 sites in the next 18 months with. And that is a very exciting possibility.

Thursday, September 12, 2013

Upcoming SANS Webcast: Convincing Management to Fund Application Security

Many security departments struggle tirelessly to obtain adequate budget for security, especially application security. It’s also no secret that security spending priorities are often grossly misaligned with respect to how businesses invest in IT. This is something I’ve discussed on my blog many times in the past.

The sheer lack of resources is a key reason why Web applications have been wide open to exploitation for as long as they’ve existed, and why companies are constantly getting hacked. While many in the industry understand the problem, they struggle justifying the level of funding necessary to protect the software their organizations build or license.

In December 2012, the SANS Institute conducted a survey of 700 organizations on app security programs and practices. That survey revealed that the primary barriers to implementing secure app management programs were “lack of management funding/buy-in,” followed by lack of resources and skills. Those two are pretty closely aligned, don’t you think?

A 2013 Microsoft survey obtained similar results. In it, more than 2,200 IT professionals and 490 developers worldwide were asked about secure development life cycle processes. The top barriers they cited were lack of management approval, lack of training and support, and cost. It’s time we start developing tools and strategies to begin solving this problem.

In a recent CSO article, SANS’ John Pescatore made some excellent points about how security people need to start approaching their relationships with management. Instead of sounding the alarm, they need to focus more on providing solutions. Let’s say that again: bring management BUSINESS SOLUTIONS and not just the problems. John correctly states that a CEO thinks in terms of opportunity costs, so security people need to use a similar mindset when strategizing a budget conversation with a CEO. Doing so does wonders.

Obviously, that’s not nearly enough of an answer to get a productive conversation started. We security people need more examples, business models, cost models, ROI models, real-world examples, and so on. This will be the topic of a webcast I’m co-presenting with John Pescatore, hosted by SANS. If you’d like come and hear us go over the the material, we’d love to have you there! on September 19 1pm EDT (10am PDT). Or, skip the webcast, and just read this whitepaper on the topic.

Monday, September 09, 2013

Government Surveillance: Why it doesn’t matter if you delete your email

Every three months I have a task alert reminding me to “delete cloud data.” This kicks off an hour or two spent clicking checkboxes and trashcan icons, getting rid of as much data as I can that is stored on someone else’s servers (Gmail, other Webmail, Twitter Direct Messages, LinkedIn messages, Facebook messages, and so on). The reason I do this is simple: to protect against the eventuality that someone will hijack one of my online accounts.

Even as a security pro, I’m not so arrogant as to think that I can’t be hacked, and my online accounts are especially vulnerable since I am not in total control of them. I figure that getting hacked is only a matter of time, either through a social engineering trick or exploitation of a website vulnerability. We’ve seen a number of celebrities and security pros alike suffer this already. For me, when the day comes, I want to limit the data loss exposure to no more than three months. It’s not that any of my data kept in “the cloud” is super sensitive, but I still don’t want it dumped on Pastebin.

While this ritual has served me well, there’s one glaring problem: the National Security Agency (NSA). Well, specifically PRISM and any other surveillance programs that they and other governments have. According to published reports, government agencies have what can only be described as wholesale access to end-user data located at Google, Facebook, and many other companies storing email and other interpersonal communication. Through various “transparency” reports released, we’re talking tens of thousands of requests without much, if any, governmental oversight or people having the power to legally object. Protecting my data against this sort of compromise is very different and renders my aforementioned data deletion useless. I’ll explain why.

Let’s say you use Gmail, or any Webmail provider for that matter. Using a browser, you craft an email, send it to another Gmail user, then subsequently delete that message from your Sent folder. Let’s say that recipient then responds to your email. You read it, and then promptly delete it. From your perspective, in your account the data is gone and anyone directly hijacking your account can’t see that anything was ever sent or received. This is exactly the outcome we were looking for. BUT, this is not necessarily true from the service provider’s perspective, or for government surveillance.

You see, the Gmail user you’ve been emailing still has a perfect transactional record of all of your sent/received email, which is sitting somewhere in their account, probably in their Inbox or Sent folders. Now, scale this out to all the email you send to any Webmail provider, and you start to get the idea. You might have deleted your email in your account, but no one else has deleted your email in their account. When Google (et al) receives a governmental order to hand over all email to/from “@gmail.com”, they can do the search system-wide. To be fair, I have no idea if they actually perform the search this way, but the fact is that they technically can.

At this point it’s also important to appreciate that when you delete email on a Webmail service, there is zero guarantee that your email has in fact been deleted. At least, nothing like the assurance you get with your own system(s).

When explaining this situation, a common reaction is suggesting Google should simply encrypt your email/data, so that not even they can read it. Before getting to that, let’s understand why Google, Yahoo, Facebook, and hundreds of companies offer you free online services. They do so because in exchange they get access to your data – however sensitive – and personal interests, no matter how private. They sell aggregated access to this data to advertisers who wish to promote their brand or influence your buying habits. That’s essentially how they make their tens of billions of dollars annually.

This relationship is not necessarily a bad deal and so far, it isn’t even controversial. What’s controversial is that Google, or any other “free” Webmail provider that needs to read your data to make money, obviously they can’t encrypt it from themselves to protect against government surveillance. It would be contrary to their business model. On this point even Vint Cerf, one of the fathers of the Internet and Google’s Chief Internet Evangelist, agrees. In the wake of the PRISM headlines, a main concern of theirs is that users will freak out and withdraw their data, decrease use of the service out of fear, and they then lose money. I think their concerns are well-founded.

That’s why, in response, companies like Google, Yahoo, Twitter, Facebook and others are eager to reassure their users and consumers that they are going to resist surveillance to the extent they legally can do so and continue to be “transparent” with them by disclosing the number of times that government has made data requests. They’ll even go so far as to challenge a government gag order to make sure they can disclose to users with as much details as possible. The truth of the matter is, “transparency” is probably the best these companies can do, but it’s just not good enough – nor will it ever be unless these companies change their business models, which they can’t, so they won’t, so we’re stuck.

What’s the answer then? On any individual user level, my quick advice has always been: if it’s something that you can’t afford to lose, or something that is truly personal to you, don’t put it on the Internet. In the same vein, if you’re going to be browsing NSFW sites while at work then do so using a search engine that does not track your data. DuckDuckGo and a few other sites like it can be a good option for this. And then, of course, you could use PGP or other tools to encrypt your email content before pasting it to Gmail. Unfortunately, personal email encryption software hasn’t proven itself very easy or attractive enough for mainstream use. And admittedly, PGP itself does not completely safeguard your email from the government or other prying eyes: the email envelope itself, which includes valuable info such as sender, recipient, subject, time sent, mail servers, etc., is still visible.

For companies like Google, Yahoo, and Facebook, the only real solution they can offer to their users is to redesign their business models so that they are not reliant on the ability to store and read user data to succeed. Yes, far easier said than done, but let’s just consider that for a moment.

If, for example, Facebook charged it’s more than one billion users just $5.00 USD for an entire year’s worth of the service, it would more than make up for the 2012 revenue it receives from advertisers. Without complete reliance upon advertisers for revenue, Facebook would no longer have any real reason to keep your data or reason not to encrypt it. A similar model could be applied to Webmail as well. In fact, Google offer paid-for corporate email hosting already via Google Apps. So why isn’t the email encrypted from themselves? (Or maybe it is?)

For Google, Yahoo and Microsoft, advertising based on search terms just does not need to be targeted at the individual – this eliminates the need to retain search analytics information. It doesn’t eliminate advertising completely, it just makes advertising tied to individual search queries no longer tied to your personal information – which means they don’t have to store the data, or if they do, leave it in such a way that they can read it.

Perhaps in all of this I’m just being naive, even a little bit idealistic. The more likely reality is there is simply too many business conflicts of interest for these companies right now under their current models to charge for their services directly and encrypt the data, so the only thing they can do is offer “transparency.” For me, that’s just not good enough.

Thursday, May 30, 2013

HackerKombat II: Capturing Flags, Pursuing the Trophy

Years ago, a small group of 5-6 of us at WhiteHat held impromptu hacking contests – usually over lunch or during breaks in the day – in which we would race each other to be the quickest to discover vulnerabilities in real live customer websites (banks, retailers, social networks, whatever). No website survived longer than maybe 20 minutes. These contests were a nice break in the day and they allowed us to share (or perhaps show off) our ability to break into things quickly. The activity usually provided comic relief, moments of humility, and most importantly they opened opportunities to learn from each other.

We have scores of extremely talented and creative minds working at WhiteHat and these activities were some of the earliest testaments to that. Our corporate culture is eager to break what was previously thought of as “secure,” often just for the fun and challenge. Today, WhiteHat has more than 100 application security specialists in our Threat Research Center (TRC) alone – essentially our own Web hacker army. With so many people now, our contests were forced to evolve, to grow and to mature. We now organize a formal internal activity called HackerKombat.

HackerKombat is a WhiteHat employee only event, a game we hold every couple of months, a late-night battle between some of the best “breakers” in the business. HackerKombat is our version of a “Hackathon,” which companies like Facebook and others host as a means to challenge their engineers to build cool new apps, new features, etc.

HackerKombat challenges our team to break things — to break websites and web applications, to test our hacker skills in a pizza and alcohol infused environment. The goals are to have some fun in a way that only hackers could appreciate, but also to encourage teamwork and thinking outside the box, and to expose areas of knowledge where we are weak.

Unlike years past, the websites and applications we target are staged – no more hacking live customer sites! We have learned that while the average business-driving website might withstand the malicious traffic of a few hackers targeting it, a dozen or more could easily cause downtime. We certainly can’t have that and you’ll see how easy that can be later in this post.

The HackerKombat challenges are designed by Kyle Osborn (@theKos), a WhiteHat TRC alumnus, accomplished security researcher, and frequent conference speaker, who is currently employed by Tesla Motors. Challenges are also developed by current TRC members, but doing so disqualifies them from actually playing — gotta keep things fair as we can. This isn’t much in way of rules for HackerKombat. I mean, are hackers expected to follow them anyway? 😉

Today, finding a single vulnerability is nowhere near enough to claim victory. HackerKombat is a series of challenges that are very difficult and require a wide variety of technical ability. Defeating every challenge requires a great team, and great teamwork. No way can a single person, even the best and brightest among us, get through every challenge and expect to have any chance of winning. Past events have shown there is strength in numbers – so we also had to cap the team size at 5-6 to keep things even.

A few weeks ago we hosted the second formal event – HackerKombat II. Teams were decided by draft, for a total of six teams with five combatants each spanning our Santa Clara headquarters as well as in our TRC location in Houston. In the hours leading up to HK II the trash talking was constant and searing. There was even an office pool posted and people were placing bets on the winning team! The biggest prize of all: our custom trophy.

The exact moment the game began the trash-talking ceased, poker faces were set – chatter became eerily quiet. If you wanted to win, and everyone did, every second and key press mattered. If someone was active on Jabber (chat client), you knew they were stuck. 😉

Each team’s approach to the 10 challenges was probably different. For my team – “Zerg” – we assessed each by triaging them first: determining what skill sets it would take and assigning those tasks to the right team member to tackle. The first 4 challenges or so were completed fairly easily within the first hour. The next 2-3 challenges we had to pair up to defeat them. Writing some code was necessary. Another hour gone. Then things got hard, really hard, and every team’s progress slowed way down.

Some of the challenges posed interesting hurdles that the designers did not anticipate. For instance, one challenge required teams to run DirBuster, which brute-forces web requests looking for a hidden web directory. The problem, however, is that a single Apache web server is not used to handling a dozen people all doing the same thing and sending thousands of requests per second. The challenge server died. Remember how I mentioned downtime? Apparently, speed in capturing that particular flag was the winning skill because no other team could get in to tackle it! Argh!

For the most difficult challenges, 9 and 10, Zerg had to gel together as a team to try to figure out the best approach and make incremental gains. I’m clearly very weak in my steganography skills. Terribly frustrating at a time we were so close to victory, but couldn’t seal the deal. An hour of study beforehand would have been enough.

In the end, the winning team – “Terrans” from Santa Clara – prevailed by completing all 10 challenges and capturing all 12 flags in a time of 4h and 46min, barely edging out the team in Houston – “PurpleStuff” – which came in second at 4h and 49min. Yes, when it was all said and done, 3 minutes separated the leaders. Imagine that!

In another moment of humility, Robert Hansen (@RSnake), another “great” in the industry, can at least claim he beat me and came in second. I’m not exactly certain even now where my team placed, probably around 4th, as every team managed to capture at least 10 flags before the Terrans claimed ultimate victory. I congratulate Rob, Nick, Dustin, Jon Paul and Ron for their win.

All in all, HK II was fun for all involved and everyone learned a great deal. We learned new techniques that the bad guys can use in the wild, and we learned where each of us individually needs to brush up on our studies. HK II’s success makes a founder very proud. I’m sure there are few, if any, companies that can pull off such an event.

I look forward to HK III. I want that trophy!

[Check out photos from JerCon and HK II here.]

Thursday, May 02, 2013

The State of Web Security

After months of hard work, today we are releasing the 2013 WhiteHat Website Security Statistics Report. Collectively represented are more than 650 organizations and tens of thousands of real-world websites continually monitored by WhiteHat Sentinel Services. This is the largest data set of its kind and we’re anxious to share all the new things we’ve learned.

This year, our 6th, we’ve done things differently. We wanted to try something truly ambitious, something that advances our collective understanding of application security, and something that to our knowledge has never been done before!

So, in addition to releasing detailed website vulnerability metrics that the community has come to rely upon, we sought to measure the impact of today’s so-called “best-practices.” To find out if activities such as software security training for developers, pre-production testing, static code analysis, web application firewalls, etc. really do lead to better security metrics and fewer breaches. To answer the fundamental question, what aspects of an SDLC program actually do make a difference – and how much? Of course, every “expert” has an opinion on the matter, but the best most anyone has is personal anecdote. That is, until now.

To get there we asked all Sentinel customers to privately share details about their SDLC and application security program in a survey format – we received 76 total responses. We then aggregated and correlated their answers to their website vulnerability outcomes and reported breaches. The results of this data combination are nothing less than stunning, enlightening, and often confusing.

To give you a taste for the full report, let’s start with the high-level basics:

The average number of serious* vulnerabilities per website continues to decline, going from 79 in 2011 down to 56 in 2012. This was not wholly unsuspected. Despite this, 86% of all websites tested were found to have at least 1 serious vulnerability during 2012. Of the serious vulnerabilities found, on average 61% were resolved and they took an average of 193 days to get resolved from the date of notification.

As far as the Top Ten most prevalent vulnerability classes in 2012, the list is relatively close to last year’s – though Information Leakage surpassed Cross-Site Scripting yet again:

Information Leakage – 55% of websites
Cross-Site-Scripting – 53% of websites
Content Spoofing – 33% of websites
Cross-site Request Forgery – 26% of websites
Brute Force –26% of websites
Fingerprinting – 23% of websites
Insufficient Transport Layer Protection –22% of websites
Session Fixation – 14% of websites
URL Redirector Abuse – 13% of websites
Insufficient Authorization – 11% of websites

Conspicuously absent is SQL Injection, which fell from #8 to #14 from 2011 to 2012, and now identified in only 7% of websites. Obviously vulnerability prevalence alone does not solely equate to exploitation.

When we took a closer look at some of the correlations of vulnerability and survey data, we found some counter-intuitive statistics – implying that software security controls, or “best practices” do not necessarily lead to better security – at least at all times in all cases:

57% of organizations surveyed provide some amount of instructor-led or computer-based software security training for their programmers. These organizations experienced 40% fewer vulnerabilities, resolved them 59% faster, but exhibited a 12% lower remediation rate.
39% of organizations said they perform some amount of Static Code Analysis on their website(s) underlying applications. These organizations experienced 15% more vulnerabilities, resolved them 26% slower, and had a 4% lower remediation rate.
55% of organizations said they have a Web Application Firewall (WAF) in some state of deployment. These organizations experienced 11% more vulnerabilities, resolved them 8% slower, and had a 7% lower remediation rate.

Two questions we posed in our survey illustrated that compliance is the number one driver for fixing web vulnerabilities…while it was also the number one driver for not fixing web vulnerabilities. Proponents of compliance often suggest that mandatory regulatory controls be treated as a “security baseline,” a platform to raise the floor, and not represent the ceiling. While this is a nice concept in casual conversation, this is typically not the real-world reality we see.

The last point I want to bring up for now focuses on accountability in the event of a data breach. Should an organization experience a website or system breach, WhiteHat Security found that 27% said the Board of Directors would be accountable. Additionally, 24% said Software Development, 19% Security Department, and 18% Executive Management. Here’s where things get really interesting though. By analyzing the data in this report, we see evidence of a direct correlation between increased accountability and decreased breaches, and of the efficacy of “best-practices” and security controls.

We stopped short of coming to any strong conclusions based upon this data alone. However, we now have something solid to work from in establishing new theories and avenues of research to explore. Please, have a look at the report and let us know what stands out to you. What are your theories for why things are the way they are? If you’d like different slices of the data, we’re all ears.

#WebsiteVulnStats

Twitter to @jeremiahg and @whitehatsec.

Personal side note: I would like to thank all of our customers who responded to our survey earlier this year as well as to a select group of respected individuals in the security space (they know who they are) that got a sneak peek of our findings last week and whose feedback was invaluable. Also thank you to my colleagues Gabriel Gumbs, Sevak Tsaturyan, Siri De Licori, Bill Coffman, Matt Johansen, Johannes Hoech, Kylie Heintz, and Michele Cox, whose teamwork helped bring everything together.

Thursday, February 14, 2013

WhiteHat Sentinel Infrastructure, by the Numbers

WhiteHat Sentinel has assessed well over 12,000 websites for vulnerabilities across 500 companies. For context, getting to our first 1,000 websites took four years. Today, we’re onboarding at least 1,000 per month.

The infrastructure’s concurrent scan average is roughly 2,100 with peaks reaching 3,374. Currently, these vulnerability scans generate 256 million HTTP requests per month. This traffic crosses over redundant 1GB Internet connections and has uncovered nearly 100,000 separate website vulnerabilities between 2006 and 2012. Collectively we index billions of URLs annually. Think Googlebot, but logged-in. To top it off, we log each and every HTTP request / response combo, with full headers, for every scan, on every website.

As you can see, mass scanning websites for vulnerabilities is highly disk intensive. That’s why Sentinel’s infrastructure has 220TB worth of clustered storage arrays, plus an additional 32TB in Virtual shared storage. This storage space is split up among 12 master databases and 12 standby databases (one for each master database for full tolerance), and each consumes about 20GB per week. 2TB of new data is being written to the NFS cluster every week.

We also have heavy server requirements. While we recommend Sentinel customers scan their websites continuously to minimize coverage gaps, current schedules are weighted towards commencing Thursday and Friday, extending over the weekend, and pause/complete by the Monday e-commerce rush. This of course is local time for the customer, and we do provide services for the entire planet! Monday morning is typically when customers analyze their most recent Sentinel vulnerability findings, integrate our results into their bug tracking system, and generate customized reports for the week’s meetings.

For efficiency, Sentinel’s infrastructure must be smart of enough to automatically provision Scan Servers and Reporting Servers. To accomplish this we leverage virtualization on top of several clusters of blade chassis, which allow us to control resource allocation between multiple scanning instances and load balanced front-end & back-end reporting Web servers. As new scans kickoff, as defined by their schedule, Scan Servers dynamically appear to handle the load. We’ve had as many as 64 Scan Servers running at once. As scans taper off, unnecessary Scan Servers vanish, freeing up their CPU / memory resources for the Reporting Servers. When we need additional server capacity, we add additional blades or an entire new blade chassis.

Next we could describe all the various networking gear, routers, switches, and firewalls, which bind everything together. The reality is we’re not comfortable sharing out that information publicly. What we can say is the entirety of the system passed a BITS/ISO27002 Shared Assessment compliance audit. Beyond that, you’ll need to sign a non-disclosure agreement.

Its safe to say the Sentinel infrastructure is rather sophisticated and contains a lot of moving parts. All told, our IT team monitors 162 hosts and over 1,300 services in production. They keep a close eye on utilization of network, CPU, memory, uptime, latency, etc. ensuring everything runs smoothly 24 hours of the day, 7 days a week, 365 days a year. With rare exception, Sentinel’s entire infrastructure is redundant. Pull any network cable, push any power button, and the system keeps hacking away — so to speak.

All of this heavy metal is connected together via dual 10GB backplane ethernet and housed in 5 fully utilized 42U racks (expanding into racks 6 and 7 shortly). Since the data we’re responsible for is highly sensitive, to the say the least, the racks are physically located in SSAE16 SOC 1, and soon to be FedRamp certified, state-of-the-art colocation facility. At the Colo, security guards are always onsite. Then there are digital video recorders, false entrances, vehicle blockades, bulletproof glass/walls, unmarked buildings, and person-traps authenticating only one person at a time. Access to our cage requires an appointment, government issued ID, biometric scan, and only then do they hand over the key.

Building the Sentinel infrastructure has taken us years, millions and millions of dollars, countless all nighters and precious hair follicles. It is something we’re extremely proud of and confident in. Nothing else like it, or even close to it, exists. And it’s always getting better, always being improved upon. When your mission is scanning every website on the Internet for vulnerabilities, making them measurably more secure, such a physical infrastructure is just one of the things you need. When we say “scalable,” this is what we mean.

Tuesday, February 12, 2013

Order of Injection Matters: Smart Scanning

Each year in Web security new vulnerability classes are published, new variations of the existing ones are documented, and each website must be tested for them. Likewise, each year, the already mountainous pile of HTTP requests necessary to test for these issues grows, which significantly increases scan times. This problem can’t be solved by threading scans, making simultaneous HTTP requests, alone. A smarter way of going about the scanning process is required. We need solutions drastically reducing the number of HTTP requests per scan, while maintaining vulnerability identification performance.

To begin the discussion, let’s take a look at SiteX, an every day e-commerce website. The attack surface of SiteX can be encompassed by 100 distinct URLs that have a total of 400 unique name / value pairs, 20 Web forms with a total 60 of fields, and 3 cookies that add 12 more input points. This gives a grand total of 472 injection points, all of which must be checked for vulnerabilities in a dynamic scan. (aka run-time testing, fuzz testing, fault-testing, black-box testing, etc.)

Let’s begin testing SiteX for Cross-Site Scripting (XSS) by starting with a simple payload like <* XSSTEST>. If “<* XSSTEST>” returns un-html-encoded in the response page, this is a good indication a vulnerability exists. 472 HTTP requests later, to fully exercise the aforementioned attack surface, we might have some interesting bug-bounty-worthy results.

Of course “<* XSSTEST>” alone is not enough for thorough XSS testing. Fortunately, we can easily try another XSS payload such as attribute injection, “ onmouseover=alert(document.domain).” Doing so costs another 472 HTTP requests. What about testing if the injection lands directly in Javascript space? Simple, submit “‘; alert(document.domain); ”. You get the idea now. Each payload tested must be run across SiteX’s entire attack surface and the price of 472 requests must be paid.

Scaling this out further, consider what happens if we need to test 5 XSS payloads, 10 payloads, 20, or maybe even 50! Racking up such a lengthy list of injections is trivial when attempting all the myriad of filter-bypass tricks documented over the years. For example, full url hex encoding converts “<* XSSTEST>” into “%20%3C%58%53%53%54%45%53%54%3E,” which might even work! We can even try Base64 encoding as well if we’d like, “PFhTU1RFU1Q+.”

In our scenario, such exhaustive XSS testing may require up to 23,600 HTTP requests (50 payloads x 472 attack surface), which could take a long while to complete. Next, think about similar testing for SQL Injection, Content-Spoofing, Command Execution, Path Traversal, HTTP Response Splitting, and injection style classes of attack. All of a sudden the number of HTTP requests for a full website vulnerability scan starts getting up into the 6 figures rather fast. This is why scans routinely take hours, even multiple days to finish.

At WhiteHat Security, one such way we’ve been counteracting this problem is by analyzing our historical vulnerability scan data. We’ve scanned tens of thousands of real-world websites of all shapes, sizes, and types, over and over again for years, and identified countless numbers of vulnerabilities. The data shows that certain payloads are far more likely to succeed than others. Obviously then, it makes sense to attempt those most likely to succeed first. If one payload works, injecting subsequent payloads in that class become unnecessary.

Figure 1 illustrates payload by payload performance using a simple graph. On the horizontal, each tick mark represents one of our payloads. The vertical is their relative effectiveness by website percentage. Clearly some payloads succeed on a large number of websites, while others do not. Figure 2 is subtly different. Instead measuring website percentage, the vertical shows the relative total quantity of vulnerabilities payloads are credited for identifying. Some payloads are definitely more productive than others.

Figure 1.

Figure 2.

Back to our previous example, if “<* XSSTEST>” works on the first shot, the remaining 49 payloads, whatever they are, for that one injection point, don’t have to be sent. We can save 49 HTTP requests and the time it takes to send them. By smartly ordering our injections we drastically increase our scan efficiency without sacrificing overall vulnerability identification performance. We can know this for a fact through regression testing. This is one of the areas our Research team, the “R” in “TRC,” focuses on.

From here, scan efficiency in our technology gets ever cooler, well, sophisticated at least. A while back we introduced data-backed conditional logic. Depending on how SiteX responds to one test, it impacts what the next test will be. For example, if SiteX does not echo “>” and “<,” there is no need to inject any more of those types of payloads. Doing this exponentially cuts down the number of requests a scan might otherwise require.

Figure 3 is a dynamically generated graphical diagram of the logic flow of our payloads that illustrates a little bit about how this looks. At the risk of revealing some intellectual property, Figure 4 zooms in on a particular area of the decision tree and provides a bit clearer picture of what’s happening behind the scenes.

Figure 3.

Figure 4.

As a Software-as-a-Service vendor, we have the ability to see and measure payload performance and use it to our advantage — to our customer’s advantage. A luxury the desktop scanner guys do not have. Every day, with each new scan, with each new website we test, with each new payload to test, we get just a little bit better. Every day, a little bit smarter.

Thursday, February 07, 2013

Password Cracking AES-256 DMGs and Epic Self-Pwnage

Two weeks ago I was in the midst of a nightmare. I’d forgotten a password. Not just any password. THE password. Without this one password I was cryptographically locked out of thousands and gigabytes worth of files I care about. Highly sensitive and valuable files that include work documents, personal projects, photos, code snippets, notes, family stuff, etc. The password in question unlocks these files from the protection of locally stored AES-256 encrypted disk image. A location where an “email me a password reset link” is not an option. File backups? Of course! Encrypted the same way with the same password. Password paper backup? Nope. I’ll get to that. I somehow needed to “crack” this password. If not, the amount of epic self-pwnage would be too horrible to imagine.

Before sharing how I got myself into this predicament, it’s necessary to reveal some details about my personal computer security habits. More specifics than I’m normally comfortable sharing.

As my badge wall shows, I travel a lot, all around the world, and often with the same laptop. A MacBook Pro. My computer becoming lost, stolen, or imaged by border guards and other law enforcement officers is a constant concern. To protect against these potential physical attacks, OS X dutifully offers FileVault.

FileVault is a full disk encryption feature utilizing XTS-AES 128 crypto. Enabling FileVault means that even if someone has physical possession of my computer, or obtains a full copy of the hard drive, they’d be the proud new owner of a cutting-edge machine, but unable to get any useful data off of it. That is unless my admin password, which unlocks FileVault, is ridiculously simple, and it isn’t. By all practical means, “cracking” this password is impossible.

What is possible is law enforcement, or a robber, forcibly stopping me and “asking” for my admin password, a method capable of defeating FileVault’s full disk encryption. Realistically, while my brazilian jiu-jitsu black belt certainly helps in many situations, it can be utterly useless in other real-world encounters. I’ll of course resist giving up my admin password to the extent I’m able, but must assume I may have to “comply” at some point. If this should happen, ideally my data, other than email, should remain safe even after the adversary lands on my desktop.

Setting up this type of layered security fall-back plan is where we return to the conversation of encrypted disk images. On OS X, Disk Utility can be used to create encrypted disk images called DMGs. DMGs are self-contained portable files, of customizable size, that when mounted (i.e. double-clicked) display on the desktop like any other disk drive where files can be stored.

Upon creation of DMGs the level of encryption strength can be set, the highest being AES-256. If FileVault’s AES-128 crypto is already “impossible” to crack, AES-256 DMGs are exponentially more impossible. To ensure this, all you have to do is set a reasonable password. We’re talking even 6 characters or longer, some upper and lower case, and maybe toss in a digit and special character. DON’T SAVE THE PASSWORD IN YOUR KEYCHAIN. Doing so defeats the entire purpose of what we’re trying to accomplish, because the admin password unlocks the keychain.

A great thing about DMGs is that they can be stored anywhere. Hidden in some obscure directory on the local machine, a network storage device, a USB drive, whatever. All my confidential files are typically stored this way, in a series of encrypted DMGs with separate passwords. Also very important, DMGs containing sensitives files are only mounted on an as-needed basis. This is for two reasons:

If I must hand over my admin password, the person now on the desktop should still have a difficult time learning these disk images exist and a password is required to open them. As they begin to snoop around, image the drive, run forensics, etc., they should feel they have the keys to the kingdom. If they do manage to find the DMGs, hopefully by then I’m on my way and seeking legal help.
Should my computer get “hacked,” a remote attacker will find it extremely difficult to transfer out many many gigabytes worth of data as a single DMG file before being noticed, the computer loses its connection to the Internet, or the image is unmounted.

Credit: http://xkcd.com/

What’s also cool is a DMG can be used to store additional account passwords, flat file style. Passwords, which can be made super strong and don’t have to be committed to memory. Simply copy-paste as necessary. This FileValue / DMG setup makes it very convenient to only have to remember a small hand full of passwords, including the admin password, to access everything important and without sacrificing security. Well, convenient up until the point where you forget a DMG password. In my case, caused by my scheduled ritual of “change all my passwords.” Ugh!

I wake up once upon a recent morning and begin my daily routine. Check calendar. Check email. Checks RSS. Check Twitter. Start working, start reading. As is common, I mount a DMG and am greeted by the familiar password dialog. First password attempt, fail. Second attempt, fail. Third attempt, fail. Warning dialog appears. That’s weird, I thought. Normally I’m a proficient touch typist. Am I’m fat-fingering the password? Three strikes and I’m out again.

Annoyed, but not concerned. Check the caps lock key. Nope. Try the password again. Fail, fail, fail. Fail, fail, fail. Rinse, repeat several more times. WTF! Am I at least trying to type the correct password for the DMG? I believe so. Let me try a few “shouldn’t work passwords” just in case Morning Brain is causing problems. A few dozen password fails later, annoyance begins constricting into panic. It’s OK, consoling myself, I’ll come back to this in a little while. It’ll be fine. I have some non-DMG-required work to complete anyway.

An hour later, I repeated the same password attempt cycle. No dice. The password fails mounting up are now in the hundreds. I start to mouth some obscenities and my keyboard is really not liking the pounding. My wife is beginning to eyeball me with concern. I’m running out of ideas of what that problem could be. That’s about when I recalled recently changing all my passwords. A few moment laters, that’s when it hit me, like really hit me. For whatever reason, I’d forgotten what I changed the password to. *Gulp*. Oh, no!

Credit: http://xkcd.com/

Think positive, think optimistic. Keep calm. Carry on. It’ll come to me. I’ve never forgotten these passwords before. I even remember most of it. At least, I think I do.

I’m periodically trying different passwords throughout the day, throughout out the evening. One day turns into two, two into three. All like the first. Only now I’m losing sleep. I’m waking up in the middle of the night and have to try a few more passwords just so I can get back to sleep. For those who don’t know, dreaming of password combinations sucks. What also sucks is without access to this DMG, more specifically the work documents within it, my daily productivity plummets.

Finally, after nearly a week I have to admit to myself, I forgot it. That I’m in trouble. Time for Plan B. Google.

I begin searching around for DMG password cracking tools. My thought is since I have a partial password, I should be fine. Most of the results pages are littered with people responding by cracking jokes when asked about cracking DMG AES crypto. That’s not very encouraging. Then I come across something called crowbarDMG, which is basically a GUI for command:

>$ hdiutil attach -passphrase <passphrase> DiskImage.dmg

hdiutil locks a DMG file when attempting to mount it, so crowbarDMG runs single threaded, which essentially means a cracking speed of 1 password c/s. Yeah, slow. For my particular circumstance, this was fine. I figured I was only missing between 1 – 3 characters of the password anyway. A day of cracking, maybe two, and I’d be back in business. It was not to be. Then my fuzzy memory suggested I might be missing as much as 6 characters. If that be the case, by sheer math, at least multiple decades worth of cracking would be necessary at current speed. Time for Plan C. Twitter.

Having ~15,000 followers interested in computer security has its perks. Through the years I’ve come to expect a good percentage of them have a stinging sense of humor. Similar to the Google search, 99% of the responses received were sarcastic. This included one such retort from a friend who works in law enforcement computer forensics. I’m sure some tweets were funny, but I was in no laughing mood. I was freaked. A sense of futility and finality was setting in.

That was until Solar Designer, gat3way, Dhiru Kholia, and Magnum, the guys behind the infamous John the Ripper (JtR) password cracker answered my plea. Then Jeremi Gosney of Stricture Consulting Group graciously offered up the use of his mega hash cracking computing resources as well. You remember Stricture from their Ars article, they have an insane “25-GPU cluster cracks every standard Windows password in < 6 hours.” Collectively, these guys are the amongst the world’s foremost experts in password cracking. If they can’t help, no one can. No joking around, they immediately dove right in.

Now, I couldn’t just share out my DMG for others to attempt to crack. Its enormous size basically precluded that. But even if I could, I wouldn’t. Given the sensitive nature of the data, I actually preferred the data lost than suffer any risk of a leak. Fortunately, JtR has something called dmg2john. dmg2john scrapes the DMG and provides output which can be cracked with JtR by others without putting the data at risk. Nice! Unfortunately, when I got there, dmg2john and JtR were broken when it came to DMGs. I provided the bug details to john-dev and john-users mailing list to replicate. The JtR developers had the issues fixed in a couple days. These guys are awesome.

Next step, send the dmg2john output of my DMG over to Jeremi at Stricture along with everything I think I remember about what my password might have been. Jeremi informs me of the next challenge, he’s only able to crack my DMG at a speed of ~100 c/s! At that rate it’s going to take a little over a decade worth of cracking to exhaust the password key space. I’m thinking this is very odd, it’s only maybe 6 extra characters tops. Jeremi explains why…

The reason it’s so slow is because your AES256-encrypted DMG uses 250,000 rounds of PBKDF2-HMAC-SHA-1 to generate the encryption key. The ludicrous round count makes it extremely computationally expensive, slowing down the HMAC-SHA1 process by a factor of 250,000.
My Xeon X7350 can crack a single round of HMAC-SHA1 at a rate of 9.3 million hashes per second. But since we are using 250,000 rounds, it means I was reduced to doing ~ 37 hashes per second. Using all four processors I was only able to pull about 104 hashes per second total (doesn’t scale perfectly.)

Once understanding this, Jeremi begins asking for more information about what the extra six or so characters in my password might have been. We’re they all upper and lower case characters? What about digits? Any special characters? Which characters were most likely used, or not used? Ever bit of intel helped a lot. We managed to whittle down an in initial 41106759720 possible password combinations to 22472. This meant the total amount of time required to crack the DMG was reduced to 3.5 minutes on his rig.

Subsequently, Jeremi sent me what had to be one the most relieving and frightening emails I’ve ever received in my life. Relieving because I recognized the password immediately upon sight. I knew it was right, but my anxiety level remained at 10 until typing it in and seeing it work. I hadn’t touched my precious data in weeks! It was a tender moment, but also frightening because, well, no security professional is ever comfortable seeing such a prized password emailed to them from someone else. When/if that happens, it typically means you are hacked and another pain awaits.

Interestingly, in living out this nightmare, I learned A LOT I didn’t know about password cracking, storage, and complexity. I’ve come to appreciate why password storage is ever so much more important than password complexity. If you don’t know how your password is stored, then all you really can depend upon is complexity. This might be common knowledge to password and crypto pros, but for the average InfoSec or Web Security expert, I highly doubt it.

Now, after telling everyone a few of my best tricks and enduring an awful deficiency in one of them, I’ll obviously have to change things up a bit. Clearly I need paper backup, and thinking maybe about giving it to my attorney for safekeeping where it’ll enjoy legal privilege protection. We’ll see.

In the meantime, I can’t thank the John the Ripper guys and Jeremi from Stricture Consulting enough. If you need a password cracked, for personal and professional reasons, this is where you look to.

BIO

Jeremiah Grossman brings 20+ years of experience in Computer Security and has become one of the most recognizable and world-renowned cybersecurity experts in the industry, coining several of the original hacking terms commonly used around the world today. Early in his career, Jeremiah was known as “The Hacker Yahoo” which led to his role as the company’s Information Security Officer. Jeremiah founded WhiteHat Security (now Synopsis), and served as Chief of Security Strategy for SentinelOne which was the highest-valued cybersecurity IPO in history. Most recently, Jeremiah was the founder & CEO of Bit Discovery, which was acquired by Tenable in 2022. He also serves as a company advisor and board member to several tech startups. In his spare time, Jeremiah does Brazilian Jiu-Jitsu and is passionate about classic cars. He recently opened Toybox, a luxury car club in Boise, Idaho.