Monday, June 04, 2007

How to find your websites (Road to Website Security part 1)

I spend a lot of time with companies, mostly large and medium sized, who are interested in finding the vulnerabilities in their websites. Obviously the first step in the VA process is to first FIND the websites. Now this may come as a surprise to many, companies with more than 5 or 6 websites tend not to know what they are, what they do, or who’s responsible for them. And if they don’t know what websites they own, there is no hope of securing them.

Finding all of a company’s websites isn’t exactly a trivial process and doesn’t end with scanning an IP ranges for port 80 and 443. Virtual hosts, redirects, vanity hostnames/domains, partnerships, and legacy are hurdles that must be overcome. Here is a process that should help:

1) Network Discovery
Find a starting point IP address. Most of the time the main website (i.e. works fine. Look up the IP address using dig or some other utility:


Next plug in the IP address in the ARIN whois database to search for the registered netblock(s). Then have a chat with one of the network systems administrators, asking them if this is indeed your netblock and if they know of any more that might have been missed.

Last, nmap scan the netblock ranges on port 80 and 443 looking for web servers. Sure other web servers could be listening on non-standard ports, but those are likely out of “web application security” scope and can be addressed later.

> nmap -sT -p 80,443 x.x.x.0-255

Save all your results in a spreadsheet.

2) DNS and Zone Transfer
Search for web servers based upon domain names by interrogating the name servers. whois works great on the command line, but if not, any other website (godaddy,, etc) will do that provides the service.

> whois

Name Server:
Name Server:

Next we’ll attempt a DNS zone-transfer on the off chance that it’s misconfigured. Digital Point Solutions provides a great online utility that does this for you, which loops through each name server attempting the zone-transfer. dig on the command line works fine as well, but I still prefer the Web in this instance.

> dig @ axfr
> dig @ axfr

Additionally it doesn’t hurt to have a chat with the person in charge of or has access to the domain registrars account to see what other domain names are owned by the company. If you are lucky they might even save you a lot of work by providing the hostname list from the DNS name servers directly. If you have access to the web servers configuration or the person who does, you could also dump the virtual hostnames and get lists that way as well.

Match up the hostnames to the IP addresses in your spreadsheet and log the domain names.

3) Google and Netcraft
Google is a great resource to locate websites, especially if you know the right search options to use. First restrict search results by domain name:

This should provide a list of results, but also a lot of pages on the same hostnames that need to be widdled down. Once you find a hostname, log it, then restrict it from the search results and try again.

Rinse repeat.

and so on until no more results come up. Log all hostnames found.

Netcraft SearchDNS is also an excellent resource for locating hostnames. Perform a wildcard domain name search for each domain name you have logged:


Log each hostname listed. You’ll probably get a lot of overlap between Google and Netcraft, but that’s OK, better not to miss anything. You also might want to give Fierce (by Rsnake) a try… it locates targets both internal and externally, not just websites though.

4) The grunt work
Visit each website on the list with a web browser and start taking notes. See if the website is up, active, functional, its purpose, redirects to or anything else informational. Click around the website, having a look at the links and the sitemap to see if any other hostnames or domain names are not on your list. Doing this with a logging HTTP proxy helps as well.

Depending on how much websites there are, this can be a painstaking process, but it’s also vital.


How to rate the value of your websites (Road to Website Security part 2)


Anonymous said...

Cool post,

Another simple way to whois is via a batch file:

@echo on

save it as whois.cmd in /system32/
call it in CMD: whois


Ronald van den Heetkamp.

jw said...

Great Post and really informative. What Logging HTTP Proxy do you use and recommend?

Jeremiah Grossman said...

Ronald, nice trick!

jw, I prefer Paros, but I hear Burp is good as well. Basically you just need one that lists request lines that could be viewed quickly.

Andy Steingruebl said...

So, I'm guessing there isn't a lot of value in finding one site and then asking for the IIS and/or Apache configs for it so you can find all of the virtual servers that might be hosted there?

Jeremiah Grossman said...

Security Retentive: Actually, you make an excellent point. Provided you have access to the web server(s) in question or can ask the person who does, absolutely this ads value. I'll update the post.

Anonymous said...

well, there's which supplies domain information for GTLD's:
# whois -h
# whois -h
using this, you can query for that domain's registrar and forward dns servers.

then there's ARIN/RIPE/APNIC/LACNIC/AFNIC which supply IP/ASN information. or possibly RWhois, RADB (, or the BGP table itself (
# telnet 4321
Connected to (
Escape character is '^]'.
%rwhois V-1.5:001ab7:00 (Exodus Communications)
-holdconnect on
network:Organization;I:Exodus IDC - SV/SC4
network:Name;I:Exodus IP Address Administrator
network:Street;I:2401 Walsh Avenue
network:City;I:Santa Clara

Connection closed by foreign host.
some of which can supply very detailed information as you can see. ARIN and most RIR's will provide reverse DNS server information, which is incredibly useful for mapping out networks.

then there's more interesting information to be had with fierce, googlegath, etc - although the way the Google API works now you might be better off using something like Aura:

in regards to nmap and similar scanning tools - make sure not to forget IP protocol scanning - there are plenty of IPv6 and multicast networks that may be visible as well (and who knows what else?).

at the web search layer: make sure not to forget MSN search, Yahoo, etc (see:, the deep [deep] [deep] web,, and tons of other meta search engines (start here:

and, as jeremiah and rsnake have pointed out in the past... alexa and google (and probably others) provide nice point tools such as Google Trends:

Jeremiah Grossman said...

Thanks ntp.

Considering the feedback I've received, including from Sensepost, Im going to have to spend sometime rewriting it once its all in. Attempting to keep it short and simple if possible.

Anonymous said...

Apologies if I missed it, but another good trick (if they won't give you the dns zone information) is to do a reverse lookup on every ip that you have for your target organization.

Anonymous said...

heh.. no rest for the wicked :>
(till then, people can catch more on the vhosting at

Jeremiah Grossman said...

Thanks for the feedback everyone. Going to wait a week or so, then I'll post an update.

Anonymous said...

digg posted this interesting site yesterday:

Jordan said...

Also very useful in the search for virtual hosts are passive dns replication databases. I use them when analyzing botnet behavior and I need to find the domain name being used to herd the machines (shutting down a specific IP does no good if you don't know what domain name is being used as well)

Here's a list of URLs in my bookmarks for doing this (some overlap with what's already been mentioned): (full results cost money at

They all have varying advantages and coverage, so it's useful to compare as many as possible.

Hmm, I wonder if it would be possible to add these to fierce so that it could integrate these results into its scanning...

Also, it should be noted that many of these tools do a poor job with many tlds (.edu being a good example) since they apparently rely on doing lookups based on the public zone files for some of the tlds.

Anonymous said...

You could always try fierce:

It does a lot of this discovery for you, without a lot of thinking.

Anonymous said...

Another useful way of finding what virtual hosts are on a given machine, is to use MSN search's 'ip:' prefix. This would sure be a handy one to add to google, but till then it's ""