Open Source Intelligence Automation: Spiderfoot

SpiderFoot is an open source footprinting tool, available for Windows and Linux. It is written in Python and provides an easy-to-use GUI. SpiderFoot obtains a wide range of information about a target, such as web servers, netblocks, e-mail addresses and more. SpiderFoot’s simple web-based interface enables you to kick off a scan immediately after install – just give your scan a name, the domain name of your target and select which modules to enable.

The main objective of SpiderFoot is to automate this process to the greatest extent possible, freeing up a penetration tester’s time to focus their efforts on the security testing itself.

  • Start with a target of more than just domains (Hostnames, IPs, Netblocks, etc.)
  • Clean-up back-end data model to be more flexible
  • Simultaneous scans
  • More threading for faster performance
  • Search/Filtering
  • Bunch of bug fixes


There are three main areas where SpiderFoot can be useful:

  • If you are a pen-tester, SpiderFoot will automate the reconnaisance stage of the test, giving you a rich set of data to help you pin-point areas of focus for the test.
  • Understand what your network/organisation is openly exposing to the outside world. Such information in the wrong hands could be a significant risk.
  • SpiderFoot can also be used to gather threat intelligence about suspected malicious IPs you might be seeing in your logs or have obtained via threat intelligence data feeds.



  • Utilises a shedload of data sources; over 50 so far and counting, including SHODAN, RIPE, Whois, PasteBin, Google, SANS and more.
  • Designed for maximum data extraction; every piece of data is passed on to modules that may be interested, so that they can extract valuable information. No piece of discovered data is saved from analysis.
  • Runs on Linux and Windows. And fully open-source so you can fork it on GitHub and do whatever you want with it.
  • Visualisations. Built-in JavaScript-based visualisations or export to GEXF/CSV for use in other tools, like Gephi for instance.
  • Web-based UI. No cumbersome CLI or Java to mess with. Easy to use, easy to navigate. Take a look through the gallery for screenshots.
  • Highly configurable. Almost every module is configurable so you can define the level of intrusiveness and functionality.
  • Modular. Each major piece of functionality is a module, written in Python. Feel free to write your own and submit them to be incorporated!
  • SQLite back-end. All scan results are stored in a local SQLite database, so you can play with your data to your heart’s content.
  • Simultaneous scans. Each footprint scan runs as its own thread, so you can perform footprinting of many different targets simultaneously.
  • So much more.. check out the documentation for more information.


Data Sources

This is an ever-growing list of data sources SpiderFoot uses to gather intelligence about your target. A few require API keys but they are freely available.

Source Location Notes Various malware trackers.
AdBlock AdBlock pattern matches
AlienVault AlienVault’s IP reputation database. Blacklists.
AVG Site Safety Report Site safety checker.
Bing Scraping but future version to also use API. Blacklists. Look up username availability on popular sites.
DNS Your configured DNS server. Defaults to your local DNS but can be configured to whatever IP address you supply SpiderFoot.
Facebook Scraping but future version to also use API.
Google Scraping but future version to also use API.
Google+ Scraping but future version to also use API.
Google Safe Browsing Site safety checker.
LinkedIn Scraping but future version to also use API. Blacklists. Blacklists. Blacklists.
McAfee SiteAdvisor Site safety checker.
NameDroppers Blacklists.
OpenBL Blacklists.
PasteBin Achieved through Google scraping.
PGP Servers PGP public keys.
PhishTank Identified phishing sites.
Project Honeypot Blacklists. API key needed.
SANS ISC Internet Storm Center IP reputation database.
SHODAN API key needed.
SORBS Blacklists.
SpamHaus Blacklists.
ThreatExpert Blacklists.
TOR Node List Domains/IPs used by malware.
UCEPROTECT Blacklists.
VirusTotal Domains/IPs used by malware. API key needed.
Whois Various Whois servers for different TLDs.
Yahoo Scraping but future version to also use API.
Zone-H Easy to get black-listed. Log onto the site in a browser from the IP you’re scanning from first and enter the CAPTCHA, then it should be fine.

SpiderFoot is designed from the ground-up to be modular. This means you can easily add your own modules that consume data from other modules to perform whatever task you desire. As a simple example, you could create a module that automatically attempts to brute-force usernames and passwords any time a password-handling webpage is identified by the spidering module.


SpiderFoot is written in Python (2.7), so to run on Linux/Solaris/FreeBSD/etc. you need Python 2.7 installed, in addition to the lxml, netaddr, M2Crypto, CherryPy, bs4, requests and Mako modules.

To install the dependencies using PIP, run the following:

~$ pip install lxml netaddr M2Crypto cherrypy mako requests bs4

On some distros, instead of M2Crypto, you must install it using APT instead:

~$ apt-get install python-m2crypto

Other modules such as PyPDF2, SOCKS and more are included in the SpiderFoot package, so you don’t need to install them separately.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s