Install and Configure SpiderFoot 4.0 on Ubuntu 22.04

SpiderFoot is a powerful open-source intelligence (OSINT) automation tool that helps security professionals gather information about targets during reconnaissance phases. Whether you're conducting penetration testing, threat intelligence, or security assessments, SpiderFoot can significantly enhance your capabilities by scanning IP addresses, domain names, and other assets to uncover vulnerabilities and gather critical insights.

Introduction

SpiderFoot stands out among OSINT tools for its comprehensive scanning capabilities and modular architecture. With over 200 modules that can query everything from DNS records to dark web mentions, it automates what would otherwise be countless manual searches across various data sources.

This tutorial will walk you through:

  • Installing SpiderFoot 4.0 on Ubuntu 22.04
  • Configuring the environment and dependencies
  • Setting up the web interface
  • Running your first OSINT scan
  • Best practices for secure deployment

Prerequisites

Before beginning, ensure you have:

  • A clean Ubuntu 22.04 server with sudo privileges
  • At least 4GB RAM and 10GB free disk space
  • A stable internet connection
  • Basic familiarity with Linux command line
  • Python 3.8+ (included by default in Ubuntu 22.04)

Step 1: Update Your System

First, make sure your system is up-to-date:

sudo apt update
sudo apt upgrade -y

Step 2: Install Required Dependencies

SpiderFoot requires several packages to function properly:

sudo apt install -y python3-pip python3-dev libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev git

Now, upgrade pip to ensure you have the latest version:

python3 -m pip install --upgrade pip

Step 3: Clone the SpiderFoot Repository

Let's download the latest version of SpiderFoot from the official GitHub repository:

cd /opt
sudo git clone https://github.com/smicallef/spiderfoot.git
sudo chown -R $USER:$USER /opt/spiderfoot

Step 4: Install Python Dependencies

Navigate to the SpiderFoot directory and install the required Python packages:

cd /opt/spiderfoot
pip3 install -r requirements.txt

If you encounter any issues with specific modules, you can try installing them individually:

pip3 install dnspython netaddr requests cherrypy mako beautifulsoup4 lxml publicsuffixlist

Step 5: Configure SpiderFoot (Optional)

SpiderFoot works out of the box, but you can customize its behavior by creating a configuration file:

cp /opt/spiderfoot/sf.example.conf /opt/spiderfoot/sf.conf

Edit the configuration file to suit your needs:

nano /opt/spiderfoot/sf.conf

Some important settings to consider:

  • __database: Database file location
  • __webaddr: Web server IP address (default: 127.0.0.1)
  • __webport: Web server port (default: 5001)
  • __useragent: User-agent string for HTTP requests

Step 6: Start SpiderFoot

You can run SpiderFoot in two ways: with the web interface or via the command line.

Option A: Start the Web Interface

The web interface provides a user-friendly way to manage scans:

cd /opt/spiderfoot
python3 ./sf.py -l 0.0.0.0:5001

This command starts SpiderFoot's web server on all interfaces (0.0.0.0) and port 5001. You can now access the web interface by navigating to http://your_server_ip:5001 in your web browser.

Note: Using 0.0.0.0 exposes the interface to all network interfaces. For production use, consider using a reverse proxy with HTTPS and proper authentication.

Option B: Command-Line Interface

SpiderFoot can also be used entirely from the command line:

cd /opt/spiderfoot
python3 ./sf.py -h

To run a scan from the command line:

python3 ./sf.py -m all -s example.com

This runs all modules against the target "example.com".

Step 7: Create a Systemd Service (Optional)

For a more permanent setup, create a systemd service file to automatically start SpiderFoot:

sudo nano /etc/systemd/system/spiderfoot.service

Add the following content:

[Unit]
Description=SpiderFoot OSINT Automation Tool
After=network.target

[Service]
User=YOUR_USERNAME
WorkingDirectory=/opt/spiderfoot
ExecStart=/usr/bin/python3 /opt/spiderfoot/sf.py -l 127.0.0.1:5001
Restart=on-failure

[Install]
WantedBy=multi-user.target

Replace YOUR_USERNAME with your actual username. Then, enable and start the service:

sudo systemctl daemon-reload
sudo systemctl enable spiderfoot
sudo systemctl start spiderfoot
sudo systemctl status spiderfoot

Step 8: Set Up a Reverse Proxy (Recommended for Production)

For better security, set up Nginx as a reverse proxy with SSL:

sudo apt install -y nginx certbot python3-certbot-nginx

Create an Nginx configuration file:

sudo nano /etc/nginx/sites-available/spiderfoot

Add the following configuration:

server {
    listen 80;
    server_name spiderfoot.yourdomain.com;

    location / {
        proxy_pass http://127.0.0.1:5001;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Enable the site and obtain SSL certificates:

sudo ln -s /etc/nginx/sites-available/spiderfoot /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d spiderfoot.yourdomain.com

Step 9: Running Your First Scan

Now that SpiderFoot is installed and running, let's perform a basic scan:

  1. Access the SpiderFoot web interface at http://your_server_ip:5001 (or your configured domain)
  2. Click on "New Scan" in the top navigation
  3. Enter a target (domain, IP, etc.) in the "Target" field
  4. Choose modules to run (or use the default selection)
  5. Click "Run Scan" to begin the reconnaissance process

SpiderFoot will now collect information about your target. Depending on the number of modules selected and the complexity of the target, scans can take anywhere from minutes to hours.

Troubleshooting Section

Common Issues and Solutions

1. Module Dependencies Errors

Problem: You see errors related to missing Python modules.

Solution: Install the specific missing module:

pip3 install module_name

2. Web Interface Not Accessible

Problem: Cannot connect to the SpiderFoot web interface.

Solution: Check the following:

  • Verify SpiderFoot is running: ps aux | grep spiderfoot
  • Check firewall settings: sudo ufw status
  • Ensure correct IP/port binding in your command or config file

3. API Rate Limiting

Problem: Certain modules fail with rate limiting errors.

Solution: Some modules use external APIs with rate limits. Consider:

  • Running fewer modules simultaneously
  • Adding API keys in the SpiderFoot configuration
  • Adding delays between scans

4. Database Errors

Problem: SQLite database errors occur during operation.

Solution: Try resetting the database:

cd /opt/spiderfoot
mv spiderfoot.db spiderfoot.db.old
python3 ./sf.py

Best Practices & Optimization Tips

Security Considerations

  • Access Control: Always restrict access to your SpiderFoot instance, preferably using a reverse proxy with authentication
  • API Key Protection: Store API keys securely and never share your configuration file
  • Network Isolation: Consider running SpiderFoot in a dedicated VLAN or container
  • Regular Updates: Keep SpiderFoot and its dependencies up-to-date

Performance Optimization

  • Resource Allocation: Provide adequate RAM and CPU for complex scans
  • Module Selection: Only enable modules relevant to your objectives
  • Scan Scope: Clearly define scan boundaries to prevent excessive data collection
  • Database Maintenance: Periodically clean up old scan data to maintain performance

Automation & Monitoring

For recurring intelligence gathering, consider automating SpiderFoot scans:

Scheduled Scans with Cron

Create a script to run automated scans:

nano /opt/spiderfoot/automated_scan.sh

Add the following content:

#!/bin/bash
cd /opt/spiderfoot
python3 ./sf.py -m footprint,investigate -s example.com -o /opt/spiderfoot/reports/$(date +%Y%m%d)_example_scan.csv -F CSV

Make the script executable:

chmod +x /opt/spiderfoot/automated_scan.sh

Add a cron job to run weekly scans:

crontab -e

Add the following line to run every Sunday at 1 AM:

0 1 * * 0 /opt/spiderfoot/automated_scan.sh >> /opt/spiderfoot/scan.log 2>&1

Integration with Security Tools

SpiderFoot can be integrated with other security tools using its API. The API allows you to:

  • Trigger scans programmatically
  • Fetch scan results for further processing
  • Incorporate findings into security dashboards

Example of using the API with curl:

curl -H "Content-Type: application/json" -X POST -d '{"scanName":"API Scan","scanTarget":"example.com","usecase":"all","modulelist":["sfp_dnsresolve","sfp_whois"]}' http://127.0.0.1:5001/startscan

Conclusion

You've successfully installed and configured SpiderFoot 4.0 on Ubuntu 22.04. This powerful OSINT tool can now help you gather comprehensive intelligence on targets during security assessments, penetration tests, or threat hunting exercises.

Remember that OSINT tools like SpiderFoot should be used responsibly and ethically. Always ensure you have proper authorization before scanning any targets, and be mindful of terms of service for the various data sources SpiderFoot utilizes.

By following the best practices outlined in this guide, you can maintain a secure, efficient SpiderFoot installation that provides valuable intelligence without compromising your security posture.

As open-source intelligence continues to grow in importance for security professionals, mastering tools like SpiderFoot will become increasingly valuable for identifying potential attack vectors and strengthening your defensive capabilities.