How to Make a Web Vulnerability Scanner Using Python (Step-by-Step Guide)

spyboy's avatarPosted by

If you’re serious about bug bounties, penetration testing, or ethical hacking, learning how to build your own web vulnerability scanner is one of the most powerful skills you can develop.

Most beginners rely entirely on tools like:

  • Burp Suite
  • OWASP ZAP
  • Nikto
  • Nmap

These are excellent tools. But here’s the difference:

Professionals understand how scanners work internally.
Advanced hackers build their own.

In this step-by-step guide, you’ll learn how to make a web vulnerability scanner using Python — designed for learning, lab environments, and authorized testing only.

What Is a Web Vulnerability Scanner?

A web vulnerability scanner is a tool that automatically:

  • Crawls a website
  • Identifies input fields and parameters
  • Tests for common vulnerabilities
  • Reports potential security issues

Common vulnerabilities include:

  • Cross-Site Scripting (XSS)
  • SQL Injection (basic detection)
  • Open Redirects
  • Directory Exposure
  • Missing Security Headers
  • Misconfigurations

Instead of manually testing every page, a scanner automates reconnaissance and testing.


Why Build Your Own Vulnerability Scanner?

Let’s be real — you could just download a scanner.

But building one gives you:

1️⃣ Deep Understanding of Web Attacks

You’ll understand exactly how payloads work.

2️⃣ Bug Bounty Advantage

Custom scanners find edge cases others miss.

3️⃣ Automation Power

You can tailor scans to specific programs.

4️⃣ Portfolio Credibility

A GitHub scanner tool shows skill — not just usage.


Important: Legal & Ethical Use Only

Before we go further:

⚠️ Only scan:

  • Your own websites
  • Local lab environments
  • Targets explicitly authorized in bug bounty scope

Unauthorized scanning can be illegal.

This tutorial is for educational and defensive security purposes only.


What We’re Going to Build

By the end of this guide, your Python-based web vulnerability scanner will:

  • ✔️ Crawl internal links
  • ✔️ Detect forms and input fields
  • ✔️ Test for basic XSS
  • ✔️ Test for simple SQL Injection patterns
  • ✔️ Detect open redirects
  • ✔️ Check for missing security headers
  • ✔️ Generate a structured report

We’ll keep it modular so you can expand it later.


Step 1: Setting Up the Environment

Install required libraries:

pip install requests beautifulsoup4 colorama urllib3

We’ll use:

  • requests – HTTP requests
  • BeautifulSoup – HTML parsing
  • colorama – Terminal color output
  • urllib.parse – URL handling

Step 2: Project Structure

Create:

web_scanner/
├── main.py
├── crawler.py
├── scanner.py
├── payloads.py
├── reporter.py
└── output/

Clean structure = scalable tool.


Step 3: Building the Web Crawler

A scanner needs to discover pages.

crawler.py

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse
visited = set()
def crawl(url, base_domain):
links = []
try:
response = requests.get(url, timeout=5)
soup = BeautifulSoup(response.text, "html.parser")
for tag in soup.find_all("a", href=True):
href = tag["href"]
full_url = urljoin(url, href)
if urlparse(full_url).netloc == base_domain:
if full_url not in visited:
visited.add(full_url)
links.append(full_url)
except:
pass
return links

This keeps crawling limited to the same domain.


Step 4: Detecting Forms and Inputs

Forms are the primary attack surface.

Add inside scanner.py:

import requests
from bs4 import BeautifulSoup
def get_forms(url):
try:
response = requests.get(url, timeout=5)
soup = BeautifulSoup(response.text, "html.parser")
return soup.find_all("form")
except:
return []

Step 5: Adding XSS Detection

Basic XSS testing works by injecting a script payload and checking if it reflects.

Add in payloads.py:

xss_payloads = [
"<script>alert('XSS')</script>",
"\"'><img src=x onerror=alert(1)>"
]

In scanner.py:

def test_xss(url, forms, payloads):
vulnerabilities = []
for form in forms:
action = form.get("action")
method = form.get("method", "get").lower()
inputs = form.find_all("input")
for payload in payloads:
data = {}
for input_tag in inputs:
name = input_tag.get("name")
if name:
data[name] = payload
try:
if method == "post":
response = requests.post(url, data=data)
else:
response = requests.get(url, params=data)
if payload in response.text:
vulnerabilities.append((url, "Potential XSS"))
except:
pass
return vulnerabilities

This is basic reflected XSS detection.


Step 6: Basic SQL Injection Detection

In payloads.py:

sqli_payloads = [
"' OR '1'='1",
"'; DROP TABLE users; --"
]

Add in scanner.py:

def test_sqli(url, forms, payloads):
findings = []
error_signatures = [
"sql syntax",
"mysql_fetch",
"ORA-",
"SQLite"
]
for form in forms:
action = form.get("action")
method = form.get("method", "get").lower()
inputs = form.find_all("input")
for payload in payloads:
data = {}
for input_tag in inputs:
name = input_tag.get("name")
if name:
data[name] = payload
try:
if method == "post":
response = requests.post(url, data=data)
else:
response = requests.get(url, params=data)
for error in error_signatures:
if error.lower() in response.text.lower():
findings.append((url, "Possible SQL Injection"))
except:
pass
return findings

This checks for error-based SQL injection indicators.


Step 7: Detecting Open Redirect

Open redirects allow attackers to redirect users to malicious websites.

Add:

def test_open_redirect(url):
test_param = "https://evil.com"
try:
response = requests.get(url, params={"redirect": test_param}, allow_redirects=False)
if "evil.com" in response.headers.get("Location", ""):
return [(url, "Open Redirect")]
except:
pass
return []

Step 8: Checking Security Headers

Security headers protect websites.

Common ones:

  • Content-Security-Policy
  • X-Frame-Options
  • X-Content-Type-Options
  • Strict-Transport-Security

Add:

def check_headers(url):
findings = []
try:
response = requests.get(url)
headers = response.headers
security_headers = [
"Content-Security-Policy",
"X-Frame-Options",
"Strict-Transport-Security"
]
for header in security_headers:
if header not in headers:
findings.append((url, f"Missing {header}"))
except:
pass
return findings

Step 9: Putting It All Together

In main.py:

from crawler import crawl
from scanner import get_forms, test_xss, test_sqli, test_open_redirect, check_headers
from payloads import xss_payloads, sqli_payloads
from urllib.parse import urlparse
def main():
target = input("Enter target URL: ")
base_domain = urlparse(target).netloc
print("[*] Crawling...")
links = crawl(target, base_domain)
all_findings = []
for link in links:
forms = get_forms(link)
all_findings += test_xss(link, forms, xss_payloads)
all_findings += test_sqli(link, forms, sqli_payloads)
all_findings += test_open_redirect(link)
all_findings += check_headers(link)
print("\n=== Scan Results ===")
for finding in all_findings:
print(f"[!] {finding[1]} found at {finding[0]}")
if __name__ == "__main__":
main()

You now have a working educational web vulnerability scanner.


How Real Hackers Abuse Vulnerability Scanners

Attackers use scanners to:

  • Identify outdated CMS
  • Detect exposed admin panels
  • Map attack surfaces
  • Automate mass exploitation

That’s why defenders must understand scanning too.

Understanding the attacker mindset helps secure your own systems.


How to Upgrade This Scanner (Advanced Features)

If you want to turn this into a serious tool:

Add:

  • Multithreading for faster scanning
  • Subdomain enumeration
  • JavaScript endpoint extraction
  • Parameter discovery
  • JSON report output
  • HTML report generation
  • Authentication support
  • Rate limiting
  • Proxy support

You can even turn it into a lightweight alternative to OWASP ZAP for specific tasks.


Final Thoughts

Building your own web vulnerability scanner teaches you:

  • How forms work
  • How input injection works
  • How crawling works
  • How scanners think
  • How defenders must think

This knowledge alone makes you better than someone who just presses “Scan” in a GUI tool.

You now understand the foundation.

And once you understand the foundation…

You can build something much more powerful.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.