当需要识别请求是否由Playwright发出时,有几种有效的方法。Playwright作为一个自动化测试工具,在发送HTTP请求时会留下一些特征标识。
1. 检查User-Agent头部
Playwright默认会使用特定的User-Agent字符串,通常包含以下特征:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/X.X.X.X Safari/537.36
其中Chrome版本号会根据Playwright使用的Chromium版本而变化。但仅凭User-Agent难以确定,因为它可以被轻易修改。
2. 检查特定的HTTP头部
Playwright会添加一些特定的HTTP头部,可以通过检查这些头部来识别:
def is_playwright_request(request_headers): # Playwright通常会设置这些头部 playwright_indicators = [ 'accept-language', 'upgrade-insecure-requests', 'sec-fetch-site', 'sec-fetch-mode', 'sec-fetch-user', 'sec-fetch-dest' ] # 检查头部组合 matches = 0 for indicator in playwright_indicators: if indicator.lower() in request_headers: matches += 1 # 如果匹配度高,可能是Playwright return matches >= 4
3. 检查请求模式和行为特征
Playwright的请求通常具有以下特征:
短时间内发送大量请求请求之间的时间间隔非常规律不会存储或处理cookies(除非特别配置)不会加载某些资源(如广告)
4. 在服务器端实现检测
以下是一个使用Python Flask框架的简单示例,用于检测Playwright请求:
from flask import Flask, request, jsonify app = Flask(__name__) @app.route('/') def index(): # 获取请求头 headers = request.headers # 检查是否是Playwright is_playwright = False # 检查User-Agent user_agent = headers.get('User-Agent', '') if 'HeadlessChrome' in user_agent: is_playwright = True # 检查特定头部组合 playwright_headers = { 'accept-language': headers.get('Accept-Language'), 'sec-fetch-site': headers.get('Sec-Fetch-Site'), 'sec-fetch-mode': headers.get('Sec-Fetch-Mode'), 'sec-fetch-dest': headers.get('Sec-Fetch-Dest') } # 检查Playwright特有的头部组合 if all(playwright_headers.values()) and 'sec-fetch-user' not in headers: is_playwright = True # 检查是否有浏览器指纹特征 if not headers.get('Cookie') and headers.get('Accept') == '*/*': is_playwright = True return jsonify({ 'is_playwright': is_playwright, 'headers': dict(headers) }) if __name__ == '__main__': app.run(debug=True)
5. 使用JavaScript检测
在前端,可以使用JavaScript检测Playwright:
function detectPlaywright() { // 检测自动化相关属性 const automationDetected = navigator.webdriver || window.callPhantom || window._phantom || window.__nightmare; // 检测Playwright特有的属性 const playwrightSpecific = !!window.playwright || // 检查Chrome DevTools Protocol !!window.chrome && !!window.chrome.csi; // 检查navigator属性 const navigatorCheck = navigator.plugins.length === 0 || navigator.languages.length === 0; return automationDetected || playwrightSpecific || navigatorCheck; } // 使用方法 if (detectPlaywright()) { console.log("Playwright检测到了!"); // 可以在这里执行相应操作 }
6. 通过Playwright自定义标识
如果您控制Playwright脚本,可以主动添加标识:
// 在Playwright脚本中 const browser = await playwright.chromium.launch(); const context = await browser.newContext(); await context.setExtraHTTPHeaders({ 'X-Automated-Tool': 'Playwright' });
然后在服务器端检测此头部:
@app.route('/api/data') def get_data(): if request.headers.get('X-Automated-Tool') == 'Playwright': # 这是Playwright发出的请求 return jsonify({'message': 'Playwright detected'}) else: # 正常请求 return jsonify({'data': 'regular data'})
注意事项
Playwright可以通过配置绕过大多数检测方法没有100%可靠的方法来检测Playwright最佳实践是结合多种检测方法检测自动化工具可能会影响合法的自动化测试
网友回复