+
95
-

selenium如何打开网页下载网页所有静态资源文件js css image等?

selenium如何打开网页下载网页所有静态资源文件js css image等?


网友回复

+
15
-

1、创建一个http代理

2、浏览器设置代理服务器

3、获取js、css、图片等资源url并下载到本地

+
15
-

附上示例代码:

import socket

import select

import threading

import requests

import os

from urllib.parse import urlparse, urljoin

# 代理服务器监听IP和端口

host = "127.0.0.1"

port = 8081


# 最大连接数

max_connections = 100


# 缓存大小

buffer_size = 4096


# 初始化代理服务器

proxy_server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

proxy_server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

proxy_server.bind((host, port))

proxy_server.listen(max_connections)

print(f"代理服务器已启动,监听地址:{host}:{port}")


def download_file(url, folder,headers):

    # 获取网页URL的路径和文件名

    parsed_url = urlparse(url)

    path_parts = parsed_url.path.split("/")

    if path_parts[-1] == "":

        filename = "index.html"

    else:

        filename = path_parts[-1]


    # 获取本地文件路径

    local_path = os.path.join(folder, *path_parts[1:-1])

    if not os.path.exists(local_path):

        os...

点击查看剩余70%

我知道答案,我要回答