用fastapi写了一个爬虫,前端vue3,流程是这样:
用户输入要搜索的关键词,发送给后端fastapi,经过处理再爬取api接口数据,这里需要登录之后获取cookie,再携带cookie抓取数据返回前端。api类代码如下:
class FS131VIAPI:
def __init__(self) -> None:
self.cookie = None
async def fetch(self, params, xlsx_file):
if not self.cookie:
await self.login_for_cookie()
async with httpx.AsyncClient() as client:
resp = await client.post(
URL, cookies=self.cookie, params=params, files=xlsx_file
)
if not resp.status_code == 200:
raise Exception("接口连接失败")
else:
return resp["data"]
async def login_for_cookie(self):
async with async_playwright() as p:
browser = await p.firefox.launch(headless=False)
context = await browser.new_context()
page = await context.new_page()
resp = await page.goto("https://cn.bing.com/")
if not resp.status == 200:
raise Exception("连接异常")
elif not resp["code"] == "0000":
raise Exception("登录异常")
else:
print(await page.title())
self.cookie = await context.cookies()
await context.close()
下面是fastapi路由调用上面api类
api = FS131VIAPI()
@router.post("/")
async def index(qs: SearchPayload):
for k in SERVICE_IDS:
try:
resp = await api.fetch({"SERVICE_IDS": SERVICE_IDS[k]})
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
else:
# 其他业务代码
pass
现在问题是,初次启动或者登录过期时,此时2个用户同时发送了请求,会同时调用login_for_cookie方法,登录2次获取2个cookie(这个接口模拟登录用时10秒左右)。
请问如何保证不管同时来多少请求,都只登录一次,获取一个cookie共用
想过把login_for_cookie写成同步模式,但是playwright的同步模式运行在异步fetch里面报错