多用户使用的爬虫如何保证登录的唯一?

265 阅读1分钟

用fastapi写了一个爬虫,前端vue3,流程是这样:
用户输入要搜索的关键词,发送给后端fastapi,经过处理再爬取api接口数据,这里需要登录之后获取cookie,再携带cookie抓取数据返回前端。api类代码如下:

class FS131VIAPI:
    def __init__(self) -> None:
        self.cookie = None

    async def fetch(self, params, xlsx_file):
        if not self.cookie:
            await self.login_for_cookie()

        async with httpx.AsyncClient() as client:
            resp = await client.post(
                URL, cookies=self.cookie, params=params, files=xlsx_file
            )

        if not resp.status_code == 200:
            raise Exception("接口连接失败")
        else:
            return resp["data"]

    async def login_for_cookie(self):
        async with async_playwright() as p:
            browser = await p.firefox.launch(headless=False)
            context = await browser.new_context()
            page = await context.new_page()
            resp = await page.goto("https://cn.bing.com/")
            if not resp.status == 200:
                raise Exception("连接异常")
            elif not resp["code"] == "0000":
                raise Exception("登录异常")
            else:
                print(await page.title())
                self.cookie = await context.cookies()
                await context.close()

下面是fastapi路由调用上面api类

api = FS131VIAPI()


@router.post("/")
async def index(qs: SearchPayload):
    for k in SERVICE_IDS:
        try:
            resp = await api.fetch({"SERVICE_IDS": SERVICE_IDS[k]})
        except Exception as e:
            raise HTTPException(status_code=400, detail=str(e))
        else:
            # 其他业务代码
            pass

现在问题是,初次启动或者登录过期时,此时2个用户同时发送了请求,会同时调用login_for_cookie方法,登录2次获取2个cookie(这个接口模拟登录用时10秒左右)。
请问如何保证不管同时来多少请求,都只登录一次,获取一个cookie共用

想过把login_for_cookie写成同步模式,但是playwright的同步模式运行在异步fetch里面报错