利用 selenium+phantomjs 爬取 JS 生成的网页1. 网页是 JS 生成的，怎么爬取网页？做爬虫的

1. 网页是 JS 生成的，怎么爬取网页？

做爬虫的时候遇到网页有 JS 生成的，由于 JS 代码未执行，拿不到数据，所以不能用普通的 Http 包来爬取网页。所以我利用了selenium+phantomjs 来爬取网页。

2. 为什么选择 phantomjs ？

至于为什么用 phantomjs 而不用 chrome 或者 Firefox 来请求是因为 phantomjs 不需要调用浏览器即可模拟请求，速度要快。

3. 拿到网页文本后怎么方便解析?

https://www.npmjs.com/package/cheerio

4. 具体怎么实现？


 const cheerio = require('cheerio');
 const webdriver = require('selenium-webdriver');
 const driver = new webdriver.Builder().forBrowser('phantomjs').build();
 const Url = 'https://xxx';
 driver.get(Url);
 const coupons_link = await driver.getPageSource().then((res) => {
    const $ = cheerio.load(res);
    const elements = $("a")[0].attribs.href;
    return `http:${elements}`;
  });

driver.quit();

利用 selenium+phantomjs 爬取 JS 生成的网页

1. 网页是 JS 生成的， 怎么爬取网页？

2. 为什么选择 phantomjs ？

3. 拿到网页文本后怎么方便解析?

4. 具体怎么实现？

1. 网页是 JS 生成的，怎么爬取网页？