参考文档
官方Github地址:github.com/puppeteer/p…
iT邦幫忙系列教程:ithelp.ithome.com.tw/articles/10…
博客总结:www.cnblogs.com/paris-test/…
掘金安装问题:juejin.cn/post/684490…
csdn切换page:blog.csdn.net/w20101310/a…
csdn切换iframe:blog.csdn.net/qupan1993/a…
chromium浏览器下载地址:npm.taobao.org/mirrors/chr…
puppueteer入门:aaron-bird.github.io/2019/04/22/…
一、安装
1、自带浏览器版,(打包有450M)
mkdir my-app
cd my-app
npm init -y
npm i puppeteer
# or "yarn add puppeteer"
2、生产安装轻量核心版(不带浏览器打包145M)
npm i puppeteer-core
# or "yarn add puppeteer-core"
二、配置
//import puppeteer from "puppeteer-core";//核心板
import puppeteer from "puppeteer";
const isDevelopment = process.env.NODE_ENV !== 'production'
this.browser = await puppeteer.launch({
headless: !isDevelopment,
executablePath: "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe",
defaultViewport: {
width: 1349,
height: 600
}
})
headless默认true,开发环境使用false,可以看到可视化页面 executablePath:默认node_modules下的Chromium浏览器,可以改为本地的chrome浏览器 defaultViewport:窗口大小,默认800×600页面显示不全
使用
截图
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
生成PDF
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});
await browser.close();
})();
获取视图大小
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Get the "viewport" of the page, as reported by the page.
const dimensions = await page.evaluate(() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio
};
});
console.log('Dimensions:', dimensions);
await browser.close();
})();
三、采坑
1、hover无法显示
await page.waitFor(2000);//hover前不加延迟,就会超时?????
await page.hover(moreBtnSelector)
2、切换page
(1)browser.pages()
browser.pages()[0]
(2)page.url()、page.name()
const page2 = (await browser.pages()).find(p => p.url().indexOf('beijing') > -1)
(3)使用targetcreated事件
const newPagePromise = new Promise(x => this.browser.once('targetcreated', target => x(target.page()))); // targetcreated发生时就可以得到page
3、iframe嵌套,获取iframe
(1)page.frames()
const frame1 = ( await page.frames() )[0]
(2)page.url()、page.name()。 page.url()得到的地址和看到的src有时不相同, page.name()可能是重复的
const frame1 = await page.frames().find(f => f.name() === 'allbox');
4、网络慢的时候会发生,操作iframe报错,iframe为undefined
解决方法
/*
循环查找iframe
s超时时间默认60s
t时间间隔
*/
async waitForIframe(f = () => { }, s = 60000, t = 1000) {
return await new Promise((resolve, reject) => {
let timer, beginDate = new Date().getTime();
let fn = async () => {
let endDate = new Date().getTime()
if (timer) clearTimeout(timer)
if (endDate - beginDate > s) {
return reject('TimeoutError')
}
let frame = await f()
if (frame) {
return resolve(frame)
} else {
timer = setTimeout(fn, t)
}
}
fn()
})
}
const mainIframe= this.waitForIframe(()=>page.mainFrame().childFrames().find(f => f.url().indexOf('/iframe') > -1))
const childIframe= this.waitForIframe(()=>mainIframe.childFrames().find(f => f.name()==='childIframe'))
5、通过样式隐藏弹框
await frame1.$eval('div.layer", f => f.style.display = 'none') #匹配第一个
6、填写input,使用xpath获取全部的可编辑input,然后填写
await this.typeInput('//input[@type="text" and @jpath and not(@readonly)]', page, frame2)
async typeInput(xPath, page, frame) {
if (!frame) frame = page;
const elementHandle = await frame.$x(xPath)
for (let i = 0; i < elementHandle.length; i++) {
let jpath = await frame.evaluate(el => el.getAttribute('jpath'), elementHandle[i])
let value = this.gf(jpath)
if (typeof value !== 'string') continue
await elementHandle[i].focus()
await this.inputClear(page)
await elementHandle[i].type(value)
}
await this.inputBlur(page);
}
7、page.type()方法会在旧数据的后面继续输入
//清空input
async inputClear(page) {
await page.keyboard.down('Control');
await page.keyboard.down('KeyA');
await page.keyboard.press('Backspace');//Backspace/Delete
await page.keyboard.up('Control');
await page.keyboard.up('KeyA');
}
8、有focus方法,没有找到blur方法,点击其他地方模拟blur
//input失去焦点
async inputBlur(page) {
await page.click('body')
}
9、input type="radio" 使用page.click
10、填写select,使用page.select()
11、 喜欢使用jquery的可以,使用cheerio操作
const body = await page.content();
const $ = await cheerio.load(body);
await page.evaluate(async () => {
const selector = await $('input[jpath="zzsybsbSbbdxxVO.qtkspzmxb.txffs"]')
await selector.focus();
await selector.val(7777);
await selector.trigger("input").trigger('propertychange').trigger("change")
//ng使用了modal双向绑定,需要手动触发onInput事件
await selector.blur();
}, $);