puppeteer入门及使用electron打包(一)

7,613 阅读2分钟

参考文档

官方Github地址:github.com/puppeteer/p…

iT邦幫忙系列教程:ithelp.ithome.com.tw/articles/10…

博客总结:www.cnblogs.com/paris-test/…

掘金安装问题:juejin.cn/post/684490…

csdn切换page:blog.csdn.net/w20101310/a…

csdn切换iframe:blog.csdn.net/qupan1993/a…

chromium浏览器下载地址:npm.taobao.org/mirrors/chr…

puppueteer入门:aaron-bird.github.io/2019/04/22/…

一、安装

1、自带浏览器版,(打包有450M)

    mkdir my-app
    cd my-app
    npm init -y
    npm i puppeteer
    # or "yarn add puppeteer"

2、生产安装轻量核心版(不带浏览器打包145M)

npm i puppeteer-core
# or "yarn add puppeteer-core"

二、配置

//import puppeteer from "puppeteer-core";//核心板
import puppeteer from "puppeteer";
const isDevelopment = process.env.NODE_ENV !== 'production'
this.browser = await puppeteer.launch({
        headless: !isDevelopment,
        executablePath: "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe",
         defaultViewport: {
                width: 1349,
                height: 600
           }
      })

headless默认true,开发环境使用false,可以看到可视化页面 executablePath:默认node_modules下的Chromium浏览器,可以改为本地的chrome浏览器 defaultViewport:窗口大小,默认800×600页面显示不全

使用

截图

const puppeteer = require('puppeteer');
(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({path: 'example.png'});
  await browser.close();
})();

生成PDF

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
  await page.pdf({path: 'hn.pdf', format: 'A4'});
  await browser.close();
})();

获取视图大小

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');

  // Get the "viewport" of the page, as reported by the page.
  const dimensions = await page.evaluate(() => {
    return {
      width: document.documentElement.clientWidth,
      height: document.documentElement.clientHeight,
      deviceScaleFactor: window.devicePixelRatio
    };
  });
  console.log('Dimensions:', dimensions);
  await browser.close();
})();

三、采坑

1、hover无法显示

 await page.waitFor(2000);//hover前不加延迟,就会超时?????
 await page.hover(moreBtnSelector)

2、切换page

(1)browser.pages()

  browser.pages()[0]

(2)page.url()、page.name()

   const page2 = (await browser.pages()).find(p => p.url().indexOf('beijing') > -1)

(3)使用targetcreated事件

 const newPagePromise = new Promise(x => this.browser.once('targetcreated', target => x(target.page()))); // targetcreated发生时就可以得到page

3、iframe嵌套,获取iframe

(1)page.frames()

   const frame1 = ( await page.frames() )[0]

(2)page.url()、page.name()。 page.url()得到的地址和看到的src有时不相同, page.name()可能是重复的

const frame1 = await page.frames().find(f => f.name() === 'allbox');

4、网络慢的时候会发生,操作iframe报错,iframe为undefined

解决方法

    /*
        循环查找iframe
        s超时时间默认60s
        t时间间隔
    */
    async waitForIframe(f = () => { }, s = 60000, t = 1000) {
        return await new Promise((resolve, reject) => {
            let timer, beginDate = new Date().getTime();
            let fn = async () => {
                let endDate = new Date().getTime()
                if (timer) clearTimeout(timer)
                if (endDate - beginDate > s) {
                    return reject('TimeoutError')
                }
                let frame = await f()
                if (frame) {
                    return resolve(frame)
                } else {
                    timer = setTimeout(fn, t)
                }
            }
            fn()
        })
    }

const mainIframe= this.waitForIframe(()=>page.mainFrame().childFrames().find(f => f.url().indexOf('/iframe') > -1))

const childIframe= this.waitForIframe(()=>mainIframe.childFrames().find(f => f.name()==='childIframe'))

5、通过样式隐藏弹框

await frame1.$eval('div.layer", f => f.style.display = 'none') #匹配第一个

6、填写input,使用xpath获取全部的可编辑input,然后填写

await this.typeInput('//input[@type="text" and @jpath and not(@readonly)]', page, frame2)
async typeInput(xPath, page, frame) {
    if (!frame) frame = page;
    const elementHandle = await frame.$x(xPath)
    for (let i = 0; i < elementHandle.length; i++) {
    let jpath = await frame.evaluate(el => el.getAttribute('jpath'), elementHandle[i])
    let value = this.gf(jpath)
    if (typeof value !== 'string') continue
    await elementHandle[i].focus()
    await this.inputClear(page)
    await elementHandle[i].type(value)
    }
    await this.inputBlur(page);
}

7、page.type()方法会在旧数据的后面继续输入

//清空input
async inputClear(page) {
    await page.keyboard.down('Control');
    await page.keyboard.down('KeyA');
    await page.keyboard.press('Backspace');//Backspace/Delete
    await page.keyboard.up('Control');
    await page.keyboard.up('KeyA');
}

8、有focus方法,没有找到blur方法,点击其他地方模拟blur

//input失去焦点
async inputBlur(page) {
    await page.click('body')
}

9、input type="radio" 使用page.click

10、填写select,使用page.select()

11、 喜欢使用jquery的可以,使用cheerio操作

const body = await page.content();
const $ = await cheerio.load(body);
await page.evaluate(async () => {
    const selector = await $('input[jpath="zzsybsbSbbdxxVO.qtkspzmxb.txffs"]')
    await selector.focus();
    await selector.val(7777);
    await selector.trigger("input").trigger('propertychange').trigger("change")
    //ng使用了modal双向绑定,需要手动触发onInput事件
    await selector.blur();
}, $);