不废话,直接上方式
tesseract.js 是一个可以分析出图片上文字的一个库
github地址:github.com/naptha/tess…
首先需 npm install tesseract
npm i tesseract.js
然后如下步骤
import { createWorker } from 'tesseract.js';
(async () => {
const worker = await createWorker('eng');
const ret = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
console.log(ret.data.text);
await worker.terminate();
})();
上面只能解析英文,可以看到有 eng
这个参数,那怎么才能解析中文呢?只需要改成chi_sim
。
如果是中英混合,改成eng+chi_sim
示例pic如下
代码执行结果
Mild Splendour of the various-vested Night! Mother of wildly-working visions! hail I watch thy gliding, while with watery light Thy weak eye glimmers through a fleecy veil; And when thou lovest thy pale orb to shroud Behind the gather’d blackness lost on high; And when thou dartest from the wind-rent cloud Thy placid lightning o’er the awaken’d sky.