在两年前其实我写过一篇前端transformers.js的介绍,里面可以运行一些对话、图片识别、语音转文字、文字转语音等等:Transformers.js:Web 上的最新机器学习技术(1)
但是那时候的模型还很弱鸡,所以我也只是随便玩玩而已,但是现在6202年了,国产开源模型跟坐火箭一样疯狂进化,特别是像Qwen3.5这样的原生多模态模型都出来了。然后早上突然有个鬼点子出来了,能不能在浏览器跑一下,然后我就去hf.co看了一下有没有3.5的模型可以用,一搜索发现还真有:
点进去发现还有demo,而且像模像样的,说明应该是可以直接跑的。
那么就让ai写一个界面来用一下试试,这里我使用了哈基米3.1pro:
然后这是使用展示:
拿来做ocr的话0.8b准确率不高,有兴趣可以尝试一下4b。
(在写文的时候突然想到加入了速度计算,不过好像不是从首字返回开始计时,可能不太准确,可以自己优化一下)
这是ai生成的html:
<!doctype html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Qwen3.5-ONNX WebGPU Chat</title>
<style>
:root {
--primary: #2563eb;
--bg-sidebar: #f3f4f6;
--bg-main: #ffffff;
--border: #e5e7eb;
--msg-user: #2563eb;
--msg-ai: #f3f4f6;
}
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
body {
font-family:
-apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica,
Arial, sans-serif;
display: flex;
height: 100vh;
overflow: hidden;
background: var(--bg-main);
}
/* Sidebar Styles */
#sidebar {
width: 260px;
background: var(--bg-sidebar);
border-right: 1px solid var(--border);
display: flex;
flex-direction: column;
}
#sidebar-header {
padding: 13px 15px 12.5px 15px;
border-bottom: 1px solid var(--border);
}
#new-chat-btn {
width: 100%;
padding: 10px;
background: var(--primary);
color: white;
border: none;
border-radius: 6px;
cursor: pointer;
font-size: 14px;
}
#new-chat-btn:hover {
background: #1d4ed8;
}
#chat-list {
flex: 1;
overflow-y: auto;
padding: 10px;
}
.chat-item {
padding: 10px;
margin-bottom: 5px;
border-radius: 6px;
cursor: pointer;
display: flex;
justify-content: space-between;
align-items: center;
}
.chat-item:hover {
background: #e5e7eb;
}
.chat-item.active {
background: #dbeafe;
color: #1e40af;
}
.chat-title {
font-size: 14px;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
flex: 1;
}
.delete-btn {
color: #ef4444;
font-size: 12px;
border: none;
background: none;
cursor: pointer;
padding: 0 5px;
display: none;
}
.chat-item:hover .delete-btn {
display: block;
}
/* Main Chat Area */
#main {
flex: 1;
display: flex;
flex-direction: column;
position: relative;
}
#header {
padding: 15px 20px;
border-bottom: 1px solid var(--border);
display: flex;
justify-content: space-between;
align-items: center;
background: white;
z-index: 10;
}
#model-controls {
display: flex;
gap: 10px;
align-items: center;
}
select {
padding: 8px;
border-radius: 4px;
border: 1px solid var(--border);
font-size: 14px;
min-width: 250px;
}
#load-model-btn {
padding: 8px 15px;
background: var(--primary);
color: white;
border: none;
border-radius: 4px;
cursor: pointer;
}
#load-model-btn:disabled {
background: #9ca3af;
cursor: not-allowed;
}
#chat-container {
flex: 1;
overflow-y: auto;
padding: 20px;
display: flex;
flex-direction: column;
gap: 15px;
background: white;
}
.message {
max-width: 80%;
padding: 12px 16px;
border-radius: 8px;
line-height: 1.5;
font-size: 15px;
word-wrap: break-word;
}
.message img {
max-width: 300px;
border-radius: 6px;
margin-bottom: 8px;
display: block;
}
.user-msg {
background: var(--msg-user);
color: white;
align-self: flex-end;
border-bottom-right-radius: 0;
}
.ai-msg {
background: var(--msg-ai);
color: black;
align-self: flex-start;
border-bottom-left-radius: 0;
}
/* Token Speed Stats */
.msg-stats {
font-size: 12px;
color: #6b7280;
margin-top: 8px;
font-family:
ui-monospace, SFMono-Regular, Consolas, 'Courier New', monospace;
}
/* Input Area */
#input-wrapper {
padding: 15px 20px;
border-top: 1px solid var(--border);
background: white;
}
#preview-container {
display: none;
margin-bottom: 10px;
position: relative;
width: max-content;
}
#preview-img {
max-height: 100px;
border-radius: 6px;
border: 1px solid var(--border);
}
#remove-img-btn {
position: absolute;
top: -8px;
right: -8px;
background: #ef4444;
color: white;
border: none;
border-radius: 50%;
width: 20px;
height: 20px;
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
font-size: 12px;
}
#input-form {
display: flex;
gap: 10px;
align-items: flex-end;
}
#text-input {
flex: 1;
padding: 12px;
border: 1px solid var(--border);
border-radius: 6px;
resize: none;
min-height: 45px;
max-height: 150px;
font-family: inherit;
font-size: 15px;
}
.icon-btn {
background: none;
border: 1px solid var(--border);
border-radius: 6px;
padding: 10px;
cursor: pointer;
color: #4b5563;
display: flex;
align-items: center;
justify-content: center;
}
.icon-btn:hover {
background: #f3f4f6;
}
#send-btn {
background: var(--primary);
color: white;
border: none;
padding: 0 20px;
height: 45px;
border-radius: 6px;
cursor: pointer;
font-weight: 500;
}
#send-btn:disabled {
background: #9ca3af;
cursor: not-allowed;
}
/* Loading Modal */
#loading-modal {
position: absolute;
top: 0;
left: 0;
right: 0;
bottom: 0;
background: rgba(255, 255, 255, 0.9);
z-index: 100;
display: none;
flex-direction: column;
justify-content: center;
align-items: center;
}
.modal-box {
background: white;
padding: 30px;
border-radius: 12px;
box-shadow: 0 4px 20px rgba(0, 0, 0, 0.1);
width: 400px;
text-align: center;
}
#progress-text {
margin-bottom: 15px;
font-weight: 500;
color: #374151;
}
progress {
width: 100%;
height: 10px;
border-radius: 5px;
appearance: none;
}
progress::-webkit-progress-bar {
background-color: #e5e7eb;
border-radius: 5px;
}
progress::-webkit-progress-value {
background-color: var(--primary);
border-radius: 5px;
}
</style>
</head>
<body>
<!-- 左侧对话历史 -->
<div id="sidebar">
<div id="sidebar-header">
<button id="new-chat-btn">+ 新建对话</button>
</div>
<div id="chat-list"></div>
</div>
<!-- 右侧主界面 -->
<div id="main">
<div id="header">
<h3>Qwen WebGPU Chat</h3>
<div id="model-controls">
<select id="model-select">
<option value="huggingworld/Qwen3.5-0.8B-ONNX">
Qwen3.5-0.8B-ONNX
</option>
<option value="huggingworld/Qwen3.5-2B-ONNX">
Qwen3.5-2B-ONNX
</option>
<option value="huggingworld/Qwen3.5-4B-ONNX">
Qwen3.5-4B-ONNX
</option>
</select>
<button id="load-model-btn">加载模型</button>
</div>
</div>
<div id="chat-container">
<!-- 聊天消息会插入在这里 -->
</div>
<div id="input-wrapper">
<div id="preview-container">
<img id="preview-img" src="" alt="preview" />
<button id="remove-img-btn">✕</button>
</div>
<form id="input-form">
<input
type="file"
id="file-input"
accept="image/*"
style="display: none"
/>
<button
type="button"
class="icon-btn"
id="attach-btn"
title="上传图片"
>
📷
</button>
<textarea
id="text-input"
placeholder="输入消息 (Shift + Enter 换行)..."
rows="1"
></textarea>
<button type="submit" id="send-btn" disabled>发送</button>
</form>
</div>
<!-- 模型加载进度弹窗 -->
<div id="loading-modal">
<div class="modal-box">
<h3 style="margin-bottom: 15px">加载模型中</h3>
<div id="progress-text">正在初始化...</div>
<progress id="progress-bar" value="0" max="100"></progress>
<p style="margin-top: 15px; font-size: 12px; color: #6b7280">
首次加载会自动下载模型文件,请耐心等待。
</p>
</div>
</div>
</div>
<script type="module">
import {
AutoProcessor,
Qwen3_5ForConditionalGeneration,
RawImage,
TextStreamer,
} from 'https://unpkg.com/@huggingface/transformers@4.0.0-next.8/dist/transformers.min.js'
// ============== 自定义 Streamer 以实时更新 UI 并计算速度 ==============
class DOMTextStreamer extends TextStreamer {
constructor(tokenizer, callback) {
super(tokenizer, { skip_prompt: true, skip_special_tokens: true })
this.callback = callback
this.generatedText = ''
// 用于速度统计的变量
this.tokenCount = 0
this.startTime = null
this.first_put_done = false
this.finalTps = 0
}
put(value) {
// 第一次 put 通常包含 Prompt 或者是生成前的准备
if (!this.first_put_done) {
this.first_put_done = true
// 以处理完 Prompt 后的时间为起点,这样算出来的是纯 Decode 速度
this.startTime = performance.now()
} else {
// 后续每次 put 记录新生成的 token 数
let count =
value.size !== undefined ? value.size : value.length || 1
this.tokenCount += count
}
super.put(value)
}
on_finalized_text(text, streamEnd) {
this.generatedText += text
// 计算当前速度 (Tokens/s)
let tps = 0
if (this.tokenCount > 0 && this.startTime) {
const elapsed = (performance.now() - this.startTime) / 1000
if (elapsed > 0) tps = (this.tokenCount / elapsed).toFixed(2)
}
this.finalTps = tps
this.callback(this.generatedText, streamEnd, tps)
}
}
// ============== IndexedDB 封装 ==============
const DB_NAME = 'QwenChatDB'
const STORE_NAME = 'chats'
let db
function initDB() {
return new Promise((resolve, reject) => {
const req = indexedDB.open(DB_NAME, 1)
req.onupgradeneeded = (e) => {
let db = e.target.result
if (!db.objectStoreNames.contains(STORE_NAME)) {
db.createObjectStore(STORE_NAME, { keyPath: 'id' })
}
}
req.onsuccess = (e) => {
db = e.target.result
resolve()
}
req.onerror = (e) => reject(e)
})
}
function saveChat(chat) {
return new Promise((resolve) => {
const tx = db.transaction(STORE_NAME, 'readwrite')
tx.objectStore(STORE_NAME).put(chat)
tx.oncomplete = () => resolve()
})
}
function getAllChats() {
return new Promise((resolve) => {
const tx = db.transaction(STORE_NAME, 'readonly')
const req = tx.objectStore(STORE_NAME).getAll()
req.onsuccess = () => resolve(req.result || [])
})
}
function deleteChat(id) {
return new Promise((resolve) => {
const tx = db.transaction(STORE_NAME, 'readwrite')
tx.objectStore(STORE_NAME).delete(id)
tx.oncomplete = () => resolve()
})
}
// ============== 全局状态 ==============
let currentChatId = null
let chatHistory = []
let processor = null
let model = null
let isLoading = false
let currentImageBase64 = null
// ============== UI 元素 ==============
const chatListEl = document.getElementById('chat-list')
const chatContainerEl = document.getElementById('chat-container')
const textInput = document.getElementById('text-input')
const sendBtn = document.getElementById('send-btn')
const loadModelBtn = document.getElementById('load-model-btn')
const modelSelect = document.getElementById('model-select')
const attachBtn = document.getElementById('attach-btn')
const fileInput = document.getElementById('file-input')
const previewContainer = document.getElementById('preview-container')
const previewImg = document.getElementById('preview-img')
const removeImgBtn = document.getElementById('remove-img-btn')
const loadingModal = document.getElementById('loading-modal')
const progressBar = document.getElementById('progress-bar')
const progressText = document.getElementById('progress-text')
// ============== 初始化流程 ==============
async function init() {
await initDB()
await loadSidebar()
createNewChat()
}
// ============== 侧边栏与历史管理 ==============
async function loadSidebar() {
const chats = await getAllChats()
chats.sort((a, b) => b.id - a.id)
chatListEl.innerHTML = ''
chats.forEach((chat) => {
const div = document.createElement('div')
div.className = `chat-item ${chat.id === currentChatId ? 'active' : ''}`
div.onclick = () => selectChat(chat)
const title = document.createElement('div')
title.className = 'chat-title'
title.innerText = chat.title || '新对话'
const delBtn = document.createElement('button')
delBtn.className = 'delete-btn'
delBtn.innerText = '删除'
delBtn.onclick = async (e) => {
e.stopPropagation()
await deleteChat(chat.id)
if (currentChatId === chat.id) createNewChat()
loadSidebar()
}
div.appendChild(title)
div.appendChild(delBtn)
chatListEl.appendChild(div)
})
}
function createNewChat() {
currentChatId = Date.now()
chatHistory = []
chatContainerEl.innerHTML = ''
loadSidebar()
}
async function selectChat(chat) {
currentChatId = chat.id
chatHistory = chat.messages || []
chatContainerEl.innerHTML = ''
// 重新加载历史时也要显示之前记录的速度
chatHistory.forEach((msg) => {
const { statsDiv } = appendMessageUI(msg.role, msg.text, msg.image)
if (msg.tps && statsDiv) {
statsDiv.innerText = `⚡ 速度: ${msg.tps} tokens/s`
}
})
loadSidebar()
}
document
.getElementById('new-chat-btn')
.addEventListener('click', createNewChat)
// ============== 图片上传预览 ==============
attachBtn.onclick = () => fileInput.click()
fileInput.onchange = (e) => {
const file = e.target.files[0]
if (!file) return
const reader = new FileReader()
reader.onload = (event) => {
currentImageBase64 = event.target.result
previewImg.src = currentImageBase64
previewContainer.style.display = 'block'
}
reader.readAsDataURL(file)
fileInput.value = ''
}
removeImgBtn.onclick = () => {
currentImageBase64 = null
previewContainer.style.display = 'none'
previewImg.src = ''
}
// ============== 加载模型 ==============
loadModelBtn.onclick = async () => {
if (isLoading) return
const model_id = modelSelect.value
isLoading = true
loadModelBtn.disabled = true
modelSelect.disabled = true
loadingModal.style.display = 'flex'
const progressMap = {}
try {
const progressCallback = (info) => {
if (info.status === 'progress') {
progressMap[info.file] = {
loaded: info.loaded,
total: info.total,
}
let totalLoaded = 0,
totalSize = 0
for (let file in progressMap) {
totalLoaded += progressMap[file].loaded
totalSize += progressMap[file].total
}
const percent =
totalSize > 0 ? (totalLoaded / totalSize) * 100 : 0
progressBar.value = percent
progressText.innerText = `下载中... ${Math.round(percent)}%`
} else if (info.status === 'ready') {
progressText.innerText = `加载至 WebGPU... 这可能会需要一段时间`
}
}
processor = await AutoProcessor.from_pretrained(model_id, {
progress_callback: progressCallback,
})
model = await Qwen3_5ForConditionalGeneration.from_pretrained(
model_id,
{
dtype: {
embed_tokens: 'q4',
vision_encoder: 'fp16',
decoder_model_merged: 'q4',
},
device: 'webgpu',
progress_callback: progressCallback,
},
)
sendBtn.disabled = false
loadModelBtn.innerText = '模型已加载'
} catch (error) {
console.error(error)
alert(
'模型加载失败。请确保您的浏览器支持 WebGPU(Chrome 113+),或者控制台查看具体报错。\n' +
error.message,
)
loadModelBtn.disabled = false
modelSelect.disabled = false
} finally {
isLoading = false
loadingModal.style.display = 'none'
}
}
// ============== 消息 UI 处理 ==============
// 修改为返回对象包含 textSpan 和 statsDiv,方便外部实时更新
function appendMessageUI(role, text, imageBase64 = null, msgId = null) {
const msgDiv = document.createElement('div')
msgDiv.className = `message ${role === 'user' ? 'user-msg' : 'ai-msg'}`
if (msgId) msgDiv.id = msgId
if (imageBase64) {
const img = document.createElement('img')
img.src = imageBase64
msgDiv.appendChild(img)
}
const textSpan = document.createElement('span')
textSpan.innerText = text
msgDiv.appendChild(textSpan)
// 如果是 AI 回复,添加一个显示速度的专属容器
let statsDiv = null
if (role === 'assistant') {
statsDiv = document.createElement('div')
statsDiv.className = 'msg-stats'
msgDiv.appendChild(statsDiv)
}
chatContainerEl.appendChild(msgDiv)
chatContainerEl.scrollTop = chatContainerEl.scrollHeight
return { textSpan, statsDiv }
}
// ============== 发送消息与模型推理 ==============
document
.getElementById('input-form')
.addEventListener('submit', async (e) => {
e.preventDefault()
const text = textInput.value.trim()
if (!text && !currentImageBase64) return
if (!model || !processor) {
alert('请先加载模型!')
return
}
const userText = text
const userImg = currentImageBase64
// 清理输入框
textInput.value = ''
removeImgBtn.click()
// 显示用户消息并保存
appendMessageUI('user', userText, userImg)
chatHistory.push({ role: 'user', text: userText, image: userImg })
updateDB()
// 禁用输入
sendBtn.disabled = true
textInput.disabled = true
// 准备 AI 消息框
const aiMsgId = `ai-msg-${Date.now()}`
const { textSpan: aiTextSpan, statsDiv: aiStatsDiv } =
appendMessageUI('assistant', '思考中...', null, aiMsgId)
try {
// 1. 整理对话格式(适配 apply_chat_template)
const conversation = []
const rawImages = []
for (let msg of chatHistory) {
if (msg.role === 'user') {
const content = []
if (msg.image) {
content.push({ type: 'image' })
const rImg = await RawImage.read(msg.image)
const resized = await rImg.resize(448, 448)
rawImages.push(resized)
}
content.push({ type: 'text', text: msg.text || '' })
conversation.push({ role: 'user', content: content })
} else {
conversation.push({
role: 'assistant',
content: [{ type: 'text', text: msg.text }],
})
}
}
// 2. 调用 Processor
const promptText = processor.apply_chat_template(conversation, {
add_generation_prompt: true,
})
let inputs
if (rawImages.length > 0) {
inputs = await processor(
promptText,
rawImages.length === 1 ? rawImages[0] : rawImages,
)
} else {
inputs = await processor(promptText)
}
// 3. 构建 Streamer 来动态更新网页及显示速度
aiTextSpan.innerText = ''
const streamer = new DOMTextStreamer(
processor.tokenizer,
(newText, isEnd, tps) => {
aiTextSpan.innerText = newText
if (tps > 0) {
aiStatsDiv.innerText = `⚡ 速度: ${tps} tokens/s`
}
chatContainerEl.scrollTop = chatContainerEl.scrollHeight
},
)
// 4. 开始生成
await model.generate({
...inputs,
max_new_tokens: 512,
streamer: streamer,
})
// 5. 保存结果(把速度也保存进去,方便重新加载时查看)
chatHistory.push({
role: 'assistant',
text: aiTextSpan.innerText,
tps: streamer.finalTps,
})
updateDB()
} catch (err) {
console.error('生成报错: ', err)
aiTextSpan.innerText = '生成出错: ' + err.message
} finally {
sendBtn.disabled = false
textInput.disabled = false
textInput.focus()
}
})
// 绑定 Enter 快捷键
textInput.addEventListener('keydown', (e) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
document.getElementById('send-btn').click()
}
})
async function updateDB() {
// 自动用第一条消息作为标题
const title =
chatHistory.length > 0
? chatHistory[0].text.substring(0, 15) || '图片对话'
: '新对话'
await saveChat({
id: currentChatId,
title: title,
messages: chatHistory,
})
loadSidebar()
}
// 启动应用
init()
</script>
</body>
</html>
这是我放到vercel上面的代理,可以通过这个链接体验:my-ai-gen-demo.vercel.app/qwen-webgpu…