上一篇文章《端侧AI 模型部署实战五(Android大模型加载)》实现了Android手机端LLM文本大模型的加载和运行,本文继续实现多模型大模型的加载。
AI Prompt: windows + llama.cpp + Android studio 环境移植Qwen2.5-Omni-3B-Instruct
为什么选择Qwen2.5-Omni-3B-Instruct这个大模型,一个原因是我手机低配,这个模型足够小能跑,根本原因是一次把文本,图片,语音,视频都验证完成节省时间。
一、整体目标
把 llama.cpp + Qwen2.5-Omni-3B 编译成 Android 可调用的 .so 库,在 Android Studio 里做成离线全模态 AI APP。
二、需要准备
- Windows PC
- Android Studio Hedgehog 或最新版
- 安装 NDK (25c 或 26b)
- 模型文件(必须 3B)
qwen2.5-omni-3b.Q4_K_M.ggufmmproj-Qwen2.5-Omni-3B-f16.gguf
三、核心步骤
1. 下载 llama.cpp 源码
官网下载,命令行执行 git clone github.com/ggml-org/ll… scp 拷贝源码至本地目录。
如果下载太慢可以使用:git clone gitee.com/mirrors/lla… 这个国内镜像,速度很快。
注意: 由于llama.cpp源码的框架最近变动很大,不同时期的编译方式和JNI接口调用都有大量的差异,而且不同版本对大模型的支持也有变动,需要和大模型进行配合。
git describe --tags 命令确认llama.cpp的版本,,比如我这个版本是b8648,如果使用AI编程,一定要预置这个prompt, 不然AI默认以最新的编程接口出现大量接口编译错误。
2. 模型下载与转换
模型下载选择上面llama.cpp b8648支持的多态大模型,下载之前先运行命令确认当前llama.cpp支持哪些模型,避免出现模型下载下来无法量化转换的问题。
python convert_hf_to_gguf.py --print-supported-models
模型下载地址:
huggingface官方完整仓库,huggingface.co/google
上面几个网站都可以直接下载原始未量化的模型,也可以下载量化过的模型。
量化模型下载:
modelscope download --model ggml-org/Qwen2.5-Omni-3B-GGUF --local_dir D:\workspace\AI\lm_studio_models\Qwen2.5-Omni-3B-GGUF
原始模型下载:
modelscope download --model Qwen/Qwen2.5-Omni-3B --local_dir D:\workspace\AI\lm_studio_models\Qwen2.5-Omni-3B
下载之后可以自己量化,量化的方法之前文章有介绍 。
3. llamap.cpp so编译集成(关键!)
编译集成的目的是为了将llama.cpp中的推理引擎编译成so库集成到手机,实现JNI接口供手机apk调用,所以重点是so的编译生成和JNI的实现。
so库的编译集成主要两种方式实现,方式一:采用预编译llama.cpp,将编译生成的so预置到项目工程中,端侧AI 模型部署实战五中就是将so预置到了app/src/main/jniLibs/arm64-v8a/*.so中。方式二:直接将llama.cpp放到工程目录下,重新编译。
方式1:可以使用下面脚本编译
@echo of
:: ==============================================
:: llama.cpp b8648 Android arm64-v8a 编译脚本
:: 已开启 MTMD(多模态)
:: ==============================================
:: 1. 进入源码根目录
cd /d D:\workspace\AI\tools\llama.cpp-android\llama.cpp
:: 2. 清理旧构建
rmdir /s /q build-android
mkdir build-android
cd build-android
:: 3. NDK 路径
set ANDROID_NDK=D:\workspace\AI\SDK\ndk\28.2.13676358
chcp 65001 > nul
:: 4. CMake 配置(已开启 MTMD)
cmake .. ^
-DCMAKE_TOOLCHAIN_FILE=%ANDROID_NDK%/build/cmake/android.toolchain.cmake ^
-DANDROID_ABI=arm64-v8a ^
-DANDROID_PLATFORM=android-28 ^
-DCMAKE_BUILD_TYPE=Release ^
-DCMAKE_C_FLAGS="-march=armv8-a" ^
-DCMAKE_CXX_FLAGS="-march=armv8-a" ^
-DGGML_OPENMP=OFF ^
-DGGML_NATIVE=OFF ^
-DBUILD_SHARED_LIBS=ON ^
-DLLAMA_ANDROID=ON ^
-DGGML_CUDA=OFF ^
-DGGML_METAL=OFF ^
-DLLAMA_MTMD=ON ^
-DCMAKE_MAKE_PROGRAM=%ANDROID_NDK%/prebuilt/windows-x86_64/bin/make.exe ^
-G "Unix Makefiles"
:: 5. 编译
cmake --build . -j4
echo.
echo ==============================
echo 编译完成 ✅ MTMD 已开启
echo 输出路径:build-android/lib/libllama.so
echo ==============================
pause
本文采用的方式二,将代码llama.cpp全部拷贝到了LlamaTest\app\src\main\cpp\llama,结构如下
使用方式二的原因是本文要验证的是多模态,其代码路径是llama.cpp\tools\mtmd,默认编译配置mtmd的代码不会编译,导致实现多模态JNI口的时大量函数,为了不修改llama.cpp的CMakeLists.txt,通过修改项目CMakeLists.txt实现,项目CMakeLists.txt内容修改如下:
LlamaTest\app\src\main\cpp\CMakeLists.txt
cmake_minimum_required(VERSION 3.22.1)
project(llamatest)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# 编译优化开关
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O0 -fno-openmp -w")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -fno-openmp -w")
# 关闭无用模块
set(LLAMA_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
set(LLAMA_BUILD_TESTS OFF CACHE BOOL "" FORCE)
add_subdirectory(llama)
# ==============================================
# 🔥 编译全部 mtmd 相关(包括 helper)
# ==============================================
add_library(
mtmd
STATIC
llama/tools/mtmd/mtmd.cpp
llama/tools/mtmd/mtmd-image.cpp
llama/tools/mtmd/mtmd-helper.cpp
llama/tools/mtmd/clip.cpp
llama/tools/mtmd/mtmd-audio.cpp
llama/tools/mtmd/models/siglip.cpp
llama/tools/mtmd/models/cogvlm.cpp
llama/tools/mtmd/models/conformer.cpp
llama/tools/mtmd/models/gemma4v.cpp
llama/tools/mtmd/models/glm4v.cpp
llama/tools/mtmd/models/internvl.cpp
llama/tools/mtmd/models/kimivl.cpp
llama/tools/mtmd/models/kimik25.cpp
llama/tools/mtmd/models/nemotron-v2-vl.cpp
llama/tools/mtmd/models/llama4.cpp
llama/tools/mtmd/models/llava.cpp
llama/tools/mtmd/models/minicpmv.cpp
llama/tools/mtmd/models/paddleocr.cpp
llama/tools/mtmd/models/pixtral.cpp
llama/tools/mtmd/models/qwen2vl.cpp
llama/tools/mtmd/models/qwen3vl.cpp
llama/tools/mtmd/models/siglip.cpp
llama/tools/mtmd/models/whisper-enc.cpp
llama/tools/mtmd/models/deepseekocr.cpp
llama/tools/mtmd/models/mobilenetv5.cpp
llama/tools/mtmd/models/youtuvl.cpp
)
# ==============================================
# 🎯 关键:直接指定 miniaudio 路径!!!
# ==============================================
target_include_directories(
mtmd PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/llama/include
${CMAKE_CURRENT_SOURCE_DIR}/llama/tools/mtmd
${CMAKE_CURRENT_SOURCE_DIR}/llama/ggml/src # ✅ 这里!
${CMAKE_CURRENT_SOURCE_DIR}/llama/ggml/include # ✅ 这里!
${CMAKE_CURRENT_SOURCE_DIR}/llama/vendor # ✅ 这里!
)
target_link_libraries(mtmd PRIVATE llama)
# JNI
add_library(llama_jni SHARED llama_wrapper.cpp)
target_include_directories(
llama_jni PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/llama/include
${CMAKE_CURRENT_SOURCE_DIR}/llama/tools/mtmd
)
target_link_libraries(
llama_jni
PRIVATE
llama
mtmd
log
android
jnigraphics
)
如果使用AI编程,必须要将mtmd里面的文件丢给AI,让AI知道有哪些接口,让其使用文件中的接口进行编程,不然AI编程会使用新旧接口,来回反复折腾,我在这里吃了大亏来回搞了两天。
4. 全模态接口(文 + 图 + 音)
还是使用之前的LlamaTest工程,主要代码修改如下:
LlamaTest\app\src\main\java\com\example\llamatest\MainActivity.kt
package com.example.llamatest
import android.content.Intent
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.os.Bundle
import android.widget.EditText
import android.widget.TextView
import androidx.appcompat.app.AppCompatActivity
import java.io.File
import java.io.FileOutputStream
class MainActivity : AppCompatActivity() {
private lateinit var tvResult: TextView
private var modelPath: String? = null
private var mmprojPath: String? = null
private var isModelLoaded = false
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
tvResult = findViewById(R.id.tv_result)
val btnSelectModel = findViewById<android.widget.Button>(R.id.btn_select_model)
val btnSelectMmproj = findViewById<android.widget.Button>(R.id.btn_select_mmproj)
val btnLoadModel = findViewById<android.widget.Button>(R.id.btn_load_model)
val etPrompt = findViewById<EditText>(R.id.et_prompt)
val btnSendText = findViewById<android.widget.Button>(R.id.btn_send_text)
val btnSendImage = findViewById<android.widget.Button>(R.id.btn_send_image)
logMsg("✅ 启动成功,选择模型 + mmproj投影文件")
// 选择模型
btnSelectModel.setOnClickListener {
val intent = Intent(Intent.ACTION_OPEN_DOCUMENT).apply {
addCategory(Intent.CATEGORY_OPENABLE)
type = "*/*"
}
startActivityForResult(intent, 100)
}
// 选择mmproj
btnSelectMmproj.setOnClickListener {
val intent = Intent(Intent.ACTION_OPEN_DOCUMENT).apply {
addCategory(Intent.CATEGORY_OPENABLE)
type = "*/*"
}
startActivityForResult(intent, 101)
}
// 加载多模态
btnLoadModel.setOnClickListener {
if (modelPath == null || mmprojPath == null) {
logMsg("❌ 必须同时选模型和mmproj")
return@setOnClickListener
}
logMsg("⏳ 加载Qwen2.5-Omni...")
Thread {
val success = llamaBridge.loadMultimodal(modelPath!!, mmprojPath!!)
runOnUiThread {
isModelLoaded = success
logMsg(if (success) "🎉 模型+mmproj加载成功" else "❌ 加载失败(检查mmproj/编译LLAMA_MTMD=ON)")
}
}.start()
}
// 纯文本
btnSendText.setOnClickListener {
val input = etPrompt.text.toString().trim()
if (input.isEmpty() || !isModelLoaded) {
logMsg(if (input.isEmpty()) "❌ 输入不能为空" else "❌ 先加载模型")
return@setOnClickListener
}
logMsg("\n🧑💻:$input")
logMsg("🤖 思考中...")
Thread {
try {
val reply = llamaBridge.streamGenerate(input)
runOnUiThread { logMsg("🤖:$reply") }
} catch (e: Exception) {
runOnUiThread { logMsg("❌ 文本失败:${e.message}") }
}
}.start()
}
// 图片+文字(真正传Bitmap)
btnSendImage.setOnClickListener {
val input = etPrompt.text.toString().trim()
if (input.isEmpty() || !isModelLoaded) {
logMsg(if (input.isEmpty()) "❌ 请输入问题" else "❌ 先加载模型")
return@setOnClickListener
}
val intent = Intent(Intent.ACTION_OPEN_DOCUMENT).apply { type = "image/*" }
startActivityForResult(intent, 200)
}
}
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
super.onActivityResult(requestCode, resultCode, data)
if (resultCode != RESULT_OK || data == null) return
val uri = data.data ?: return
// 复制模型/mmproj
if (requestCode == 100 || requestCode == 101) {
val isModel = requestCode == 100
val target = if (isModel) File(filesDir, "model.gguf") else File(filesDir, "mmproj.gguf")
Thread {
contentResolver.openInputStream(uri)?.use { input ->
FileOutputStream(target).use { output -> input.copyTo(output) }
}
runOnUiThread {
if (isModel) modelPath = target.absolutePath
else mmprojPath = target.absolutePath
logMsg("✅ ${if (isModel) "模型" else "mmproj"}复制完成:${target.name}")
}
}.start()
}
// 处理图片
if (requestCode == 200) {
val prompt = findViewById<EditText>(R.id.et_prompt).text.toString().trim()
// 缩放448x448(适配Qwen)
val bitmap = contentResolver.openInputStream(uri)?.use {
BitmapFactory.decodeStream(it)
}?.run {
Bitmap.createScaledBitmap(this, 448, 448, true)
}
if (bitmap == null) {
logMsg("❌ 图片加载失败")
return
}
logMsg("\n🖼️ 正在理解图片+文字:$prompt")
Thread {
try {
val reply = llamaBridge.chatImage(prompt, bitmap)
runOnUiThread {
logMsg("🤖:$reply")
bitmap.recycle()
}
} catch (e: Exception) {
runOnUiThread {
logMsg("❌ 图片推理失败:${e.message}")
bitmap.recycle()
}
}
}.start()
}
}
// UI实时日志
private fun logMsg(msg: String) {
runOnUiThread { tvResult.append("\n$msg") }
}
override fun onDestroy() {
super.onDestroy()
llamaBridge.releaseModel()
}
}
LlamaTest\app\src\main\java\com\example\llamatest\llamaBridge.kt
package com.example.llamatest
object llamaBridge {
init {
System.loadLibrary("llama_jni")
}
external fun loadMultimodal(modelPath: String, mmprojPath: String): Boolean
external fun chatImage(prompt: String, bitmap: android.graphics.Bitmap?): String
external fun streamGenerate(prompt: String): String
external fun releaseModel()
}
LlamaTest\app\src\main\cpp\llama_wrapper.cpp
#include <jni.h>
#include <string>
#include <vector>
#include <cstring>
#include <android/bitmap.h>
#include <android/log.h>
#define LOG_TAG "LlamaMM"
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
#ifdef __cplusplus
extern "C" {
#endif
#include "llama.h"
#include "mtmd.h"
#ifdef __cplusplus
}
#endif
static llama_model* g_model = nullptr;
static llama_context* g_ctx = nullptr;
static const llama_vocab* g_vocab = nullptr;
static mtmd_context* g_mtmd = nullptr;
static llama_token sample_token(llama_context* ctx) {
LOGI("sample_token: 开始采样");
const float* logits = llama_get_logits(ctx);
int n = llama_vocab_n_tokens(g_vocab);
int best = 0;
float max_val = -1e9;
for (int i = 0; i < n; ++i) {
if (logits[i] > max_val) {
max_val = logits[i];
best = i;
}
}
LOGI("sample_token: 采样结果 = %d", best);
return (llama_token)best;
}
extern "C" JNIEXPORT jboolean JNICALL
Java_com_example_llamatest_llamaBridge_loadMultimodal(
JNIEnv* env, jobject /* thiz */,
jstring modelPath, jstring mmprojPath)
{
LOGI("==================================================");
LOGI("loadMultimodal: 【Qwen2.5-Omni-3B】加载开始");
LOGI("==================================================");
const char* model_path = env->GetStringUTFChars(modelPath, nullptr);
const char* mmproj_path = env->GetStringUTFChars(mmprojPath, nullptr);
llama_model_params mparams = llama_model_default_params();
g_model = llama_model_load_from_file(model_path, mparams);
env->ReleaseStringUTFChars(modelPath, model_path);
if (!g_model) {
LOGE("模型加载失败 ❌");
return JNI_FALSE;
}
LOGI("模型加载成功 ✅");
g_vocab = llama_model_get_vocab(g_model);
if (!g_vocab) {
LOGE("vocab 获取失败 ❌");
return JNI_FALSE;
}
LOGI("vocab 获取成功 ✅");
auto mtmd_params = mtmd_context_params_default();
g_mtmd = mtmd_init_from_file(mmproj_path, g_model, mtmd_params);
env->ReleaseStringUTFChars(mmprojPath, mmproj_path);
if (!g_mtmd) {
LOGE("mmproj 加载失败 ❌");
return JNI_FALSE;
}
LOGI("mmproj 加载成功 ✅");
bool support_vision = mtmd_support_vision(g_mtmd);
LOGI("==================================================");
LOGI("✅ mtmd 是否支持视觉 = %s", support_vision ? "是" : "否");
LOGI("==================================================");
llama_context_params cparams = llama_context_default_params();
cparams.n_ctx = 32768;
cparams.n_batch = 1024;
cparams.n_ubatch = 1024;
cparams.n_threads = 4;
cparams.no_perf = true;
g_ctx = llama_init_from_model(g_model, cparams);
if (!g_ctx) {
LOGE("context 初始化失败 ❌");
return JNI_FALSE;
}
LOGI("==================================================");
LOGI("loadMultimodal: 全部加载成功 ✅");
LOGI("==================================================");
return JNI_TRUE;
}
extern "C" JNIEXPORT jstring JNICALL
Java_com_example_llamatest_llamaBridge_chatImage(
JNIEnv* env, jobject /* thiz */,
jstring prompt, jobject bitmap)
{
LOGI("==================================================");
LOGI("chatImage: 【图像对话开始】");
LOGI("==================================================");
if (!g_ctx || !g_mtmd || !g_vocab || !bitmap) {
LOGE("错误:模型未初始化 ❌");
return env->NewStringUTF("模型未初始化");
}
LOGI("所有实例正常 ✅");
const char* prompt_c = env->GetStringUTFChars(prompt, nullptr);
std::string input_text = "<|im_start|>user\n<__media__>\n";
input_text += prompt_c;
input_text += "<|im_end|>\n<|im_start|>assistant\n";
env->ReleaseStringUTFChars(prompt, prompt_c);
LOGI("最终 Prompt:\n%s", input_text.c_str());
LOGI("开始读取 Bitmap");
AndroidBitmapInfo info{};
void* pixels = nullptr;
AndroidBitmap_getInfo(env, bitmap, &info);
AndroidBitmap_lockPixels(env, bitmap, &pixels);
int w = info.width;
int h = info.height;
LOGI("图片大小: %d x %d", w, h);
const int TARGET_SIZE = 224;
std::vector<uint8_t> rgb(TARGET_SIZE * TARGET_SIZE * 3);
uint32_t* src = (uint32_t*)pixels;
for (int y = 0; y < TARGET_SIZE; y++) {
for (int x = 0; x < TARGET_SIZE; x++) {
int sx = x * w / TARGET_SIZE;
int sy = y * h / TARGET_SIZE;
uint32_t c = src[sy * info.width + sx];
rgb[(y*TARGET_SIZE+x)*3+0] = (c >> 16) & 0xFF;
rgb[(y*TARGET_SIZE+x)*3+1] = (c >> 8) & 0xFF;
rgb[(y*TARGET_SIZE+x)*3+2] = c & 0xFF;
}
}
AndroidBitmap_unlockPixels(env, bitmap);
LOGI("Bitmap → RGB 完成,已缩放到 224x224 ✅");
mtmd_bitmap* img = mtmd_bitmap_init(TARGET_SIZE, TARGET_SIZE, rgb.data());
if (!img) {
LOGE("mtmd_bitmap_init 失败 ❌");
return env->NewStringUTF("图片处理失败");
}
LOGI("mtmd_bitmap_init 成功 ✅");
mtmd_input_chunks* chunks = mtmd_input_chunks_init();
LOGI("mtmd_input_chunks_init 成功 ✅");
mtmd_input_text text_cfg{};
text_cfg.text = input_text.c_str();
text_cfg.add_special = true;
text_cfg.parse_special = true;
const mtmd_bitmap* bitmaps[] = { img };
LOGI("==== 开始 mtmd_tokenize ====");
mtmd_tokenize(g_mtmd, chunks, &text_cfg, bitmaps, 1);
size_t chunk_cnt = mtmd_input_chunks_size(chunks);
LOGI("==== mtmd_tokenize 完成 ====");
LOGI("chunk 数量 = %zu", chunk_cnt);
llama_pos pos = 0;
int embd_dim = llama_model_n_embd_inp(g_model);
LOGI("embd_dim = %d", embd_dim);
for (size_t i = 0; i < chunk_cnt; i++) {
const mtmd_input_chunk* chunk = mtmd_input_chunks_get(chunks, i);
int type = mtmd_input_chunk_get_type(chunk);
LOGI("--------------------------------");
LOGI("处理 chunk %zu | type = %d", i, type);
if (type == MTMD_INPUT_CHUNK_TYPE_TEXT) {
size_t ntok = 0;
const llama_token* tok = mtmd_input_chunk_get_tokens_text(chunk, &ntok);
LOGI("文本 token 数量 = %zu", ntok);
for (size_t j = 0; j < ntok; j++) {
llama_token t = tok[j];
LOGI("token [%zu] = %d", j, (int)t);
llama_batch b = llama_batch_get_one((llama_token*)&t, pos++);
int ret = llama_decode(g_ctx, b);
LOGI("文本 llama_decode ret = %d | pos = %d", ret, (int)pos);
}
}
else if (type == MTMD_INPUT_CHUNK_TYPE_IMAGE) {
LOGI("图像 chunk → 开始编码(224x224)");
mtmd_encode_chunk(g_mtmd, chunk);
LOGI("图像编码完成 ✅");
LOGI("开始解码图像 embedding ...");
float* embd = mtmd_get_output_embd(g_mtmd);
if (embd == nullptr) {
LOGE("❌ 严重错误:mtmd_get_output_embd 返回空指针,跳过图像解码");
continue;
}
LOGI("✅ mtmd_get_output_embd 获取成功");
size_t n_tokens = mtmd_input_chunk_get_n_tokens(chunk);
LOGI("图像 token 数量 = %zu", n_tokens);
// ==========================
// ✅ 修复:正确初始化 batch(老版本兼容)
// ==========================
for (size_t j = 0; j < n_tokens; j++) {
// 官方API初始化,杜绝野指针崩溃
llama_batch batch = llama_batch_init(1, embd_dim, 0);
batch.n_tokens = 1;
// 按维度赋值,兼容你的老版本llama.cpp
for (int d = 0; d < embd_dim; d++) {
batch.embd[d] = embd[j * embd_dim + d];
}
batch.pos[0] = pos;
int ret = llama_decode(g_ctx, batch);
LOGI("图像 decode ret = %d | idx=%zu pos=%d", ret, j, (int)pos);
// 释放内存
llama_batch_free(batch);
pos++;
}
LOGI("✅ 所有图像 token 解码完成");
}
else {
LOGI("未知 chunk 类型");
}
}
LOGI("========================================");
LOGI("所有 chunk 解码完成,开始生成回答 🚀");
LOGI("========================================");
std::string output;
const int MAX_GEN = 1024;
llama_token eos = llama_vocab_eos(g_vocab);
LOGI("EOS token = %d", (int)eos);
for (int i = 0; i < MAX_GEN; i++) {
LOGI("开始生成第 %d 个 token", i);
llama_token t = sample_token(g_ctx);
if (t == eos || t == 0) {
LOGI("生成结束:到达 EOS");
break;
}
char buf[256] = {0};
llama_token_to_piece(g_vocab, t, buf, sizeof(buf)-1, 0, false);
output += buf;
LOGI("生成内容: %s", buf);
llama_batch b = llama_batch_get_one(&t, pos++);
llama_decode(g_ctx, b);
}
LOGI("========================================");
LOGI("最终结果: %s", output.c_str());
LOGI("========================================");
mtmd_input_chunks_free(chunks);
mtmd_bitmap_free(img);
return env->NewStringUTF(output.c_str());
}
extern "C" JNIEXPORT void JNICALL
Java_com_example_llamatest_llamaBridge_releaseModel(JNIEnv*, jobject)
{
LOGI("释放模型资源...");
if (g_ctx) { llama_free(g_ctx); LOGI("ctx 释放"); }
if (g_model) { llama_model_free(g_model); LOGI("model 释放"); }
if (g_mtmd) { mtmd_free(g_mtmd); LOGI("mtmd 释放"); }
g_ctx = nullptr;
g_model = nullptr;
g_mtmd = nullptr;
g_vocab = nullptr;
LOGI("全部释放完成 ✅");
}
static llama_token sample_token_text() {
float* logits = llama_get_logits_ith(g_ctx, -1);
int n_vocab = llama_vocab_n_tokens(g_vocab);
int best = 0;
float max = -1e9;
for (int i = 0; i < n_vocab; i++) {
if (logits[i] > max) {
max = logits[i];
best = i;
}
}
return (llama_token)best;
}
// 流式生成(安全版)
extern "C" JNIEXPORT jstring JNICALL
Java_com_example_llamatest_llamaBridge_streamGenerate(
JNIEnv* env, jobject /* thiz */, jstring prompt)
{
if (!g_ctx || !g_model || !g_vocab) {
return env->NewStringUTF("ERROR: 模型未加载");
}
const char* prompt_c = env->GetStringUTFChars(prompt, nullptr);
std::string input = "<start_of_turn>user\n";
input += prompt_c;
input += "<end_of_turn>\n<start_of_turn>model\n";
env->ReleaseStringUTFChars(prompt, prompt_c);
std::vector<llama_token> tokens(512);
int n_tokens = llama_tokenize(g_vocab, input.c_str(), input.size(), tokens.data(), 512, true, false);
if (n_tokens <= 0) return env->NewStringUTF("分词失败");
// 一次性输入
llama_batch batch = llama_batch_get_one(tokens.data(), n_tokens);
llama_decode(g_ctx, batch);
std::string full_output;
const int MAX_GEN = 40;
const llama_token eos = llama_vocab_eos(g_vocab);
for (int i = 0; i < MAX_GEN; i++) {
llama_token tok = sample_token_text();
if (tok == eos || tok == 0) break;
char buf[256] = {0};
llama_token_to_piece(g_vocab, tok, buf, sizeof(buf)-1, 0, false);
full_output += buf;
// 推理下一个
llama_batch b = llama_batch_get_one(&tok, 1);
llama_decode(g_ctx, b);
}
return env->NewStringUTF(full_output.c_str());
}
四、APP 运行要求
- arm64-v8a 手机(现在 99% 安卓机)
- 内存 ≥ 6GB(推荐 8GB)
- 模型占用 ≈ 2.5GB
- 离线运行,不联网
运行结果如下:
文本可以正常运行,视频如下:
手机内存限制,图片和文件同步处理过程中出现死机,调试半天效果,还是crash在视频embedding ..., 下一步需要进行优化。
Line 17053: 04-18 12:33:28.660 26737 27804 I LlamaMM : loadMultimodal: 【Qwen2.5-Omni-3B】加载开始
Line 17054: 04-18 12:33:28.660 26737 27804 I LlamaMM : ==================================================
Line 17132: 04-18 12:33:32.550 26737 27804 I LlamaMM : 模型加载成功 ✅
Line 17133: 04-18 12:33:32.550 26737 27804 I LlamaMM : vocab 获取成功 ✅
Line 17337: 04-18 12:33:37.240 26737 27804 I LlamaMM : mmproj 加载成功 ✅
Line 17338: 04-18 12:33:37.240 26737 27804 I LlamaMM : ==================================================
Line 17339: 04-18 12:33:37.240 26737 27804 I LlamaMM : ✅ mtmd 是否支持视觉 = 是
Line 17340: 04-18 12:33:37.240 26737 27804 I LlamaMM : ==================================================
Line 17424: 04-18 12:33:38.022 26737 27804 I LlamaMM : ==================================================
Line 17425: 04-18 12:33:38.022 26737 27804 I LlamaMM : loadMultimodal: 全部加载成功 ✅
Line 17426: 04-18 12:33:38.022 26737 27804 I LlamaMM : ==================================================
Line 20939: 04-18 12:34:04.671 26737 28036 I LlamaMM : ==================================================
Line 20940: 04-18 12:34:04.671 26737 28036 I LlamaMM : chatImage: 【图像对话开始】
Line 20941: 04-18 12:34:04.671 26737 28036 I LlamaMM : ==================================================
Line 20942: 04-18 12:34:04.671 26737 28036 I LlamaMM : 所有实例正常 ✅
Line 20943: 04-18 12:34:04.671 26737 28036 I LlamaMM : 最终 Prompt:
Line 20944: 04-18 12:34:04.671 26737 28036 I LlamaMM : <|im_start|>user
Line 20945: 04-18 12:34:04.671 26737 28036 I LlamaMM : <__media__>
Line 20946: 04-18 12:34:04.671 26737 28036 I LlamaMM : 识别图片文字<|im_end|>
Line 20947: 04-18 12:34:04.671 26737 28036 I LlamaMM : <|im_start|>assistant
Line 20948: 04-18 12:34:04.671 26737 28036 I LlamaMM : 开始读取 Bitmap
Line 20949: 04-18 12:34:04.671 26737 28036 I LlamaMM : 图片大小: 448 x 448
Line 20958: 04-18 12:34:04.681 26737 28036 I LlamaMM : Bitmap → RGB 完成,已缩放到 224x224 ✅
Line 20960: 04-18 12:34:04.683 26737 28036 I LlamaMM : mtmd_bitmap_init 成功 ✅
Line 20961: 04-18 12:34:04.683 26737 28036 I LlamaMM : mtmd_input_chunks_init 成功 ✅
Line 20962: 04-18 12:34:04.683 26737 28036 I LlamaMM : ==== 开始 mtmd_tokenize ====
Line 21048: 04-18 12:34:04.742 26737 28036 I LlamaMM : ==== mtmd_tokenize 完成 ====
Line 21049: 04-18 12:34:04.742 26737 28036 I LlamaMM : chunk 数量 = 3
Line 21050: 04-18 12:34:04.743 26737 28036 I LlamaMM : embd_dim = 2048
Line 21051: 04-18 12:34:04.743 26737 28036 I LlamaMM : --------------------------------
Line 21052: 04-18 12:34:04.743 26737 28036 I LlamaMM : 处理 chunk 0 | type = 0
Line 21053: 04-18 12:34:04.743 26737 28036 I LlamaMM : 文本 token 数量 = 9
Line 21054: 04-18 12:34:04.743 26737 28036 I LlamaMM : token [0] = 151644
Line 21055: 04-18 12:34:04.743 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 1
Line 21056: 04-18 12:34:04.743 26737 28036 I LlamaMM : token [1] = 872
Line 23190: 04-18 12:34:13.791 26737 28036 I LlamaMM : 文本 llama_decode ret = 0 | pos = 2
Line 23191: 04-18 12:34:13.792 26737 28036 I LlamaMM : token [2] = 198
Line 23926: 04-18 12:34:27.729 26737 28036 I LlamaMM : 文本 llama_decode ret = 0 | pos = 3
Line 23927: 04-18 12:34:27.729 26737 28036 I LlamaMM : token [3] = 27
Line 33500: 04-18 12:35:20.801 26737 28036 I LlamaMM : 文本 llama_decode ret = 0 | pos = 4
Line 33501: 04-18 12:35:20.801 26737 28036 I LlamaMM : token [4] = 91
Line 33503: 04-18 12:35:20.805 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 5
Line 33504: 04-18 12:35:20.805 26737 28036 I LlamaMM : token [5] = 13013
Line 33505: 04-18 12:35:20.805 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 6
Line 33506: 04-18 12:35:20.805 26737 28036 I LlamaMM : token [6] = 4906
Line 33507: 04-18 12:35:20.805 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 7
Line 33508: 04-18 12:35:20.805 26737 28036 I LlamaMM : token [7] = 91
Line 33509: 04-18 12:35:20.805 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 8
Line 33510: 04-18 12:35:20.805 26737 28036 I LlamaMM : token [8] = 29
Line 33511: 04-18 12:35:20.805 26737 28036 I LlamaMM : 文本 llama_decode ret = -1 | pos = 9
Line 33512: 04-18 12:35:20.806 26737 28036 I LlamaMM : --------------------------------
Line 33513: 04-18 12:35:20.806 26737 28036 I LlamaMM : 处理 chunk 1 | type = 1
Line 33514: 04-18 12:35:20.806 26737 28036 I LlamaMM : 图像 chunk → 开始编码(224x224)
Line 76739: 04-18 12:41:05.321 26737 28036 I LlamaMM : 图像编码完成 ✅
Line 76740: 04-18 12:41:05.322 26737 28036 I LlamaMM : 开始解码图像 embedding ...
Line 76741: 04-18 12:41:05.322 26737 28036 I LlamaMM : 图像 token 数量 = 64
--------- beginning of crash
04-18 12:41:05.322 26737 28036 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 28036 (Thread-12), pid 26737 (ample.llamatest)
04-18 12:41:05.485 1431 1431 D vendor.oplus.hardware.charger-V6-service: get_ui_charge_icon_type 0 0 0 0 0
04-18 12:41:05.488 1431 1431 D vendor.oplus.hardware.charger-V6-service: oplus_chg_get_cpa_power 2500 charge_online 1 retention_topic 0
04-18 12:41:05.500 1818 4063 I OplusHansManager : uid=10116, pkg=com.baidu.input_oppo cannot transition from R to M, importance=input
04-18 12:41:05.504 1818 4137 I OplusCpuLimitManager: OplusCpuHighLoad: newLoadLevel=1, newMngLevel=1, (nl_load=1, nl_thermal=1, nl_scene=1), curMngLevel=1, CL=54, curWinCL=54, curShortWinCL= 56, latestCL=52, latestThermalLevel=8, expHlScene=false
04-18 12:41:05.505 6682 495 D ShellSubscriberWorkerThread: the first 8 bytes empty, and RestData length > 8
04-18 12:41:05.547 4057 4400 D CommonApp-559-10432: DefaultState EVENT_SCENARIOS_CHANGE
04-18 12:41:05.547 4057 4400 D CommonApp-559-10432: scene=COMMON pn=com.example.llamatest
04-18 12:41:05.650 30525 30525 E crash_dump64: failed to get the guest state header for thread 26737: Bad address
04-18 12:41:05.651 30525 30525 E crash_dump64: failed to get the guest state header for thread 26747: Bad address
04-18 12:41:05.652 30525 30525 E crash_dump64: failed to get the guest state header for thread 26748: Bad address
04-18 12:41:05.652 30525 30525 E crash_dump64: failed to get the guest state header for thread 26749: Bad address
04-18 12:41:05.652 30525 30525 E crash_dump64: failed to get the guest state header for thread 26750: Bad address
04-18 12:41:05.652 30525 30525 E crash_dump64: failed to get the guest state header for thread 26751: Bad address
04-18 12:41:05.653 30525 30525 E crash_dump64: failed to get the guest state header for thread 26752: Bad address
04-18 12:41:05.653 30525 30525 E crash_dump64: failed to get the guest state header for thread 26753: Bad address
04-18 12:41:05.653 30525 30525 E crash_dump64: failed to get the guest state header for thread 26755: Bad address
04-18 12:41:05.654 30525 30525 E crash_dump64: failed to get the guest state header for thread 26757: Bad address
04-18 12:41:05.654 30525 30525 E crash_dump64: failed to get the guest state header for thread 26758: Bad address
04-18 12:41:05.654 30525 30525 E crash_dump64: failed to get the guest state header for thread 26765: Bad address
04-18 12:41:05.655 30525 30525 E crash_dump64: failed to get the guest state header for thread 26836: Bad address
04-18 12:41:05.655 30525 30525 E crash_dump64: failed to get the guest state header for thread 26848: Bad address
04-18 12:41:05.655 30525 30525 E crash_dump64: failed to get the guest state header for thread 26895: Bad address
04-18 12:41:05.656 30525 30525 E crash_dump64: failed to get the guest state header for thread 26913: Bad address
04-18 12:41:05.657 30525 30525 E crash_dump64: failed to get the guest state header for thread 26934: Bad address
04-18 12:41:05.657 30525 30525 E crash_dump64: failed to get the guest state header for thread 26940: Bad address
04-18 12:41:05.657 30525 30525 E crash_dump64: failed to get the guest state header for thread 26942: Bad address
04-18 12:41:05.657 30525 30525 E crash_dump64: failed to get the guest state header for thread 26959: Bad address
04-18 12:41:05.657 30525 30525 E crash_dump64: failed to get the guest state header for thread 26964: Bad address
04-18 12:41:05.658 30525 30525 E crash_dump64: failed to get the guest state header for thread 27288: Bad address
04-18 12:41:05.658 30525 30525 E crash_dump64: failed to get the guest state header for thread 27294: Bad address
04-18 12:41:05.659 30525 30525 E crash_dump64: failed to get the guest state header for thread 27507: Bad address
04-18 12:41:05.659 30525 30525 E crash_dump64: failed to get the guest state header for thread 28036: Bad address
04-18 12:41:05.659 30525 30525 E crash_dump64: failed to get the guest state header for thread 28042: Bad address
04-18 12:41:05.659 30525 30525 E crash_dump64: failed to get the guest state header for thread 28043: Bad address
04-18 12:41:05.659 30525 30525 E crash_dump64: failed to get the guest state header for thread 28044: Bad address
04-18 12:41:05.660 30525 30525 E crash_dump64: failed to get the guest state header for thread 28624: Bad address
04-18 12:41:05.661 30525 30525 E crash_dump64: failed to get the guest state header for thread 28982: Bad address
04-18 12:41:05.661 30525 30525 E crash_dump64: failed to get the guest state header for thread 29612: Bad address
04-18 12:41:05.662 30525 30525 E crash_dump64: failed to get the guest state header for thread 30279: Bad address
04-18 12:41:05.662 30525 30525 E crash_dump64: failed to get the guest state header for thread 30430: Bad address
04-18 12:41:07.112 30525 30525 F DEBUG : Process name is com.example.llamatest, uid is 10432, not key_process
04-18 12:41:07.112 30525 30525 F DEBUG : keyProcess: 0
04-18 12:41:07.112 30525 30525 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
04-18 12:41:07.112 30525 30525 F DEBUG : Build fingerprint: 'OPPO/PJC110/OP591D:15/AP3A.240617.008/T.221035c-7152f-1:user/release-keys'
04-18 12:41:07.112 30525 30525 F DEBUG : Revision: '0'
04-18 12:41:07.112 30525 30525 F DEBUG : ABI: 'arm64'
04-18 12:41:07.112 30525 30525 F DEBUG : Timestamp: 2026-04-18 12:41:05.910132574+0800
04-18 12:41:07.112 30525 30525 F DEBUG : Process uptime: 492s
04-18 12:41:07.112 30525 30525 F DEBUG : Cmdline: com.example.llamatest
04-18 12:41:07.112 30525 30525 F DEBUG : pid: 26737, tid: 28036, name: Thread-12 >>> com.example.llamatest <<<
04-18 12:41:07.112 30525 30525 F DEBUG : uid: 10432
04-18 12:41:07.112 30525 30525 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000
04-18 12:41:07.112 30525 30525 F DEBUG : Cause: null pointer dereference
04-18 12:41:07.113 30525 30525 F DEBUG : x0 0000000000000001 x1 b4000077551e7480 x2 00000076da1a3cd8 x3 0000000000000003
04-18 12:41:07.113 30525 30525 F DEBUG : x4 0000000000000000 x5 8080000000000000 x6 000000006574616d x7 0000000000008080
04-18 12:41:07.113 30525 30525 F DEBUG : x8 0000000000000000 x9 0000000000000000 x10 b4000076de36f240 x11 0000000000000000
04-18 12:41:07.113 30525 30525 F DEBUG : x12 00000076da1a3d10 x13 0000000000000019 x14 00000076da1a5048 x15 0000000034155555
04-18 12:41:07.113 30525 30525 F DEBUG : x16 0000000000000001 x17 00000077dfca74ac x18 00000074d8184000 x19 b40000773a2c9c00
04-18 12:41:07.113 30525 30525 F DEBUG : x20 0000000000000000 x21 b40000773a2c9cd0 x22 0000000000000003 x23 0000000000000003
04-18 12:41:07.113 30525 30525 F DEBUG : x24 00000077e4b4e430 x25 0000000000000000 x26 00000077e4b4e430 x27 0000000000000000
04-18 12:41:07.113 30525 30525 F DEBUG : x28 00000076da1a59b0 x29 00000076da1a5980
04-18 12:41:07.113 30525 30525 F DEBUG : lr 00000076bb53569c sp 00000076da1a54a0 pc 00000076bb535728 pst 0000000080001000
04-18 12:41:07.113 30525 30525 F DEBUG : 24 total frames
04-18 12:41:07.113 30525 30525 F DEBUG : backtrace:
04-18 12:41:07.113 30525 30525 F DEBUG : #00 pc 0000000000076728 /data/app/~~lQupF08BDgVSXtqvfKhhSA==/com.example.llamatest-bKMVgzwXB9VK2ogYVV0EWg==/base.apk!libllama_jni.so (offset 0xc0000) (Java_com_example_llamatest_llamaBridge_chatImage+2440) (BuildId: 2eac2bbe892efb7db4758ef558a9b2ffbff17079)
04-18 12:41:07.113 30525 30525 F DEBUG : #01 pc 0000000000534170 /apex/com.android.art/lib64/libart.so (art_quick_generic_jni_trampoline+144) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #02 pc 000000000051d974 /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #03 pc 000000000051b298 /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+2192) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #04 pc 0000000000673388 /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+16624) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #05 pc 00000000005367d8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #06 pc 0000000000001534 /data/app/~~lQupF08BDgVSXtqvfKhhSA==/com.example.llamatest-bKMVgzwXB9VK2ogYVV0EWg==/base.apk (com.example.llamatest.MainActivity.onActivityResult$lambda$21+0)
04-18 12:41:07.113 30525 30525 F DEBUG : #07 pc 0000000000512a7c /apex/com.android.art/lib64/libart.so (art::interpreter::ExecuteSwitch(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool) (.__uniq.112435418011751916792819755956732575238.llvm.6886435955882106544)+68) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #08 pc 0000000000512c34 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.6886435955882106544)+384) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #09 pc 000000000051b5e8 /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+3040) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #10 pc 0000000000673388 /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+16624) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #11 pc 00000000005367d8 /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #12 pc 0000000000001100 /data/app/~~lQupF08BDgVSXtqvfKhhSA==/com.example.llamatest-bKMVgzwXB9VK2ogYVV0EWg==/base.apk (com.example.llamatest.MainActivity$$ExternalSyntheticLambda3.run+0)
04-18 12:41:07.113 30525 30525 F DEBUG : #13 pc 0000000000512a7c /apex/com.android.art/lib64/libart.so (art::interpreter::ExecuteSwitch(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool) (.__uniq.112435418011751916792819755956732575238.llvm.6886435955882106544)+68) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #14 pc 0000000000512c34 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.6886435955882106544)+384) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #15 pc 000000000051433c /apex/com.android.art/lib64/libart.so (artQuickToInterpreterBridge+532) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #16 pc 0000000000534298 /apex/com.android.art/lib64/libart.so (art_quick_to_interpreter_bridge+88) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #17 pc 000000000202a840 /memfd:jit-cache (deleted) (offset 0x2000000) (java.lang.Thread.run+144)
04-18 12:41:07.113 30525 30525 F DEBUG : #18 pc 000000000051d974 /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #19 pc 00000000004a2d70 /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+140) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #20 pc 0000000000bb0c4c /apex/com.android.art/lib64/libart.so (art::detail::ShortyTraits<(char)86>::Type art::ArtMethod::InvokeInstance<(char)86>(art::Thread*, art::ObjPtr<art::mirror::Object>, art::detail::ShortyTraits<>::Type...)+64) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #21 pc 00000000006a380c /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+788) (BuildId: 6bcc64d9d5f20015626d4ef502940ca3)
04-18 12:41:07.113 30525 30525 F DEBUG : #22 pc 00000000000a3ce8 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+196) (BuildId: 1d6558a3b88dbb195284ac1e713c1e3c)
04-18 12:41:07.113 30525 30525 F DEBUG : #23 pc 000000000009614c /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68) (BuildId: 1d6558a3b88dbb195284ac1e713c1e3c)
04-18 12:41:07.169 832 832 E tombstoned: Tombstone written to: tombstone_18
五、一句话总结
Windows + Android Studio + llama.cpp + Qwen2.5-Omni-3B 全流程移植 完全可行,文本对话正常,图片+文字多模态处理场景下手机硬件吃不消,后期考虑升级硬件,使用异构计算优化。