程序员利用视觉分析API

123 阅读2分钟

市场上的产品:

Gpt 4v : 最佳

jina.ai  scenex.jina.ai/scenes

阿里巴巴的通义千问 tongyi.aliyun.com/

LLaVA : llava-vl.github.io/ , github.com/haotian-liu…  

提示语:

你是一名 专业假发分析师,请分析图片,按以下数据结构返回结果:


{

"properties": {

"details": {

"description": "list all the object details in the image.",

"type": "array",

"items": {

"type": "string"

}

},

"color_palette": {

"type": "array",

"description": "Captures the predominant colors in the image.",

"items": {

"type": "string"

}

},

"category": {

"description": "The best fitting category for the image.",

"enum": ["Nature", "Animals", "People and Portraits", "Architecture and Cities", "Food and Drink"],

"type": "string"

},

"location": {

"type": "string",

"description": "Specifies the location where the image was captured."

},

"visual_story": {

"description": "Crafts a captivating visual story around the image, enhancing your online reading experience.",

"type": "string"

},

"trending_keywords": {

"type": "array",

"description": "Highlights trending keywords associated with the image, optimizing your search experience.",

"items": {

"type": "string"

}

},

"season": {

"enum": ["Spring", "Summer", "Autumn", "Winter"],

"description": "Identifies the season or climate depicted in the image.",

"type": "string"

},

"editorial_commentary": {

"description": "Provides insightful editorial commentary on the image, adding depth to your understanding of German trends and culture.",

"type": "string"

},

"featured_person": {

"description": "Highlights the main person or character in the image.",

"type": "string"

},

"short_description": {

"description": "The short description of the image, at most 10 words.",

"type": "string"

}

},

"type": "object"

}


image.png

 

 

jina.ai人工智能回答:

 

{

  "details": \[

    "mannequin",

    "black top",

    "straight short bob wig",

    "crystal necklace",

    "silver necklace",

    "dark background"

  ],

  "color\_palette": \[

    "black",

    "silver"

  ],

  "category": "People and Portraits",

  "location": "",

  "visual\_story": "A mannequin styled with sophisticated simplicity, adorned with a sleek bob wig and a dazzling crystal necklace, standing against an obscure backdrop, evoking a sense of elegant fashion.",

  "trending\_keywords": \[

    "fashion",

    "mannequin",

    "wig",

    "bob haircut",

    "crystal necklace"

  ],

  "season": "",

  "editorial\_commentary": "",

  "featured\_person": "",

  "short\_description": "Mannequin with black wig and necklace"

}

 

 

 

阿里巴巴的通义千问:

image.png

 

 

 

 

Gpt4V 的分析:

 


{

"properties": {

"is\_advertisement": {

"type": "boolean",

"description": false

},

"details": {

"description": "list all the object details in the image.",

"type": "array",

"items": \["Woman", "Blonde hair", "Black top", "Blue jeans", "Necklace", "Bracelet", "Residential background"]

},

"color\_palette": {

"type": "array",

"description": "Captures the predominant colors in the image.",

"items": \["Black", "Blue", "Blonde", "Pink", "Brown", "White"]

},

"category": {

"description": "The best fitting category for the image.",

"enum": \["People and Portraits"],

"type": "string"

![image.png](https://p1-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/fa43487017e34dd2ac3726f1a4e6fb49~tplv-k3u1fbpfcp-jj-mark:0:0:0:0:q75.image#?w=834&h=620&s=2072363&e=png&b=ffffff)
},

"location": {

"type": "string",

"description": "A residential area."

},

"brand": {

"description": "The brand being advertised, if any.",

"type": "string"

},

"visual\_story": {

"description": "A young woman confidently poses in a casual outfit, showcasing her personal style in a suburban setting.",

"type": "string"

},

"trending\_keywords": {

"type": "array",

"description": "Highlights trending keywords associated with the image, optimizing your search experience.",

"items": \["Fashion", "Style", "Casual wear", "Portrait", "Suburban"]

},

"season": {

"enum": \["Spring"],

"description": "The attire and background suggest it could be spring.",

"type": "string"

},

"editorial\_commentary": {

"description": "The image captures a moment of everyday elegance, reflecting contemporary casual fashion trends.",

"type": "string"

},

"featured\_person": {

"description": "The main person in the image is a young woman.",

"type": "string"

},

"short\_description": {

"description": "Woman in casual attire posing outdoors.",

"type": "string"

}

},

"type": "object"

}

  结论: Gpt4v是最强大的。