lucataco/bakllava 🖼️📝🔢 → 📝

▶️ 39.8K runs 📅 Oct 2023 ⚙️ Cog 0.8.6 🔗 GitHub ⚖️ License
image-captioning image-to-text visual-question-answering

About

BakLLaVA-1 is a Mistral 7B base augmented with the LLaVA 1.5 architecture

Example Output

Prompt:

"Describe this image"

Output

The image features a detailed illustration of a human heart, showcasing its various parts and blood vessels. The heart is depicted in full color, with its interior and exterior structures visible.

The heart is surrounded by a network of blood vessels, including arteries and veins. There are at least six distinct blood vessels radiating from the heart, some of which are larger and more prominent than others. This illustration provides a clear and comprehensive view of the heart and its vital connections within the circulatory system.

Performance Metrics

7.10s Prediction Time
357.02s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/pbxt/JklacZyHwJH9UPsYUwwUnh4YLYDbAsjmz53SqKgSWWo3yPTW/heart.jpg",
  "prompt": "Describe this image",
  "max_sequence": 512
}
Input Parameters
image (required) Type: string
Input Image
prompt Type: stringDefault: Describe this image
Input prompt
max_sequence Type: integerDefault: 512Range: 8 - 2048
Maximum sequence length
Output Schema

Output

Type: string

Example Execution Logs
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Version Details
Version ID
452b2fa0b66d8acdf40e05a7f0af948f9c6065f6da5af22fce4cead99a26ff3d
Version Created
October 24, 2023
Run on Replicate →