cjwbw/cogagent-chat 🖼️📝🔢 → 📝
About
A Visual Language Model for GUI Agents

Example Output
Output
The screenshot shows a web page from Skyscanner for booking flights from London (Any) to Bangkok (Any) for 1 adult in Economy class. The page displays options to sort and filter results, with a total of 1009 results shown. Stops are listed on the left with options for direct and 1 stop flights from different airlines. On the right, flight options are presented with details such as departure and arrival times, duration, airlines, whether they have a stopover, and prices in both pounds sterling and Thai baht. The cheapest option is highlighted for £480, with the fastest being £855. Outbound flight options are shown on the left, with return options on the right. User controls for sorting and selecting flights are also visible.
Performance Metrics
20.13s
Prediction Time
283.75s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/pbxt/KLeBpmZL2GjRa2c77grUodPILFbYdY8re3AzfuoBmQ3rEH29/Screenshot%202024-02-05%20at%2001.54.19.png", "query": "what does this screenshot tell you?", "temperature": 0.9 }
Input Parameters
- image (required)
- Input image.
- query
- Input query.
- temperature
- Adjusts randomness of textual outputs, greater than 1 is random and 0 is deterministic.
Output Schema
Output
Version Details
- Version ID
c3429fc8f69bc71d0ee109e99c62b1e223b7cdfc0839b8e44a4326e12752656e
- Version Created
- February 5, 2024