intelligent-utilities/topic-tags 📝🔢 → 📝

▶️ 11 runs 📅 Aug 2025 ⚙️ Cog 0.14.0
About

Generate topic tags for a passage of text
Example Output

Output

ai-generated-artconsistent-character-generationimage-model-comparisonreplicate-modelskontext-progpt-image-1gen-4-imageseededit-3flux-kontextface-persistence
Performance Metrics

2.19s Prediction Time
2.49s Total Time
All Input Parameters
{
  "text": "Generate consistent characters\nPosted July 21, 2025 by \nfofr\nA grid of 8 images showing the same character in different scenes\nUntil recently, the best way to generate images of a consistent character was from a trained lora. You would need to create a dataset of images and then train a FLUX lora on them.\n\nIf you want to go back further, you might remember having to use a ComfyUI workflow. A workflow that would combine SDXL, controlnets, IPAdapters and some non-commercial face landmark models. Things have got remarkably simpler.\n\nToday we have a choice of state of the art image models that can do this accurately from a single reference. In this blog post we’ll highlight which models can do this, and which is best depending on your needs.\n\nshe is wearing a pink t-shirt with the text “Replicate” on it\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“she is wearing a pink t-shirt with the text “Replicate” on it”\n\nThe best models for consistent characters\nAs of July 2025, there are four models on Replicate that can create a realistic and accurate output from a single reference. In order of release:\n\nOpenAI’s gpt-image-1\nRunway’s Gen-4 Image\nBlack Forest Labs’s FLUX.1 Kontext\nBytedance’s SeedEdit 3\nSince this blog post was written, two new models have also been released:\n\nIdeogram’s Character\nRunway’s Gen-4 Image Turbo\nFLUX.1 Kontext comes in a few different flavors: pro, max and dev. Dev is an open source version of kontext, which is more controllable and fine-tunable, but isn’t as powerful as pro.\n\nTo help write this blog post, I put together a little Replicate model to make it easy to compare outputs. Here is our comparison model, it runs FLUX.1 Kontext, SeedEdit 3.0, gpt-image-1 and Runway’s Gen-4 in parallel: fofr/compare-character-consistency.\n\n(Did you know that anyone can create and push models to Replicate?)\n\nPrice and speed comparison\nFirst, the essentials: speed and cost. The table below shows the price and speed of each model. The price of gpt-image-1 depends on the output quality you choose (low, medium, high). The price of Gen-4 Image depends on whether you choose 720p or 1080p resolution.\n\nIn summary though, gpt-image-1 is the slowest and most expensive model, and Kontext Dev is the cheapest and fastest. The tradeoffs are in quality, and we’ll look at that in more detail below.\n\nModel\tPrice (per image)\tSpeed\tDate\nOpenAI\ngpt-image-1\t$0.04–$0.17\t16s–59s\tApril 2025\nRunway\nGen-4 Image\t$0.05–$0.08\t20s–27s\tApril 2025\nBlack Forest Labs\nFLUX.1 Kontext Pro\t$0.04\t5s\tMay 2025\nBlack Forest Labs\nFLUX.1 Kontext Max\t$0.08\t7s\tMay 2025\nBlack Forest Labs\nFLUX.1 Kontext Dev\t$0.025\t4s\tMay 2025\nBytedance\nSeedEdit 3\t$0.03\t13s\tJuly 2025\nPreserving a character’s identity\nLet’s compare how well each model preserves a character’s identity.\n\nIn the following comparisons, we are using gpt-image-1 with the high quality and high fidelity settings. We stick with FLUX.1 Kontext Pro as the best compromise between quality and speed. And we use Gen-4 Image at 1080p.\n\nPhotographic accuracy\nBelow are a varied set of examples, showing the strengths and weaknesses of each model, all focusing on photographic outputs.\n\nA new activity\nIn these two examples, we can see the strengths of Gen-4 coming through. The composition is the most compelling, and the character is the most accurate.\n\nshe is playing the piano\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“she is playing the piano”\n\nhe is playing the guitar\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“he is playing the guitar”\n\nTweak the scene\nIf you want to keep most of the original composition, and change just a small part of the scene, all models handle this well.\n\nremove the glass of drink\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“remove the glass of drink”\n\nHalf-length portrait with unusual hair and eye color\nA more challenging comparison, here is a character with heterochromia and hair with two colors, as well as some facial marks.\n\nWe can see that every model is capable of handling the hair and eyes. (Some needed a few retries to get this right.)\n\na half-length portrait photo of her in a summer forest\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“a half-length portrait photo of her in a summer forest”\n\nA shave, a coat and some rain\nRather than keeping everything consistent, let’s try to keep the same person but change some things.\n\nIt’s a bit of a mixed bag here, only SeedEdit 3 and gpt-image-1 can handle the clean-shaven request. But gpt-image-1 is also a completely different person, so that’s probably the worst result.\n\nremove his beard, put him in a raincoat, it is raining\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“remove his beard, put him in a raincoat, it is raining”\n\nTrying tattoos\nHere we try a character with many distinct tattoos to see how well each model handles them. None are perfect, with Gen-4 and gpt-image-1 maintaining the neck tattoos the best.\n\nhe is a chef cooking a meal in a restaurant kitchen\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“he is a chef cooking a meal in a restaurant kitchen”\n\nCreative tasks and full transformations\nIn these examples, we are looking to transform the character into something else, or show them in a different style. A good model will perform the transformation while maintaining the character’s identity.\n\nChanging the style\nWith these simple style changes, we can see quickly that Gen-4 should not be used for these stylistic tasks.\n\nrestyle this person as anime\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“restyle this person as anime”\n\nmake this a watercolor painting\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“make this a watercolor painting”\n\nBecoming something else\nIt’s halloween. We turn her into a witch, and him into an ogre, and someone else into a blue na’vi from Pandora. Gen-4 does the best witch output, but also the least convincing ogre.\n\nmake her a witch\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“make her a witch”\n\nturn him into a green skinned ogre\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“turn him into a green skinned ogre”\n\nFor this example, Kontext Pro didn’t want to create an image of a blue na’vi from Pandora, we’re showing Kontext Dev instead.\n\nturn him into a blue na’vi from pandora (avatar)\n\nOriginal reference image\nOriginal\nA grid of 4 outputs\n“turn him into a blue na’vi from pandora (avatar)“\n\nConclusion\nOverall, we found that:\n\nKontext Pro is versatile and can give fabulous results, but often there are too many artifacts around the face, and these frequently make the image unusable (these artifacts do not seem to be present in Kontext Dev, but Dev has overall lower quality)\ngpt-image-1 will always add a distinctive yellow tint, and even with the high quality and high fidelity settings enabled, the identity will frequently change. With the highest cost and slowest speed, we’d only use this for the most complex of tasks.\nSeedEdit 3 tends to restrict itself to the initial composition, making it difficult to prompt a new angle or scene. Outputs are typically softer and can look more AI generated. Coherency is also a problem in complex scenes.\nRunway’s Gen-4 is the most adaptable and accurate when it comes to likeness in photos. It’s main drawback is coherency in complex scenes, and you might find some unexpected arms, limbs or hands. Sometimes this can be fixed with a few retries, sometimes not. Gen-4 also cannot restyle a scene.\nOur recommendations\nFor photos you should start with Runway’s Gen-4 Image model. If you need faster or cheaper outputs, then Kontext Pro is the next best option. If you get some outputs from Gen-4 that aren’t coherent, you can always put them through Kontext Pro to fix them.\n\nFor more creative tasks, and complete character transformations, try Kontext Pro first. If the task is more complex, and if you can afford it, you should also try gpt-image-1. SeedEdit 3 is a good cheap alternative if you can’t afford gpt-image-1 and kontext isn’t working for you. Do not use Gen-4 for stylistic tasks.\n\nThat’s it for now, but stay tuned for more models, comparisons and experiments. Until then, try something new at replicate.com/explore, and follow us on X to see what we’re up to.",
  "number_of_tags": 10
}
Input Parameters
text (required) Type: string: Text to tag
number_of_tags Type: integerDefault: 5Range: 1 - 50: Number of tags
Output Schema
Output
Type: array • Items Type: string
Version Details
Version ID: 9f3b644db1a196a7e95fa640abddd7939ef7c8cdd498086a70768c9020fe273d
Version Created: August 14, 2025
Run on Replicate →