prompt-classifier

Maintainer: fofr

1.7K

Last updated 5/17/2024

Property	Value
Model Link	View on Replicate
API Spec	View on Replicate
Github Link	No Github link provided
Paper Link	No paper link provided

Get summaries of the top AI models delivered straight to your inbox:

Model overview

prompt-classifier is a model that determines the toxicity of text-to-image prompts. It is a fine-tuned version of the llama-13b language model, with a focus on assessing the safety and appropriateness of prompts used to generate images. The model outputs a safety ranking between 0 (safe) and 10 (toxic) for a given prompt. This can be useful for content creators, AI model developers, and others who work with text-to-image generation to ensure their prompts do not produce harmful or undesirable content.

Similar models include codellama-13b-instruct, a 13 billion parameter Llama tuned for coding and conversation, and llamaguard-7b, a 7 billion parameter Llama 2-based input-output safeguard model.

Model inputs and outputs

prompt-classifier takes a text prompt as input and outputs a safety ranking between 0 and 10, indicating the level of toxicity or inappropriateness in the prompt.

Inputs

Prompt: The text prompt to be evaluated for safety.
Seed: A random seed value, which can be left blank to randomize the seed.
Debug: A boolean flag to enable debugging output.
Top K: The number of most likely tokens to consider when decoding the text.
Top P: The percentage of most likely tokens to consider when decoding the text.
Temperature: A value adjusting the randomness of the output, with higher values being more random.
Max New Tokens: The maximum number of new tokens to generate.
Min New Tokens: The minimum number of new tokens to generate (or -1 to disable).
Stop Sequences: A comma-separated list of sequences to stop generation at.
Replicate Weights: The path to fine-tuned weights produced by a Replicate fine-tune job.

Outputs

The model outputs a list of strings representing the predicted safety ranking for the input prompt.

Capabilities

The prompt-classifier model is designed to assess the toxicity and safety of text-to-image prompts. This can be useful for content creators, AI model developers, and others who work with text-to-image generation to ensure their prompts do not produce harmful or undesirable content.

What can I use it for?

The prompt-classifier model can be used to screen text-to-image prompts before generating images, helping to ensure the prompts do not produce content that is toxic, inappropriate, or unsafe. This can be particularly helpful for content creators, publishers, and others who rely on text-to-image generation, as it allows them to proactively identify and address potentially problematic prompts.

Things to try

One interesting thing to try with the prompt-classifier model is to use it as part of a broader system for managing and curating text-to-image prompts. For example, you could integrate the model into a workflow where prompts are automatically evaluated for safety before being used to generate images. This could help streamline the content creation process and reduce the risk of producing harmful or undesirable content.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

image-prompts

fofr

image-prompts is a model developed by Replicate creator fofr that can generate image prompts for the Midjourney text-to-image AI model. This model can be used to auto-complete prompts for any text-to-image model, including the DALLE family. Similar models like text2image-prompt-generator and openjourney also focus on generating high-quality text prompts for different image generation models. Model inputs and outputs The image-prompts model takes a text prompt as input and generates an array of prompts as output. The key inputs are: Inputs Prompt**: The text prompt to generate image prompts from Max Length**: The maximum number of tokens (generally 2-3 per word) to generate Temperature**: The randomness of the outputs, with higher values producing more diverse and unpredictable results Repetition Penalty**: The penalty for repeated words in the generated text, where values greater than 1 discourage repetition Outputs An array of generated image prompts Capabilities The image-prompts model can generate a diverse set of prompts based on a given text input. This can be useful for quickly generating a large number of prompts to test with image generation models like Midjourney, DALL-E, or Stable Diffusion. The model has been trained on a dataset of real prompts used with the Midjourney service, so the generated prompts reflect the style and phrasing used by Midjourney users. What can I use it for? The image-prompts model can be used to accelerate the process of finding high-quality prompts for text-to-image AI models. By generating a variety of prompts based on a given input, users can quickly experiment with different phrasings and ideas to see what works best for their desired image outputs. This can be particularly useful for Midjourney users, as the model is trained on real Midjourney prompts. Things to try Try experimenting with different temperature and repetition penalty values to see how they affect the diversity and quality of the generated prompts. You can also try providing more specific or detailed input prompts to see how the model responds. Additionally, you can combine the image-prompts model with other text-to-image models like openjourney or majicmix to further refine and improve your image generation workflow.

Updated Invalid Date

Text-to-Image

pulid-base

fofr

The pulid-base model is a face generation AI developed by fofr at Replicate. It uses SDXL fine-tuned checkpoints to generate images from a face image input. This model can be particularly useful for tasks like photo editing, avatar creation, or artistic exploration. Compared to similar models like stable-diffusion, pulid-base is specifically focused on face generation, while pulid is a more general ID customization model. The sdxl-deep-down model from the same creator is also fine-tuned on underwater imagery, making it suitable for different use cases. Model inputs and outputs The pulid-base model takes a face image as the primary input, along with a text prompt, seed, size, and various other options to control the style and output format. It then generates one or more images based on the provided inputs. Inputs Face Image**: The face image to use for the generation Prompt**: The text prompt to guide the image generation Seed**: Set a seed for reproducibility (random by default) Width/Height**: The size of the output image Face Style**: The desired style for the generated face Output Format**: The file format for the output images Output Quality**: The quality level for the output images Negative Prompt**: Text to exclude from the generated image Checkpoint Model**: The model checkpoint to use for generation Outputs Output Images**: One or more generated images based on the provided inputs Capabilities The pulid-base model can generate photo-realistic face images from a combination of a face image and a text prompt. It can be used to create unique, personalized images by blending the input face with different styles and scenarios described in the prompt. The model is particularly adept at maintaining the identity and features of the input face while generating diverse and visually compelling output images. What can I use it for? The pulid-base model can be a powerful tool for a variety of applications, such as: Avatar and character creation**: Generate unique, custom avatars or character designs for games, social media, or other digital experiences. Face editing and enhancement**: Enhance or modify existing face images, such as by changing the expression, style, or environment. Digital art and illustration**: Combine face images with imaginative prompts to create surreal, dreamlike, or stylized artworks. Prototyping and visualization**: Quickly generate face images to visualize concepts, ideas, or designs involving human subjects. By leveraging the face-focused capabilities of the pulid-base model, you can create a wide range of personalized and visually striking images to suit your needs. Things to try Experiment with different combinations of face images, prompts, and model parameters to see how the pulid-base model can transform a face in unexpected and creative ways. Try using the model to generate portraits with specific moods, emotions, or artistic styles. You can also explore blending the face with different environments, characters, or fantastical elements to produce unique and imaginative results.

Updated Invalid Date

Text-to-Image

stable-diffusion

stability-ai

107.9K

Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. Developed by Stability AI, it is an impressive AI model that can create stunning visuals from simple text prompts. The model has several versions, with each newer version being trained for longer and producing higher-quality images than the previous ones. The main advantage of Stable Diffusion is its ability to generate highly detailed and realistic images from a wide range of textual descriptions. This makes it a powerful tool for creative applications, allowing users to visualize their ideas and concepts in a photorealistic way. The model has been trained on a large and diverse dataset, enabling it to handle a broad spectrum of subjects and styles. Model inputs and outputs Inputs Prompt**: The text prompt that describes the desired image. This can be a simple description or a more detailed, creative prompt. Seed**: An optional random seed value to control the randomness of the image generation process. Width and Height**: The desired dimensions of the generated image, which must be multiples of 64. Scheduler**: The algorithm used to generate the image, with options like DPMSolverMultistep. Num Outputs**: The number of images to generate (up to 4). Guidance Scale**: The scale for classifier-free guidance, which controls the trade-off between image quality and faithfulness to the input prompt. Negative Prompt**: Text that specifies things the model should avoid including in the generated image. Num Inference Steps**: The number of denoising steps to perform during the image generation process. Outputs Array of image URLs**: The generated images are returned as an array of URLs pointing to the created images. Capabilities Stable Diffusion is capable of generating a wide variety of photorealistic images from text prompts. It can create images of people, animals, landscapes, architecture, and more, with a high level of detail and accuracy. The model is particularly skilled at rendering complex scenes and capturing the essence of the input prompt. One of the key strengths of Stable Diffusion is its ability to handle diverse prompts, from simple descriptions to more creative and imaginative ideas. The model can generate images of fantastical creatures, surreal landscapes, and even abstract concepts with impressive results. What can I use it for? Stable Diffusion can be used for a variety of creative applications, such as: Visualizing ideas and concepts for art, design, or storytelling Generating images for use in marketing, advertising, or social media Aiding in the development of games, movies, or other visual media Exploring and experimenting with new ideas and artistic styles The model's versatility and high-quality output make it a valuable tool for anyone looking to bring their ideas to life through visual art. By combining the power of AI with human creativity, Stable Diffusion opens up new possibilities for visual expression and innovation. Things to try One interesting aspect of Stable Diffusion is its ability to generate images with a high level of detail and realism. Users can experiment with prompts that combine specific elements, such as "a steam-powered robot exploring a lush, alien jungle," to see how the model handles complex and imaginative scenes. Additionally, the model's support for different image sizes and resolutions allows users to explore the limits of its capabilities. By generating images at various scales, users can see how the model handles the level of detail and complexity required for different use cases, such as high-resolution artwork or smaller social media graphics. Overall, Stable Diffusion is a powerful and versatile AI model that offers endless possibilities for creative expression and exploration. By experimenting with different prompts, settings, and output formats, users can unlock the full potential of this cutting-edge text-to-image technology.

Updated Invalid Date

Text-to-Image

txt2img

fofr

The txt2img model is a collection of various text-to-image generation models from the Replicate platform, including RealVisXL, Juggernaut, Proteus, DreamShaper, and others. These models allow users to generate high-quality images from textual descriptions, leveraging the power of large language models and diffusion-based approaches. The txt2img model can be used through the ComfyUI web interface, providing a user-friendly way to experiment with different base weights and generate diverse visual outputs. Model inputs and outputs The txt2img model takes a variety of inputs, including a text prompt, image size, number of outputs, and various parameters to control the image generation process, such as the sampling method and guidance scale. The output of the model is an array of image URLs, representing the generated images. Inputs Prompt**: The textual description that the model uses to generate the image. Model**: The base weights to use for the text-to-image generation. Width/Height**: The desired size of the output image. Num Outputs**: The number of images to generate. Scheduler**: The diffusion scheduler to use for image generation. Sampler Name**: The sampling method to use during the diffusion process. Guidance Scale**: The scale for classifier-free guidance, which controls the influence of the text prompt on the generated images. Negative Prompt**: The textual description to guide the model away from generating certain undesirable elements. Num Inference Steps**: The number of diffusion steps to perform during the generation process. Disable Safety Checker**: An option to disable the safety checker, which can be useful for generating artistic or experimental images. Outputs Array of Image URLs**: The generated images are returned as an array of URLs, which can be used to display or download the output. Capabilities The txt2img model can be used to generate a wide variety of images from text prompts, ranging from realistic scenes to fantastical and imaginative creations. The model's capabilities are showcased in the examples provided by the maintainer, fofr, who has also created other Replicate models like face-to-many and sticker-maker. What can I use it for? The txt2img model can be used for a range of creative and practical applications, such as generating concept art, illustrating stories, creating custom graphics, and producing unique images for marketing or social media. The ability to fine-tune the model's outputs through various parameters allows users to experiment and find the right balance for their specific needs. Things to try One interesting aspect of the txt2img model is the ability to use different base weights, such as RealVisXL, Juggernaut, and Proteus. Experimenting with these different weights can result in varied visual styles and outputs, allowing users to explore different artistic and creative directions. Additionally, playing with the guidance scale and negative prompts can help users refine the generated images and achieve their desired results.

Updated Invalid Date

Text-to-Image