Stable Diffusion has many different parameters that can drastically affect the resulting images it generates.
While you can read the extensive documentation about each parameter, it is often more intuitive to visually understand the parameter space and the effects of these parameters through the use of XYZ plots. As the name suggests, the XYZ plot visually represents how the output of images changes depending on the increments of three chosen parameters that are set along the X, Y, and Z axes.
When generating images with the XYZ plot, it can be resource-intensive as it requires computing multiple variations of an image, each with slightly altered parameters at full resolution.
While consumer laptops are certainly capable of running Stable Diffusion models, a dedicated GPU or rental GPU service will provide the necessary computational power to generate images in a reasonable time frame.
Need additional guidance? Watch our video that walks you through the step-by-step process.
Where is the XYZ Plot?
You can find it within the txt2img tab of Automatic 1111’s WebUI. If you scroll to the Scripts section, XYZ Plot will be available:
What You Can Compare With the XYZ Plot
As of November 2023, the following values are available for analysis with the XYZ Plot:
|Always discard next-to-last sigma||Nothing||Sigma Churn|
|CFG Scale||Prompt order||Sigma max|
|Checkpoint name||Prompt S/R||Sigma min|
|Clip skip||Refiner checkpoint||Sigma noise|
|Denoising||Refiner switch at||Steps|
|Extra noise||Sampler||Token merging ratio|
|Face restore||Schedule max sigma||Token merging ratio high-res|
|Hires sampler||Schedule min sigma||UniPC Order|
|Hires steps||Schedule rho||VAE|
|Hires upscaler||Schedule type||Var. seed|
|Initial noise multiplier||Seed||Var. strength|
|Negative Guidance minimum sigma||SGM noise multiplier|
Note: More options are available when using extensions. For example, ControlNET loads in several additional parameters.
Stable Diffusion starts with a seed value, this represents the initial starting point of the random noise pattern that the generative algorithm will use to create the final image. The seed is a critical parameter because it determines the stochastic element of the image generation process; changing the seed value will result in a completely different image, even with all other parameters held constant.
When plotting 5 sequential seeds, we can see how the image gradually changes:
The CFG Scale, or Classifier-Free Guidance Scale, dictates how strictly the model adheres to your input prompt. A higher value yields a more accurate representation of the prompt but can also lead to over-saturation of colors. Finding the right balance is key to producing images that are both creative and relevant.
Here’s a plot of the CFG Scale:
The Prompt Search and Replace (S/R) function allows for quick word substitutions within your prompt, offering a straightforward way to alter the mood or setting of your image. This feature is particularly useful for adjusting lighting conditions, camera angles, or other specific details without rewriting the entire prompt.
For example, if you wanted to see how a house looks as the seasons change, you might set up an XYZ plot where one axis cycles through a list of seasons like “spring”, “summer”, “autumn”, and “winter”.
Here’s a plot of the Prompt S/R:
Variation Seed & Variation Strength: The Dance of Diversity
These two parameters, Variation Seed and Variation Strength, are a duo that must be adjusted together. They enable you to generate variations of your initial image by mixing features from different seeds.
This doesn’t mean you can mix up two completely different images (i.e., a character and a landscape), as the prompt must remain the same; instead, if you like details from two different seeds, you can incorporate elements from both.
The variation strength informs the model how much it should follow one seed over the other. If you prefer one image but want just a hint of the characteristics from the other, you would set a high variation strength for the favored seed and a lower one for the other.
To make sense of this, here we demonstrate the variation strength in 0.25 increments from 0 (full adherence to the initial seed) to 1 (full commitment to the variation seed), with an intermediary scale that shows the gradual merger of features:
Prompt Order: The Power of Position
With models handling up to 75 tokens, although it appears support may be increasing, Stable Diffusion models give more weight to the initial tokens in a prompt, affecting the image’s overall composition and the salience of elements.
Playing with the prompt order in the XYZ plot can reveal insights into how different arrangements of the same words can produce varying visual outcomes.
For example, when swapping the words “clouds” and “flowers,” “intricate,” and “countryside,” you can observe the nuanced shifts in the generated image’s focus:
It’s worth stressing here that the image isn’t expected to change drastically as the prompt didn’t change; rather, the subtle difference often lies in which aspect of the prompt is emphasized.
These differences may be more pronounced when playing with the prompt order for long prompts, and swapping it out with one of the starter words or phrases can make a significant impact.
The sampler is the algorithm that “cleans up” the image during generation. There are several types to choose from, each with its characteristics. The most common include Euler A, DPM++ 2M Karras, and DDIM.
By comparing samplers against one another using the XYZ plot, you can determine which one best suits the style and intricacy of the artwork you aim to create.
Here’s a plot of different samplers:
Note: It is best to use this parameter in conjunction with steps, as the sampler may require more or fewer steps to produce the desired result.
Comparing different checkpoints—essentially different versions of the Stable Diffusion model—can be challenging. They’re trained on varied datasets and have distinct expectations regarding prompts. For example, one may depend on OpenAI’s CLIP ViT-L/14, OpenCLIP-ViT/H, and others.
Variational Autoencoder (VAE)
VAEs are instrumental in refining image quality in Stable Diffusion. Some checkpoints incorporate VAEs, while others offer them as separate enhancements. If you are curious about how the VAE impacts a model’s output, you can use the XYZ plot to compare different VAEs.
Negative Guidance Minimum Sigma (NG Min Sigma)
This feature enables users to bypass the negative prompt for some steps, boosting performance with minimal loss of quality.