Sign In

ComfyUI Tutorial #2 - Building Your First Workflow

12

ComfyUI Tutorial #2 - Building Your First Workflow

ComfyUI Tutorial #2 - Building Your First Workflow

Alright, rookies! Gather ‘round. Today, we’re going to break down each node like we’re building an engine from scratch. I’ll explain why each piece exists, what it does, and how everything fits together to create an image. Ready? Let’s dive in!

1. Load the AI’s Brain (Load Checkpoint)

What does it do? Think of Stable Diffusion as a super-talented artist who’s asleep. The .safetensors or .ckpt file (your checkpoint) is the artist’s brain. Without it, the artist doesn’t even know how to hold a brush! The Load Checkpoint node wakes up the artist and gives them all their knowledge: styles, shapes, colors—everything they’ve learned during training.

Why is it essential?

  • Without a loaded model, ComfyUI can’t generate anything. It’s the foundation!

  • Each model has its specialty: some are great for landscapes, others for portraits, anime, etc.

How to do it:

  1. Add the Load Checkpoint node:

    • Right-click in the workspace → Add Node → Type Load Checkpoint.

  2. Select your model:

    • Click the dropdown for ckpt_name and pick your file (e.g., sd_xl_base_1.0.safetensors).

  3. Key outputs:

    • MODEL: The AI’s "brain," ready to follow your instructions.

    • CLIP: The "translator" that understands your text prompt.

    • VAE (sometimes): A tool to refine details (we’ll get to that later).

2. Translate Your Idea into AI Language (CLIP Text Encode)

What does it do? Your prompt (e.g., "a fantasy landscape with glowing mountains") isn’t something the AI can read directly. It needs a translator to turn your words into something it understands: numbers and vectors. That’s the job of the CLIP Text Encode node!

Why connect it to the Checkpoint?

  • The CLIP output from Load Checkpoint contains the translation rules specific to your model.

  • Without this connection, the AI wouldn’t know what to draw.

How to do it:

  1. Add the CLIP Text Encode node:

    • Right-click → Add Node → Type CLIP Text Encode.

  2. Connect the CLIP:

    • Take the CLIP output from Load Checkpoint and plug it into the clip input of CLIP Text Encode.

  3. Write your prompt:

    • In the text field, describe what you want (e.g., "a fantasy landscape, golden mountains, cinematic lighting, 8k, highly detailed").

    • Tip: The more precise your prompt, the better the result. Use visual keywords (colors, styles, moods).

Why is this magical?

  • This node turns your text into a mental map the AI can follow to create the image.

  • You can also add a negative prompt (what you don’t want) in a second CLIP Text Encode node (check negative).Step 3: The Heart of Generation (KSampler)

What does it do? The KSampler is the conductor. It takes:

  • The loaded model (which knows "how to draw").

  • Your translated prompt (which says "what to draw"). And it generates the image step by step, like a painter adding layers.

Why is this the most important node?

  • This is where you control quality, style, and consistency.

  • You can tweak:

    • steps: How many iterations the AI takes to refine the image. More steps = more detail (but slower).

      • Example: 20 steps = a good balance of speed and quality.

    • cfg: How closely the AI follows your prompt (vs. doing its own thing).

      • Example: cfg = 7 = the AI follows your instructions well without being too rigid.

    • sampler_name: The technique used to generate the image (e.g., euler, dpmpp_2m).

      • Example: euler = fast and creative, dpmpp_2m = more precise but slower.

    • seed: A random number to reproduce the same image later.

      • Example: 123456789 (write it down if you like the result!).

How to do it:

  1. Add the KSampler node:

    • Right-click → Add Node → Type KSampler.

  2. Connect the inputs:

    • modelMODEL output from Load Checkpoint.

    • positive/negative → Outputs from your CLIP Text Encode nodes.

  3. Configure settings:

    • steps: 20 (good starting point).

    • cfg: 7 (balanced).

    • sampler_name: euler (or dpmpp_2m for more detail).

    • seed: 123456789 (or leave blank for randomness).

Pro tip: If your image is blurry or inconsistent, try increasing steps (30-50) or switching the sampler.

4. Sharpen the Details with VAE (VAE Loader)

What does it do? The VAE (Variational AutoEncoder) is like a pair of precision glasses for the AI. It helps:

  • Refine colors.

  • Improve contrast.

  • Make the image sharper.

Why is it optional but recommended?

  • Some models include a VAE, but adding one manually can boost quality.

  • Without it, images might look dull or blurry.

How to do it:

  1. Add the VAE Loader node:

    • Right-click → Add Node → Type VAE.

  2. Load a VAE:

    • Select a file like vae-ft-mse-840000.safetensors.

  3. Connect it to KSampler:

    • Take the VAE output and plug it into the vae input of KSampler.

When to use it?

  • Always with SDXL (VAE makes a big difference).

  • If your images lack sharpness or depth.

5. Save Your Masterpiece (Save Image)

What does it do? You’ve generated an amazing image… but if you don’t save it, it’ll vanish like a dream! The Save Image node is your camera: it captures the final result and saves it to your computer.

Why is this crucial?

  • Without it, you’ll have to re-run the workflow to see your image again.

  • You can choose where and how to save it (format, name, etc.).

How to do it:

  1. Add the Save Image node:

    • Right-click → Add Node → Type Save Image.

  2. Connect the image:

    • Take the OUTPUT from KSampler and plug it into the images input of Save Image.

  3. Set a save location:

    • In filename_prefix, type a name (e.g., my_fantasy_landscape).

    • In output_path, specify a folder (e.g., ComfyUI_windows_portable/output).

Pro tip:

  • Check save_preview to see a thumbnail in ComfyUI.

  • Use clear names to find your images later (e.g., golden_mountains_20steps_cfg7).

6. Run the Workflow (Queue Prompt)

Moment of truth!

  1. Click Queue Prompt (bottom right).

  2. Watch the magic happen…

    • You’ll see a progress bar in the console.

    • The image will appear in the preview window.

  3. Admire your creation!

    • If you like it, note the seed to recreate it.

    • If not, tweak the prompt or settings and try again.

Common issues?

Blurry image

Increase steps or change the sampler.

AI ignores prompt

Check your cfg (7-12) and prompt clarity.

CUDA/GPU error

Close other apps or lower the resolution.

Nothing happens

Double-check all node connections.

7. Visual Recap of the Workflow

Here’s what your final workflow looks like:

[Load Checkpoint]
│
├── MODEL → [KSampler]
│    │
│    ├── positive → [CLIP Text Encode] ←--- "a fantasy landscape..."
│    │
│    └── negative → [CLIP Text Encode] ←--- "blurry, ugly, deformed" (optional)
│
└── CLIP → [CLIP Text Encode]
     │
[VAE Loader] → VAE → [KSampler]
     │
     └── OUTPUT → [Save Image] → "my_fantasy_landscape_00001.png"

8. Next Steps to Become a Pro

Now that you’ve mastered the basics, try:

  1. Adding LoRAs to refine styles (e.g., specific characters).

  2. Using ControlNet to control poses or compositions.

  3. Experimenting with Illustrious for advanced workflows (upscaling, model mixing).

  4. Downloading pre-made workflows from CivitAI and studying them.

9. Beginner FAQs

Q: Why is my image all black?

  • Answer: Your prompt might be too vague, or your cfg is too low. Try cfg = 10 and a more detailed prompt.

Q: How do I make a series of images with the same style?

  • Answer: Lock the seed (right-click KSamplerLock Seed) and only change part of the prompt.

Q: Can I mix two models?

  • Answer: Yes! Use nodes like ModelMerge or Illustrious Mix

10. You’re Ready to Create!

Congrats! You’ve just built your first workflow like a pro. ComfyUI is like Lego—the more you play with nodes, the more creative you’ll get. So go ahead, build, experiment, and generate!

(Stuck? Ask questions—I’m here to help!)

12