ผู้ใช้ ComfyUI บางครั้งต้องเจอกับ Workflow แปลก ๆ หรือ node แปลก ๆ ที่ทำให้ต้องเจอ error ต่าง ๆ หรือบางครั้งมีเครื่องมือใหม่ ๆ โมเดลใหม่ ๆ ที่ช่วยให้เจนไวขึ้น โดยในที่นี้จะรวมสิ่งที่ปวดหัวที่สุดอย่างแรกเลยคือ Sageattention

SageAttention คืออะไร

ถ้าเอาแบบง่ายที่สุดคือเป็นชุดคำสั่งตัวหนึ่งที่ทำให้เจน image / video ไวขึ้นแบบมาก ๆ และคุณภาพตกน้อยมาก ๆ หรือแทบไม่ลดลงเลย

แบบไม่ใช้ SageAttention

got prompt
Requested to load WanVAE
loaded completely; 20446.88 MB usable, 242.03 MB loaded, full load: True
switching model at step 2
Running high noise model...
Requested to load WAN21
loaded completely; 17057.76 MB usable, 13629.08 MB loaded, full load: True

  0%|          | 0/2 [00:00<?, ?it/s]
50%|█████     | 1/2 [00:27<00:27, 27.17s/it]
100%|██████████| 2/2 [00:47<00:00, 23.19s/it]
100%|██████████| 2/2 [00:47<00:00, 23.78s/it]
Running low noise model...
Requested to load WAN21
loaded completely; 17057.76 MB usable, 13629.08 MB loaded, full load: True
0%|          | 0/2 [00:00<?, ?it/s]
50%|█████     | 1/2 [00:27<00:27, 27.29s/it]
100%|██████████| 2/2 [00:47<00:00, 23.26s/it]
100%|██████████| 2/2 [00:47<00:00, 23.86s/it]
Requested to load WanVAE
loaded completely; 5057.62 MB usable, 242.03 MB loaded, full load: True
Prompt executed in 139.50 seconds

แบบใช้ SageAttention

got prompt
switching model at step 2
Running high noise model...
Requested to load WAN21
loaded completely; 17057.76 MB usable, 13629.08 MB loaded, full load: True
Patching comfy attention to use sageattn

  0%|          | 0/2 [00:00<?, ?it/s]
50%|█████     | 1/2 [00:17<00:17, 17.72s/it]
100%|██████████| 2/2 [00:31<00:00, 15.39s/it]
100%|██████████| 2/2 [00:31<00:00, 15.74s/it]
Restoring initial comfy attention
Running low noise model...
Requested to load WAN21
loaded completely; 17057.76 MB usable, 13629.08 MB loaded, full load: True
Patching comfy attention to use sageattn

  0%|          | 0/2 [00:00<?, ?it/s]
50%|█████     | 1/2 [00:17<00:17, 17.76s/it]
100%|██████████| 2/2 [00:31<00:00, 15.41s/it]
100%|██████████| 2/2 [00:31<00:00, 15.76s/it]
Restoring initial comfy attention
Requested to load WanVAE
loaded completely; 5057.62 MB usable, 242.03 MB loaded, full load: True
Prompt executed in 96.38 seconds

*ทดสอบด้วย RTX 5090 32 GB ที่ Runpod

เร็วขึ้นจาก 140 -> 96 วินาที => เร็วขึ้น 44 วินาที
ถ้ามีการ์ดจอที่แรงขึ้นจะเห็นผลมาก

ความยากของการลง SageAttention

หากไม่มี Pytorch หรือ CUDA Version ที่ตรงกันต้อง build จาก source code เองทั้งหมด
หากมี Pytorch หรือ CUDA Version ที่ตรงกันกับที่มีคนทำเป็นไฟล์พร้อมลงมาให้แล้วจะง่ายมาก ๆ
หากลง Sageattention ต้องมีพ่วงกับ Triton ด้วย

หากต้องการลง SageAttention ทำยังไงได้บ้าง

Batch Script สำหรับลง Triton และ Sageattention อัตโนมัติ

รองรับ ComfyUI Portable ที่ใช้ python 3.10, 3.11, 3.12 และ 3.13
สามารถ Download ได้ที่นี่
- https://huggingface.co/vjump21848/sageattention-pre-compiled-wheel/resolve/main/triton-and-sageattn-installer-for-comfyui-portable.zip

โดยเมื่อกด link แล้ว download ไปที่ folder comfyui portable ที่ต้องการจะ patch จะได้ไฟล์ script มา และทำการ extract zip และกด double click run ได้เลย

*ให้ทำการ backup ComfyUI ไว้ก่อนทำการ patch เสมอ

ต้องการลงแบบ Manual ทำตามวิธีนี้ได้เลย

Cheklist ที่ต้องหา Sageattention ให้ตรงกัน

Python Version
torch Version
Cuda Version

วิธีการเช็คสามารถหาได้ง่าย ๆ จาก About ComfyUI

กดปุ่ม logo ComfyUI -> Help -> About ComfyUI

Version Python ตามนี่้สามาใช้งานได้

3.10.x
3.11.x
3.12.x
3.13.x

วิธีการตรวจสอบ Version ของ torch และ CUDA

x.x.x+cuzzz

example

2.9.1+cu130

x.x.x หมายถึง version ของ torch
cuzzz หมายถึง version ของ CUDA

Version Python + torch + CUDA ที่สามารถใช้งานได้ (สำหรับ Windows)

แบบต้อง build เอง

Python 3.10.x ถึง 3.13.x และ 2.6.0+cu124 ถึง 2.9.1+cu130

แบบสามารถลงได้เลย (พร้อม link download)

torch 2.9.0+cu130

Python 3.11.x และ torch 2.9.0+cu130

torch 2.9.1+cu130

Python 3.11.x และ torch 2.9.1+cu130

torch 2.7.1+cu128

torch 2.8.0+cu128

วิธีการติดตั้ง SageAttention

ให้เปิดหน้า cmd ที่อยู่ใน folder ComfyUI และเวลาพิมพ์คำสั่ง pip เพื่อลง package ใด ๆ ต้องพิมพ์แบบนี้นำหน้าเสมอ เช่น

.\python_embeded\python.exe -m pip install uv

ก่อนอื่นต้องลง triton-windows ก่อน

หากใช้ torch 2.7

.\python_embeded\python.exe -m pip install "triton-windows<3.4"

หากใช้ torch 2.8

.\python_embeded\python.exe -m pip install "triton-windows<3.5"

หากใช้ torch 2.9

.\python_embeded\python.exe -m pip install "triton-windows<3.6"

ข้อความเมื่อลง triton สำเร็จ

Collecting triton-windows<3.6
  Downloading triton_windows-3.5.1.post21-cp312-cp312-win_amd64.whl.metadata (1.8 kB)
Downloading triton_windows-3.5.1.post21-cp312-cp312-win_amd64.whl (46.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.5/46.5 MB 61.6 MB/s  0:00:00
Installing collected packages: triton-windows
Successfully installed triton-windows-3.5.1.post21

จากนั้นโหลดไฟล์ SageAttenion มาแล้ววางไว้ที่ folder comfyui portable นอกสุด

ข้อควรระวัง ห้ามเปลี่ยนชื่อไฟล์เด็ดขาด

แล้วทำการเปิด cmd แล้วติดตั้งตามนี้

.\python_embeded\python.exe -m pip install "ไฟล์ sageattention.whl"

เมื่อติดตั้งเสร็จแล้วจะขึ้นแบบนี้

Collecting sageattention==2.2.0+cu130torch2.9.1
  Downloading sageattention (9.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.1/9.1 MB 33.2 MB/s  0:00:00
Installing collected packages: sageattention
Successfully installed sageattention-2.2.0+cu130torch2.9.1

ลง libs และ Include เพื่อใช้งาน triton

โดยใช้เช็คตาม version python โดยเช็คแค่ X.YY (เช่น 3.10, 3.11, 3.12 หรือ 3.13) แล้วทำการโหลดให้ตรงกัน และ แตกไฟล์ไว้ที่ python_embeded

python 3.10

https://github.com/woct0rdho/triton-windows/releases/download/v3.0.0-windows.post1/python_3.10.11_include_libs.zip

python 3.11

https://github.com/woct0rdho/triton-windows/releases/download/v3.0.0-windows.post1/python_3.11.9_include_libs.zip

python 3.12

https://github.com/woct0rdho/triton-windows/releases/download/v3.0.0-windows.post1/python_3.12.7_include_libs.zip

python 3.13

https://github.com/woct0rdho/triton-windows/releases/download/v3.0.0-windows.post1/python_3.13.2_include_libs.zip

*ไม่ต้องสนใจเลขหลังสุด

ให้แตกไฟล์ลงใน folder นี้ตามภาพ

วิธีเรียกใช้งาน sageattention

ต้องมี custom node ComfyUI-KJNodes (โหลดได้จาก ComfyUI Manager หรือ https://github.com/kijai/ComfyUI-KJNodes)

แล้วให้เรียก node ตัวนี้ขึ้นมาให้ต่อระหว่าง model กับ model

โดยแนะนำให้เลือก auto ก่อนถ้า error ค่อยเลือกตามช้อยล่าง

- sageattn_qk_int8_pv_fp16_cuda
- sageattn_qk_int8_pv_fp16_triton

หรือถ้าใช้ custom node ของ kijai เช่น WanVideoWrapper จะมีเมนูนี้

จากนั้นก็สามารถกด Gen ได้เลย

หากใช้งานบน Runpod

Template นี้มีการติดตั้ง SageAttenion มาให้ล่วงหน้าแล้วพร้อมใช้งาน

CU128 Template url : https://console.runpod.io/deploy?template=7crlm3hxud&ref=6h6f9kga (ให้กดลิ้งนี้) (รองรับการ์ดจอ 50 series)
CU124 template url : https://console.runpod.io/deploy?template=kzds77do4y&ref=6h6f9kga (ใช้กับการ์ดจอส่วนมากใน runpod ได้)

วิธีการติดตั้ง SageAttention สำหรับ ComfyUI