Trong thử nghiệm với Higgs Audio 2, ComfyUI tạo âm thanh trong ~5 s trên RTX 5070Ti, dùng ít VRAM và RAM. Đoạn script Python cùng tham số tốn thời gian lớn hơn và chiếm toàn bộ VRAM/RAM. Nguyên nhân có thể là ComfyUI giữ mô hình trong bộ nhớ và tối ưu tài nguyên. #AI #Audio #ComfyUI #Python #GPU #HiggsAudio2 #TríTuệNhânTạo #ÂmThanh

https://www.reddit.com/r/LocalLLaMA/comments/1qpww5a/comfyui_vs_python_gpu_usage_higgs_audio_2/

So sánh hiệu năng khi chạy Higgs Audio 2 trên ComfyUI vs Python:
- ComfyUI xử lý câu ~5 giây (GPU 5070Ti) trong khi script Python tốn thời gian hơn dù cùng tham số
- ComfyUI sử dụng ít VRAM/RAM hơn đáng kể so với Python script
Câu hỏi mở về nguyên nhân khác biệt hiệu suất giữa 2 nền tảng này?

#AI #GPU #TốiƯuHóa #ComfyUI #HiggsAudio2 #Tech #SoSánhCôngNghệ

https://www.reddit.com/r/LocalLLaMA/comments/1qpww5a/comfyui_vs_python_gpu_usage_higgs_audio_2/

YES SUCCEEDED!!!

Just rendered an image at 944×1152 (slightly above 1024×1024) using Flux1-Schnell-FP8 on my 6700 XT, and it works! (Image 1 is the Real-ESRGAN 2× upscaled version)

Workflow 1: Sampling (Image 2)

Prompt executed → UNet generates the latent

Step 1 (model load + latent generation) took 419 seconds

Output: Latent tensor saved to disk

Workflow 2 : VAE Decode (Image 3)

Latent loaded → VAE decodes the image

Duration: 7.5 seconds

Advantage: UNet doesn’t need to stay in VRAM → VRAM freed, even on 12 GB GPUs

The problem with the stock LoadLatent Node

Dropdown only shows files if they were produced / annotated by a previous SaveLatent Node

Node is designed to pass latents inside a graph, not load arbitrary files from disk

Purpose: prevents accidentally loading wrong files

Workaround (Image 4)

Edited /ComfyUI/nodes.py, class LoadLatent

Hardcoded latent path → Node now loads directly from disk

Result: Workflow 2 runs instantly, UNet can be unloaded

Timing

Step 1 (model load + latent generation): 419 s

Step 2 (VAE decode): 7.5 s

Result: High-res images on a 12 GB RDNA2 GPU are now possible on Flux1-Schnell-FP8 without ComfyUI crashing! (Image 5 is the original output)

This might actually become my new Flux workflow: render quick 512×512 previews first (which works perfectly on RDNA2 GPUs), sort out the good ones, extract the seed from the PNG metadata, and then re-render only the selected images with the same seed using the split workflow at higher resolutions. This way, high-resolution Flux1-Schnell-FP8 renders become possible on 12 GB RDNA2 GPUs D:

Question at the end: Has anyone ever done this before? Because I have no clue xD

#ComfyUI #flux #Flux1SchnellFP8 #FP8 #AMD #RDNA2 #VAE #AIArt #Pixelfed #HighResolution #GPUOptimization #LatentWorkflow #AIWorkflow #AIHacks #RealESRGAN #Upscale #AIExperiment #CreativeAI #DigitalArt #AICommunity #python #linux #opensource #foss
One year ago today I opened my Pixelfed profile 🎉

Time for a short retrospective of how it all began.

Late 2024, 2 a.m.: I was manually integrating peaks from chromatograms in Chromeleon when I thought: why can’t an AI do this?
The idea didn’t go anywhere, but I started exploring AI frameworks and ended up with image generation. ROCm on Debian, EasyDiffusion, and then Pixelfed.

Later Debian and ROCm drifted apart, so I posted some real-life photos. With an Ubuntu chroot, everything ran cleanly again, even AUTOMATIC1111. SD 1.5 was my standard for a long time. Early this year I tried FLUX in ComfyUI but had to drop it: RDNA2 + no FP8 + incomplete HIP → FLUX-VAE not practical. Mid-January I finally fixed the NaNs in SDXL VAE in A1111.

Now I’m fully on ComfyUI, can render 1024×1024, and 512+ px no longer OOMs.

End of 2025, I used Pixelfed for IT/FOSS networking: the FOSS Advent Calendar. Posts were seen thanks to ActivityPub, and I even started my own dev blog xD

Thanks 💜 to everyone who follows me, especially my regular viewers and those I really exchange with.
Pixelfed remains my place to share, experiment, and learn.

1 year on Pixelfed, and it all started with peaks at 2 a.m.

tl;dr: Thanks so much to everyone who follows me, especially my regular viewers and those I really exchange with, you are awesome (in Austrian slang: Ihr seid ur leiwand 💜)

#Pixelfed #Fediverse #OpenSource #FOSS #Anniversary #1Year #Celebration #Birthday #Milestone #BirthdayCake #Fireworks #Festive #Colorful #AI #AIArt #GenerativeArt #ComfyUI #SDXL #StableDiffusion #ROCm #Linux #ThankYou #AiCommunity
New post soon, let's rendez-vous on Friday at 8:00 AM
for a brand new Following Alice iteration. More infos on https://billisdead.com
#billisdead #FollowingAlice #ComfyUI #aiart #aiartwork #aiillustration #flux1dev #AliceInWonderland
billisdead's neural stories

AiArtist, foss & selfhosting enthusiast, I make dark artworks using open source selfhosted ai tools.

billisdead's neural stories
生成AIグラビアをグラビアカメラマンが作るとどうなる?第59回:BFLからFLUX.2 [klein]リリース!(西川和久)

BFLからFLUX.2 [klein]登場!

テクノエッジ TechnoEdge

nefuron (@nefuron_23)

Qwen3-tts로 여러 캐릭터(오네상, 코바야시, 내레이터)의 대화를 약 3분 분량으로 생성해 본 테스트 공유입니다. 억양·액센트 문제는 있으나 로컬에서 다인 캐릭터 대사를 생성할 수 있는 점을 강조하며, 생성 시간은 길이보다 약간 짧았다고 언급합니다.

https://x.com/nefuron_23/status/2016116209926984167

#qwen3tts #tts #comfyui #speechsynthesis

nefuron (@nefuron_23) on X

Qwen3-ttsで複数キャラクターの会話(dialogue)テスト。長さは3分程で、登場人物はお姉さん、小林、ナレーターの3名。アクセントやイントネーションの問題は有るけれど、これがローカルで作れるのは凄い!生成時間は長さより少し短いぐらいかな #Qwen3TTS #comfyui

X (formerly Twitter)
Tongyi-MAI/Z-Image · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🧠 LTX-2-Workflows è una repository che raccoglie una serie di workflow pronti all’uso pensati per lavorare con LTX-2 all’interno di #ComfyUI, rendendo più accessibile e modulare la generazione video. 
👉 I dettagli: https://www.linkedin.com/posts/alessiopomaro_comfyui-ai-genai-activity-7421887553450131457-ZsPG

___ 
✉️ 𝗦𝗲 𝘃𝘂𝗼𝗶 𝗿𝗶𝗺𝗮𝗻𝗲𝗿𝗲 𝗮𝗴𝗴𝗶𝗼𝗿𝗻𝗮𝘁𝗼/𝗮 𝘀𝘂 𝗾𝘂𝗲𝘀𝘁𝗲 𝘁𝗲𝗺𝗮𝘁𝗶𝗰𝗵𝗲, 𝗶𝘀𝗰𝗿𝗶𝘃𝗶𝘁𝗶 𝗮𝗹𝗹𝗮 𝗺𝗶𝗮 𝗻𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿: https://bit.ly/newsletter-alessiopomaro

#AI #GenAI #GenerativeAI #IntelligenzaArtificiale #LLM 

とよふくToyofuku (@Yeq6X)

ComfyUI의 임의 json을 Claude Code에서 '스킬'로 사용하도록 하는 기능을 만들어 테스트했다고 보고합니다. 예컨대 "이 이미지를 레이어 분리해줘"라는 지시로 qwen image layered를 호출하거나, "이 이미지를 ○○의 lora로 변환" 같은 작업을 스킬로 실행할 수 있게 하는 통합·활용 사례입니다.

https://x.com/Yeq6X/status/2016027113946808767

#comfyui #claude #qwen #lora #aitools

とよふく🎍Toyofuku (@Yeq6X) on X

任意のComfyUIのjsonをskillsにしてClaude Codeから使うスキルを作ってみた 「この画像をレイヤー分けして」 のような指示でqwen image layeredを使用したり、 「この画像を○○のloraで変換」 のように使えたりするようになります Claude

X (formerly Twitter)