Last evening, I spent some time investigating how to generate video from a text prompt and created this short clip. I was thinking it would require more GPU power, but all worked out OK. Took around 100 seconds to create this clip. The model used is Wan2.1, a 1.3b parameter model that fits a 8GB VRAM GPU (AMD/Nvidia). I used ComfyUI running on my home lab. Impressive for a "tiny" model, but comes to demonstrate that model size does not equate to results. Give the little models a go. You may surprise yourself with the results.





