Mastodawn

InfoQ Jul 21, 2023

Say hello to #InstructPix2Pix - the #DeepLearning model that edits images based on human instructions!

Trained on synthetic data, it outperforms baseline AI image-editing models.

Discover the magic of InstructPix2Pix on #InfoQ: https://bit.ly/44EO2B1

#AI #ML #ComputerVision

Berkeley Open-Sources AI Image-Editing Model InstructPix2Pix

Researchers from the Berkeley Artificial Intelligence Research (BAIR) Lab have open-sourced InstructPix2Pix, a deep-learning model that follows human instructions to edit images. InstructPix2Pix was trained on synthetic data and outperforms a baseline AI image-editing model.

InfoQ

Chaotic Human Feb 6, 2023

Just found out it's possible to merge the #InstructPix2Pix- with the #riffusion-model by using the receipe in the image of this post.

And the most interesting part here is the resulting instructPix2Pix-riffusion-model indeed still only outputs spectograms however the results are otherwise not that good (I guess the reason is the GPT3-component of instructPix2Pix was not optimized for spectograms) but it's still interesting that this merger kinda works.

#StableDiffusion

Chaotic Human Feb 4, 2023

#instructPix2Pix (ip2p) is finally working again in the most recent version of the automatic1111-webui.

Just update your webui-installation with git pull and then your ip2p-extension using the Extensions-tab of the webui.

If you had a depth-model loaded before switch to your ip2p-model and then restart the webui to ensure that the depth-model is unloaded to avoid out-of-memory-errors while using ip2p.

#StableDiffusion

Show thread

Chaotic Human Jan 27, 2023

Update: oh well, so it really turned out that #instructpix2pix can actually even run on systems with 6GB of VRAM and doesn't need 18GB of VRAM. This is huge news.

We now have proof that a #StableDiffusion-model with a LLM-component can indeed also run locally on ordinary PCs. I think it's now only a matter of time until it's also possible to run a full LLM on 6GB of VRAM or less.

Chaotic Human Jan 27, 2023

So there's now another very interesting new #StableDiffusion-model out there named #InstructPix2Pix or to be more precise a model that is a merge of Stable Diffusion with a version of #GPT3 meaning with a large language model(!). (https://huggingface.co/timbrooks/instruct-pix2pix/tree/main).

This model modifies images using prompts with natural language you would also similarly use in ChatGPT (prompts like f.e. "What would it look like with rain?", "Add fireworks", etc).