Textual inversion

Winkletter • 3 Oct 2022 •
Tweet

Screenshot

The GUI I’m using for Stable Diffusion has added a tab for training textual inversion embeddings. Previously, I could use embeddings that others had trained. For example, I can type in “concept-art-style” to my prompt because someone trained an embedding on samples of concept art. Now I’m able to train my own styles.

And now that I see it working, I finally understand a bit more about what it’s doing.

It doesn’t actually change the model. Instead, it’s training on the text inputs that will generate the training images I supply using the existing model. It tries to figure out the text for an image, runs the text back through Stable Diffusion, and compares the result to the original.

Eventually, it should learn a prompt that creates the style I want and save it as a 4KB file I can invoke with a single word.

This is different from the method being developed by Dreambooth which creates a new model based on the base model that Stable Diffusion trained. For example, I’ve just downloaded the “waifu diffusion” model which will provide me now with unlimited anime waifus.

I was too excited by textual inversion to try that out yet.

For my first training I’ve picked 12 logos at random to see if I can create a “logo style.” I’m running 100,000 steps and I’m at step 5,624 right now. I think the whole process will take about 6 hours, but I can pause it at any time.

Textual inversion

Winkletter • 3 Oct 2022 •
Tweet

Screenshot

Comments

Discover more

Lifelog

Textual inversion

Winkletter • 3 Oct 2022 • Tweet Screenshot

Comments

Discover more

Winkletter • 3 Oct 2022 •
Tweet

Screenshot