Introduction
Stable Diffusion is an open-source text-to-image AI model that can generate amazing images from given texts in seconds. However, getting this program to run can take a bit of work. This will be part of a series of 3 articles from installation to personalized prompts to the future that AI image generation implies.
1. Installing Stable-Diffusion WebUI locally
Getting Stable Diffusion running locally on your machine so you have complete control over all the elements, also access to a wealth of extensions and options.
2. Training Personalized Models with Dreambooth
You may need to rent a server for a GPU powerful enough to train your personalized model ($0.50 per hour), although if you have a powerful machine locally you might avoid this.
3. The Future Applications and Pitfalls
This third part of the series will chart the potential future of the text-to-image AI, the copyright issues, and the backlash that this technology is already facing.
“Boston university in the 1920s, ultra-realistic, colour, concept art, intricate details, night, thunder, raining, eerie, Arkham Horror, Call of Cthulhu, elder sign, highly detailed, dark fantasy, photorealistic, octane render, 8k, unreal engine 5. art by artgerm and Greg Rutkowski and Alphonse Mucha”
Limitations
While Stable-Diffusion will be running locally on your machine, we will be using a paid-for server online to run Dreambooth. The reason for this is the need for a very high-power GPU to run the training model, This algorithm requires a GPU with a minimum of 24GB of VRAM. While the EVGA GeForce RTX 3090 FTW3 is a great option, if you can get your hands on one, the price for a new one at time of writing is in excess of $1,500 USD.
Secondly, a lot of this tutorial has been tested mostly on windows with an NVIDIA-based GPU, although there are links to pages that explain the process for Macs/Linux and AMD-based GPU machines.
After this first how to is complete you’ll be able to generate landscapes, scenes, famous people and iconic scenes etc on your local machine, if you want to generate images with your face, you’ll probably need to run a remote $/h server to do so. The cost is very low and you can generate many training models for less than $5, more on that later.
“((path in the forest)), leading to a dark cave!!! entrance, exquisite masterpiece painting by Vermeer, trending on artstation”
Local System Requirements
First, install all the various programs and programming systems we will need to run Stable-Diffusion locally:
- Python: https://www.python.org/downloads/release/python-3109/
- Anaconda: https://www.anaconda.com/
- Git: https://gitforwindows.org/
Create a folder on your local machine where you want to store the models and generate your images (make sure the disk drive has enough space to do all this!
- I will use (D:\AI) but the name and location are arbitrary, I merely have a very full C:\ drive so I used my secondary one.
Files to download
- Hugging Face https://huggingface.co/
- You’ll need a free account to download the model
- Stable Diffusion Model Weights 1.4 (4 gb)
- GFPGAN Face Correction 1.3
Install Stable Diffusion WebUI
Stable diffusion WebUI is not 100% necessary; you can run everything from the command line; however, the local web version running on your browser makes the process 1000 times easier.
What does this tutorial cover?
There are lots of forks (versions) of the Stable Diffusion WebUI, many tutorials work with the NVIDIA GPU and CUDA, and many Macs don’t have Nvidia, but don’t despair we’ve tried to create a tutorial to help you install it on Mac or Windows.
If you have troubles with your particular system or installation please visit the Stable Diffusion Reddit subreddit and see if someone has created a Stable Diffusion system that meets your local machine requirements.
One-click installer for Windows and a macOS alternative
While this article is based on a more detailed installation process of the Stable Diffusion WebUI project the community is rapidly evolving and an easier installation method is one of those things desired by most.
There is A1111’s Stable Diffusion WebUI Easy Installer, which automates most of the download/install steps below. If it works for you, great. If not, the manual process is not so bad.
Installing on a machine with an Nvidia GPU
Search and launch “Anaconda Prompt” from the Start Menu.
- Go into your new folder [cd AI] (D:\AI).
- If you’re trying to change drive in the anaconda prompt, just type the drive name “D:\”
- If you’re trying to change directory use the cd command
cd D:\AI
- Clone Stable-Diffusion-webui
- Download “Stable Diffusion Model Weights 1.4”
- You will need to create a free Hugging Face account sd-v1-4.ckpt
- Rename the downloaded file from “sd-v1-4.ckpt” to “model.ckpt”.
- Move the newly renamed “model.ckpt” file to the Stable Diffusion “Models” folder (D:\AI\stable-diffusion-webui\models).
- Download the GFPGAN File.
- GFPGAN Face Correction 1.3
- Move the “GFPGANv1.3.pth” file to the root Stable Diffusion WebUI folder (D:\AI\stable-diffusion-webui).
Installing with an AMD graphics GPU (Mac)
You’ll need to run these extra steps if you’re using an AMD GPU which is very common on Mac devices. Based on the following article https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs
Launch “Anaconda Prompt” from the Start Menu.
- Go into your new folder [cd AI] (D:\AI).
- If you’re trying to change drive in anaconda prompt, just type the drive name “D:\”
- If you’re trying to change directory, use the cd command
cd D:\AI
- Clone Stable-Diffusion-webui
- Download “Stable Diffusion Model Weights 1.4”
- You will need to create a free Hugging Face account
sd-v1-4.ckpt - Rename the downloaded file from “sd-v1-4.ckpt” to “model.ckpt”.
- Move the newly renamed “model.ckpt” file to the Stable Diffusion “Models” folder (D:\AI\stable-diffusion-webui\models).
- You will need to create a free Hugging Face account
- Download the GFPGAN File.
GFPGAN Face Correction 1.3- Move the “GFPGANv1.3.pth” file to the root Stable Diffusion WebUI folder (D:\AI\stable-diffusion-webui).
- In the root folder (D:\AI\stable-diffusion-webui) do the following terminal commands
For the Mac
We needed to do the following from the command line.
python -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip wheel
# It's possible that you don't need "--precision full"
TORCH_COMMAND='pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/rocm5.1.1'
python launch.py --precision full --no-half --skip-torch-cuda-test
Issues and problems
There are an abundance of systems, graphics cards, and issues that exist in installing something this complex so again If you have troubles with your particular system or installation please visit the Stable Diffusion Reddit subreddit and see if someone has created a Stable Diffusion system that meets your local machine requirements. Or linked to a ‘how to’ on the steps required.
As stated before if you’re not using an NVIDIA GPU or you are on a Mac, it can be more challenging (but it is possible).
Step 2 - Run Webui
If you’re on windows
Run/Double-click on the “webui-user.bat” inside the “stable-diffusion-webui” folder (this will take quite a while the first time you run the .bat).
If you’re on a Mac
Run the “./webui.sh” inside the “stable-diffusion-webui” folder (this will take quite a while the first time you run the file) if you don’t want to run from the terminal every time do the following.
Copy the webui.sh and edit the copy to say webui.command
Enter the terminal and navigate to the root of stable diffusion
D:\AI\stable-diffusion-webui
Run the command chmod +x webui.command
You can now open WebUI by clicking on the command and not use the terminal.
Open a web browser and go to: http://127.0.0.1:7860
Step 3 - Play
Now, you should have Stable Diffusion running on your machine! On the Txt2Image prompt, you can type a series of suggestions and you’ll generate a single response. We found these descriptions on the website www.lexica.art
There are a few settings you’ll need to know, but we’ll cover those in future. The Stable-Diffusion WebUI covers over eight tabs, extensions, training and various features.
The prompt
This is where you type the description you want to create, there is a negative
Negative prompt
These are things which you don’t want to see in your generated image. We found a list for reducing some of the worst issues.
3d, ugly face, (disfigured), (bad art), (deformed), (poorly drawn), (extra limbs), strange colours, blurry, boring, sketch, lacklustre, repetitive, cropped, hands
Other settings
The Batch Count and Batch Size are ones to watch and say the number of batches and images you wish to produce, on a low GPU computer, we found that only a Batch size of 2 worked, while on a gaming machine, 8 images take a few minutes.
Local Machine - Prompt 1
“Baroque oil painting of key visual portrait concept art of anime maids entrenched in the great war, brutalist, dark fantasy, rule of thirds golden ratio, fake detail, trending pixiv fanbox, acrylic palette knife, style of Makoto Shinkai studio Ghibli Genshin impact Jamie Wyeth James Gilleard, Greg Rutkowski, Chiho Aoshima“
Local Machine - Prompt 2
“Gandalf, D&D, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and Greg Rutkowski and Alphonse Mucha “
Finishing Points
This tutorial was about Stable Diffusion 1.4 and recently things have changed with the newer version of 2. We’ll cover more about the evolution of Stable Diffusion tech and AI art in the 3rd article.
Useful Websites
- Hugging Face - https://huggingface.co/
- Birme Bulk Image Resizing - https://www.birme.net/?target_width=512&target_height=512
- Openart prompt guide (pdf) https://openart.ai/promptbook