So for the past month or so I’ve been messing around with AI art generation, both img2img and text-to-img.
Once I upgraded my computer, I all of a sudden had a GPU powerful enough to run it on my desktop and then things really ramped up. I’ll try to answer the question in the headline, but otherwise the only point of this blogpost is a stream of consciousness ramble about AI art while sharing assorted results. If you have any questions, post it in the comments and I’ll make a separate blog post specifically about that.
My opinions on AI art are very much middle of the road. I think it is a powerful tool to generate art of a concept that you can describe. It is also “dangerous” because if it is able to consistently create high quality images it can completely destroy the already extremely challenging to navigate ecosystem of commercial art. It also sometimes generates extremely disturbing imagery that you didn’t ask for of malformed mangled bodies, that are sometimes unable to be described as anything other than demonic. Third eyes, horns, pale skin, or other things, whatever image in your head popped up when reading the word demonic probably got generated once when prompted something innocuous like, “smiling caucasian boy with short hair.” (I don’t have an example. I think its perfectly reasonable to not save images of demons on your computer)
In general, while there is a possible timeline where AI art makes a significant chunk of the art market, think its more likely that it’ll be used in niche applications and be unable to scale fast enough to make a dent. I can be wrong, but I doubt my tinkering in the margin will make a difference one way or the other.
I’ve been using it mostly to up the quality of my own art, and also come up with concepts for potential projects. Also I just think its neat.
I’ve been mostly using WaifuDiffusion and StableDiffusion as my go-to models, and I prefer WaifuDiffusion because I am mostly concerned with anime style art. NovelAI does a good job with anime art, but I don’t want to pay for them. Technically I can use the leaked model that they’re using, but it feels borderline illegal so I shy away from it unless I am just not getting what I’m asking for with other models. WaifuDiffusion takes images from Danbooru, which is an imageboard that meticulously categorizes anime art, but also doesn’t shy away from porn. My experience has been that if you don’t ask for NSFW images, you don’t get it.
Anyway now I want to talk about results. Unfortunately I don’t have any prompts for any example images below. These were generated a long time ago, and are the result of repeated trial and error and tweaking. If I knew I was going to make this blog post I would have more carefully documented my process.
Anna
Anna is my main protagonist for my webcomic, found here. In some ways she is an easy character to generate and difficult. There is a wealth of examples of cute anime girls that have pigtails and wear pink that the AI was undoubtedly fed. The problem is when you get into the details. I didn’t intend for Anna to wear the same clothes throughout the comic, but her current costume is a dress with an undershirt that are two tones of pink, and apparently that is way too difficult for the AI to interpret correctly. Because of this, the text to img generation gave lots of close but also totally wrong generations:
Also, she has two pigtails tied low. Its not especially uncommon in anime but way less common than normal pigtails so the AI preferred to generate that instead. Also the pigtails are supposed to be tied by scrunchies, which the AI did not know what to do with.
That said, sometimes despite not being perfect, the result is close enough that I can declare success. I might end up just make this her dress in future chapters:
I had better success with img2img, though its still hit or miss. I turned this image:
Into this image:
You can see another example here:





Also I accidentally made Vikki from Kamen America. To be honest I don’t understand why from my prompt the AI gave glasses and a green headband to the image, but because Mark Pellegrini retweets every reference of Kamen America he finds, I will shamelessly and cynically include it on the off chance he reads this:
Tracy
Tracy is another protagonist of my comic. Her design is more generic redhead tomboy with just a tshirt and leggings/shorts with very few creative additions, so it was way easier to generate images of her.
I also experimented with different costumes more with her. I think its because it took no time at all to get her base costume down, I could move on to other themes faster whereas for Anna I had to spend iteration after iteration just trying to find a way to get the prompt to understand that she’s got a pink shirt and a pink dress.
Img2img also was successful.
In this case, AI has helped me brainstorm new ways to take Tracy’s design going forward.
I have more images to share, but substack is complaining that this is getting too big for emails, and its already gone on for a bit longer than expected so I think this is a good place to stop for now and I’ll pick it up again in another blog post. I’ll leave with a showcase of the disastrous failures. Till next time, catch you later.
This is interesting. I do also believe that there will be a shift in the market. Artists should concentrate in improving their style even more now. Because when someone will want a specific style they won't go to Ai, they'll go to the artists that has it.