Monsters and Manuals: My Shoe is Safe

Thursday, 19 December 2024

My Shoe is Safe

Yesterday's quiz was a toughie. In it, to recap, commenters were encouraged to guess at what the following pieces of art, generated by Substack's own AI image generator, represented. Here, to begin with, are the answers:

1. Minotaur

2. Troll

3. Orc

4. Githyanki

5. Kuo-toa

6. Skeleton

7. Kobold

8. Tarrasque

9. Umber hulk

10. Pit fiend

11. Vampire

12. Lich (yeah, your guess is as good as mine as to how it came up with this)

13. Dragon

14. Ogre

15. Manticore

16. Hydra

17. Beholder

Ok, I confess the last one was a bit of a cheat (although a fascinating experiment, in its own way).

What I find interesting about these kinds of exercises is that they reinforce the point that AI is not 'intelligent'. It doesn't actually 'know' things. It isn't capable of assessing whether what it produces actually looks like what it is supposed to look like, because it isn't capable of assessing anything. It does not see what it produces, because it does not 'see' at all. And it does not exercise judgment because it cannot judge. It has no taste in the most literal sense.

With that said, it does come up with curiosities and some images which are thought-provoking. These pictures are mostly dross - a kind of implausibly well-executed dross, the kind of dross that a human being would never produce (try to imagine a human artist creating a piece that is simultaneously as competently executed and as spectacularly misjudged as the kuo-toa or ogre images above). And the ones that are interesting are interesting mostly because they cause one to reflect on the nature of AI art itself and the way it works - note for instance that the 'minotaur' patently isn't a minotaur, but the programme nonetheless 'knew' that vaguely Grecian columns ought to be involved and it also incorporated what appear to be sheep (which I took to be some sort of mangled intrusion of the cyclops myth).

But there are one or two images which at least spur the imagination - observe, for example, the tiny figure that is standing on top of the 'umber hulk'. Who is that and what is he doing riding around on that thing? And who are those different entities in the 'lich' picture? What is that little homonculus next to the 'vampire' up to? What is that a 'skeleton' of?

I also have to confess I do rather like the 'githyanki'. It looks nothing like a githyanki, but Satan didn't do a bad job with it all the same.

11 comments:

waywardwayfarer20 December 2024 at 00:57
I still can't get over that weird agglomeration of dragon parts, but on the other hand, the kobold as wide-eyed lizard-mouse is a rather refreshing take after so many tedious iterations of WotC's mini-dragonborn interpretation.
ReplyDelete
Replies
Simulated Knave20 December 2024 at 04:18
Other than the hair (which could be fixed quite easily) I rather like that ogre.

I'd say the githyanki, ogre, lich, pit fiend, kobold, and even the hydra all offered made me think interesting thoughts about the monster.
ReplyDelete
Replies
Anonymous20 December 2024 at 19:44
If you had the masochistic patience (and were willing to pay to use some of them, which obviously you wouldn’t) it would be interesting to see how the results varied across different AI art programs. - Jason Bradley Thompson
ReplyDelete
Replies
Anonymous20 December 2024 at 23:51
See this is WHY you need to consider AI as a part of a tool set that interact with the human eye and mind and not the completed product by itself. You can make decent, cheapo illustrations of monsters with AI but it need more than just inputting a few meagre words.

This is just AI slop at its most brutish, stylistically lame form and, believe me I get it you're the anti-AI guy (and for many good reasons so I don't blame you) but there's definitely a better use of the tech. It however requires human refinement and experimentation in the prompting process. Just inputting 'ogre' or 'troll' will lead to lackluster and generic results.
ReplyDelete
Replies
Matt Halton22 December 2024 at 03:23
I've been experimenting with making shoggoths and other Lovecraftian monsters in Bing Image Creator, in the style of 1920s pulp illustrations. Works pretty well. A shoggoth is an amorphous blob so it doesn't matter if the machine gets some of the details wrong.

Try going to Bing and putting in something like "skeleton fighting a minotaur. 1920s pulp illustration". Results will be surreal but more aesthetically appealing than these.
ReplyDelete
Replies
Anonymous10 January 2025 at 04:36
Personally I love how the ogre’s skulls at the bottom of the picture have hard angles - one’s cranium is reminiscent of an icosahedron. They are faintly die-like.
ReplyDelete
Replies

Add comment