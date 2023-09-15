Whether I can write or not depends on whom you ask. Some say pretty well; others disagree. But one thing that everyone should agree on is that I can’t draw. I did better as a kid, but whatever graphic artistic ability I may have had at one time atrophied long ago. As of this writing, anything I turn out is extremely cartoonish, not much better than stick figures or the like.
But sometimes I need to create an image, for business for use in a PowerPoint presentation to make a point, or for one of my novels, usually something like a map or a sketch I can provide to a more accomplished artist. Using my own pencil, for the reasons stated above, is pretty much out. I’ve had two choices in the past. The first is to find a stock image online that’s in the public domain. The other is to get a friend or relative to help. My nephew, for example, is a pretty good cartoonist.
Now I have a third option, and that is to use generative AI (artificial intelligence) software. I’ve written a couple of times about using generative AI as an aid to creating written text, including some of the risks inherent in using such a program uncritically. Well, AI technology can also be used to generate pictures. I must not be the only artistically challenged person around, because several companies perceive a need to create software to create images.
Actually, the image-generating AI may well be more advanced than the programs that crank out text.
Versions of image-creating software have been in use by the film industry for years. But it’s only recently that IT providers have started marketing products designed for individual use.
I have tried two image-generative AI platforms. One is called DALL-E and is offered by OpenAI, the same company that offers ChatGPT. The second is a product called MidJourney AI, produced by one of OpenAI’s competitors. DALL-E will sell the customer a certain number of images a la carte. MidJourney offers a free trial, and I’ve used that. I thought that sharing some of my experiences in experimenting with these two platforms might be useful, or at least amusing, for my readers.
I found DALL-E only marginally satisfactory. I wanted to create an image or two from the novel I’m writing now and tried cutting and pasting descriptive text from the work in process. That approach didn’t work well. The images produced were cartoonish, did not capture the text well, and supplied details I thought inconsistent with the text.
Taking a cue from the platform’s suggestions, I tried adding a directive to use the style of a named artist, in this case Frank Frazetta. The result did not resemble any Frazetta illustration I’ve ever seen. DALL-E also pushed back, for reasons I didn’t understand, on creating some images because it thought they wouldn’t conform to the platform’s “standards.” Because I hadn’t asked for anything offensive, I had no idea what the AI was talking about.
So, then I tried typing in an abbreviated instruction, like “create an image of a dark-haired young woman in a riding habit suitable for fox hunting.” That worked better, but it still didn’t come out the way I wanted the illustration to look. I saved a few of the better images for later use, but I still wasn’t happy.
Then I saw a reference to MidJourney online and looked it up. The provider will, as I wrote above, allow limited trial use of the software without charge, after which the user must pay. I’m told that the cost of unlimited use is $9 per month, but I haven’t subscribed yet. Anyway, I started by trying what hadn’t worked with the other platform. I cut and pasted the description of a character into the dialogue box on my screen, and clicked on the button to “create image.”
That approach worked well, so I tweaked the description to provide more detail, such as the colors of eyes and clothing, and it worked better still. My conclusion is that MidJourney is a smarter AI than DALL-E. The program actually turned out a picture of one of the main characters that looked like how I’d imagined that person.
I haven’t gone any further, and I haven’t fully investigated the risks in using these programs. The webpages for the image-generating AI don’t come with disclaimers and warnings like those for ChatGPT. I’ll have an update in a future column.
