In my previous post, I discussed how humans, whether they are artists or laymen, are using Artificial Intelligence to create digital works of art. Currently, there are several AI art generative tools that one can choose from, such as DALL-E 2, DreamStudio (Stable Diffusion), Midjourney, NightCafe, and Prodia, just to name a few. Some of these programs are free to use, but many charge an annual fee (monthly or yearly).
Eager to try out this new technology and see how far it could be pushed, I chose to experiment with Bing Image Creator, which is a highly regulated version of DALL-E 2 (and is free to use with a Microsoft account). The way this tool works is really quite simple; first you type in a word or sentence of an idea you have, then you click the “create” button. The time it takes for the AI algorithm to formulate and generate the art of your idea depends on how specific your request is.
For my first generation request, I simply type in the word “dog”. After a minute of waiting, the AI program generated four images of a dog that you see below.
Initially, I was very excited that the resulting images actually looked like a dog. However, my gradual thoughts were: “Why did the AI generate images of these particular dog breeds? Why did it choose to generate hyper-realistic images of dogs as opposed to say hand-drawn illustrations or 3D models? Why are they all headshots and not full body views?” I understand that users must be more precise with their words in order to get varied results, but I wonder why are these particular images (of dogs) the program’s defaults? Interestingly, when I reverse image searched a few of these generations through Google I found a few websites were using very similar images of these dogs. In fact, some images had the Bing AI art watermark in the lower left hand corner.
Next, I decided to repeat the same search, as I was curious to see if the AI would generate art of the same dog breed. Once again, four images appeared; the color, lighting, and position of the dog’s head were all the same. Perhaps the programmers (or those who built the original algorithm) chose a Retriever to be the AI’s default idea of a “dog”.
Eager to create something different, I decided to include the additional detail “with alien” in my original search. In these new generations, I finally got four different breeds of dogs, different species of aliens, and varied head positions. I was astonished by the uniqueness of each alien’s features (the number of eyes, the colors of their leathery skin), and I especially enjoyed the expressions on each of the dogs’ faces (some scared and others confused).
Subsequently, I decided to modify the sentence even further by adding the words “playing catch.” It was at this point when I started to notice that the AI program seemed to be struggling with merging several figures into one image. For example, you may notice that there are distortions around the eyes in some of the dogs’ faces. I was intrigued by the fact that the program seems to interpret the request “Dog playing catch with alien” in multiple ways. In two images, a dog and alien are playing catch with a ball (just as I requested), but in another image, it looks like the alien has taken on a football shape and is perhaps being caught by the dog (like a chew toy). What I also found interesting was that in two images, the AI chose to include a UFO ship, even though I never requested that in the original prompt.
In the final step of my experiment with AI-generated art, I added a few more words to this ever-growing sentence: “Dog playing catch with alien at Fenway Park, photograph.” By adding the word “photograph,” I hoped to make the final image appear more realistic, with no blurring and crystal-clear detail, rather than a digital illustration with painterly brushstrokes. In the end, I am very happy with how three of the four art generations turned out. In each image, the viewer can clearly see that there is a dog and an alien throwing a ball back and forth, that the location is a baseball stadium, and that the AI used the correct colors of Fenway Park. The AI program really pushed itself to create dynamic movement in both the dog and alien bodies (specifically outstretched arms and bent knees). One question I would have for the AI artist is, “Why are all of the dogs portrayed in profile view and not three-quarter view?” I wonder if the program is capable of producing images where the dog has its back towards the camera.
If you would like to learn more about Bing AI click this link:
https://www.bing.com/images/create/help?FORM=GENHLP
For those of you who are unfamiliar with the concept of AI Art check out this article:
https://www.techtarget.com/searchenterpriseai/definition/AI-art-artificial-intelligence-art
No comments:
Post a Comment