AI Image Generators Like Dall.E and Imagen Steal Ideas from Humans

AI Image Generators Like Dall.E and Imagen Steal Ideas from Humans

Artificial intelligence is slowly occupying the sphere of processing knowledge and making decisions, which is thought to belong to only humans. When GPT-3 was released, everyone was amazed at its word churning capability and thought about where it is taking the writer community. Then it was time for image processing applications Dall.E and Imagen which though threatened the artists' community to steal their ideas – otherwise put, steal their jobs. Why wouldn't they create such a panic? Dall. E and Imagen could bring together any combination of objects and creatures, in style and substance, all at just one cue. The AI image generators, Dall.E2, and Imagen which have created ripples of curiosity and awe, were only announced recently, and are far away from public reach. Against the uproar in the artist community AI image generators are creating, a thread of thought demonstrates otherwise.

Self-attention is all that AI knows:

To put the picture in perspective, let us understand how Imagen works. The image generator is a three-stage transformer model, which uses text prompts as inputs, how much ever incoherent they are, to generate the corresponding images. A text encoder takes the caption as input and converts the semantic information within the image into a numeric code. The image resolution model which receives the code generates a high-resolution image of the deciphered code. It is very important that the encoder ensures text encoding understands how the words in the input relate to one another – a method called self-attention. One cannot expect the desired result just by feeding it random words, because a sentence in the English language carries meaning by virtue of the syntactic structure rather than the words it contains. If only Imagen has to pay attention to the words and not the syntactic structure of the sentence the result would be as random as the words it ingests. A prompt like "a dragon holding kitten litter" might result in an image with a kitten litter holding a dragon. Whether it sounds funny or meaningful is up to the perceiver but definitely, it is not the right output. Dall. E 2 the latest version of Dall E has a similar working principle except for it can accept images as input. Its stunning capability to generate coherent images only shows that it has a good understanding of the world and the relationship between objects.

Can AI Image Generators Get Creative?

Now that it is established that AI image generators take cues from humans to generate images, the most dreaded question is whether they can steal ideas from humans. A more relevant concern would be if it can get creative to overtake humans. AI has advanced to the stage it is today all because of three factors – the rise of big data, the emergence of powerful GPUs, and the re-emergence of deep learning, which holds huge promise for taking AI towards AGI or a self-reasoning AI. Though self-reasoning and creativity are way different in actual terms it is believed that they are deeply related. Imagine, a self-driving car having to resolve a situation it is not trained for, it has to use an out-the-box solution. Forget about creative thinking, industry experts believe that we are only in the early days of teaching machines, about deep reasoning. Ada Lovelace, the English mathematician and writer quotes, "a machine has no pretensions to originate anything. It can only do whatever we know how to order it to perform." In contrast, there is so much news around AI becoming creative, not only just generating images but pursuing other artistic interests too. While AI image generators transform from generating dreamlike images to witchy and realistic images, they are creating a lot of space for speculations around them overtaking humans. Except for a few implications they have for the advertising and film industry there is nothing about Dall.E2, its predecessor Dall.E, or Imagen – which have their own drawbacks for them even to be called holistic image generators – to fear about.

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net