Video Gamer is reader-supported. When you buy through links on our site, we may earn an affiliate commission. Prices subject to change. Learn more
Now that OpenAI have launched its multimodal language model, you might be interested in what GPT-4 image input is capable of.
GPT-4 introduced multimodal models to ChatGPT, and one of the theorized new forms of input is images. Before, ChatGPT could only be trained with textual input, however, advancements in technology have reared it for a total change in paradigm.
Now, ChatGPT Plus users have been granted the ability to input images to the service, so let’s go over how.
Originality AI detector
What can you do with GPT-4 image input?
Best Cyber Monday deals 2023
- WD Black SN850X 1TB- $79.99 (was $180)
- ASUS TUF RTX 4060 Ti OC - $409.99 (was $460)
- Alienware AW2724DM QHD - $327.99 (was $500)
- Razer Huntsman V2 TKL - $79.99 (was $160)
- Logitech G502 Lightspeed Wireless - $89.99 (was $150)
GPT-4 image input allows you to receive natural language, code, instructions, or artificial opinions as a response to a photo.
This means that you’re going to be able to input a unique image, alongside a set of clear instructions, questions, or opinions, and GPT-4 can return a structured answer that uses both sets of data as inputs. For example, you might enter an image of a pattern of shapes, and ask GPT-4 which shape completes the pattern, though of course there are more complex and creative usages possible with the new update.
A better example of this could be sets of graphs or data, and you could extrapolate advanced business strategies based on this information. Uses like this, of course, might also be helped by the new addition of ChatGPT plugins, and the recent implementation of ChatGPT web browsing.
Examples of GPT-4 image processing
This is an example of what GPT-4’s image input and processing can do, via OpenAI’s GPT-4 whitepaper. It takes a popular Reddit post and explains what is funny about it.
Being able to process the image and determining what’s funny about it is honestly a fantastic display of the current advancements in modern technology – and they go further and explain the ‘steerability’ of the new GPT-4 model.
What is steerability?
Steerability is effectively the user’s ability to modify and manipulate the ‘personality’ of the AI. Now, you can proscribe the AI with its own custom behavior, which OpenAI is referring to as a ‘jailbreak.’ You can therefore use steerability to pull natural language responses from GPT that are adhered to what you particularly want.
ChatGPT can process images now
You can now input images to ChatGPT if you’re using GPT-4. You simply have to input the image, whether it’s a drag and drop or a file upload, and you’re going to be able to see ChatGPT’s response.
With ChatGPT now integrating DALL-E 3 into their chatbot, you would expect to be able to use the two modules in conjunction with each other. However, you can’t input images into a DALL-E 3 module chat, nor can you cross-information over.
Can ChatGPT interpret images?
With GPT-4, ChatGPT has the ability to analyze and interpret images.
To give you an example of this in your day-to-day life, you could enlist the help of ChatGPT to help you plan a leftovers meal. By simply taking a photo of items that you have in your cupboard, ChatGPT can give you recipe recommendations based on these images. Pretty impressive, right?
Can GPT-4 generate images?
GPT-4 is strictly a multimodal language model. While it can receive varying forms of data input, it can still only return natural language responses. Yet, despite this, ChatGPT can use DALL-E 3 to generate images, as the generation module is now integrated into the chatbot.
So, while GPT-4 can’t necessarily generate images itself – the service has integrated an image generation module into the UI to make it seamless. However, a current pain-point is that there is no way to crossover chats. For example, you can’t input an image into the same chat that you’re going to use to generate an image with DALL-E 3, nor can you use the Advanced Data Analysis tool while using DALL-E 3. Some users across Reddit have been granted access to a GPT-4 Alpha which allows you to integrate all modules into a single chat.
Can ChatGPT make art?
ChatGPT can make art, though you’ll need a bit of your own creativity for the most interesting results. It’s worth noting that ChatGPT is not an AI art generator that produces stylistic results like Midjourney. However, you can use ChatGPT to generate the text input for Midjourney.
Speaking of text generation, ChatGPT can generate poetry, though these results lean more towards the amusing than the profound or beautiful.
Can ChatGPT create ASCII art?
ASCII art is a popular type of digital art work made from ASCII characters that can be shared easily between online platforms. ChatGPT can create ASCII art, though it will struggle with some of your requests. While it coped fine with creating ASCII art of a cat, we didn’t think it did particularly well with out request of creating ASCII art of the lesser known capybara.
Frequently Asked Questions
Is ChatGPT an image generation AI?
ChatGPT does not make images, but GPT-4 takes them as an input via the API.