OpenAI’s ChatGPT-4 AI Can Understand Photos Too
The rather impressive ChatGPT AI platform has fascinated users of all stripes since it was first unveiled in November of 2022. Now, its latest version has taken a major leap forward in analytical capacity and human-like interaction.
According to OpenAI, the company behind ChatGPT and its latest version, this newest version of the LLM (Large Language Model) is capable of reading photos and explaining their contents to users.
The AI research company explains these and other abilities of Chat GPT-4 in its recent research blog post on the subject.
Users were first stunned by ChatGPT-3 and its multitude of abilities, which include summarizing blog posts, generating complex snippets of programming code and giving clear, human-like explanations of deep technical subjects.
Other things that ChatGPT-3 was capable of include writing whole blog posts, letters, emails and many other types of content, usually to a level of quality that’s hard to distinguish from that of an average educated, literate human writer.
Compared to the natural language capabilities of nearly all previous AI systems, the things that ChatGPT-3 could do represented an enormous leap forward in machine learning.
One defect of the previous model however was its strict limitation to text inputs. ChatGPT-4 breaks this rule by being able to analyze and interpret images too.
As OpenAI explains about GPT-4, “It generates text outputs given inputs consisting of interspersed text and images, over a range of domains — including documents with text and photographs, diagrams, or screenshots — GPT-4 exhibits similar capabilities as it does on text-only inputs.”
A basic example of this ability is the AI system being able to look at a photo with surreal elements and explain why it’s unusual.
For humans, doing this usually isn’t difficult, for AIs, it has been a persistently huge problem. Now it isn’t anymore according to OpenAI.
Microsoft, one of the chief investors in the OpenAI organization, has been enthusiastic about the potential of GPT-4. Just a week ago, the Chief Technology Officer of Microsoft Germany announced that GPT-4 will “offer completely different possibilities — for example, videos.”
Despite this, the current capabilities that OpenAI has described for the new ChatGPT edition only refer to photo interpretation. We’ll have to see how video analysis evolves.
Microsoft has already invested well over 10 billion dollars in OpenAI’s efforts, so the company has a fair bit riding on GPT-4’s success.
This growth in its popularity, multiple effective uses and its overall ability to simulate human communication and content creation with remarkable fidelity have all fueled fears of ChatGPT ruining many human jobs.
So far though, this doesn’t seem to be the case. Microsoft’s own CEO for operations in Germany, Marianne Janik, tried to refute these claims, stating recently that the platform should be considered a tool for reducing repetitive tasks instead o replacing human work.
According to Janik, “Disruption does not necessarily mean job losses. It will take many experts to make the use of AI value-adding.”
How this pans out in the real world of business cost and benefit calculations is something we’ll see in the coming years.
For now, ChatGPT is definitively and incredibly useful in certain contexts, but in others, it still needs plenty of refinement before becoming a trustworthy substitute for human task handling.