Image Understanding Models

Ai2 Releases Molmo 2: State-of-the-Art Open Multimodal Family for Video and Multi-Image Understanding

New open models unlock deep video comprehension with novel features like video tracking and multi-image reasoning, accelerating the science of AI into a new generation of multimodal intelligence.

Geeky Gadgets

Inside Llama 3.2’s Vision Architecture: Bridging Language and Image Understanding

Meta’s Llama 3.2 has been developed to redefined how large language models (LLMs) interact with visual data. By introducing a groundbreaking architecture that seamlessly integrates image understanding ...

9to5Mac

Apple builds single AI model that can see, create and edit images

Building on a previous model called UniGen, a team of Apple researchers is showcasing UniGen 1.5, a system that can handle image understanding, generation, and editing within a single model. Here are ...

TechCrunch

Meta claims its new art-generating model is best-in-class

Over the past two years, AI-powered image generators have become commodified, more or less, thanks to the widespread availability of — and decreasing technical barriers around — the tech. They’ve been ...

TechCrunch

Meet two open source challengers to OpenAI’s ‘multimodal’ GPT-4V

OpenAI’s GPT-4V is being hailed as the next big thing in AI: a “multimodal” model that can understand both text and images. This has obvious utility, which is why a pair of open source projects have ...

ZDNet

ChatGPT's stunning new image generator is now free for everyone

OpenAI has continually expanded its ChatGPT offerings, adding an AI voice assistant, file and image understanding, advanced research capabilities, AI agents, and more. However, there was one glaring ...

VentureBeat

Qwen-Image is a powerful, open source new AI image generator with support for embedded text in English & Chinese

After seizing the summer with a blitz of powerful, freely available new open source language and coding focused AI models that matched or in some cases bested closed ...

VentureBeat

Ai2’s Molmo 2 shows open-source models can rival proprietary giants in video understanding

Fresh off releasing the latest version of its Olmo foundation model, the Allen Institute for AI (Ai2) launched its open-source video model, Molmo 2, on Tuesday, aiming to show that smaller, open ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results