A quick look over the fence at the competition: Google has moved its image model Gemini 3.1 Flash Image — known internally by the nickname “Nano Banana 2” — to general availability (GA), together with Gemini 3 Pro Image. In exchange, the existing -preview variants are being shut down.
From preview to production
Nano Banana 2 had been in preview since earlier this year and quickly became a developer favorite — mostly because it pairs fast, high-quality image generation with conversational editing at a mainstream price. That blend of low latency and solid quality is exactly what makes it the efficient counterpart to the bigger Gemini 3 Pro Image.
With the GA step, the preview era is over. If you’re still pointing at gemini-3.1-flash-image-preview or gemini-3-pro-image-preview, migrate soon — the preview endpoints are being retired.
The real highlight: video-to-image
More interesting than the GA label, to me, is a new capability exclusive to Gemini 3.1 Flash Image: video-to-image generation. You pass a video file as multimodal context — along with a text prompt — and the model generates matching stills from it. Google lists use cases like high-quality thumbnails, cinematic posters, and summary infographics.
That’s more than a gimmick. Anyone who regularly produces video content knows the hassle of building good preview images for it. A model that actually understands the clip and derives a coherent keyframe from it saves exactly that tedious in-between step.
My take
clauding.de is mostly about Claude and Anthropic — but the pace Google is setting on image models is hard to ignore. Competition in multimodal models pushes the whole field forward, and in the end we all benefit. Video-to-image is one of those features that sounds unremarkable and can save real time in day-to-day work. I’m curious when similar multimodal bridges become standard elsewhere too.
Sources: Gemini API Release Notes, Google Cloud: Generate images from video with Gemini