Gemini 2.5 Flash Image
(gemini-2.5-flash-image), also known as Nano Banana, supports the ability to generate
images in addition to text. This expands Gemini's capabilities to include the
following:
- Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context.
- Generate images with high-quality long text rendering.
- Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models.
- Generate images using Gemini's world knowledge and reasoning capabilities.
With this release, Gemini 2.5 Flash Image can generate images in 1024px, supports generating images of people, and contains updated safety filters that provide a more flexible and less restrictive user experience.
It supports the following modalities and capabilities:
Text to image
- Example prompt: "Generate an image of the Eiffel tower with fireworks in the background."
Text to image (text rendering)
- Example prompt: "generate a cinematic photo of a large building with this giant text projection mapped on the front of the building: "Gemini 2.5 can now generate long form text""
Text to image(s) and text (interleaved)
- Example prompt: "Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe."
- Example prompt: "Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image"
Image(s) and text to image(s) and text (interleaved)
- Example prompt: (With an image of a furnished room) "What other color sofas would work in my space? Can you update the image?"
Best practices
To improve your image generation results, follow these best practices:
Be specific: More details give you more control. For example, instead of "fantasy armor," try "ornate elven plate armor, etched with silver leaf patterns, with a high collar and pauldrons shaped like falcon wings."
Provide context and intent: Explain the purpose of the image to help the model understand the context. For example, "Create a logo for a high-end, minimalist skincare brand" works better than "Create a logo."
Iterate and refine: Don't expect a perfect image on your first attempt. Use follow-up prompts to make small changes, for example, "Make the lighting warmer" or "Change the character's expression to be more serious."
Use step-by-step instructions: For complex scenes, split your request into steps. For example, "First, create a background of a serene, misty forest at dawn. Then, in the foreground, add a moss-covered ancient stone altar. Finally, place a single, glowing sword on top of the altar."
Describe what you want, not what you don't: Instead of saying "no cars", describe the scene positively by saying, "an empty, deserted street with no signs of traffic."
Control the camera: Guide the camera view. Use photographic and cinematic terms to describe the composition, for example, "wide-angle shot", "macro shot", or "low-angle perspective".
Prompt for images: Describe the intent by using phrases such as "create an image of" or "generate an image of". Otherwise, the multimodal model might respond with text instead of the image.
Limitations:
For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN.
Image generation doesn't support audio or video inputs.
The model might not create the exact number of images you ask for.
For best results, include a maximum of three images in an input.
When generating an image containing text, first generate the text and then generate an image with that text.
Image or text generation might not work as expected in these situations:
The model might only create text. If you want images, clearly ask for images in your request. For example, "provide images as you go along."
The model might create text as an image. To generate text, specifically ask for text output. For example, "generate narrative text along with illustrations."
The model might stop generating content even when it's not finished. If this occurs, try again or use a different prompt.
If a prompt is potentially unsafe, the model might not process the request and returns a response indicating that it can't create unsafe images. In this case, the
FinishReasonisSTOP.
Generate images
The following sections cover how to generate images using either Vertex AI Studio or using the API.
For guidance and best practices for prompting, see Design multimodal prompts.
Console
To use image generation:
- Open Vertex AI Studio > Create prompt.
-
Click Switch model and select
gemini-2.5-flash-imagefrom the menu. - In the Outputs panel, select Image and text from the drop-down menu.
- Write a description of the image you want to generate in the text area of the Write a prompt text area.
- Click the Prompt () button.
Gemini will generate an image based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
Run the following command in the terminal to create or overwrite this file in the current directory:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."},
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
"image_config": {
"aspect_ratio": "16:9",
},
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Gemini will generate an image based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Generate interleaved images and text
Gemini 2.5 Flash Image can generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like to go along with the text of that step, without having to make separate requests to the model to do so.
Console
To generate interleaved images with text responses:
- Open Vertex AI Studio > Create prompt.
-
Click Switch model and select
gemini-2.5-flash-imagefrom the menu. - In the Outputs panel, select Image and text from the drop-down menu.
- Write a description of the image you want to generate in the text area of the Write a prompt text area. For example, "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."
- Click the Prompt () button.
Gemini will generate a response based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
Run the following command in the terminal to create or overwrite this file in the current directory:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."},
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
"image_config": {
"aspect_ratio": "16:9",
},
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Gemini will generate an image based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Edit images
Gemini 2.5 Flash Image for image generation
(gemini-2.5-flash-image) supports the ability to edit
images in addition generating them. Gemini 2.5 Flash Image supports
improved editing of images and multi-turn editing, and contains updated safety
filters that provide a more flexible and less restrictive user experience.
It supports the following modalities and capabilities:
Image editing (text and image to image)
- Example prompt: "Edit this image to make it look like a cartoon"
- Example prompt: [image of a cat] + [image of a pillow] + "Create a cross stitch of my cat on this pillow."
Multi-turn image editing (chat)
Example prompts: [upload an image of a blue car.] "Turn this car into a convertible."
- [Model returns an image of a convertible in the same scene] "Now change the color to yellow."
- [Model returns an image with a yellow convertible] "Add a spoiler."
- [Model returns an image of the convertible with a spoiler]
Edit an image
Console
To edit images:
- Open Vertex AI Studio > Create prompt.
-
Click Switch model and select
gemini-2.5-flash-imagefrom the menu. - In the Outputs panel, select Image and text from the drop-down menu.
- Click Insert media () and select a source from the menu, then follow the dialog's instructions.
- Write what edits you want to make to the image in the Write a prompt text area.
- Click the Prompt () button.
Gemini will generate an edited version of the provided image based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
Java
Learn how to install or update the Java.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=global export GOOGLE_GENAI_USE_VERTEXAI=True
REST
Run the following command in the terminal to create or overwrite this file in the current directory:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${API_ENDPOINT}:generateContent \
-d '{
"contents": {
"role": "USER",
"parts": [
{"file_data": {
"mime_type": "image/jpg",
"file_uri": "<var>FILE_NAME</var>"
}
},
{"text": "Convert this photo to black and white, in a cartoonish style."},
]
},
"generation_config": {
"response_modalities": ["TEXT", "IMAGE"],
"image_config": {
"aspect_ratio": "16:9",
},
},
"safetySettings": {
"method": "PROBABILITY",
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE"
},
}' 2>/dev/null >response.json
Gemini will generate an image based on your description. This process takes a few seconds, but can be comparatively slower depending on capacity.
Multi-turn image editing
Gemini 2.5 Flash Image also supports improved multi-turn editing, letting you respond to the model with changes after receiving an edited image response. This lets you continue to make edits to the image conversationally.
Note that it's recommended to limit the entire request file size to 50MB maximum.
To test out multi-turn image editing, try our Gemini 2.5 Flash Image notebook.
Responsible AI
To ensure a safe and responsible experience, Vertex AI's image generation capabilities are equipped with a multi-layered safety approach. This is designed to prevent the creation of inappropriate content, including sexually explicit, dangerous, violent, hateful, or toxic material.
All users must adhere to the Generative AI Prohibited Use Policy. This policy strictly forbids the generation of content that:
- Relates to child sexual abuse or exploitation.
- Facilitates violent extremism or terrorism.
- Facilitates non-consensual intimate imagery. Facilitates self-harm.
- Is sexually explicit.
- Constitutes hate speech.
- Promotes harassment or bullying.
When provided with an unsafe prompt, the model might refuse to generate an image, or the prompt or generated response might be blocked by our safety filters.
- Model refusal: If a prompt is potentially unsafe, the model might refuse to process the request. If this happens, the model usually gives a text response saying it can't generate unsafe images. The
FinishReasonwill beSTOP. - Safety filter blocking:
- If the prompt is identified as potentially harmful by a safety filter, the API returns
BlockedReasoninPromptFeedback. - If the response is identified as potentially harmful by a safety filter, the API response will include a
FinishReasonofIMAGE_SAFETY,IMAGE_PROHIBITED_CONTENT, or similar. - If the prompt or response is blocked by a safety filter, the
finishMessagefield contains more details about error codes. These codes map to a harm category:
- If the prompt is identified as potentially harmful by a safety filter, the API returns
| Error code | Safety category | Description |
|---|---|---|
| 89371032 49114662 72817394 11030041 47555161 32549819 51891163 |
Prohibited content | Detects the request of prohibited content related to child safety in the request. |
| 63429089 | Sexual | Detects content that's sexual in nature. |
| 87332966 | Dangerous content | Detects content that's potentially dangerous in nature. |
| 22137204 | Hate | Detects hate-related topics or content. |
| 49257518 65344558 |
Celebrity safety | Detects photorealistic representation of a celebrity that violates Google's safety policies. |
| 14952152 | Other | Detects other miscellaneous safety issues with the request. |
| 39322892 | People/Face | Detects a person or face when it isn't allowed due to the request settings. |
| 17301594 | Child | Detects child content where it isn't allowed due to the API request settings. |