Edita imágenes con Gemini

Gemini 2.5 Flash Image (gemini-2.5-flash-image) admite la edición mejorada de imágenes y la edición en varios turnos, y contiene filtros de seguridad actualizados que brindan una experiencia del usuario más flexible y menos restrictiva.

Para obtener más información sobre las capacidades de los modelos, consulta Modelos de Gemini.

Cómo editar una imagen

Console

Para editar imágenes, haz lo siguiente:

Abre Vertex AI Studio > Crear instrucción.
Haz clic en Cambiar modelo y selecciona uno de los siguientes modelos del menú:
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
En el panel Salidas, selecciona Imagen y texto en el menú desplegable.
Haz clic en Insertar medios () y selecciona una fuente del menú. Luego, sigue las instrucciones del diálogo.
Escribe los cambios que quieres realizar en la imagen en el área de texto Escribe una instrucción.
Haz clic en el botón Instrucción ().

Gemini genera una versión editada de la imagen proporcionada según tu descripción. Este proceso tarda unos segundos, pero puede ser comparativamente más lento según la capacidad.

Python

Instalar

pip install --upgrade google-genai

Para obtener más información, consulta la documentación de referencia del SDK.

Establece variables de entorno para usar el SDK de IA generativa con Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

# Using an image of Eiffel tower, with fireworks in the background.
image = Image.open("test_resources/example-image-eiffel-tower.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[image, "Edit this image to make it look like a cartoon."],
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/bw-example-image.png")

Java

Obtén más información para instalar o actualizar Java.

Para obtener más información, consulta la documentación de referencia del SDK.

Establece variables de entorno para usar el SDK de IA generativa con Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashEditImageWithTextAndImage {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/bw-example-image.png";
    generateContent(modelId, outputFile);
  }

  // Edits an image with image and text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      byte[] localImageBytes =
          Files.readAllBytes(Paths.get("resources/example-image-eiffel-tower.png"));

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              Content.fromParts(
                  Part.fromBytes(localImageBytes, "image/png"),
                  Part.fromText("Edit this image to make it look like a cartoon.")),
              GenerateContentConfig.builder().responseModalities("TEXT", "IMAGE").build());

      // Get parts of the response
      List<Part> parts =
          response
              .candidates()
              .flatMap(candidates -> candidates.stream().findFirst())
              .flatMap(Candidate::content)
              .flatMap(Content::parts)
              .orElse(new ArrayList<>());

      // For each part print text if present, otherwise read image data if present and
      // write it to the output file
      for (Part part : parts) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
          BufferedImage image =
              ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
          ImageIO.write(image, "png", new File(outputFile));
        }
      }

      System.out.println("Content written to: " + outputFile);

      // Example response:
      // No problem! Here's the image in a cartoon style...
      //
      // Content written to: resources/output/bw-example-image.png
    }
  }
}

Go

Obtén más información para instalar o actualizar Go.

Para obtener más información, consulta la documentación de referencia del SDK.

Establece variables de entorno para usar el SDK de IA generativa con Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"
	"os"

	"google.golang.org/genai"
)

// generateImageMMFlashEditWithTextImg demonstrates editing an image with text and image inputs.
func generateImageMMFlashEditWithTextImg(w io.Writer) error {
	// TODO(developer): Update below lines
	outputFile := "bw-example-image.png"
	inputFile := "example-image-eiffel-tower.png"
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	image, err := os.ReadFile(inputFile)
	if err != nil {
		return fmt.Errorf("failed to read image: %w", err)
	}

	modelName := "gemini-2.5-flash-image"
	prompt := "Edit this image to make it look like a cartoon."
	contents := []*genai.Content{
		{
			Role: "user",
			Parts: []*genai.Part{
				{Text: prompt},
				{InlineData: &genai.Blob{
					MIMEType: "image/png",
					Data:     image,
				}},
			},
		},
	}
	resp, err := client.Models.GenerateContent(ctx,
		modelName,
		contents,
		&genai.GenerateContentConfig{
			ResponseModalities: []string{
				string(genai.ModalityText),
				string(genai.ModalityImage),
			},
		},
	)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}

	if len(resp.Candidates) == 0 || resp.Candidates[0].Content == nil {
		return fmt.Errorf("no content was generated")
	}

	for _, part := range resp.Candidates[0].Content.Parts {
		if part.Text != "" {
			fmt.Fprintln(w, part.Text)
		} else if part.InlineData != nil {
			if len(part.InlineData.Data) > 0 {
				if err := os.WriteFile(outputFile, part.InlineData.Data, 0644); err != nil {
					return fmt.Errorf("failed to save image: %w", err)
				}
				fmt.Fprintln(w, outputFile)
			}
		}
	}

	// Example response:
	// Here's the image of the Eiffel Tower and fireworks, cartoonized for you!
	// Cartoon-style edit:
	//  - Simplified the Eiffel Tower with bolder lines and slightly exaggerated proportions.
	//  - Brightened and saturated the colors of the sky, fireworks, and foliage for a more vibrant, cartoonish look.
	//  ....
	return nil
}

Node.js

Instalar

npm install @google/genai

Para obtener más información, consulta la documentación de referencia del SDK.

Establece variables de entorno para usar el SDK de IA generativa con Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

const FILE_NAME = 'test-data/example-image-eiffel-tower.png';

async function generateImage(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const imageBytes = fs.readFileSync(FILE_NAME);

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash-image',
    contents: [
      {
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/png',
              data: imageBytes.toString('base64'),
            },
          },
          {
            text: 'Edit this image to make it look like a cartoon',
          },
        ],
      },
    ],
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });

  for (const part of response.candidates[0].content.parts) {
    if (part.text) {
      console.log(`${part.text}`);
    } else if (part.inlineData) {
      const outputDir = 'output-folder';
      if (!fs.existsSync(outputDir)) {
        fs.mkdirSync(outputDir, {recursive: true});
      }
      const imageBytes = Buffer.from(part.inlineData.data, 'base64');
      const filename = `${outputDir}/bw-example-image.png`;
      fs.writeFileSync(filename, imageBytes);
    }
  }

  // Example response:
  // Okay, I will edit this image to give it a cartoonish style, with bolder outlines, simplified details, and more vibrant colors.
  return response;
}

REST

Ejecuta el siguiente comando en la terminal para crear o reemplazar este archivo en el directorio actual:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {"fileData": {
          "mimeType": "image/jpg",
          "fileUri": "FILE_NAME"
          }
        },
        {"text": "Convert this photo to black and white, in a cartoonish style."},
      ]

    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
    },
    "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

Gemini 2.5 Flash Image admite las siguientes relaciones de aspecto: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9 y 21:9.

Gemini genera una imagen en función de tu descripción. Este proceso tarda unos segundos, pero puede ser comparativamente más lento según la capacidad.

Edición de imágenes de varios turnos

Gemini 2.5 Flash Image y Gemini 3 Pro Image admiten una edición mejorada en varios turnos, lo que te permite responder al modelo con cambios después de recibir una respuesta de imagen editada.

Recomendamos limitar el tamaño total del archivo de solicitud a un máximo de 50 MB.

Para probar la edición de imágenes en varios turnos, consulta los siguientes notebooks:

Para ver ejemplos de código relacionados con la creación y edición de imágenes en varios turnos con Gemini 3 Pro Image, consulta Ejemplo de edición de imágenes en varios turnos con firmas de pensamiento.

Próximos pasos

Consulta los siguientes vínculos para obtener más información sobre la generación de imágenes con Gemini:

Edita imágenes con Gemini Organiza tus páginas con colecciones Guarda y categoriza el contenido según tus preferencias.

Cómo editar una imagen

Console

Python

Instalar

Java

Go

Node.js

Instalar

REST

Edición de imágenes de varios turnos

Próximos pasos

Edita imágenes con Gemini