使用 Gemini 生成及編輯圖片

注意：gemini-2.0-flash-preview-image-generation 和 gemini-2.5-flash-image-preview 模型將於 2025 年 10 月 31 日淘汰。請務必在該日期前將所有工作流程遷移至 gemini-2.5-flash-image，以免服務中斷。

除了文字，下列 Gemini 模型也支援生成圖片：

Gemini 2.5 Flash Image (又稱 Gemini 2.5 Flash，採用 Nano Banana 模型)
Gemini 3 Pro Image (預先發布版)，又稱 Gemini 3 Pro (搭載 Nano Banana)

這項功能擴充了 Gemini 的功能，包括：

透過自然語言對話反覆生成圖像，並在調整圖像時維持一致性和情境脈絡。
生成圖像，並以高品質呈現長篇文字。
生成文字和圖片夾雜的內容。舉例來說，單一回合的網誌文章包含文字和圖片。先前，這需要將多個模型串連在一起。
運用 Gemini 的世界知識和推理能力生成圖片。

Gemini 2.5 Flash Image (gemini-2.5-flash-image) 和 Gemini 3 Pro Image 預先發布版 (gemini-3-pro-image-preview) 支援生成人物圖像，並提供更新版安全篩選器，讓使用者體驗更靈活，限制更少。Gemini 2.5 Flash Image 可生成 1024 像素的圖片。Gemini 3 Pro Image 可生成最大 4096 像素的圖片。

這兩種模型都支援下列模態和功能：

文字轉圖像
- 範例提示：「生成艾菲爾鐵塔的圖片，背景要有煙火。」
文字轉圖像 (文字算繪)
- 提示範例：「生成一張電影風格的照片，內容是大型建築物，正面投影出巨大文字：「Gemini 3 現在可以生成長篇文字」」
文字轉圖像和文字 (交錯)
- 範例提示：「Generate an illustrated recipe for a paella. 在生成食譜的同時建立圖像。」
- 提示範例：「Generate a story about a dog in a 3D cartoon animation style. 為每個場景生成圖片
圖片和文字轉為圖片和文字 (交錯)
- 範例提示： (附上已擺放家具的房間圖片)「我的空間適合擺放哪些其他顏色的沙發？可以更新圖片嗎？」

最佳做法

如要提升圖像生成結果的品質，請遵循下列最佳做法：

具體說明：提供越多詳細資訊，生成結果就越貼近需求。舉例來說，請嘗試「精緻的精靈板甲，刻有銀葉圖案，高領和肩甲形狀像獵鷹翅膀」，而非「奇幻盔甲」。
提供背景資訊和意圖：說明圖片用途，幫助模型瞭解背景資訊。舉例來說，「為高檔極簡護膚品牌設計標誌」的效果會比「設計標誌」更好。
反覆測試及修正：第一次嘗試時，請別期待能生成完美的圖片。使用後續提示進行微調，例如「將光線調暖」或「將角色的表情改得更嚴肅」。
使用逐步操作說明：如果是複雜的場景，請將要求分成多個步驟。例如：「首先，請在黎明時分製作寧靜的霧氣森林背景。接著，在前景中新增覆蓋著青苔的古老石祭壇。最後，將一把發光的劍放在祭壇上。」
描述你想要的內容，而不是不想要的內容：請以正面方式描述場景，例如「空蕩蕩的街道，沒有任何交通跡象」，而不是說「沒有車輛」。
控制攝影機：引導攝影機視角。使用攝影和電影術語描述構圖，例如「廣角鏡頭」、「微距鏡頭」或「低角度透視」。
圖片提示：使用「製作…的圖片」或「生成…的圖片」等詞組，描述意圖。否則多模態模型可能會以文字回覆，而非圖片。
傳遞想法簽章： 使用 Gemini 3 Pro Image 時，建議您在多輪圖像建立和編輯期間，將想法簽章傳遞回模型。這樣就能在互動過程中保留推理脈絡。如需使用 Gemini 3 Pro Image 進行多輪圖像編輯的相關程式碼範例，請參閱「使用思維簽章進行多輪圖像編輯的範例」。

限制：

如要使用 Gemini 2.5 Flash Image 獲得最佳成效，請使用下列語言：英文、西班牙文 (墨西哥)、日文、中文 (中國) 或印地文。如要使用 Gemini 3 Pro Image 獲得最佳成效，請使用下列語言：ar-EG、de-DE、EN、es-MX、fr-FR、hi-IN、id-ID、it-IT、ja-JP、ko-KR、pt-BR、ru-RU、ua-UA、vi-VN 和 zh-CN
圖像生成功能不支援音訊或影片輸入內容。
模型可能不會生成您要求的確切圖片數量。
如要使用 Gemini 2.5 Flash Image 獲得最佳結果，輸入內容最多可包含三張圖片。如要使用 Gemini 3 Pro Image 獲得最佳效果，輸入內容最多可包含 14 張圖片。
生成含有文字的圖片時，請先生成文字，然後再生成含有該文字的圖片。
在下列情況下，圖片或文字生成功能可能無法正常運作：
- 如果提示不夠明確，模型可能只會生成文字，不會生成圖片。如要取得圖片，請在要求中明確說明。例如「請在過程中提供圖片」。
- 模型可能會將文字做為圖片建立。如要生成文字，請明確要求文字輸出。例如「生成敘事文字和插圖」。
- 即使模型尚未完成生成內容，也可能會停止作業。如果發生這種情況，請再試一次或改用其他提示。
- 如果提示可能不安全，模型可能不會處理要求，並傳回無法建立不安全圖片的回應。在本例中，FinishReason 是 STOP。

生成圖像

以下章節將說明如何使用 Vertex AI Studio 或 API 生成圖片。

如需提示詞的指南和最佳做法，請參閱「設計多模態提示」。

控制台

如要使用圖像生成功能，請按照下列步驟操作：

開啟 Vertex AI Studio > 建立提示詞。
按一下「切換模型」，然後從選單中選取下列其中一個模型：
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
在「輸出」面板中，從下拉式選單選取「圖片和文字」。
在「撰寫提示」文字區域中，輸入要生成的圖片說明。
按一下「提示」 () 按鈕。

Gemini 會根據你的描述生成圖片。這項程序需要幾秒鐘，但視容量而定，速度可能會相對較慢。

Python

安裝

pip install --upgrade google-genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=("Generate an image of the Eiffel tower with fireworks in the background."),
    config=GenerateContentConfig(
        response_modalities=[Modality.TEXT, Modality.IMAGE],
    ),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/example-image-eiffel-tower.png")

Go

瞭解如何安裝或更新 Go。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"
	"os"

	"google.golang.org/genai"
)

// generateMMFlashWithText demonstrates how to generate both text and image outputs.
func generateMMFlashWithText(w io.Writer) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	modelName := "gemini-2.5-flash-image"
	contents := []*genai.Content{
		{
			Parts: []*genai.Part{
				{Text: "Generate an image of the Eiffel tower with fireworks in the background."},
			},
			Role: genai.RoleUser,
		},
	}

	resp, err := client.Models.GenerateContent(ctx,
		modelName,
		contents,
		&genai.GenerateContentConfig{
			ResponseModalities: []string{
				string(genai.ModalityText),
				string(genai.ModalityImage),
			},
			CandidateCount: int32(1),
			SafetySettings: []*genai.SafetySetting{
				{Method: genai.HarmBlockMethodProbability},
				{Category: genai.HarmCategoryDangerousContent},
				{Threshold: genai.HarmBlockThresholdBlockMediumAndAbove},
			},
		},
	)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}

	if len(resp.Candidates) == 0 || resp.Candidates[0].Content == nil {
		return fmt.Errorf("no candidates returned")
	}
	var fileName string
	for _, part := range resp.Candidates[0].Content.Parts {
		if part.Text != "" {
			fmt.Fprintln(w, part.Text)
		} else if part.InlineData != nil {
			fileName = "example-image-eiffel-tower.png"
			if err := os.WriteFile(fileName, part.InlineData.Data, 0o644); err != nil {
				return fmt.Errorf("failed to save image: %w", err)
			}
		}
	}
	fmt.Fprintln(w, fileName)

	// Example response:
	// I will generate an image of the Eiffel Tower at night, with a vibrant display of
	// colorful fireworks exploding in the dark sky behind it.
	// ....
	return nil
}

Node.js

安裝

npm install @google/genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

async function generateImage(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const response = await client.models.generateContentStream({
    model: 'gemini-2.5-flash-image',
    contents:
      'Generate an image of the Eiffel tower with fireworks in the background.',
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });

  const generatedFileNames = [];
  let imageIndex = 0;

  for await (const chunk of response) {
    const text = chunk.text;
    const data = chunk.data;
    if (text) {
      console.debug(text);
    } else if (data) {
      const outputDir = 'output-folder';
      if (!fs.existsSync(outputDir)) {
        fs.mkdirSync(outputDir, {recursive: true});
      }
      const fileName = `${outputDir}/generate_content_streaming_image_${imageIndex++}.png`;
      console.debug(`Writing response image to file: ${fileName}.`);
      try {
        fs.writeFileSync(fileName, data);
        generatedFileNames.push(fileName);
      } catch (error) {
        console.error(`Failed to write image file ${fileName}:`, error);
      }
    }
  }

  // Example response:
  //  I will generate an image of the Eiffel Tower at night, with a vibrant display of
  //  colorful fireworks exploding in the dark sky behind it. The tower will be
  //  illuminated, standing tall as the focal point of the scene, with the bursts of
  //  light from the fireworks creating a festive atmosphere.

  return generatedFileNames;
}

Java

瞭解如何安裝或更新 Java。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import com.google.genai.types.SafetySetting;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashWithText {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/example-image-eiffel-tower.png";
    generateContent(modelId, outputFile);
  }

  // Generates an image with text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      GenerateContentConfig contentConfig =
          GenerateContentConfig.builder()
              .responseModalities("TEXT", "IMAGE")
              .candidateCount(1)
              .safetySettings(
                  SafetySetting.builder()
                      .method("PROBABILITY")
                      .category("HARM_CATEGORY_DANGEROUS_CONTENT")
                      .threshold("BLOCK_MEDIUM_AND_ABOVE")
                      .build())
              .build();

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              "Generate an image of the Eiffel tower with fireworks in the background.",
              contentConfig);

      // Get parts of the response
      List<Part> parts =
          response
              .candidates()
              .flatMap(candidates -> candidates.stream().findFirst())
              .flatMap(Candidate::content)
              .flatMap(Content::parts)
              .orElse(new ArrayList<>());

      // For each part print text if present, otherwise read image data if present and
      // write it to the output file
      for (Part part : parts) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
          BufferedImage image =
              ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
          ImageIO.write(image, "png", new File(outputFile));
        }
      }

      System.out.println("Content written to: " + outputFile);
      // Example response:
      // Here is the Eiffel Tower with fireworks in the background...
      //
      // Content written to: resources/output/example-image-eiffel-tower.png
    }
  }
}

REST

在終端機中執行下列指令，在目前目錄中建立或覆寫這個檔案：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."
        }
      ]
    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
     },
     "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：Gemini 2.5 Flash Image 支援下列顯示比例：1:1、3:2、2:3、 3:4、4:3、4:5、5:4、 9:16、16:9 和 21:9。

Gemini 會根據你的描述生成圖片。這項程序只需要幾秒鐘，但視容量而定，速度可能會比較慢。

生成圖像與文字交雜的內容

Gemini 2.5 Flash Image 可在文字回覆中穿插圖片。舉例來說，您可以生成食譜中每個步驟的圖片，並搭配該步驟的文字，不必另外向模型提出要求。

控制台

如要生成圖像與文字交雜的回覆，請按照下列步驟操作：

開啟 Vertex AI Studio > 建立提示詞。
按一下「切換模型」，然後從選單中選取下列其中一個模型：
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
在「輸出」面板中，從下拉式選單選取「圖片和文字」。
在「撰寫提示」文字區域中，輸入要生成的圖片說明。例如：「製作教學課程，說明如何用簡單的三個步驟製作花生醬和果醬三明治。針對每個步驟，提供標題和步驟編號、說明，並生成圖片，每張圖片的長寬比為 1:1。"
按一下「提示」 () 按鈕。

Gemini 會根據你的描述生成回覆。這項程序只需要幾秒鐘，但視容量而定，速度可能會比較慢。

Python

安裝

pip install --upgrade google-genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=(
        "Generate an illustrated recipe for a paella."
        "Create images to go alongside the text as you generate the recipe"
    ),
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
with open("output_folder/paella-recipe.md", "w") as fp:
    for i, part in enumerate(response.candidates[0].content.parts):
        if part.text is not None:
            fp.write(part.text)
        elif part.inline_data is not None:
            image = Image.open(BytesIO((part.inline_data.data)))
            image.save(f"output_folder/example-image-{i+1}.png")
            fp.write(f"![image](example-image-{i+1}.png)")

Java

瞭解如何安裝或更新 Java。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.awt.image.BufferedImage;
import java.io.BufferedWriter;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashTextAndImageWithText {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/paella-recipe.md";
    generateContent(modelId, outputFile);
  }

  // Generates text and image with text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              Content.fromParts(
                  Part.fromText("Generate an illustrated recipe for a paella."),
                  Part.fromText(
                      "Create images to go alongside the text as you generate the recipe.")),
              GenerateContentConfig.builder().responseModalities("TEXT", "IMAGE").build());

      try (BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))) {

        // Get parts of the response
        List<Part> parts =
            response
                .candidates()
                .flatMap(candidates -> candidates.stream().findFirst())
                .flatMap(Candidate::content)
                .flatMap(Content::parts)
                .orElse(new ArrayList<>());

        int index = 1;
        // For each part print text if present, otherwise read image data if present and
        // write it to the output file
        for (Part part : parts) {
          if (part.text().isPresent()) {
            writer.write(part.text().get());
          } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
            BufferedImage image =
                ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
            ImageIO.write(
                image, "png", new File("resources/output/example-image-" + index + ".png"));
            writer.write("![image](example-image-" + index + ".png)");
          }
          index++;
        }

        System.out.println("Content written to: " + outputFile);

        // Example response:
        // A markdown page for a Paella recipe(`paella-recipe.md`) has been generated.
        // It includes detailed steps and several images illustrating the cooking process.
        //
        // Content written to:  resources/output/paella-recipe.md
      }
    }
  }
}

Go

瞭解如何安裝或更新 Go。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"
	"os"
	"path/filepath"

	"google.golang.org/genai"
)

// generateMMFlashTxtImgWithText demonstrates how to generate an illustrated recipe
// combining text and image outputs into a markdown file.
func generateMMFlashTxtImgWithText(w io.Writer) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	modelName := "gemini-2.5-flash-image"
	contents := []*genai.Content{
		{
			Parts: []*genai.Part{
				{Text: "Generate an illustrated recipe for a paella. " +
					"Create images to go alongside the text as you generate the recipe."},
			},
			Role: genai.RoleUser,
		},
	}

	resp, err := client.Models.GenerateContent(ctx,
		modelName,
		contents,
		&genai.GenerateContentConfig{
			ResponseModalities: []string{
				string(genai.ModalityText),
				string(genai.ModalityImage),
			},
			CandidateCount: int32(1),
		},
	)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}

	if len(resp.Candidates) == 0 || resp.Candidates[0].Content == nil {
		return fmt.Errorf("no candidates returned")
	}

	outputFolder := ""

	// Create the markdown file
	mdFile := filepath.Join(outputFolder, "paella-recipe.md")
	fp, err := os.Create(mdFile)
	if err != nil {
		return fmt.Errorf("failed to create markdown file: %w", err)
	}
	defer fp.Close()

	for i, part := range resp.Candidates[0].Content.Parts {
		if part.Text != "" {
			if _, err := fp.WriteString(part.Text); err != nil {
				return fmt.Errorf("failed to write text: %w", err)
			}
		} else if part.InlineData != nil {
			imgFile := filepath.Join(outputFolder, fmt.Sprintf("example-image-%d.png", i+1))
			if err := os.WriteFile(imgFile, part.InlineData.Data, 0644); err != nil {
				return fmt.Errorf("failed to save image: %w", err)
			}
			if _, err := fp.WriteString(fmt.Sprintf("![image](%s)", filepath.Base(imgFile))); err != nil {
				return fmt.Errorf("failed to write image reference: %w", err)
			}
		}
	}

	fmt.Fprintln(w, mdFile)

	// Example response:
	//  A markdown page for a Paella recipe (`paella-recipe.md`) has been generated.
	//  It includes detailed steps and several images illustrating the cooking process.
	return nil
}

Node.js

安裝

npm install @google/genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

async function savePaellaRecipe(response) {
  const parts = response.candidates[0].content.parts;

  let mdText = '';
  const outputDir = 'output-folder';

  for (let i = 0; i < parts.length; i++) {
    const part = parts[i];

    if (part.text) {
      mdText += part.text + '\n';
    } else if (part.inlineData) {
      if (!fs.existsSync(outputDir)) {
        fs.mkdirSync(outputDir, {recursive: true});
      }
      const imageBytes = Buffer.from(part.inlineData.data, 'base64');
      const imagePath = `example-image-${i + 1}.png`;
      const saveImagePath = `${outputDir}/${imagePath}`;

      fs.writeFileSync(saveImagePath, imageBytes);
      mdText += `![image](./${imagePath})\n`;
    }
  }
  const mdFile = `${outputDir}/paella-recipe.md`;

  fs.writeFileSync(mdFile, mdText);
  console.log(`Saved recipe to: ${mdFile}`);
}

async function generateImage(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash-image',
    contents:
      'Generate an illustrated recipe for a paella. Create images to go alongside the text as you generate the recipe',
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });
  console.log(response);

  await savePaellaRecipe(response);

  return response;
}
// Example response:
//  A markdown page for a Paella recipe(`paella-recipe.md`) has been generated.
//  It includes detailed steps and several images illustrating the cooking process.

REST

在終端機中執行下列指令，在目前目錄中建立或覆寫這個檔案：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."
        }
      ]
    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
    },
    "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：Gemini 2.5 Flash Image 和 Gemini 3 Pro Image 支援下列長寬比：1:1、3:2、 2:3、3:4、4:3、4:5、 5:4、9:16、16:9 和 21:9。

Gemini 會根據你的描述生成圖片。這項程序只需要幾秒鐘，但視容量而定，速度可能會比較慢。

編輯圖像

Gemini 2.5 Flash Image 可生成及編輯圖像 (gemini-2.5-flash-image)。Gemini 2.5 Flash Image 支援更完善的圖片編輯和多輪編輯功能，並提供更新的安全篩選器，讓使用者體驗更靈活、限制更少。

支援的模態和功能如下：

圖片編輯 (文字和圖片轉為圖片)
- 範例提示：「將這張圖片編輯成卡通風格」
- 範例提示：[貓咪圖片] + [枕頭圖片] +「在這顆枕頭上製作我貓咪的十字繡。」
多輪圖像編輯 (聊天)
- 範例提示：[上傳藍色車輛的圖片。] 「將這輛車變成敞篷車。」
  - [模型會傳回同一場景中敞篷車的圖片]「現在將顏色改為黃色。」
  - [模型傳回一輛黃色敞篷車的圖片]「加上擾流板。」
  - [模型會傳回有擾流板的敞篷車圖片]

編輯圖片

控制台

如要編輯圖片，請按照下列步驟操作：

開啟 Vertex AI Studio > 建立提示詞。
按一下「切換模型」，然後從選單中選取下列其中一個模型：
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
在「輸出」面板中，從下拉式選單選取「圖片和文字」。
按一下「插入媒體」圖示 ()，然後從選單中選取來源，並按照對話方塊的指示操作。
在「撰寫提示」文字區域中，寫下要對圖片進行的編輯。
按一下「提示」 () 按鈕。

Gemini 會根據你的描述生成編輯後的圖片。這項程序只需幾秒鐘，但視容量而定，速度可能相對較慢。

Python

安裝

pip install --upgrade google-genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

# Using an image of Eiffel tower, with fireworks in the background.
image = Image.open("test_resources/example-image-eiffel-tower.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[image, "Edit this image to make it look like a cartoon."],
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/bw-example-image.png")

Java

瞭解如何安裝或更新 Java。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashEditImageWithTextAndImage {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/bw-example-image.png";
    generateContent(modelId, outputFile);
  }

  // Edits an image with image and text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      byte[] localImageBytes =
          Files.readAllBytes(Paths.get("resources/example-image-eiffel-tower.png"));

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              Content.fromParts(
                  Part.fromBytes(localImageBytes, "image/png"),
                  Part.fromText("Edit this image to make it look like a cartoon.")),
              GenerateContentConfig.builder().responseModalities("TEXT", "IMAGE").build());

      // Get parts of the response
      List<Part> parts =
          response
              .candidates()
              .flatMap(candidates -> candidates.stream().findFirst())
              .flatMap(Candidate::content)
              .flatMap(Content::parts)
              .orElse(new ArrayList<>());

      // For each part print text if present, otherwise read image data if present and
      // write it to the output file
      for (Part part : parts) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
          BufferedImage image =
              ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
          ImageIO.write(image, "png", new File(outputFile));
        }
      }

      System.out.println("Content written to: " + outputFile);

      // Example response:
      // No problem! Here's the image in a cartoon style...
      //
      // Content written to: resources/output/bw-example-image.png
    }
  }
}

Go

瞭解如何安裝或更新 Go。

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

import (
	"context"
	"fmt"
	"io"
	"os"

	"google.golang.org/genai"
)

// generateImageMMFlashEditWithTextImg demonstrates editing an image with text and image inputs.
func generateImageMMFlashEditWithTextImg(w io.Writer) error {
	// TODO(developer): Update below lines
	outputFile := "bw-example-image.png"
	inputFile := "example-image-eiffel-tower.png"
	ctx := context.Background()

	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
	})
	if err != nil {
		return fmt.Errorf("failed to create genai client: %w", err)
	}

	image, err := os.ReadFile(inputFile)
	if err != nil {
		return fmt.Errorf("failed to read image: %w", err)
	}

	modelName := "gemini-2.5-flash-image"
	prompt := "Edit this image to make it look like a cartoon."
	contents := []*genai.Content{
		{
			Role: "user",
			Parts: []*genai.Part{
				{Text: prompt},
				{InlineData: &genai.Blob{
					MIMEType: "image/png",
					Data:     image,
				}},
			},
		},
	}
	resp, err := client.Models.GenerateContent(ctx,
		modelName,
		contents,
		&genai.GenerateContentConfig{
			ResponseModalities: []string{
				string(genai.ModalityText),
				string(genai.ModalityImage),
			},
		},
	)
	if err != nil {
		return fmt.Errorf("failed to generate content: %w", err)
	}

	if len(resp.Candidates) == 0 || resp.Candidates[0].Content == nil {
		return fmt.Errorf("no content was generated")
	}

	for _, part := range resp.Candidates[0].Content.Parts {
		if part.Text != "" {
			fmt.Fprintln(w, part.Text)
		} else if part.InlineData != nil {
			if len(part.InlineData.Data) > 0 {
				if err := os.WriteFile(outputFile, part.InlineData.Data, 0644); err != nil {
					return fmt.Errorf("failed to save image: %w", err)
				}
				fmt.Fprintln(w, outputFile)
			}
		}
	}

	// Example response:
	// Here's the image of the Eiffel Tower and fireworks, cartoonized for you!
	// Cartoon-style edit:
	//  - Simplified the Eiffel Tower with bolder lines and slightly exaggerated proportions.
	//  - Brightened and saturated the colors of the sky, fireworks, and foliage for a more vibrant, cartoonish look.
	//  ....
	return nil
}

Node.js

安裝

npm install @google/genai

詳情請參閱 SDK 參考說明文件。

設定環境變數，透過 Vertex AI 使用 Gen AI SDK：

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

const FILE_NAME = 'test-data/example-image-eiffel-tower.png';

async function generateImage(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const imageBytes = fs.readFileSync(FILE_NAME);

  const response = await client.models.generateContent({
    model: 'gemini-2.5-flash-image',
    contents: [
      {
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/png',
              data: imageBytes.toString('base64'),
            },
          },
          {
            text: 'Edit this image to make it look like a cartoon',
          },
        ],
      },
    ],
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });

  for (const part of response.candidates[0].content.parts) {
    if (part.text) {
      console.log(`${part.text}`);
    } else if (part.inlineData) {
      const outputDir = 'output-folder';
      if (!fs.existsSync(outputDir)) {
        fs.mkdirSync(outputDir, {recursive: true});
      }
      const imageBytes = Buffer.from(part.inlineData.data, 'base64');
      const filename = `${outputDir}/bw-example-image.png`;
      fs.writeFileSync(filename, imageBytes);
    }
  }

  // Example response:
  // Okay, I will edit this image to give it a cartoonish style, with bolder outlines, simplified details, and more vibrant colors.
  return response;
}

REST

在終端機中執行下列指令，在目前目錄中建立或覆寫這個檔案：

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {"fileData": {
          "mimeType": "image/jpg",
          "fileUri": "FILE_NAME"
          }
        },
        {"text": "Convert this photo to black and white, in a cartoonish style."},
      ]

    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
    },
    "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

注意：Gemini 2.5 Flash Image 支援下列顯示比例：1:1、3:2、2:3、 3:4、4:3、4:5、5:4、 9:16、16:9 和 21:9。

Gemini 會根據你的描述生成圖片。這項程序只需要幾秒鐘，但視容量而定，速度可能會比較慢。

多輪圖像編輯

Gemini 2.5 Flash Image 和 Gemini 3 Pro Image 支援改良的多輪編輯功能，讓您在收到編輯後的圖像回覆後，可回覆模型並進行變更。這樣一來，您就能繼續以對話方式編輯圖片。

請注意，建議將整個要求檔案大小限制在 50MB 以內。

如要測試多輪圖片編輯功能，請試用下列筆記本：

如需使用 Gemini 3 Pro Image 建立及編輯多輪圖像的程式碼範例，請參閱使用思維簽章編輯多輪圖像的範例。

負責任的 AI 技術

為確保安全無虞的體驗，Vertex AI 的圖片生成功能採用多層式安全防護機制。這項功能旨在防止生成不當內容，包括煽情露骨、危險、暴力、仇恨或有害的素材。

所有使用者都必須遵守《生成式 AI 使用限制政策》。這項政策嚴格禁止生成以下內容：

兒少性虐待或兒少剝削相關內容。
助長暴力極端主義或恐怖主義的內容。
散布未經當事人同意的私密圖像的內容。助長自傷的內容。
含有煽情露骨內容。
構成仇恨言論。
鼓吹騷擾或霸凌行為。

如果提示內容不安全，模型可能會拒絕生成圖片，或提示/生成的回覆可能會遭到安全篩選器封鎖。

模型拒絕：如果提示可能不安全，模型可能會拒絕處理要求。如果發生這種情況，模型通常會提供文字回覆，表示無法生成不安全的圖片。FinishReason 將為 STOP。
安全性篩選器封鎖：
- 如果安全篩選器判定提示詞可能有害，API 會在 PromptFeedback 中傳回 BlockedReason。
- 如果安全篩選器判斷回覆可能有害，API 回覆會包含 FinishReason，值為 IMAGE_SAFETY、IMAGE_PROHIBITED_CONTENT 或類似項目。

安全篩選器代碼類別

視您設定的安全篩選機制而定，輸出內容可能包含類似下列的安全原因代碼：

    {
      "raiFilteredReason": "ERROR_MESSAGE. Support codes: 56562880"
    }

列出的代碼對應至特定有害類別。這些代碼與類別的對應關係如下：

錯誤代碼	安全類別	說明	已篩除的內容：提示輸入內容或圖像輸出內容
58061214 17301594	子項	偵測因 API 要求設定或許可清單而不得出現的兒童內容。	輸入 (提示)：58061214 輸出 (圖片)：17301594
29310472 15236754	名人	偵測要求中是否包含名人的寫實圖像。	輸入 (提示)：29310472 輸出 (圖片)：15236754
62263041	危險內容	偵測可能具有危險性的內容。	輸入 (提示)
57734940 22137204	仇恨	偵測仇恨相關主題或內容。	輸入 (提示詞)：57734940 輸出 (圖片)：22137204
74803281 29578790 42876398	其他	偵測要求中的其他雜項安全問題。	輸入 (提示)：42876398 輸出 (圖片)：29578790、74803281
39322892	人物/臉部	因要求安全設定不允許，而偵測到人或臉孔。	輸出內容 (圖片)
92201652	個人資訊	偵測文字中的個人識別資訊 (PII)，例如提及信用卡號碼、住家地址或其他這類資訊。	輸入 (提示)
89371032 49114662 72817394	禁止宣傳的內容	偵測要求中是否含有違禁內容。	輸入 (提示)：89371032 輸出 (圖片)：49114662、72817394
90789179 63429089 43188360	色情內容	偵測色情內容。	輸入 (提示)：90789179 輸出 (圖片)：63429089、43188360
78610348	惡意言論	偵測文字中的惡意主題或內容。	輸入 (提示)
61493863 56562880	暴力內容	偵測圖片或文字中與暴力相關的內容。	輸入 (提示)：61493863 輸出 (圖片)：56562880
32635315	粗俗	從文字中偵測粗俗的主題或內容。	輸入 (提示)
64151117	名人或兒童	偵測名人或兒童的寫實圖像，但違反 Google 安全政策。	輸入內容 (提示詞) 輸出內容 (圖片)

使用 Gemini 生成及編輯圖片 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

生成圖像

控制台

Python

安裝

Go

Node.js

安裝

Java

REST

生成圖像與文字交雜的內容

控制台

Python

安裝

Java

Go

Node.js

安裝

REST

編輯圖像

編輯圖片

控制台

Python

安裝

Java

Go

Node.js

安裝

REST

多輪圖像編輯

負責任的 AI 技術

安全篩選器代碼類別

使用 Gemini 生成及編輯圖片