Gemini로 이미지 생성 및 수정

주의: gemini-2.0-flash-preview-image-generation 및 gemini-2.5-flash-image-preview 모델은 2025년 10월 31일에 사용 중단됩니다. 서비스 중단을 방지하려면 해당 날짜 이전에 모든 워크플로를 gemini-2.5-flash-image로 마이그레이션하세요.

다음 Gemini 모델은 텍스트 외에도 이미지를 생성하는 기능을 지원합니다.

Gemini 2.5 Flash Image(일명 Gemini 2.5 Flash(Nano Banana 사용))
Gemini 3 Pro Image(프리뷰), Gemini 3 Pro(Nano Banana 포함)라고도 함

이렇게 하면 Gemini의 기능이 다음을 포함하도록 확장됩니다.

일관성과 맥락을 유지하면서 자연어 대화를 통해 이미지를 반복적으로 생성하여 이미지를 조정합니다.
긴 고품질 텍스트 렌더링으로 이미지를 생성합니다.
인터리브 처리된 텍스트-이미지 출력을 생성합니다. 예를 들어 한 번에 텍스트와 이미지가 표시되는 블로그 게시물입니다. 이전에는 여러 모델을 연결해야 했습니다.
Gemini의 전 세계 지식 및 추론 기능을 사용하여 이미지를 생성합니다.

Gemini 2.5 Flash Image(gemini-2.5-flash-image) 및 Gemini 3 Pro Image 프리뷰(gemini-3-pro-image-preview)는 인물 이미지를 생성하는 기능을 지원하며, 보다 유연하고 제한이 적은 사용자 경험을 제공하는 업데이트된 안전 필터를 포함합니다. Gemini 2.5 Flash Image는 1,024px 이미지를 생성할 수 있습니다. Gemini 3 Pro Image는 최대 4096px의 이미지를 생성할 수 있습니다.

두 모델 모두 다음 형식과 기능을 지원합니다.

텍스트 이미지 변환
- 프롬프트 예시: '배경에 불꽃놀이가 있는 에펠탑 이미지를 생성해 줘.'
텍스트 이미지 변환(텍스트 렌더링)
- 프롬프트 예시: '대형 건물의 시네마틱 사진을 생성해 줘. 건물 전면에 거대한 텍스트 프로젝션 '이제 Gemini 3으로 긴 형식의 텍스트를 생성할 수 있습니다.'라고 적어줘.'
텍스트 이미지 변환 및 텍스트(인터리브 처리)
- 프롬프트 예시: '파에야에 관한 그림이 있는 레시피를 생성해 줘. 레시피를 생성할 때 텍스트와 함께 이미지를 만들어 줘."
- 프롬프트 예시: '3D 만화 애니메이션 스타일로 강아지에 관한 이야기를 만들어 줘. 각 장면에서 이미지를 생성합니다.'
이미지 및 텍스트 이미지 변환 및 텍스트(인터리브 처리)
- 프롬프트 예시: (가구가 완비된 방의 이미지 포함) "내 공간에 어떤 색상의 소파가 어울릴까? 이미지를 업데이트해 줘."

권장사항

이미지 생성 결과를 개선하려면 다음 권장사항을 따르세요.

구체적으로 작성: 세부정보를 많이 제공할수록 더 나은 결과를 얻을 수 있습니다. 예를 들어 '판타지 갑옷' 대신 '은박 무늬가 새겨진 화려한 엘프 판금 갑옷, 높은 칼라와 매 날개 모양의 어깨 보호대를 갖추고 있다"라고 표현해 보세요.
컨텍스트와 의도 제공: 모델이 컨텍스트를 이해할 수 있도록 이미지의 목적을 설명합니다. 예를 들어 '고급 미니멀리즘 스킨케어 브랜드를 위한 로고를 만들어 줘'가 '로고를 만들어 줘'보다 더 효과적입니다.
반복 및 미세 조정: 첫 번째 시도에서 완벽한 이미지를 기대하지 마세요. 후속 프롬프트를 사용하여 '조명을 더 따뜻하게 해 줘' 또는 '캐릭터의 표정을 더 심각하게 바꿔 줘'와 같이 미세한 변경사항을 적용합니다.
단계별 안내 사용: 복잡한 장면의 경우 요청을 단계로 나눕니다. 예를 들어 '먼저 새벽의 고요하고 안개 낀 숲의 배경을 만들어 줘. 그런 다음 전경에 이끼로 덮인 고대 돌 제단을 추가하고 마지막으로 제단 위에 빛나는 검 하나를 놓아'
원하지 않는 것이 아닌 원하는 것을 설명: '차가 없다'고 말하는 대신, '교통의 흔적조차 없는 텅 빈, 황량한 거리'라는 긍정적으로 장면을 묘사하세요.
카메라 제어: 카메라 뷰를 안내합니다. 사진 및 영화 용어를 사용하여 구도를 설명합니다(예: '광각 샷', '매크로 샷', '로우 앵글').
이미지 프롬프트: '이미지 만들어 줘' 또는 '이미지 생성해 줘'와 같은 문구를 사용하여 의도를 설명합니다. 그렇지 않으면 멀티모달 모델이 이미지 대신 텍스트로 응답할 수 있습니다.
생각 서명 전달: Gemini 3 Pro Image를 사용하는 경우 멀티턴 이미지 생성 및 편집 중에 생각 서명을 모델에 다시 전달하는 것이 좋습니다. 이렇게 하면 상호작용 전반에서 추론 컨텍스트를 유지할 수 있습니다. Gemini 3 Pro 이미지를 사용한 멀티턴 이미지 편집과 관련된 코드 샘플은 생각 서명을 사용한 멀티턴 이미지 편집의 예시를 참조하세요.

제한사항:

Gemini 2.5 Flash Image의 최상의 성능을 위해 다음 언어를 사용하세요. EN, es-MX, ja-JP, zh-CN 또는 hi-IN Gemini 3 Pro Image의 최상의 성능을 위해 다음 언어를 사용하세요. ar-EG, de-DE, EN, es-MX, fr-FR, hi-IN, id-ID, it-IT, ja-JP, ko-KR, pt-BR, ru-RU, ua-UA, vi-VN, zh-CN
이미지 생성은 오디오 또는 동영상 입력을 지원하지 않습니다.
모델이 요청한 정확한 수의 이미지를 생성하지 않을 수 있습니다.
Gemini 2.5 Flash Image를 사용하여 최상의 결과를 얻으려면 입력에 최대 3개의 이미지를 포함하세요. Gemini 3 Pro Image를 사용하여 최상의 결과를 얻으려면 입력에 최대 14개의 이미지를 포함하세요.
텍스트가 포함된 이미지를 생성할 때는 먼저 텍스트를 생성한 다음 해당 텍스트로 이미지를 생성하세요.
다음과 같은 상황에서는 이미지 또는 텍스트 생성이 예상대로 작동하지 않을 수 있습니다.
- 프롬프트가 모호한 경우 모델은 텍스트만 생성하고 이미지는 생성하지 않을 수 있습니다. 이미지를 원하는 경우 요청에 이미지를 명확하게 요청하세요. 예를 들어 '진행하면서 이미지를 제공해 줘.'
- 모델이 텍스트를 이미지로 만들 수 있습니다. 텍스트를 생성하려면 텍스트 출력을 요청하세요. 예를 들어 '삽화와 함께 서술 텍스트를 생성해 줘.'
- 모델이 작업을 완료하지 않아도 콘텐츠 생성을 중단할 수 있습니다. 이 경우 다시 시도하거나 다른 프롬프트를 사용해 보세요.
- 프롬프트가 안전하지 않을 수 있는 경우 모델이 요청을 처리하지 않고 안전하지 않은 이미지를 만들 수 없음을 나타내는 응답을 반환할 수 있습니다. 이 경우, FinishReason은 STOP입니다.

이미지 생성

다음 섹션에서는 Vertex AI Studio 또는 API를 사용하여 이미지를 생성하는 방법을 설명합니다.

프롬프트에 관한 안내 및 권장사항은 멀티모달 프롬프트 설계를 참조하세요.

콘솔

이미지 생성을 사용하려면 다음 단계를 따르세요.

Vertex AI Studio > 프롬프트 만들기를 엽니다.
모델 전환을 클릭하고 메뉴에서 다음 모델 중 하나를 선택합니다.
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
출력 패널의 드롭다운 메뉴에서 이미지 및 텍스트를 선택합니다.
프롬프트 작성 텍스트 영역에 생성하려는 이미지의 설명을 작성합니다.
프롬프트() 버튼을 클릭합니다.

Gemini가 설명을 기반으로 이미지를 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

Python

설치

pip install --upgrade google-genai

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=("Generate an image of the Eiffel tower with fireworks in the background."),
    config=GenerateContentConfig(
        response_modalities=[Modality.TEXT, Modality.IMAGE],
    ),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/example-image-eiffel-tower.png")

Node.js

설치

npm install @google/genai

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

const fs = require('fs');
const {GoogleGenAI, Modality} = require('@google/genai');

const GOOGLE_CLOUD_PROJECT = process.env.GOOGLE_CLOUD_PROJECT;
const GOOGLE_CLOUD_LOCATION =
  process.env.GOOGLE_CLOUD_LOCATION || 'us-central1';

async function generateImage(
  projectId = GOOGLE_CLOUD_PROJECT,
  location = GOOGLE_CLOUD_LOCATION
) {
  const client = new GoogleGenAI({
    vertexai: true,
    project: projectId,
    location: location,
  });

  const response = await client.models.generateContentStream({
    model: 'gemini-2.5-flash-image',
    contents:
      'Generate an image of the Eiffel tower with fireworks in the background.',
    config: {
      responseModalities: [Modality.TEXT, Modality.IMAGE],
    },
  });

  const generatedFileNames = [];
  let imageIndex = 0;

  for await (const chunk of response) {
    const text = chunk.text;
    const data = chunk.data;
    if (text) {
      console.debug(text);
    } else if (data) {
      const outputDir = 'output-folder';
      if (!fs.existsSync(outputDir)) {
        fs.mkdirSync(outputDir, {recursive: true});
      }
      const fileName = `${outputDir}/generate_content_streaming_image_${imageIndex++}.png`;
      console.debug(`Writing response image to file: ${fileName}.`);
      try {
        fs.writeFileSync(fileName, data);
        generatedFileNames.push(fileName);
      } catch (error) {
        console.error(`Failed to write image file ${fileName}:`, error);
      }
    }
  }

  // Example response:
  //  I will generate an image of the Eiffel Tower at night, with a vibrant display of
  //  colorful fireworks exploding in the dark sky behind it. The tower will be
  //  illuminated, standing tall as the focal point of the scene, with the bursts of
  //  light from the fireworks creating a festive atmosphere.

  return generatedFileNames;
}

Java

Java를 설치하거나 업데이트하는 방법을 알아보세요.

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import com.google.genai.types.SafetySetting;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashWithText {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/example-image-eiffel-tower.png";
    generateContent(modelId, outputFile);
  }

  // Generates an image with text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      GenerateContentConfig contentConfig =
          GenerateContentConfig.builder()
              .responseModalities("TEXT", "IMAGE")
              .candidateCount(1)
              .safetySettings(
                  SafetySetting.builder()
                      .method("PROBABILITY")
                      .category("HARM_CATEGORY_DANGEROUS_CONTENT")
                      .threshold("BLOCK_MEDIUM_AND_ABOVE")
                      .build())
              .build();

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              "Generate an image of the Eiffel tower with fireworks in the background.",
              contentConfig);

      // Get parts of the response
      List<Part> parts =
          response
              .candidates()
              .flatMap(candidates -> candidates.stream().findFirst())
              .flatMap(Candidate::content)
              .flatMap(Content::parts)
              .orElse(new ArrayList<>());

      // For each part print text if present, otherwise read image data if present and
      // write it to the output file
      for (Part part : parts) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
          BufferedImage image =
              ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
          ImageIO.write(image, "png", new File(outputFile));
        }
      }

      System.out.println("Content written to: " + outputFile);
      // Example response:
      // Here is the Eiffel Tower with fireworks in the background...
      //
      // Content written to: resources/output/example-image-eiffel-tower.png
    }
  }
}

REST

터미널에서 다음 명령어를 실행하여 현재 디렉터리에 이 파일을 만들거나 덮어씁니다.

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."
        }
      ]
    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
     },
     "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

참고: Gemini 2.5 Flash Image는 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9의 가로세로 비율을 지원합니다.

Gemini가 설명을 기반으로 이미지를 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

인터리브 처리된 이미지 및 텍스트 생성

Gemini 2.5 Flash Image는 텍스트 응답과 함께 인터리브 처리된 이미지를 생성할 수 있습니다. 예를 들어 모델에 별도의 요청을 하지 않고도 생성된 레시피의 각 단계가 어떻게 표시되는지 이미지를 생성하여 해당 단계의 텍스트와 함께 표시할 수 있습니다.

콘솔

텍스트 응답과 함께 인터리브 처리된 이미지를 생성하려면 다음 단계를 따르세요.

Vertex AI Studio > 프롬프트 만들기를 엽니다.
모델 전환을 클릭하고 메뉴에서 다음 모델 중 하나를 선택합니다.
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
출력 패널의 드롭다운 메뉴에서 이미지 및 텍스트를 선택합니다.
프롬프트 작성 텍스트 영역에 생성하려는 이미지의 설명을 작성합니다. 예를 들어 '땅콩버터와 젤리 샌드위치를 3단계로 쉽게 만드는 방법을 설명하는 튜토리얼을 만들어 줘. 각 단계마다 단계 번호가 포함된 제목과 설명을 제공하고 이미지를 생성해 줘. 각 이미지는 1:1 비율로 생성해 줘.'
프롬프트() 버튼을 클릭합니다.

Gemini가 설명을 기반으로 응답을 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

Python

설치

pip install --upgrade google-genai

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=(
        "Generate an illustrated recipe for a paella."
        "Create images to go alongside the text as you generate the recipe"
    ),
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
with open("output_folder/paella-recipe.md", "w") as fp:
    for i, part in enumerate(response.candidates[0].content.parts):
        if part.text is not None:
            fp.write(part.text)
        elif part.inline_data is not None:
            image = Image.open(BytesIO((part.inline_data.data)))
            image.save(f"output_folder/example-image-{i+1}.png")
            fp.write(f"![image](example-image-{i+1}.png)")

Java

Java를 설치하거나 업데이트하는 방법을 알아보세요.

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.awt.image.BufferedImage;
import java.io.BufferedWriter;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashTextAndImageWithText {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/paella-recipe.md";
    generateContent(modelId, outputFile);
  }

  // Generates text and image with text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              Content.fromParts(
                  Part.fromText("Generate an illustrated recipe for a paella."),
                  Part.fromText(
                      "Create images to go alongside the text as you generate the recipe.")),
              GenerateContentConfig.builder().responseModalities("TEXT", "IMAGE").build());

      try (BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))) {

        // Get parts of the response
        List<Part> parts =
            response
                .candidates()
                .flatMap(candidates -> candidates.stream().findFirst())
                .flatMap(Candidate::content)
                .flatMap(Content::parts)
                .orElse(new ArrayList<>());

        int index = 1;
        // For each part print text if present, otherwise read image data if present and
        // write it to the output file
        for (Part part : parts) {
          if (part.text().isPresent()) {
            writer.write(part.text().get());
          } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
            BufferedImage image =
                ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
            ImageIO.write(
                image, "png", new File("resources/output/example-image-" + index + ".png"));
            writer.write("![image](example-image-" + index + ".png)");
          }
          index++;
        }

        System.out.println("Content written to: " + outputFile);

        // Example response:
        // A markdown page for a Paella recipe(`paella-recipe.md`) has been generated.
        // It includes detailed steps and several images illustrating the cooking process.
        //
        // Content written to:  resources/output/paella-recipe.md
      }
    }
  }
}

REST

터미널에서 다음 명령어를 실행하여 현재 디렉터리에 이 파일을 만들거나 덮어씁니다.

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {
          "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."
        }
      ]
    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
    },
    "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

참고: Gemini 2.5 Flash Image 및 Gemini 3 Pro Image는 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9의 가로세로 비율을 지원합니다.

Gemini가 설명을 기반으로 이미지를 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

이미지 수정

이미지 생성을 위한 Gemini 2.5 Flash Image(gemini-2.5-flash-image)는 이미지 생성 외에도 수정 기능을 지원합니다. Gemini 2.5 Flash Image는 이미지의 개선된 수정 및 멀티턴 수정을 지원하며, 더 유연하고 제한이 적은 사용자 경험을 제공하는 업데이트된 안전 필터를 포함합니다.

다음과 같은 형식과 기능을 지원합니다.

이미지 편집(텍스트 및 이미지 간)
- 프롬프트 예시: '이 이미지를 만화처럼 보이도록 수정해 줘.'
- 프롬프트 예시: [고양이 이미지] + [베개 이미지] + '이 베개에 내 고양이 십자수를 만들어 줘.'
멀티턴 이미지 편집(채팅)
- 프롬프트 예시: [파란색 자동차 이미지를 업로드하세요.] '이 차를 컨버터블로 바꿔 줘.'
  - [모델이 동일한 장면에서 컨버터블 이미지 반환] '이제 색상을 노란색으로 바꿔 줘.'
  - [모델이 노란색 컨버터블이 있는 이미지 반환] '스포일러를 추가해 줘.'
  - [모델이 스포일러가 있는 컨버터블 이미지 반환]

이미지 수정

콘솔

이미지를 수정하려면 다음 단계를 따르세요.

Vertex AI Studio > 프롬프트 만들기를 엽니다.
모델 전환을 클릭하고 메뉴에서 다음 모델 중 하나를 선택합니다.
- gemini-2.5-flash-image
- gemini-3-pro-image-preview
출력 패널의 드롭다운 메뉴에서 이미지 및 텍스트를 선택합니다.
미디어 삽입()을 클릭하고 메뉴에서 소스를 선택한 다음 대화상자의 안내를 따릅니다.
이미지에 적용할 수정사항을 프롬프트 작성 텍스트 영역에 입력합니다.
프롬프트() 버튼을 클릭합니다.

Gemini는 제공된 이미지에 대한 설명을 바탕으로 수정된 버전을 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

Python

설치

pip install --upgrade google-genai

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True

from google import genai
from google.genai.types import GenerateContentConfig, Modality
from PIL import Image
from io import BytesIO

client = genai.Client()

# Using an image of Eiffel tower, with fireworks in the background.
image = Image.open("test_resources/example-image-eiffel-tower.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[image, "Edit this image to make it look like a cartoon."],
    config=GenerateContentConfig(response_modalities=[Modality.TEXT, Modality.IMAGE]),
)
for part in response.candidates[0].content.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save("output_folder/bw-example-image.png")

Java

Java를 설치하거나 업데이트하는 방법을 알아보세요.

자세한 내용은 SDK 참고 문서를 참고하세요.

Vertex AI에서 생성형 AI SDK를 사용하도록 환경 변수를 설정합니다.

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True


import com.google.genai.Client;
import com.google.genai.types.Blob;
import com.google.genai.types.Candidate;
import com.google.genai.types.Content;
import com.google.genai.types.GenerateContentConfig;
import com.google.genai.types.GenerateContentResponse;
import com.google.genai.types.Part;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import javax.imageio.ImageIO;

public class ImageGenMmFlashEditImageWithTextAndImage {

  public static void main(String[] args) throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String modelId = "gemini-2.5-flash-image";
    String outputFile = "resources/output/bw-example-image.png";
    generateContent(modelId, outputFile);
  }

  // Edits an image with image and text input
  public static void generateContent(String modelId, String outputFile) throws IOException {
    // Client Initialization. Once created, it can be reused for multiple requests.
    try (Client client = Client.builder().location("global").vertexAI(true).build()) {

      byte[] localImageBytes =
          Files.readAllBytes(Paths.get("resources/example-image-eiffel-tower.png"));

      GenerateContentResponse response =
          client.models.generateContent(
              modelId,
              Content.fromParts(
                  Part.fromBytes(localImageBytes, "image/png"),
                  Part.fromText("Edit this image to make it look like a cartoon.")),
              GenerateContentConfig.builder().responseModalities("TEXT", "IMAGE").build());

      // Get parts of the response
      List<Part> parts =
          response
              .candidates()
              .flatMap(candidates -> candidates.stream().findFirst())
              .flatMap(Candidate::content)
              .flatMap(Content::parts)
              .orElse(new ArrayList<>());

      // For each part print text if present, otherwise read image data if present and
      // write it to the output file
      for (Part part : parts) {
        if (part.text().isPresent()) {
          System.out.println(part.text().get());
        } else if (part.inlineData().flatMap(Blob::data).isPresent()) {
          BufferedImage image =
              ImageIO.read(new ByteArrayInputStream(part.inlineData().flatMap(Blob::data).get()));
          ImageIO.write(image, "png", new File(outputFile));
        }
      }

      System.out.println("Content written to: " + outputFile);

      // Example response:
      // No problem! Here's the image in a cartoon style...
      //
      // Content written to: resources/output/bw-example-image.png
    }
  }
}

REST

터미널에서 다음 명령어를 실행하여 현재 디렉터리에 이 파일을 만들거나 덮어씁니다.

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${API_ENDPOINT}:generateContent \
  -d '{
    "contents": {
      "role": "USER",
      "parts": [
        {"fileData": {
          "mimeType": "image/jpg",
          "fileUri": "FILE_NAME"
          }
        },
        {"text": "Convert this photo to black and white, in a cartoonish style."},
      ]

    },
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"],
      "imageConfig": {
        "aspectRatio": "16:9",
      },
    },
    "safetySettings": {
      "method": "PROBABILITY",
      "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    },
  }' 2>/dev/null >response.json

참고: Gemini 2.5 Flash Image는 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9의 가로세로 비율을 지원합니다.

Gemini가 설명을 기반으로 이미지를 생성합니다. 이 프로세스는 몇 초 정도 걸리지만 용량에 따라 상대적으로 느릴 수 있습니다.

멀티턴 이미지 수정

Gemini 2.5 Flash Image 및 Gemini 3 Pro Image는 개선된 멀티턴 수정사항을 지원하므로 수정된 이미지 응답을 받은 후 변경사항을 적용하여 모델에 응답할 수 있습니다. 이렇게 하면 대화식으로 이미지에 대한 수정을 계속할 수 있습니다.

전체 요청 파일 크기는 최대 50MB로 제한하는 것이 좋습니다.

멀티턴 이미지 수정을 테스트하려면 다음 노트북을 사용해 보세요.

Gemini 3 Pro Image를 사용한 멀티턴 이미지 생성 및 수정과 관련된 코드 샘플은 생각 서명을 사용한 멀티턴 이미지 수정 예시를 참조하세요.

책임감 있는 AI

안전하고 책임감 있는 환경을 보장하기 위해 Vertex AI의 이미지 생성 기능에는 다층적 안전 접근 방식을 갖추고 있습니다. 이는 음란물, 위험한 콘텐츠, 폭력적인 콘텐츠, 증오성 콘텐츠 또는 유해한 콘텐츠를 비롯한 부적절한 콘텐츠의 생성을 방지하기 위해 설계되었습니다.

모든 사용자는 생성형 AI에 관한 금지된 사용 정책을 준수해야 합니다. 이 정책에서는 다음과 같은 콘텐츠의 생성을 엄격히 금지합니다.

아동 성적 학대 또는 착취 관련
폭력적 극단주의 또는 테러 조장
동의하지 않은 사적인 이미지 조장 자해 조장
음란물
증오심 표현 구성
괴롭힘 또는 폭력 조장

안전하지 않은 프롬프트가 제공되면 모델이 이미지 생성을 거부하거나 프롬프트 또는 생성된 응답이 안전 필터에 의해 차단될 수 있습니다.

모델 거부: 프롬프트가 잠재적으로 안전하지 않은 경우 모델이 요청 처리를 거부할 수 있습니다. 이 경우 모델은 일반적으로 안전하지 않은 이미지를 생성할 수 없다는 텍스트 응답을 제공합니다. FinishReason은 STOP입니다.
안전 필터 차단:
- 안전 필터에 의해 프롬프트가 잠재적으로 유해한 것으로 식별되면 API는 PromptFeedback에서 BlockedReason을 반환합니다.
- 안전 필터에서 응답이 잠재적으로 유해한 것으로 식별되면 API 응답에 IMAGE_SAFETY, IMAGE_PROHIBITED_CONTENT 등의 FinishReason이 포함됩니다.

안전 필터 코드 카테고리

구성하는 안전 필터에 따라 출력에 다음과 비슷한 안전 이유 코드가 포함될 수 있습니다.

    {
      "raiFilteredReason": "ERROR_MESSAGE. Support codes: 56562880"
    }

표시된 코드는 특정 유해한 카테고리에 해당합니다. 이러한 코드와 카테고리 간의 매핑은 다음과 같습니다.

오류 코드	안전 카테고리	설명	필터링 콘텐츠: 프롬프트 입력 또는 이미지 출력
58061214 17301594	자녀	API 요청 설정 또는 허용 목록 작성에 따라 허용되지 않는 어린이 콘텐츠를 감지합니다.	입력(프롬프트): 58061214 출력(이미지): 17301594
29310472 15236754	유명인	요청에서 유명인의 사진 표현을 감지합니다.	입력(프롬프트): 29310472 출력(이미지): 15236754
62263041	위험한 콘텐츠	잠재적으로 위험할 수 있는 콘텐츠를 감지합니다.	입력(프롬프트)
57734940 22137204	혐오	혐오 관련 주제 또는 콘텐츠를 감지합니다.	입력(프롬프트): 57734940 출력(이미지): 22137204
74803281 29578790 42876398	기타	요청에서 기타 안전 문제를 감지합니다.	입력(프롬프트): 42876398 출력(이미지): 29578790, 74803281
39322892	사람/얼굴	요청 안전 설정으로 인해 허용되지 않는 경우 사람 또는 얼굴을 감지합니다.	출력(이미지)
92201652	개인정보	신용카드 번호, 집 주소, 기타 관련 정보의 언급과 같이 텍스트에 포함된 개인 식별 정보(PII)를 감지합니다.	입력(프롬프트)
89371032 49114662 72817394	금지된 콘텐츠	요청에서 금지된 콘텐츠 요청을 감지합니다.	입력(프롬프트): 89371032 출력(이미지): 49114662, 72817394
90789179 63429089 43188360	성적 콘텐츠	본질적으로 성적 콘텐츠를 감지합니다.	입력(프롬프트): 90789179 출력(이미지): 63429089, 43188360
78610348	유해	텍스트에서 유해한 주제 또는 콘텐츠를 감지합니다.	입력(프롬프트)
61493863 56562880	폭력	이미지 또는 텍스트에서 폭력 관련 콘텐츠를 감지합니다.	입력(프롬프트): 61493863 출력(이미지): 56562880
32635315	저속함	텍스트에서 저속한 주제 또는 콘텐츠를 감지합니다.	입력(프롬프트)
64151117	유명인 또는 아동	유명인 또는 아동에 대한 사실적인 표현이 Google의 안전 정책을 위반하는지 감지합니다.	입력(프롬프트) 출력(이미지)

Gemini로 이미지 생성 및 수정 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

이미지 생성

콘솔

Python

설치

Node.js

설치

Java

REST

인터리브 처리된 이미지 및 텍스트 생성

콘솔

Python

설치

Java

REST

이미지 수정

이미지 수정

콘솔

Python

설치

Java

REST

멀티턴 이미지 수정

책임감 있는 AI

안전 필터 코드 카테고리

Gemini로 이미지 생성 및 수정