Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

זיהוי אנשים

בדוגמת הקוד הבאה אפשר לראות איך לזהות אנשים בקובץ וידאו באמצעות Video Intelligence API.

הכלי Video Intelligence יכול לזהות את נוכחותם של בני אדם בקובץ וידאו ולעקוב אחרי אנשים בסרטון או בקטע וידאו.

זיהוי בני אדם מקובץ ב-Cloud Storage

בדוגמה הבאה מוצג איך לשלוח בקשת אנוטציה ל-Video Intelligence באמצעות התכונה לזיהוי בני אדם.

REST

שליחת בקשה להוספת הערה לסרטון

בדוגמה הבאה אפשר לראות איך שולחים בקשת POST לשיטה videos:annotate. בדוגמה נעשה שימוש ב-Google Cloud CLI כדי ליצור אסימון גישה. הוראות להתקנת ה-CLI של gcloud מופיעות במאמר מדריך למתחילים בנושא Video Intelligence API. מידע נוסף מופיע במאמר PersonDetectionConfig.

לפני שמשתמשים בנתוני הבקשה, צריך להחליף את הנתונים הבאים:

‫INPUT_URI: קטגוריה של Cloud Storage שמכילה את הקובץ שרוצים להוסיף לו הערות, כולל שם הקובץ. חייב להתחיל ב-gs://.
לדוגמה:
"inputUri": "gs://cloud-samples-data/video/googlework_short.mp4"
‫PROJECT_NUMBER: המזהה המספרי של Google Cloud הפרויקט

ה-method של ה-HTTP וכתובת ה-URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

תוכן בקשת JSON:

{
  "inputUri": "INPUT_URI",
  "features": ["PERSON_DETECTION"],
  "videoContext": {
    "personDetectionConfig": {
      "includeBoundingBoxes": true,
      "includePoseLandmarks": true,
      "includeAttributes": true
     }
  }
}

כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:

‫Curl (Linux,‏ macOS או Cloud Shell)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login, או באמצעות Cloud Shell שמחבר אתכם אוטומטית ל-CLI של gcloud. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://videointelligence.googleapis.com/v1/videos:annotate"

‎PowerShell (Windows)

הערה: הפקודה הבאה מבוססת על ההנחה שנכנסתם ל-CLI של gcloud באמצעות חשבון המשתמש שלכם, על ידי הרצת gcloud init או gcloud auth login. כדי לבדוק איזה חשבון פעיל, אפשר להריץ את הפקודה gcloud auth list.

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content

אתם אמורים לקבל תגובת JSON שדומה לזו:

תשובה

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID"
}

אם התגובה מצליחה, Video Intelligence API מחזיר את name עבור הפעולה שלכם. בדוגמה שלמעלה אפשר לראות תגובה כזו, שבה:

‫PROJECT_NUMBER: מספר הפרויקט
‫LOCATION_ID: האזור ב-Cloud שבו צריך להוסיף את ההערה. האזורים הנתמכים בענן הם: us-east1, ‏ us-west1,‏ europe-west1, ‏ asia-east1. אם לא מציינים אזור, המערכת תקבע אזור על סמך המיקום של קובץ הסרטון.
‫OPERATION_ID: המזהה של הפעולה הממושכת שנוצרה עבור הבקשה ומופיע בתגובה כשמתחילים את הפעולה, לדוגמה 12345...

קבלת תוצאות של אנוטציות

כדי לאחזר את תוצאת הפעולה, שולחים בקשת GET באמצעות שם הפעולה שמוחזר מהקריאה אל videos:annotate, כמו בדוגמה הבאה.

לפני שמשתמשים בנתוני הבקשה, צריך להחליף את הנתונים הבאים:

‫OPERATION_NAME: השם של הפעולה כפי שמוחזר על ידי Video Intelligence API. שם הפעולה הוא בפורמט projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
‫PROJECT_NUMBER: המזהה המספרי של Google Cloud הפרויקט

ה-method של ה-HTTP וכתובת ה-URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:

‫Curl (Linux,‏ macOS או Cloud Shell)

מריצים את הפקודה הבאה:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     "https://videointelligence.googleapis.com/v1/OPERATION_NAME"

‎PowerShell (Windows)

מריצים את הפקודה הבאה:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content

אתם אמורים לקבל תגובת JSON שדומה לזו:

תשובה

{
  "name": "us-west1.10001026834554604237",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "inputUri": "/cloud-ml-sandbox/video/chicago.mp4",
        "progressPercent": 100,
         "startTime": "2020-02-08T21:26:56.577807Z",
        "updateTime": "2020-02-08T21:28:09.620665Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
      {
        "inputUri": "/cloud-ml-sandbox/video/chicago.mp4",
        "personDetectionAnnotations": [
          {
            "tracks": [
              {
                "segment": {
                  "startTimeOffset": "0s",
                  "endTimeOffset": "1.507436s"
                }
              },
              ...
            ]
          }
        ]
      }
    ]
  }
}

הערות לגבי זיהוי צילומים מוחזרות כרשימה shotAnnotations. הערה: השדה done מוחזר רק כשהערך שלו הוא True. הוא לא נכלל בתשובות שהפעולה שלהן לא הושלמה.

הורדת תוצאות ההערות

מעתיקים את ההערה מהמקור לדלי היעד: (ראו העתקת קבצים ואובייקטים)

gcloud storage cp gcs_uri gs://my-bucket

הערה: אם המשתמש מספק את ה-URI של GCS בפלט, ההערה מאוחסנת ב-URI הזה.

Java

כדי לבצע אימות ב-Video Intelligence, צריך להגדיר את Application Default Credentials. מידע נוסף זמין במאמר הגדרת אימות לסביבת פיתוח מקומית.


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.DetectedLandmark;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.PersonDetectionAnnotation;
import com.google.cloud.videointelligence.v1.PersonDetectionConfig;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoContext;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;

public class DetectPersonGcs {

  public static void detectPersonGcs() throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String gcsUri = "gs://cloud-samples-data/video/googlework_short.mp4";
    detectPersonGcs(gcsUri);
  }

  // Detects people in a video stored in Google Cloud Storage using
  // the Cloud Video Intelligence API.
  public static void detectPersonGcs(String gcsUri) throws Exception {
    try (VideoIntelligenceServiceClient videoIntelligenceServiceClient =
        VideoIntelligenceServiceClient.create()) {
      // Reads a local video file and converts it to base64.

      PersonDetectionConfig personDetectionConfig =
          PersonDetectionConfig.newBuilder()
              // Must set includeBoundingBoxes to true to get poses and attributes.
              .setIncludeBoundingBoxes(true)
              .setIncludePoseLandmarks(true)
              .setIncludeAttributes(true)
              .build();
      VideoContext videoContext =
          VideoContext.newBuilder().setPersonDetectionConfig(personDetectionConfig).build();

      AnnotateVideoRequest request =
          AnnotateVideoRequest.newBuilder()
              .setInputUri(gcsUri)
              .addFeatures(Feature.PERSON_DETECTION)
              .setVideoContext(videoContext)
              .build();

      // Detects people in a video
      OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
          videoIntelligenceServiceClient.annotateVideoAsync(request);

      System.out.println("Waiting for operation to complete...");
      AnnotateVideoResponse response = future.get();
      // Get the first response, since we sent only one video.
      VideoAnnotationResults annotationResult = response.getAnnotationResultsList().get(0);

      // Annotations for list of people detected, tracked and recognized in video.
      for (PersonDetectionAnnotation personDetectionAnnotation :
          annotationResult.getPersonDetectionAnnotationsList()) {
        System.out.print("Person detected:\n");
        for (Track track : personDetectionAnnotation.getTracksList()) {
          VideoSegment segment = track.getSegment();
          System.out.printf(
              "\tStart: %d.%.0fs\n",
              segment.getStartTimeOffset().getSeconds(),
              segment.getStartTimeOffset().getNanos() / 1e6);
          System.out.printf(
              "\tEnd: %d.%.0fs\n",
              segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos() / 1e6);

          // Each segment includes timestamped objects that include characteristic--e.g. clothes,
          // posture of the person detected.
          TimestampedObject firstTimestampedObject = track.getTimestampedObjects(0);

          // Attributes include unique pieces of clothing, poses (i.e., body landmarks)
          // of the person detected.
          for (DetectedAttribute attribute : firstTimestampedObject.getAttributesList()) {
            System.out.printf(
                "\tAttribute: %s; Value: %s\n", attribute.getName(), attribute.getValue());
          }

          // Landmarks in person detection include body parts.
          for (DetectedLandmark attribute : firstTimestampedObject.getLandmarksList()) {
            System.out.printf(
                "\tLandmark: %s; Vertex: %f, %f\n",
                attribute.getName(), attribute.getPoint().getX(), attribute.getPoint().getY());
          }
        }
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

// Imports the Google Cloud Video Intelligence library + Node's fs library
const Video = require('@google-cloud/video-intelligence').v1;

// Creates a client
const video = new Video.VideoIntelligenceServiceClient();

async function detectPersonGCS() {
  const request = {
    inputUri: gcsUri,
    features: ['PERSON_DETECTION'],
    videoContext: {
      personDetectionConfig: {
        // Must set includeBoundingBoxes to true to get poses and attributes.
        includeBoundingBoxes: true,
        includePoseLandmarks: true,
        includeAttributes: true,
      },
    },
  };
  // Detects faces in a video
  // We get the first result because we only process 1 video
  const [operation] = await video.annotateVideo(request);
  const results = await operation.promise();
  console.log('Waiting for operation to complete...');

  // Gets annotations for video
  const personAnnotations =
    results[0].annotationResults[0].personDetectionAnnotations;

  for (const {tracks} of personAnnotations) {
    console.log('Person detected:');

    for (const {segment, timestampedObjects} of tracks) {
      console.log(
        `\tStart: ${segment.startTimeOffset.seconds}` +
          `.${(segment.startTimeOffset.nanos / 1e6).toFixed(0)}s`
      );
      console.log(
        `\tEnd: ${segment.endTimeOffset.seconds}.` +
          `${(segment.endTimeOffset.nanos / 1e6).toFixed(0)}s`
      );

      // Each segment includes timestamped objects that
      // include characteristic--e.g. clothes, posture
      // of the person detected.
      const [firstTimestampedObject] = timestampedObjects;

      // Attributes include unique pieces of clothing, poses (i.e., body
      // landmarks) of the person detected.
      for (const {name, value} of firstTimestampedObject.attributes) {
        console.log(`\tAttribute: ${name}; Value: ${value}`);
      }

      // Landmarks in person detection include body parts.
      for (const {name, point} of firstTimestampedObject.landmarks) {
        console.log(`\tLandmark: ${name}; Vertex: ${point.x}, ${point.y}`);
      }
    }
  }
}

detectPersonGCS();

Python

from google.cloud import videointelligence_v1 as videointelligence


def detect_person(gcs_uri="gs://YOUR_BUCKET_ID/path/to/your/video.mp4"):
    """Detects people in a video."""

    client = videointelligence.VideoIntelligenceServiceClient()

    # Configure the request
    config = videointelligence.types.PersonDetectionConfig(
        include_bounding_boxes=True,
        include_attributes=True,
        include_pose_landmarks=True,
    )
    context = videointelligence.types.VideoContext(person_detection_config=config)

    # Start the asynchronous request
    operation = client.annotate_video(
        request={
            "features": [videointelligence.Feature.PERSON_DETECTION],
            "input_uri": gcs_uri,
            "video_context": context,
        }
    )

    print("\nProcessing video for person detection annotations.")
    result = operation.result(timeout=300)

    print("\nFinished processing.\n")

    # Retrieve the first result, because a single video was processed.
    annotation_result = result.annotation_results[0]

    for annotation in annotation_result.person_detection_annotations:
        print("Person detected:")
        for track in annotation.tracks:
            print(
                "Segment: {}s to {}s".format(
                    track.segment.start_time_offset.seconds
                    + track.segment.start_time_offset.microseconds / 1e6,
                    track.segment.end_time_offset.seconds
                    + track.segment.end_time_offset.microseconds / 1e6,
                )
            )

            # Each segment includes timestamped objects that include
            # characteristics - -e.g.clothes, posture of the person detected.
            # Grab the first timestamped object
            timestamped_object = track.timestamped_objects[0]
            box = timestamped_object.normalized_bounding_box
            print("Bounding box:")
            print("\tleft  : {}".format(box.left))
            print("\ttop   : {}".format(box.top))
            print("\tright : {}".format(box.right))
            print("\tbottom: {}".format(box.bottom))

            # Attributes include unique pieces of clothing,
            # poses, or hair color.
            print("Attributes:")
            for attribute in timestamped_object.attributes:
                print(
                    "\t{}:{} {}".format(
                        attribute.name, attribute.value, attribute.confidence
                    )
                )

            # Landmarks in person detection include body parts such as
            # left_shoulder, right_ear, and right_ankle
            print("Landmarks:")
            for landmark in timestamped_object.landmarks:
                print(
                    "\t{}: {} (x={}, y={})".format(
                        landmark.name,
                        landmark.confidence,
                        landmark.point.x,  # Normalized vertex
                        landmark.point.y,  # Normalized vertex
                    )
                )

שפות נוספות

‫C#‎: צריך לפעול לפי הוראות ההגדרה של C# ‎ בדף של ספריות הלקוח ואז לעבור אל מאמרי העזרה של Video Intelligence בנושא ‎ .NET.

‫PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Video Intelligence מאמרי עזרה for PHP.

‫Ruby: צריך לפעול לפי הוראות ההגדרה של Ruby בדף של ספריות הלקוח ואז לעבור אל מסמך העזר של Video Intelligence ל-Ruby.

זיהוי בני אדם מקובץ מקומי

בדוגמה הבאה נעשה שימוש בזיהוי אנשים כדי למצוא ישויות בסרטון מקובץ סרטון שהועלה מהמחשב המקומי.

REST

שליחת בקשת העיבוד

כדי לבצע זיהוי בני אדם בקובץ וידאו מקומי, צריך לקודד ב-Base64 את התוכן של קובץ הווידאו. מידע על קידוד Base64 של תוכן קובץ וידאו זמין במאמר בנושא קידוד Base64. לאחר מכן, שולחים בקשת POST ל-method‏ videos:annotate. כוללים את התוכן בקידוד base64 בשדה inputContent של הבקשה ומציינים את התכונה PERSON_DETECTION.

בדוגמה הבאה מוצגת בקשת POST באמצעות curl. בדוגמה נעשה שימוש ב-Google Cloud CLI כדי ליצור אסימון גישה. הוראות להתקנת ה-CLI של gcloud מופיעות במאמר מדריך למתחילים בנושא Video Intelligence API.

לפני שמשתמשים בנתוני הבקשה, צריך להחליף את הנתונים הבאים:

inputContent: קובץ סרטון מקומי בפורמט בינארי
לדוגמה: 'AAAAGGZ0eXBtcDQyAAAAAGlzb21tcDQyAAGVYW1vb3YAAABsbXZoZAAAAADWvhlR1r4ZUQABX5ABCOxo AAEAAAEAAAAAAA4...'
‫PROJECT_NUMBER: המזהה המספרי של Google Cloud הפרויקט

ה-method של ה-HTTP וכתובת ה-URL:

POST https://videointelligence.googleapis.com/v1/videos:annotate

תוכן בקשת JSON:

{
  "inputUri": "Local video file in binary format",
  "features": ["PERSON_DETECTION"],
  "videoContext": {
    "personDetectionConfig": {
      "includeBoundingBoxes": true,
      "includePoseLandmarks": true,
      "includeAttributes": true
     }
  }
}

כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:

‫Curl (Linux,‏ macOS או Cloud Shell)

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://videointelligence.googleapis.com/v1/videos:annotate"

‎PowerShell (Windows)

שומרים את גוף הבקשה בקובץ בשם request.json ומריצים את הפקודה הבאה:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://videointelligence.googleapis.com/v1/videos:annotate" | Select-Object -Expand Content

אתם אמורים לקבל תגובת JSON שדומה לזו:

תשובה

{
  "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/var>/operations/OPERATION_ID"

}

אם הבקשה תצליח, Video Intelligence יחזיר את name עבור הפעולה. בדוגמה שלמעלה מוצגת תגובה כזו, כאשר project-number הוא מספר הפרויקט ו-operation-id הוא המזהה של הפעולה הממושכת שנוצרה עבור הבקשה.

{ "name": "us-west1.17122464255125931980" }

קבלת התוצאות

כדי לאחזר את תוצאת הפעולה, שולחים בקשת GET לנקודת הקצה operations ומציינים את שם הפעולה.

לפני שמשתמשים בנתוני הבקשה, צריך להחליף את הנתונים הבאים:

‫OPERATION_NAME: השם של הפעולה כפי שמוחזר על ידי Video Intelligence API. שם הפעולה הוא בפורמט projects/PROJECT_NUMBER/locations/LOCATION_ID/operations/OPERATION_ID
‫PROJECT_NUMBER: המזהה המספרי של Google Cloud הפרויקט

ה-method של ה-HTTP וכתובת ה-URL:

GET https://videointelligence.googleapis.com/v1/OPERATION_NAME

כדי לשלוח את הבקשה צריך להרחיב אחת מהאפשרויות הבאות:

‫Curl (Linux,‏ macOS או Cloud Shell)

מריצים את הפקודה הבאה:

curl -X GET \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_NUMBER" \
     "https://videointelligence.googleapis.com/v1/OPERATION_NAME"

‎PowerShell (Windows)

מריצים את הפקודה הבאה:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_NUMBER" }

Invoke-WebRequest `
    -Method GET `
    -Headers $headers `
    -Uri "https://videointelligence.googleapis.com/v1/OPERATION_NAME" | Select-Object -Expand Content

אתם אמורים לקבל תגובת JSON שדומה לזו:

תשובה

{
  "name": "us-west1.10001026834554604237",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoProgress",
    "annotationProgress": [
      {
        "progressPercent": 100,
        "startTime": "2020-02-08T21:26:56.577807Z",
        "updateTime": "2020-02-08T21:28:09.620665Z"
      }
    ]
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.cloud.videointelligence.v1.AnnotateVideoResponse",
    "annotationResults": [
      {
        "personDetectionAnnotations": [
          {
            "tracks": [
              {
                "segment": {
                  "startTimeOffset": "0s",
                  "endTimeOffset": "1.507436s"
                }
              },
              ...
            ]
          }
        ]
      }
    ]
  }
}

Java


import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.videointelligence.v1.AnnotateVideoProgress;
import com.google.cloud.videointelligence.v1.AnnotateVideoRequest;
import com.google.cloud.videointelligence.v1.AnnotateVideoResponse;
import com.google.cloud.videointelligence.v1.DetectedAttribute;
import com.google.cloud.videointelligence.v1.DetectedLandmark;
import com.google.cloud.videointelligence.v1.Feature;
import com.google.cloud.videointelligence.v1.PersonDetectionAnnotation;
import com.google.cloud.videointelligence.v1.PersonDetectionConfig;
import com.google.cloud.videointelligence.v1.TimestampedObject;
import com.google.cloud.videointelligence.v1.Track;
import com.google.cloud.videointelligence.v1.VideoAnnotationResults;
import com.google.cloud.videointelligence.v1.VideoContext;
import com.google.cloud.videointelligence.v1.VideoIntelligenceServiceClient;
import com.google.cloud.videointelligence.v1.VideoSegment;
import com.google.protobuf.ByteString;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class DetectPerson {

  public static void detectPerson() throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String localFilePath = "resources/googlework_short.mp4";
    detectPerson(localFilePath);
  }

  // Detects people in a video stored in a local file using the Cloud Video Intelligence API.
  public static void detectPerson(String localFilePath) throws Exception {
    try (VideoIntelligenceServiceClient videoIntelligenceServiceClient =
        VideoIntelligenceServiceClient.create()) {
      // Reads a local video file and converts it to base64.
      Path path = Paths.get(localFilePath);
      byte[] data = Files.readAllBytes(path);
      ByteString inputContent = ByteString.copyFrom(data);

      PersonDetectionConfig personDetectionConfig =
          PersonDetectionConfig.newBuilder()
              // Must set includeBoundingBoxes to true to get poses and attributes.
              .setIncludeBoundingBoxes(true)
              .setIncludePoseLandmarks(true)
              .setIncludeAttributes(true)
              .build();
      VideoContext videoContext =
          VideoContext.newBuilder().setPersonDetectionConfig(personDetectionConfig).build();

      AnnotateVideoRequest request =
          AnnotateVideoRequest.newBuilder()
              .setInputContent(inputContent)
              .addFeatures(Feature.PERSON_DETECTION)
              .setVideoContext(videoContext)
              .build();

      // Detects people in a video
      // We get the first result because only one video is processed.
      OperationFuture<AnnotateVideoResponse, AnnotateVideoProgress> future =
          videoIntelligenceServiceClient.annotateVideoAsync(request);

      System.out.println("Waiting for operation to complete...");
      AnnotateVideoResponse response = future.get();

      // Gets annotations for video
      VideoAnnotationResults annotationResult = response.getAnnotationResultsList().get(0);

      // Annotations for list of people detected, tracked and recognized in video.
      for (PersonDetectionAnnotation personDetectionAnnotation :
          annotationResult.getPersonDetectionAnnotationsList()) {
        System.out.print("Person detected:\n");
        for (Track track : personDetectionAnnotation.getTracksList()) {
          VideoSegment segment = track.getSegment();
          System.out.printf(
              "\tStart: %d.%.0fs\n",
              segment.getStartTimeOffset().getSeconds(),
              segment.getStartTimeOffset().getNanos() / 1e6);
          System.out.printf(
              "\tEnd: %d.%.0fs\n",
              segment.getEndTimeOffset().getSeconds(), segment.getEndTimeOffset().getNanos() / 1e6);

          // Each segment includes timestamped objects that include characteristic--e.g. clothes,
          // posture of the person detected.
          TimestampedObject firstTimestampedObject = track.getTimestampedObjects(0);

          // Attributes include unique pieces of clothing, poses (i.e., body landmarks)
          // of the person detected.
          for (DetectedAttribute attribute : firstTimestampedObject.getAttributesList()) {
            System.out.printf(
                "\tAttribute: %s; Value: %s\n", attribute.getName(), attribute.getValue());
          }

          // Landmarks in person detection include body parts.
          for (DetectedLandmark attribute : firstTimestampedObject.getLandmarksList()) {
            System.out.printf(
                "\tLandmark: %s; Vertex: %f, %f\n",
                attribute.getName(), attribute.getPoint().getX(), attribute.getPoint().getY());
          }
        }
      }
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const gcsUri = 'GCS URI of the video to analyze, e.g. gs://my-bucket/my-video.mp4';

// Imports the Google Cloud Video Intelligence library + Node's fs library
const Video = require('@google-cloud/video-intelligence').v1;
const fs = require('fs');
// Creates a client
const video = new Video.VideoIntelligenceServiceClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const path = 'Local file to analyze, e.g. ./my-file.mp4';

// Reads a local video file and converts it to base64
const file = fs.readFileSync(path);
const inputContent = file.toString('base64');

async function detectPerson() {
  const request = {
    inputContent: inputContent,
    features: ['PERSON_DETECTION'],
    videoContext: {
      personDetectionConfig: {
        // Must set includeBoundingBoxes to true to get poses and attributes.
        includeBoundingBoxes: true,
        includePoseLandmarks: true,
        includeAttributes: true,
      },
    },
  };
  // Detects faces in a video
  // We get the first result because we only process 1 video
  const [operation] = await video.annotateVideo(request);
  const results = await operation.promise();
  console.log('Waiting for operation to complete...');

  // Gets annotations for video
  const personAnnotations =
    results[0].annotationResults[0].personDetectionAnnotations;

  for (const {tracks} of personAnnotations) {
    console.log('Person detected:');

    for (const {segment, timestampedObjects} of tracks) {
      console.log(
        `\tStart: ${segment.startTimeOffset.seconds}` +
          `.${(segment.startTimeOffset.nanos / 1e6).toFixed(0)}s`
      );
      console.log(
        `\tEnd: ${segment.endTimeOffset.seconds}.` +
          `${(segment.endTimeOffset.nanos / 1e6).toFixed(0)}s`
      );

      // Each segment includes timestamped objects that
      // include characteristic--e.g. clothes, posture
      // of the person detected.
      const [firstTimestampedObject] = timestampedObjects;

      // Attributes include unique pieces of clothing, poses (i.e., body
      // landmarks) of the person detected.
      for (const {name, value} of firstTimestampedObject.attributes) {
        console.log(`\tAttribute: ${name}; Value: ${value}`);
      }

      // Landmarks in person detection include body parts.
      for (const {name, point} of firstTimestampedObject.landmarks) {
        console.log(`\tLandmark: ${name}; Vertex: ${point.x}, ${point.y}`);
      }
    }
  }
}

detectPerson();

Python

import io

from google.cloud import videointelligence_v1 as videointelligence


def detect_person(local_file_path="path/to/your/video-file.mp4"):
    """Detects people in a video from a local file."""

    client = videointelligence.VideoIntelligenceServiceClient()

    with io.open(local_file_path, "rb") as f:
        input_content = f.read()

    # Configure the request
    config = videointelligence.types.PersonDetectionConfig(
        include_bounding_boxes=True,
        include_attributes=True,
        include_pose_landmarks=True,
    )
    context = videointelligence.types.VideoContext(person_detection_config=config)

    # Start the asynchronous request
    operation = client.annotate_video(
        request={
            "features": [videointelligence.Feature.PERSON_DETECTION],
            "input_content": input_content,
            "video_context": context,
        }
    )

    print("\nProcessing video for person detection annotations.")
    result = operation.result(timeout=300)

    print("\nFinished processing.\n")

    # Retrieve the first result, because a single video was processed.
    annotation_result = result.annotation_results[0]

    for annotation in annotation_result.person_detection_annotations:
        print("Person detected:")
        for track in annotation.tracks:
            print(
                "Segment: {}s to {}s".format(
                    track.segment.start_time_offset.seconds
                    + track.segment.start_time_offset.microseconds / 1e6,
                    track.segment.end_time_offset.seconds
                    + track.segment.end_time_offset.microseconds / 1e6,
                )
            )

            # Each segment includes timestamped objects that include
            # characteristic - -e.g.clothes, posture of the person detected.
            # Grab the first timestamped object
            timestamped_object = track.timestamped_objects[0]
            box = timestamped_object.normalized_bounding_box
            print("Bounding box:")
            print("\tleft  : {}".format(box.left))
            print("\ttop   : {}".format(box.top))
            print("\tright : {}".format(box.right))
            print("\tbottom: {}".format(box.bottom))

            # Attributes include unique pieces of clothing,
            # poses, or hair color.
            print("Attributes:")
            for attribute in timestamped_object.attributes:
                print(
                    "\t{}:{} {}".format(
                        attribute.name, attribute.value, attribute.confidence
                    )
                )

            # Landmarks in person detection include body parts such as
            # left_shoulder, right_ear, and right_ankle
            print("Landmarks:")
            for landmark in timestamped_object.landmarks:
                print(
                    "\t{}: {} (x={}, y={})".format(
                        landmark.name,
                        landmark.confidence,
                        landmark.point.x,  # Normalized vertex
                        landmark.point.y,  # Normalized vertex
                    )
                )

שפות נוספות

‫PHP: Please follow the PHP setup instructions on the client libraries page and then visit the Video Intelligence מאמרי עזרה for PHP.

‫Ruby: צריך לפעול לפי הוראות ההגדרה של Ruby בדף של ספריות הלקוח ואז לעבור אל מסמך העזר של Video Intelligence ל-Ruby.

זיהוי אנשים קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

זיהוי בני אדם מקובץ ב-Cloud Storage

REST

שליחת בקשה להוספת הערה לסרטון

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

תשובה

קבלת תוצאות של אנוטציות

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

תשובה

הורדת תוצאות ההערות

Java

Node.js

Python

שפות נוספות

זיהוי בני אדם מקובץ מקומי

REST

שליחת בקשת העיבוד

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

תשובה

קבלת התוצאות

‫Curl (Linux,‏ macOS או Cloud Shell)

‎PowerShell (Windows)

תשובה

Java

Node.js

Python

שפות נוספות

זיהוי אנשים