Skip to content

Pose Estimation

Overview

A pose describes the body's position at one moment in time with a set of skeletal landmark points. The landmarks correspond to different body parts such as the shoulders and hips. The relative positions of landmarks can be used to distinguish one pose from another.

Demo

Install SDK

You can install our SDK for your device by following the guide at: Getting Started.

Downloading the Model

Download the Centernet Keypoints model. You can use any pose detection model from the EdgeStore.

Download Model

Running Inference

centernet-demo

Place the centernet-keypoints.edgem file in your project and run the following code:

let modelPath:String = Bundle.main.path(forResource: "centernet-keypoints", ofType: "edgem")!
let model =  EdgeModel(modelPath: modelPath)

// Preparing data for model
let image = // your input image

// Run the model on prepared inputs and get recognitions as output
let results: Results = model.run([image])

// Draw the results on your Image or UI 
let drawnImage = results.draw(on:image)

// Use the results in your app logic:
for detection in results.recognitions{

    let label:EEMLabel = detection.label!
    // In this case it is always person
    let name:String  = label.name!
    // Probability of detection between 0-1
    let confidence:Float = detection.confidence
    // location of person inside the input image
    let location:CGRect = detection.location
    // skeleton of the person
    let keypoints:Keypoints = detection.keypoints!

    for keypoint : Keypoint in keypoints.points{
        // for example Nose, Shoulders, Elbows...
        let keypointName:String = keypoint.name
        // x and y coordinate of point
        let x:Float = keypoint.x
        let y:Float = keypoint.y
        // confidence between 0-1
        let confidence:Float = keypoint.confidence

    }
}
NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"model" ofType:@"edgem"];
EdgeModel* model = [[EdgeModel alloc] initWithModelPath:modelPath];

// Preparing data for model        
CVPixelBufferRef image = // your input image

// Run the model on prepared inputs and get results as output
Results* results = [model run:@[ (__bridge id)image]];

// Draw the results on your Image or UI 
[results drawOnPixelBuffer:image ];

// Use the results in your app logic:
for (Recognition* detection in results.recognitions) {
    // name of detection, always person in this case
    NSString *name = detection.label.name;
    // Probability of detection between 0-1
    float confidence = detection.confidence;
    // location of person inside the input image
    CGRect location = detection.location;
    // skeleton of the person
    Keypoints* keypoints = detection.keypoints;
    for (Keypoint* keypoint in keypoints.points){
        // for example Nose, Shoulders, Elbows...
        NSString* keypointName = keypoint.name;
        // x and y coordinate of point
        float x = keypoint.x;
        float y = keypoint.y;
        // confidence between 0-1
        float confidence = keypoint.confidence;

    }
}
val model = EdgeModel.fromAsset(context, "centernet-keypoints.edgem")

// Preparing data for model
val image = // your input image

// Run the model on prepared inputs and get recognitions as output
val recognitions: Recognitions = model.run(listOf(image))

// Draw the recognitions on your Image or UI
val drawing: Bitmap = recognitions.drawOnBitmap(image)

// Use the recognitions in your app:
for (detection in recognitions) {
    val id = detection.id!!
    val displayName = detection.displayName!!
    val confidence = detection.confidence!!
    val location = detection.location!!
    val color = detection.color!!
    val keypoints = detection.keypoints!!
}
EdgeModel model = EdgeModel.fromAsset(context, "centernet-keypoints.edgem");

// Preparing data for model
Bitmap image = // your input image

// Run the model on prepared inputs and get recognitions as output
Recognitions recognitions = model.run(Arrays.asList(image));

// Draw the recognitions on your Image or UI
Bitmap drawing = recognitions.drawOnBitmap(image);

// Use the recognitions in your app:
for (Recognition detection : recognitions) {
    var id = detection.id;
    var displayName = detection.displayName;
    var confidence = detection.confidence;
    var location = detection.location;
    var color = detection.color;
    var keypoints = detection.keypoints;
}
from edgeengine import EdgeModel
import matplotlib.pyplot as plt

model = EdgeModel("centernet-keypoints.edgem")
image = #your input np.ndarray image
results = model.run([image])

# draw results on input image
visualization = results.visualize(image)

# show results
plt.imshow(visualization)
plt.show()

# Use results
for detection in results.recognitions:
    class_id = detection.id
    display_name = detection.display_name
    confidence = detection.confidence
    location = detection.location
    color = detection.color

Supported Image Formats

We support following input formats for images:

  • CvPixelBuffer (Recommended, iOS)

  • UIImage (Slow, iOS)

  • Bitmap (Android)

Interpreting and Using Results

The output results variable has type Recognitions that is a type of List. All the useful

information the model provides is compacted into this variable. Not matter your use case the output of model will always be Recognitions object. You can use location and keypoints (inside the for loop) data to get coordinates of each person and their skeleton.

Coordinate System

Our coordinate system is same as used in UIKit. i.e. our origin is at top left corner of image, with x axis to the right and y axis to the bottom.

Coordinates Demo

The recognitions object returned after model execution provides coordinates in image domain. i.e. 0 < x < Input Image Width and 0 < y < Input Image Height.

You can directly apply transformations on Recognitions object by using the Recognitions.map function in Android or Recogntitions.applying in iOS