Pose Estimation
Overview
A pose describes the body's position at one moment in time with a set of skeletal landmark points. The landmarks correspond to different body parts such as the shoulders and hips. The relative positions of landmarks can be used to distinguish one pose from another.
Install SDK
You can install our SDK for your device by following the guide at: Getting Started.
Downloading the Model
Download the Centernet Keypoints model. You can use any pose detection model from the EdgeStore.
Running Inference
Place the centernet-keypoints.edgem file in your project and run the following code:
let modelPath:String = Bundle.main.path(forResource: "centernet-keypoints", ofType: "edgem")!
let model = EdgeModel(modelPath: modelPath)
// Preparing data for model
let image = // your input image
// Run the model on prepared inputs and get recognitions as output
let results: Results = model.run([image])
// Draw the results on your Image or UI
let drawnImage = results.draw(on:image)
// Use the results in your app logic:
for detection in results.recognitions{
let label:EEMLabel = detection.label!
// In this case it is always person
let name:String = label.name!
// Probability of detection between 0-1
let confidence:Float = detection.confidence
// location of person inside the input image
let location:CGRect = detection.location
// skeleton of the person
let keypoints:Keypoints = detection.keypoints!
for keypoint : Keypoint in keypoints.points{
// for example Nose, Shoulders, Elbows...
let keypointName:String = keypoint.name
// x and y coordinate of point
let x:Float = keypoint.x
let y:Float = keypoint.y
// confidence between 0-1
let confidence:Float = keypoint.confidence
}
}
NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"model" ofType:@"edgem"];
EdgeModel* model = [[EdgeModel alloc] initWithModelPath:modelPath];
// Preparing data for model
CVPixelBufferRef image = // your input image
// Run the model on prepared inputs and get results as output
Results* results = [model run:@[ (__bridge id)image]];
// Draw the results on your Image or UI
[results drawOnPixelBuffer:image ];
// Use the results in your app logic:
for (Recognition* detection in results.recognitions) {
// name of detection, always person in this case
NSString *name = detection.label.name;
// Probability of detection between 0-1
float confidence = detection.confidence;
// location of person inside the input image
CGRect location = detection.location;
// skeleton of the person
Keypoints* keypoints = detection.keypoints;
for (Keypoint* keypoint in keypoints.points){
// for example Nose, Shoulders, Elbows...
NSString* keypointName = keypoint.name;
// x and y coordinate of point
float x = keypoint.x;
float y = keypoint.y;
// confidence between 0-1
float confidence = keypoint.confidence;
}
}
val model = EdgeModel.fromAsset(context, "centernet-keypoints.edgem")
// Preparing data for model
val image = // your input image
// Run the model on prepared inputs and get recognitions as output
val recognitions: Recognitions = model.run(listOf(image))
// Draw the recognitions on your Image or UI
val drawing: Bitmap = recognitions.drawOnBitmap(image)
// Use the recognitions in your app:
for (detection in recognitions) {
val id = detection.id!!
val displayName = detection.displayName!!
val confidence = detection.confidence!!
val location = detection.location!!
val color = detection.color!!
val keypoints = detection.keypoints!!
}
EdgeModel model = EdgeModel.fromAsset(context, "centernet-keypoints.edgem");
// Preparing data for model
Bitmap image = // your input image
// Run the model on prepared inputs and get recognitions as output
Recognitions recognitions = model.run(Arrays.asList(image));
// Draw the recognitions on your Image or UI
Bitmap drawing = recognitions.drawOnBitmap(image);
// Use the recognitions in your app:
for (Recognition detection : recognitions) {
var id = detection.id;
var displayName = detection.displayName;
var confidence = detection.confidence;
var location = detection.location;
var color = detection.color;
var keypoints = detection.keypoints;
}
from edgeengine import EdgeModel
import matplotlib.pyplot as plt
model = EdgeModel("centernet-keypoints.edgem")
image = #your input np.ndarray image
results = model.run([image])
# draw results on input image
visualization = results.visualize(image)
# show results
plt.imshow(visualization)
plt.show()
# Use results
for detection in results.recognitions:
class_id = detection.id
display_name = detection.display_name
confidence = detection.confidence
location = detection.location
color = detection.color
Supported Image Formats
We support following input formats for images:
-
CvPixelBuffer (Recommended, iOS)
-
UIImage (Slow, iOS)
-
Bitmap (Android)
Interpreting and Using Results
The output results variable has type Recognitions that is a type of List. All the useful
information the model provides is compacted into this variable. Not matter your use case the output of model will always be Recognitions object. You can use location and keypoints (inside the for loop) data to get coordinates of each person and their skeleton.
Coordinate System
Our coordinate system is same as used in UIKit. i.e. our origin is at top left corner of image, with x axis to the right and y axis to the bottom.
The recognitions object returned after model execution provides coordinates in image domain. i.e. 0 < x < Input Image Width and 0 < y < Input Image Height.
You can directly apply transformations on Recognitions object by using the Recognitions.map function in Android or Recogntitions.applying in iOS