Artificial Intelligence Project: Pose Detection

Thenjiwe kubheka

Published in

Level Up Coding

6 min readJan 4, 2022

I get asked a lot, how to use AI to detect certain posepostures and is it possible to get emotions out of these.

Well, my collogue(Gugu Sibanyoni) and I decided to play around with AI tracking.

The picture above is the end goal, using AI for recognizing posture and eventually emotions.

Setting Up Our Environment

We will be using open CV for the image recognition and media pipe for our posture recognition models.

!pip install mediapipe opencv-python

We will only be using two dependencies and time to move on to setting up our real-time feed in opencv.

To set up our real time feed, we need to import our dependencies;

import mediapipe as mp
import cv2

Lets begin by setting up media pipe

mp_drawing = mp.solutions.drawing_utils
mp_holistic = mp.solutions.holistic

So we have set up our drawing utilities that will draw our different detections from our holistic model to the screen via open cv.

Then holistic simply brings in our holistic model.

Inside media pipe we have a variety of models we can play around with, namely;

Face Mesh, Iris, Hands, Pose, Holistic, Selfie Segmentation.

Now we want to setup open cv for real time imaging;

cap = cv2.VideoCapture(0)
while cap.isOpened():
    ret, frame = cap.read()
    cv2.imshow('Real Time Imaging', frame)
    
    if cv2.waitKey(10) & 0xFF == ord('q'):
        breakcap.release()
cv2.destroyAllWindows()

The first line of code we are declaring cap as our variable and assigning cv2(alias of calling open cv) to VideoCapture and choosing 0 as the default camera port.

We begin our wile loop, by telling open cv that while cap, our variable is open, then return frame and read/ render cap to the screen.

Then we use the imshow function to render the image in frames and label the frame real-time imaging.

Our conditional if statement starts the break part of the statement, we saying to cv2 wait 10 milliseconds then when I press q, break the while loop that is running the frame.

Then release the program and break all windows.

Sometimes from the if statement, the break and q does not work, hence we can just have the following code alone. Just to make sure the code is broken.

cap.release()
cv2.destroyAllWindows()

Now we want to overlay our media pipe holistic components onto the opencv.

We start by copying the open cv code, to the next Jupyter cell, then we will add the holistic code to it.

cap = cv2.VideoCapture(0)
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    while cap.isOpened():
         ret, frame = cap.read()
         image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
         results = holistic.process(image)
         print(results)
         cv2.imshow('Real Time Imaging', frame)
    
         if cv2.waitKey(10) & 0xFF == ord('q'):
             breakcap.release()
cv2.destroyAllWindows()

The new additional code we have added, we are using the wit statement and using the mp_holistic we have imported earlier. Within the mp.holistic method we are adding the min_detection_confidence and min_tracking_confidence and assigning it to 0.5.

with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:

We will be setting the entire line to holistic as we do not want to be typing holistic every time.

For a model with high tracking confidence we would set min_detection confidence and min_tracking_confidence to higher values.

We have also added the following lines of code;

image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
         results = holistic.process(image)
         print(results)

The first line of code we are recoloring our image and grabbing our frame we have above and recoloring what is returning within the frame using cv2.COLOR-BGR2RGB, we want the color of the model to be represented in RGB.

We take the image which is being recolored and we pass it to holistic model and assign a variable called results. Then we print out our results as we are not yet drawing anything to the screen.

When we run our code we will see a pop up.

Then a print out of what we needed, just to check if we are still on track with everything.

Now that we have checked that all is working ok, we can go ahead and draw, these to our screen, starting with the face landmarks, we can remove the print and add the draw.

image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
         cv2.imshow('Real Time Imaging', frame)

The first line of code image = cv2.cvtColor(image, cv2.COLOR_RGB2RGB2) is converting back the rendering of the image from BGR(which we converted above) to RGB, as this is how open cv wants its images.

Then we go ahead and draw, using mp_drawing.draw_landmarks and we pass in the image variable from above, the results from our face_landmarks and the mp.holistic.FACEMESH_TESSELATION model.

So we want to change the image we are rendering, so we want to change our raw frame image to our image variable.

So we have all of our different face landmarks drawn to the screen, which is pretty cool and very creepy, I move my face in any direction it keeps tracking my face.

Now we want to draw the rest of our landmarks which is the pose, right hand and left hand.

mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION)
         mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
         mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
         mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
         cv2.imshow('Real Time Imaging', image)

As seen above the model is tracking my face, right and left hands including my pose, very accurately, very fast. We can also track the full body with the pose tracking.

Well we have the white and red, and we want to color our landmarks a bit, we will do that by using the drawing spec model.

mp_drawing.DrawingSpec(color=(0,0,255), thickness=2, circle_radius=2)

We are able to pass the color, thickness and circle radius, we want to pass through the line and circle color.

mp_drawing.DrawingSpec(color=(0,0,255), thickness=2, circle_radius=2)

We will do this for face, pose and hands.

cap = cv2.VideoCapture(0)
# Initiate holistic model
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    
    while cap.isOpened():
        ret, frame = cap.read()
        
        # Recolor Feed
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        # Make Detections
        results = holistic.process(image)
        # print(results.face_landmarks)
        
        # face_landmarks, pose_landmarks, left_hand_landmarks, right_hand_landmarks
        
        # Recolor image back to BGR for rendering
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        
        # 1. Draw face landmarks
        mp_drawing.draw_landmarks(image, results.face_landmarks, mp_holistic.FACEMESH_TESSELATION, 
                                 mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1),
                                 mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1)
                                 )
        
        # 2. Right hand
        mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS, 
                                 mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4),
                                 mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
                                 )# 3. Left Hand
        mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS, 
                                 mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4),
                                 mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
                                 )
 # 4. Pose Detections
        mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS, 
                                 mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4),
                                 mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
                                 )
                        
        cv2.imshow('Raw Webcam Feed', image)if cv2.waitKey(10) & 0xFF == ord('q'):
            breakcap.release()
cv2.destroyAllWindows()