This is Shuochen Wang from R&D department in Flect. In this blog I am going to explain how to improve the hand gesture stability of the Apple's Vision framework by using filters.
The Vision framework performs face and face landmark detection, text detection, barcode recognition, image registration, and general feature tracking. Vision also allows the use of custom Core ML models for tasks like classification or object detection. In this blog, we will use the body and hand pose detection function. Specifically, we will use the hand pose detection function to obtain the coordinates of the finger joints.
In my previous blog I have explained about how to manipulate AR object using hand gestures. We used the Vision Framework to obtain the hand position. Because we do not use actual sensors or trackers that can measure the finger positions accurately, there will be errors in the measurement.
In this blog, I have performed an experiment to examine how filter can reduce the uncertainties of the hand pose detection of the Vision Framework. Before explaining the experiment, I would like to demonstrate the problem with the current hand gesture detection.