フレクトのクラウドblog re:newal

http://blog.flect.co.jp/cloud/からさらに引っ越しています

Social distance base on ARKit and CoreML

研究開発室の馮 志聖(マイク)です。

Introduction

Coronavirus is attacking the world.

Social distancing is very important.

What is social distancing?

In public health, social distancing, also called physical distancing, is a set of non-pharmaceutical interventions or measures intended to prevent the spread of a contagious disease by maintaining a physical distance between people and reducing the number of times people come into close contact with each other.

It typically involves keeping a certain distance from others (the distance specified may differ from time to time and country to country) and avoiding gathering together in large groups.

https://en.wikipedia.org/wiki/Social_distancing

Many company try to improve social distancing on technology.

One of them is Google.

Google release the tool name is Sodar.

Sodar - use WebXR to help visualise social distancing guidelines in your environment.

Using Sodar on supported mobile devices, create an augmented reality two meter radius ring around you.

sodar.withgoogle.com

This tool is interesting and useful for public health.

So I want to improve this idea.

And make it smart that the user can detect someone getting closer.

So I will create a tool it can detect the person and distance.

It use ios CoreML with ARKit.

And with New iPad Pro release from 2020.

New iPad Pro have LiDAR sensor.

LiDAR sensor is good for measure the distance.

CoreML

What is CoreML?

Core ML is the foundational machine learning framework from Apple that builds on top of Accelerate, BNNS, and Metal Performance Shaders.

It provides machine learning models that can be integrated to iOS applications and supports image analyses, natural language processing, audio to text conversion, and sound analysis.

Applications can take advantage of Core ML without the need to have a network connection or API calls because the Core ML framework works using on-device computing.

https://golden.com/wiki/Core_ML

And this is Apple official website.

https://developer.apple.com/machine-learning/

On this case I use object detection and model is YOLOv3-Tiny.

This is official website for CoreML model.

https://developer.apple.com/machine-learning/models/

Object detection

Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos.

If you want to know more please check this URL.

https://en.wikipedia.org/wiki/Object_detection

YOLO

You only look once (YOLO) is an object detection system targeted for real-time processing.

If you want to know more detail please check these URL.

This is the paper of You only look once (YOLO).

https://arxiv.org/pdf/1506.02640.pdf

This is the paper of YOLOv3.

https://pjreddie.com/media/files/papers/YOLOv3.pdf

This is some explain for YOLO, YOLOv2 and YOLOv3.

https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088

This is YOLO official website.

https://pjreddie.com/darknet/yolo/

ARKit

In June 2017 Apple released the ARKit API tool for developers working on virtual reality and augmented reality applications.

The ARKit Tool is designed to accurately map the surrounding using SLAM (Simultaneous Localization and Mapping).

https://xinreality.com/wiki/ARKit

In this case I use ARKit hitTest to get the distance base on point.

https://developer.apple.com/documentation/arkit/arscnview/2875544-hittest

This is official website.

https://developer.apple.com/jp/augmented-reality/arkit/

Demo

This is flow chart.

f:id:fengchihsheng:20200611113409p:plain
flow chart

It look like this.

f:id:fengchihsheng:20200610150751p:plain
social distance

This is demo.

In this demo can see it will detect the person and distance.

If distance less than 2 meter will be red color with warning tone.

In the real world demo have sound.

But in this demo GIF files there is no sound.

If more than 2 meter will be green color.

f:id:fengchihsheng:20200611120535g:plain
Demo

It also support multi person.

f:id:fengchihsheng:20200611120402g:plain
Demo

Final

I think it is easy to understand someone nearby you or not.

It just like auto driving.

Difference part is car change to human.

And target is change to human too.

YOLOv3-Tiny inference very fast.

It is 30 FPS on New iPad Pro.

Other

Tell us what do you think about our result , or anything else that comes to mind.

We welcome all of your comments and suggestions.