Some cool application with machine learning - フレクトのクラウドblog re:newal

研究開発室の馮志聖(マイク)です。

First I will talk about some cool application on Tensorflow Lite.

Second I will talk about YOLACT: Real-time Instance Segmentation on ios.

Style Transfer Introduction

Tensorflow Lite release new use case on official website.

Name is style transfer.

And it can work on mobile.

Apply any styles on an input image to create a new artistic image.

It can use on take artistic photo and post on social media like twitter.

Share the cool photo to other person.

Style Transfer

This is structure.

f:id:fengchihsheng:20200831153555p:plain — style transfer structure

The image is get from official github.

Style Transfer Demo

It use pre-train model for prediction.

f:id:fengchihsheng:20200831155119g:plain — style transfer demo

Style Transfer Final

It is cool application for create a new artistic image.

And speed is very fast.

But it only can see iphone xs (ios 12) benchmark on official website.

https://www.tensorflow.org/lite/models/style_transfer/overview

So I try to measure iphone xs (ios 13.1.3), iphone 11 (ios 14.0), new ipad pro (ios 14.0) have lidar sensor.

I will use same image and same style for testing.

This is some results.

It have 4 steps.

1.Preprocessing. 2.Style prediction. 3.Style transform. 4.Post-processing.

At last these are all processing run on CPU and GPU.

f:id:fengchihsheng:20200908113702p:plain — All-processing (CPU) benchmark

f:id:fengchihsheng:20200908113724p:plain — All-processing (GPU) benchmark

On these benchmark we can see iphone 11 CPU and GPU is the fastest.

First is iphone 11. Second is new ipad pro. Third is iphone xs.

But focus on Style prediction and transform (GPU) new ipad pro is faster than other devices.

This is Demo for iphone xs.

f:id:fengchihsheng:20200903105650g:plain — demo for Style transform

Run on iphone xs need 490-500 milliseconds for each frame.

FPS nearby 2.

It is slow.

Next.

I try to create more fast demo for Style transform.

Preprocessing is heavy.

So I try to made it more fast.

I use camera buffer only to next step.

Not need any preprocessing.

These are all processing after fix run on CPU and GPU.

f:id:fengchihsheng:20200908123113p:plain — Compare All-processing (CPU) benchmark

f:id:fengchihsheng:20200908123132p:plain — Compare All-processing (GPU) benchmark

On these benchmark we can see iphone 11 CPU is the fastest.

And new ipad pro GPU is the fastest.

More fast demo like this image.

f:id:fengchihsheng:20200908104042g:plain — More fast demo for Style transform

Run on iphone xs need 150 milliseconds for each frame.

FPS nearby 6.6

YOLACT Introduction

YOLACT: Real-time Instance Segmentation.

I think maybe someone know about Semantic Segmentation.

Semantic Segmentation is image classification and localization.

Instance Segmentation is object detection and Semantic Segmentation.

Like these images.

f:id:fengchihsheng:20200831190520p:plain — Comparison of semantic segmentation, classification and localization, object detection and instance segmentation

https://medium.com/datadriveninvestor/deep-learning-for-image-segmentation-d10d19131113

f:id:fengchihsheng:20200831190848j:plain — Compare semantic segmentation and instance segmentation

https://mc.ai/detection-and-segmentation-through-convnets/

YOLACT

This is structure of YOLACT.

f:id:fengchihsheng:20200831192227p:plain — YOLACT Structure

This is table of speed on pc.

In this graph YOLACT is in Real-time area.

f:id:fengchihsheng:20200902084534p:plain — YOLACT speed on pc

This graph don't have mobile version.

Because some of them not support work on mobile.

So how YOLACT speed work on ios?

First.

How to use YOLACT model on CoreML?

Follow this URL.

And run onnx_to_coreml.py on this github.

https://github.com/Ma-Dan/yolact/tree/coreml

After finished it put on xcode project and use it.

YOLACT Demo

This is demo run on iphone xs.

It only have 1.5-1.8 FPS.

f:id:fengchihsheng:20200831192600g:plain — YOLACT demo

YOLACT Final

Use YOLACT on ios speed is very fast.

But it can not real-time on mobile.

Real-time need FPS more than 30.

I think it can only use on some business case not need real-time.

Business case like scan the multi barcode.

Other

Tell us what do you think about our result , or anything else that comes to mind.

We welcome all of your comments and suggestions.

Reference