研究開発室の馮 志聖(マイク)です。
Content
- Background
- Abstract
- Twilio version
- ARWorldMap version
- Real-time DB version
- Conclusion
- Other
- Reference
Background
The coronavirus is ruthlessly eroding human lives, and many people work from home for safety reasons.
Remote work has become very common, among which video chat tools have become a popular focus. However, video chat tools have limitations and cannot be used for certain types of tasks, such as the inability to effectively point out the location when performing maintenance tasks.
Our goal is to improve this limitation and make remote work more convenient, so we will develop an AR Remote Instructions program.
Abstract
We try to create three version of AR Remote Instructions.
(1) Twilio version
First, we plan to extend the AR function from the video chat tool. We chose the third-party library Twilio to develop the extended function.
Benefits: Easy to use, AR function can be used while video.
Drawbacks: Can only focus on the lens of one device. One party’s network speed is too slow, leading to the wrong location.
(2) ARWorldMap version
Second, we will improve the shortcomings of the Twilio version, and synchronize all devices to use the same ARWorldMap in ARKit, so all devices will display the same set of AR instructions.
Benefits: Improve the shortcomings of the Twilio version.
Drawbacks: As long as one of the devices is updated with new information, the instructions temporarily stored on the other device will disappear.
(3) Real-time DB version
Third, we will improve the shortcomings of the ARWorldMap version. First, all devices will read the same ARWorldMap as the initial point, and then the instructions temporarily stored by all devices will be uploaded to the cloud server. Each time there is a new instruction, only a small part will be updated. Affect the temporary storage instructions of other devices.
Benefits: Improve the shortcomings of ARWorldMap version.
Which one is the best solution? It's depending on use case.
If network environment is stable (1) will be good solution. Because it not need target objects in each real environment.
If not (3) will be better solution.
Twilio version
Introduction
What is Twilio?
Twilio is an American cloud communications platform as a service (CPaaS) company based in San Francisco, California.
Twilio allows software developers to programmatically make and receive phone calls, send and receive text messages, and perform other communication functions using its web service APIs.
https://en.wikipedia.org/wiki/Twilio
Why choose Twilio?
1.Development speed:
They already have a complete library that supports ios, and they have been released on cocopods. The important thing is that they support real-time data streaming. I use this function to transmit AR location information.
2.High-quality connections
The company uses very reliable software: it is rated at 99.95% according to the SLA (Service Level Agreement).
99.95% is a very high indicator, which proves that callers have all the reasons to be sure that their conversation won't be interrupted on technical grounds.
3.Free trial:
I think this is very important for developers, especially for the development of new products. Sometimes when it is not clear whether the functions of third-party libraries are compatible with their own products, the free trial period can provide developer testing.
On this website it have more information about Twilio.
https://agilie.com/en/blog/why-you-need-twilio-to-build-a-high-power-communication-platform
Overview
From this picture, we can see the flow of the entire system.
The point information is the coordinate position of the finger touched on the screen. For example, there are devices A, B and C here. These devices can be customers or supporters.
It has two scenes with this application.
In these scenes, device A is a client and requires the help of B and C supporters. All devices are focused on the screen of client A, and all point information will be displayed on client A's device.
Scene 1: First, all devices start running this software. Suppose we focus on the screen of customer A. At this time, the screen of customer A will be sent to the Twilio server, and supporters B and C will receive the screen of customer A and Displayed on its own screen, so when A client performs all actions will be shared with B, C supporters including drawing.
Scene 2: When supporters B and C start to perform actions including drawing, the coordinate position they touched on the screen will be sent to the Twilio server, and then shared with client A. When client A receives the coordinate position data, it will be rendered in Your own device will be displayed on the screen, and the screen will be shared with B and C supporters after rendering.
Demo
Note: All demos are executed in different network environments.
Demo 1:
The Twilio version implements the function of placing 3D arrows and sharing it with other devices. In this demonstration, the screen image is focused on the screen image of the right device.
Demo 2:
The Twilio version executes the drawing function and shares it with other devices. In this demonstration, the focus is on the screen of the left device.
Demo 3:
The Twilio version executes the detection distance function and shares it with other devices. In this demonstration, the screen of the left device is focused.
Discussion
There are some shortcomings in this version, the list of shortcomings is as follows:
Can only focus on one device screen.
The execution fluency depends on the network speed.
If the internet speed is slow, the following problems will result:
- It is not smooth when executing the drawing function.
- There is a delay when receiving screen data.
- The screen during execution of the action will be displayed on the wrong screen.
Here are ways to improve these shortcomings:
Store all actions on the local machine and upload them to the server, and then share them with other devices. All actions will be updated on the server and all devices in real time. I will introduce the detailed content in the next chapter.
ARWorldMap version
Introduction
What is ARWorldMap?
It is used to share AR data on ARKit, including location and 3D model. For details, please refer to the link below.
https://developer.apple.com/documentation/arkit/arworldmap
What is AR data?
The process is shown in the following picture.
How to sharing?
The official library only supports the local network environment and does not support different network environments.
For details, please refer to the following link.
Due to this limitation, we will try to share and update ARWorldMap on the server and various devices in real time to achieve our goal.
Overview
From this picture we can see the flow of the entire system.
For example, there are devices A, B and C. These devices can be customers or supporters.
It has one scene with this application.
In this scene, device A is a customer and requires support from B and C. With the help of the author, all devices have the same ARWorldMap and are rendered on their respective devices.
Scene 1: First of all, all devices start to run this software. Suppose that supporter B executes the drawing function to mark the target location and uploads the server for real-time updates. At the same time, customer A and supporter C also receive updates from supporter B and display them in their respective In the device. However, customer A and supporter C are also executing the drawing function and marking other target locations. At this time, the temporary storage mark of customer A and supporter C disappears due to the simultaneous update of supporter B.
Demo
Note: All demos are executed in different network environments.
Both devices have the same ARWorldMap and are rendered on their respective devices, and the drawing function is executed to mark the target location.
Discussion
We have resolved the issues as follows:
- Focus on only one device.
- The internet speed causes the problem of stutter.
But there is one disadvantage:
If a device is updated simultaneously, it will cause the temporary storage of other devices to disappear.
The following are ways to improve this shortcoming:
First, we tried to decompose the ARWorldMap data, but there was no official document that provided us to decompose the ARWorldMap data, so we built a real-time update server to store ARWorldMap and other action process data, and read ARWorldMap as the initial point, and only updated a small amount in real time. Part of the action process is rendered in each device.
I will introduce the detailed content in the next chapter.
Real-time DB version
Introduction
What is Real-time DB?
A real-time database is a database system which uses real-time processing to handle workloads whose state is constantly changing.
For details, please refer to the following link.
https://en.wikipedia.org/wiki/Real-time_database
Overview
From this picture we can see the flow of the entire system.
For example, there are devices A, B and C. These devices can be customers or supporters.
It has one scene with this application.
In this scene, device A is a customer and requires support from B and C. With the help of the author, all devices load the same ARWorldMap as their initial location and render them on their respective devices. The actions performed by the respective devices will be partially uploaded to the server and updated to other devices in real time.
Scene 1: First, all devices start to run this software. Suppose that client A executes the drawing function to mark the target location and uploads only a small part of the data to the server. Supporters B and C update some of the data in real time after monitoring the server for new data They are rendered on their respective devices. This update action does not affect the B and C supporters to perform the drawing function and mark other target positions, and so on.
Demo
Note: All demos are executed in different network environments.
Both devices load the same ARWorldMap as the initial location and render it in their respective devices, and each executes the drawing function to mark the target location and uploads the server to real-time synchronization updates to other devices.
Discussion
We have solved all the shortcomings, but there is another limitation that is that there must be target objects in each real environment.
The following are ways to improve this shortcoming:
We try to use a method close to virtual reality, and I will explain it next time.
Conclusion
This is our chart for comparing the reaction time of the three versions.
Comparison 1:
The response time when running on the local side.
Comparison 2:
Response time when running in the cloud.
I am very glad that I have this opportunity to develop AR Remote Instructions software. Everyone has a smartphone and the Internet. The rapid development of technology brings more convenience to people. In the future, AR, VR or MR will be Development focus, these technologies will help various industries to save more resources, including time, space and manpower, so it is my honor for me to learn these cutting-edge technologies.
Other
Tell us what do you think about our result , or anything else that comes to mind.
We welcome all of your comments and suggestions.