Teaching Robots how to Drive Vehicles in a Few Steps

Many things would be different if the robot was capable of learning through observing demonstrations. For example, you would introduce it to tasks in the same way you do to new employees. Your self-driving vehicle could also learn road safety practices from observing you as you drive around.

Extensive Research

Researchers have developed a system that allows a robot to learn sophisticated tasks autonomously from various demonstrations. These demonstrations do not have to be perfect. The system operates by analyzing the value of every demonstration, allowing the robot to learn from both the successes and mistakes it observes.

Current advanced methods require up to 100 demonstrations to master a particular task. However, with the new method, robots can learn from a few demonstrations. Further, robots can learn more intuitively just as humans learn from one another. For example, you can watch your colleague execute a task both imperfectly and perfectly then try it on your own.

According to Aniruddh Puranic, a computer specialist, “Many machine learning and reinforcement learning systems require large amounts of data and hundreds of demonstrations — you need a human to demonstrate over and over again, which is not feasible. Also, most people don’t have the programming knowledge to explicitly state what the robot needs to do, and a human cannot possibly demonstrate everything that a robot needs to know. What if the robot encounters something it hasn’t seen before? This is a key challenge.”

Learning through Demonstrations

Learning through demonstrations is fast becoming an ideal strategy when it comes to acquiring robust robot control guidelines. These guidelines regulate robot movements for intricate tasks. Apart from being prone to imperfections, there are safety concerns associated with learning through demonstrations. This is because robots can learn unacceptable and unsafe actions. It is worth mentioning that demonstrations vary.

Some of them can exhibit the desired action better compared to others. Again, the value of the exhibitions usually depends on how skilled the individual conducting the demonstration is.

Addressing Arising Issues

Researchers used signal temporal logic (STL) to analyze the value of demonstrations and rate them automatically to generate essential rewards. This would mean that even when some elements of the demonstrations weren’t sensible according to the logic concerns, this method would facilitate the robot’s learning from the imperfect components. The system guarantees a certain percentage of a demonstration’s success and accuracy.

Stefanos Nikolaidis, a computer science assistant professor says; “Let’s say robots learn from different types of demonstrations — it could be a hands-on demonstration, videos, or simulations — if I do something that is very unsafe, standard approaches will do one of two things: either, they will completely disregard it, or even worse, the robot will learn the wrong thing. In contrast, in a very intelligent way, this work uses some common sense reasoning in the form of logic to understand which parts of the demonstration are good and which parts are not. In essence, this is exactly what also humans do.”

In the case of a driving exhibition where a demonstrator misses a stop sign, the system would rate the action lower than one by an experienced drier. However, if the driver executes an intelligent action during the demonstration, such as applying brakes to avoid an accident, then the robot will learn from the smart action.

Conforming to Human Desires

STL (signal temporal logic is an articulate numerical symbolic language that facilitates robotic reasoning regarding prevailing and future consequences. Past research in this field utilized linear temporal logic, but Jyo Deshmukh, a one-time engineer at Toyota, says STL is more practical in this case.

He says “When we go into the world of cyber-physical systems, like robots and self-driving cars, where time is crucial, linear temporal logic becomes a bit cumbersome, because it reasons about sequences of true/false values for variables, while STL allows reasoning about physical signals.”

Testing the System

Researchers used a Minecraft-design game simulator to test the STL system. However, they mentioned that driving simulators or videos could also be used. Studies and trials are still ongoing, with researchers hoping to test the system on actual robots. They noted that the STL approach is ideal for applications where while maps have been determined, these are huge barriers.