What does an autonomous vehicle ‘see’? By William Sachiti, Academy of Robotics

Last Updated on: 22nd November 2023, 06:55 am

Have you ever wondered how an autonomous vehicle sees? How it manages to navigate its way around obstacles and avoid pedestrians – even if one runs out in front of it? And how it moves among the traffic avoiding collisions with other vehicles as they change lanes, make turns and stop/start? Can it spot a cat making a dash for it and avoid it? What about the wind blowing a dustbin into the road? Can the car predict what might happen next, a bit like we do when driving, and anticipate its next move?

To teach an autonomous vehicle to do all these things, we need to start by gathering huge quantities of data. To do this a data gathering car is used.

These custom-made vehicles, such as the example above produced by Pilgrim Motorsports with guidance Academy of Robotics in the UK, carry specialist sophisticated camera and computing equipment to be able to gather the required autonomous car data. Its job is to go around a town to capture visual the data is in the form of video footage from up to 12 cameras with a combined 360 view around the car as well as capturing feedback from sensors and infrared detectors. This is all to gain a comprehensive understanding of the road environment and the road’s users, particularly in residential areas

We then take this data back to a bank of supercomputers which watch it over and over again to learn. This type of computer science is called machine learning and uses evolutionary neural networks. Neural Networks are a computer system modelled on the human brain and nervous system, we run computer algorithms on neural networks. In this way the algorithms not only learn but also evolve with each iteration. This is not dissimilar to how we, as humans, have to have driving lessons and we learn a little bit more with each session.

Much like a child is taught what objects are at school, we take images of similar scenes to roads where the car will drive. From these scenes we mark out what objects are, we call this annotation. Using a branch of computer science called Machine Learning, we apply the annotated data to an algorithm which now begins to compare images and learn the difference between a car, a pedestrian, cyclist road, sky, etc. After some time of doing this and us showing the computer more complex or harder to understand scenes, the algorithm in the computer eventually figures out the rest by applying what it has been taught and what it sees.

Now that the algorithm can tell what objects are, we attach multiple cameras looking in all directions. And in real-time, the algorithm is able to identify pretty much everything that is relevant in a scene. Using onboard supercomputers, that are performing up to 7 trillion calculations per second, the camera data is interpreted to reveal something like the image below.

Understanding what is in the scene is just one small part of the puzzle. The next step is to predict what each person, car, bicycle, traffic light, is going to do next. Yes, we are going to predict the potential future of everything in the scene. While this sounds complex at first, if you break it down, it’s actually quite simple. 

In the real world, if your smartphone were to slip from your fingers and start to fall, you know it will hit the ground. It is not going to stop all of a sudden and float or spontaneously shoot up. It falling and hitting the ground, to you, is a simple predictable action with an inevitable result. Similarly, the vehicle is able to see and identify pedestrians, cars, and bicycles etc. and then predict multiple realistic potential scenarios, taking action based on which potential scenarios are more likely to happen.

We are using computer power not only to see everything in the scene but then to predict what everything is likely to do in the next three seconds. While three seconds sounds like not a very long time to calculate potential futures, from the frame of reference of the car, everything is happening very, very slowly. It sees the world at 1000 frames per second. To it, all objects on screen, are moving as slow as snails do to you and I. 

Keeping in mind that there is more than one camera looking around the car, we fuse the findings from each camera creating a combined view of the world as seen by the car. This combined view gives us a more accurate account of what is happening in the world around it. 

Lanes and keeping in Lane

There is a similar process for a vehicle to know how to keep in lane, where the road is and where it needs to be driving.

The example below shows a vehicle driving through a residential street in the UK. 

As the vehicle needs to give way, it highlights in red the areas that it cannot drive and in green the areas that it considers space which is free on the road. There is an entire algorithm with its own neural network which has been trained to understand just the road taking into account details like texture, colour, obstructions etc.

These are a few ways an autonomous car understands the world around it – but there is more. We also have sub systems for reading road markings, reading traffic signs, Infra-red and more. All these subsystems running in their own Neural Network are combined to create one super view of the world as the car sees it.

The end result is that currently, some of our test vehicles driven by neural networks are already out performing human counterparts in many scenarios. For what we do, autonomous delivery in the last mile, we have no need to learn how to drive on every road in the UK; we only need to master specific postcodes for residential last mile delivery, which is why we are already so close to deployment.

Other Vehicles on the road and limitations.

The first smartphones were giant bricks which could not do much more than make phone calls, as time went on, they got more advanced and could do more.

Self-driving vehicles are the result of years of computer science and their arrival is the next step in the evolution of vehicles. First, we saw vehicles with cruise control, then cruise control with lane assist, then self-parking and now we’re moving onto self-driving. The first autonomous cars will do an excellent job of driving themselves on very specific routes. With time, the vehicles will begin to drive more complex roads and routes, eventually, they will connect to each other and share data between each other; it is a step by step process.

I predict that we will begin seriously to see passenger carrying self-driving cars on the roads by 2020 and then a period of mass adoption between 2021 and 2025. The first self-driving cars you will see on the road are likely to be autonomous cars which deliver goods and don’t carry people. This is a simple, low risk start with a valid use. Our own autonomous delivery vehicle Kar-go is scheduled for trials later this year.

Share this article
Shareable URL
Prev Post


Next Post

UAE’s new expat laws could make it top ten world financial hub in a decade

Read next