It's a hot summer day, and your eyes spot an ice cream cart up ahead.
Without even really thinking, you start walking that direction.
Planetary scientists would like to give robots that kind of visual
recognition - not for getting ice cream, but for finding scientifically
interesting targets. Currently,
rovers and other space vehicles are still largely dependent on commands from their human controllers back on Earth.
But to decide what commands to send, operators must wait to receive images and other pertinent information from the
spacecraft. Because rovers don't have powerful
antennas,
this so-called downlink usually takes a lot of time. The data
bottleneck means rovers often "twiddle their thumbs" between subsequent
commands.
"Our goal is to make smart instruments that can do more within each command cycle," says David Thompson of the
Jet Propulsion Laboratory in Pasadena, Calif.
Thompson is heading a project called TextureCam, which involves creating
a computer vision package that can map a surface by identifying
geological features. It is primarily envisioned for a rover, but it
could also benefit a spacecraft visiting an asteroid or an aerobot
hovering in the atmosphere of a distant world.
With funds from
NASA's Astrobiology
Science and Technology for Exploring Planets (ASTEP), Thompson's team
is currently refining their computer algorithm, with an eventual plan to
build a prototype instrument that can map an astrobiologically-relevant
field site.
Roam rover, roam rover
Rovers have already made great advances in autonomy. Current prototypes
can travel as much as a kilometer on their own using on-board navigation
software. This allows these
vehicles to cover a much larger territory.
But one concern is that a rover may literally drive over a potentially valuable piece of
scientific
real estate and not even realize it. Giving a rover some rudimentary
visual identification capabilities could help avoid missing "the needle
in the haystack," as Thompson refers to the hidden clues that
astrobiologists hope to uncover on other planets.
"If the rover can make simple distinctions, we can speed up the
reconnaissance," he says. As it drives along, the rover could snap
several images and use on-board software to prioritize which images to
downlink to Earth.
And while waiting for its next set of commands, it could pick a
potentially interesting geological feature and then drive up close to
take a detailed picture or even perform some simple chemical analysis.
"You could start the next day with the instrument sitting in front of a
prime location," Thompson says.
Instead of spending time trying to get the rover from point A to point
B, mission controllers could concentrate on doing the higher level
scientific investigation that the rover can't do. At least, not yet.
"The field being investigated by David Thomson is vital to cope with the
flood of remote sensing data returned from spacecraft," says Anthony
Cook of Aberystwyth University in the UK, who is not involved with
TextureCam.
There are a other projects working on computer vision for rovers. In 2010, the
Mars rover Opportunity
received a software upgrade called AEGIS that can identify
scientifically interesting rocks. A project in the Atacama desert in
Chile used a similar rock detector system on its rover called Zoe. And
ESA's ExoMars mission is developing computer vision that can detect
objects in the rover's vicinity.
TextureCam is unique from these other efforts in that it is mapping the
surface, rather than trying to isolate particular objects. It's a more
general strategy that can identify terrain characteristics, such as
weathering or fracturing.
Recognizing a rock face
The new approach by Thompson's group focuses on the "texture" of an
image, which is computer vision terminology for the statistical patterns
that exist in an array of pixels. The same kind of image analysis is
being used in more common day-to-day applications.
For example, the web is inundated with huge
photo archives
that haven't been sorted in any systematic way. Several companies are
developing "search engines" that can identify objects in digital images.
If you were looking for, say, an image with a "blue dog" or a
"telephone booth," these programs could sift through a collection of
photos to find those that match the particular criteria.
Additionally, many digital cameras detect faces in the camera frame and
automatically adjust the focus depending on how far away the faces are.
And some new video game consoles have sensors to detect the bodily pose
of a game player.
What all these technologies have in common is a sophisticated analysis
of image pixels. The relevant software programs typically look for
signals in the variations of brightness or the shades of color that are
characteristic of a telephone or a face or a rock.
These signals often have little to do with the way we might describe these objects.
"The software identifies statistical properties that might not be obvious to the human eye," Thompson says.
Let the computer do the guesswork
In the case of TextureCam, the computer program takes a small patch, or
thumbnail, inside the image and performs a number of different
pixel-to-pixel comparisons. Which comparisons? Actually, the computer
decides.
"We train the system from examples," Thompson explains. They take images that were previously analyzed by a
geologist
as having an outcrop or a sediment or a rock of a particular variety.
The computer program compares its pixel analysis to these labels and
builds a decision tree (or a more elaborate "decision forest") that best
discriminates between the different possibilities.
"These decision trees can be quite efficient even after just a few branches," Thompson says.
This so-called "machine learning" has advantages over other techniques
that construct a visual model of what the computer should be looking
for.
"The disadvantage with visual models is that you have to build a new
rule for every new thing you want to identify," Thompson says. It can be
hard for humans to find reliable distinctions that can help a computer.
It makes more sense to let the computer go out and explore the
possibilities with trial and error.
"The system trains itself, so we don't have to anticipate," Thompson says.
The "training regimen" for TextureCam began with a set of images from Mars and is now moving onto photos from the Mojave Desert.
The team plans to integrate their algorithm into a field programmable
gate array (FPGA), which is basically a special purpose computer that
would connect directly to a rover camera. This would allow TextureCam to
work faster, without relying on the rover's main computer. "Computers
and software are not ready to take over the interpretation tasks of
human geologists, but they will help to pre-sort and pre-identify
regions of interest, thus reducing the amount of remote sensing data
that geologists must examine," Cook says.