UWindsor researcher teaching computers to see like humans

Dibyendu Mukherjee is trying to perfect the computer’s ability to see and identify objects the way a human eye can—research that could be used to help design a self-driving car or to benefit medical treatments.

“This research is proving invaluable in solving various vision related problems including traffic monitoring, surveillance as well as medical image analysis,” says Dr. Mukherjee.

Using sensors and cameras, Mukherjee, a recent PhD graduate from UWindsor in electrical engineering, captures video or still images to process using algorithms. This data is fed into a computer, which in turn segments the images. Using the algorithm, the computer is trained to identify objects from the segmented video or still image.

In the Computer Vision and Sensing Systems Laboratory in the Centre for Engineering Innovation, Mukherjee also uses high-end stereo cameras that take two images at a time to help build a three-dimensional image, and range cameras that use infrared light to take pictures rendering depth information, but he has been experimenting with lower-end cameras to cut down on cost.

“The basic goal of all our work is to simplify processes and that starts with finding cheaper instruments,” he says. “I captured video using a laptop camera, the kind you’d use for skyping, and it was successful. Our robust algorithm still works with those cheaper cameras.”

The process works in real-time and he has published his findings in the journal IEEE Xplore. Mukherjee says it gets tricky with real-time video because the computer has to deal with identifying moving objects and people as events take place.

“Generally there’s movement in the foreground but the background is stable,” he says. “Practically speaking the real challenge lies with movement in the background like running water or swaying trees, but my algorithm can ignore all unwanted motion.”

This would be especially useful with cameras mounted above traffic signals. Mukherjee says in the future the real-time images collected from these cameras and sensors could help calculate the risk of an accident and in turn notify drivers. He has also collaborated with a major automotive industry for GPS localization improvements using stereo cameras.

“High-tech cameras can also take pictures of the eye or other human body parts that can be analyzed for damaged tissues,” he says. “The computer can be trained by our algorithms to identify and differentiate between a normal tissue structure and a damaged one. This could mean avoiding surgery or catching a disease while it’s still early enough to cure.”

All of Mukherjee’s published graduate research, completed under the supervision of electrical and computer engineering professor and Canada Research Chair Jonathan Wu, is available open source so others can have access to the data for use in their own research.