We’ve been tackling buzz words in the tech industry recently. This is because there is a certain trend that occurs once a term is coined. Everyone uses it without fully getting it and that causes misinformation, confusion, and sometimes fake news. This time around we are looking at the term computer vision.
When you look at an image of a crowd your brain can immediately figure out who is a familiar face, who is a stranger, who is a man or a woman, who is a child or an adult, and roughly someone’s ethnicity. You can also see the clothing people are wearing, who looks put together and who does not, and what time of day it is or season depending on the foreground and lighting.
A computer can look at the same image and see nothing, if we deem it so, but with computer vision it can recognize and identify all the faces, tell you the ages of everyone in the picture, and even accurately tell you everyone’s ethnicity. It may have a harder time determining the season and time of day, due to the shadows, lighting, and shapes, but when it comes to the crowd analytics, verification and recognition it is a breeze.
What is Computer Vision?
The short definition, computer vision is when a computer and/or machine has sight.
To get a little more technical, computer vision is the process of recording and playing back light fragments.
Turns Out SEEING Is Actually Really Hard
Is that really seeing? Some would argue no, as seeing includes processing these images in our brains into thoughts. These thoughts can translate into emotions, decisions, ideas, etc; However, computer vision paired with certain algorithms (ie: see machine and deep learning) can allow a machine to recognize images, interpret solutions, and in some cases even learn.
However, in the beginning we talked about the picture of a crowd and how a human could see beyond the crowd understanding more about the scenery or the people in it. That’s what makes seeing so difficult, the knowledge and breadth that comes with it. Computers can’t do that.
Computer vision does a great job at seeing what we tell it to see unlike human vision which can see many things, in detail, and interpret all the information at once. However, when we tell a computer to see something, and we code it the right way, it can see it better than almost any human on earth.
Kairos' computer vision and machine learning algorithms are designed to detect and recognize (human) faces in nearly all video and image formats - Learn more about Kairos' face recognition features. (Image: © 2017 Marvel Studio)
The History Of Computer Vision
It might amaze you to know that computer vision has been in the works decades before Snapchat graced our phones. Which means, people in the 1950s understood the importance of computer vision before the knew all the ways in which we could use it.
Back in the day
A long time ago, like in the late 50s and into the late 60s, computer scientists started to tackle the idea of computer vision. They wanted to teach computers to predict what a photograph could predict, like a human face has two eyes, a mouth, a nose, and two ears. If a computer identified those features, the photograph must have had a person in it.
However, this project failed as the technology just wasn’t there yet. There were too many other factors that could be at play in a photo and throw the whole system off and no one could figure out how to use something like that.
In the 70s similar projects were started and progress was made in the way in which computers interpreted certain images. Nothing ground shaking yet in the 80s computers could now see shapes through mathematical methods. This changed everything because by seeing shapes computers could finally identify patterns.
Into the 90s
By the 90s facial recognition was a tool being used in government programs through Convolutional Neural Networks (CNNs). CNNs tried to process images in the same way the human brain does, by teaching and learning. Images were given labels and through equations, computers could start classifying the images by those labels.
Yet, we still weren’t there yet and so once again the technology was at a stand still. By the early 2000s government computer scientists started to crack the code, as they had the computer processing power to do so, and started to work on facial recognition. By 2012 the University of Toronto created AlexNet which was trained on 15 million images, computing hundreds of labels, and changing the world of computer vision.
Before AlexNet 1 in every 4 images was incorrectly identified.
After AlexNet 1 in every 7 images was incorrectly identified.
Today it is less than 1 in every 25 images, according to Google’s Inception.
Tesla's 'Autopilot' feature uses computer vision via eight surround cameras. This provides 360 degrees of visibility around the car at up to 250 meters of range. It's a great example of how Computer Vision is becoming part of everday life. (Image: Tesla © 2017)
Why Is Computer Vision Important?
At Kairos we use computer vision for face recognition, identification, verification, emotion analysis, and crowd analytics. Without it our business would not exist so it is extremely important to us.
Computer vision is also great for:
- Optical Character Recognition (OCR): Recognizing and identifying text in documents, a scanner does this.
- Vision Biometrics: Recognizing people who have been missing through iris patterns.
- Object Recognition: Great for retail and fashion to find products in real-time based off of an image or scan.
- Special Effects: Motion capture and shape capture, any movie with CGI.
- 3-D Printing and Image Capture: Used in movies, architectural structures, and more.
- Sports: In a game when they draw additional lines on the field, yup computer vision.
- Social Media: Anything with a story that allows you to wear something on your face.
- Smart Cars: Through computer vision they can identify objects and humans.
- Medical Imaging: 3D imaging and image guided surgery.
Really the list goes on and on here too. We use computer vision in space, in video games, in mobile and industrial robots, and in so many other industries.
Computer vision is one of the easiest tech terms to define but has been one of the most difficult to teach computers. It has taken computer scientists almost 80 years to get to where we are today and with AI and deep learning, we are refining it even more. What is to come in the future with computer vision will by far be amazing.
If you want to read more about vision and computer vision we suggested these publications:
- Introduction to Computer Vision by Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst.
- Ted Talk: How we teach computers to understand pictures starring Fei Fei Li, establisher of ImageNet database.
- Computer Vision: A Modern Approach by David A. Forsyth, Jean Ponce.
If you want to learn how to code with Computer Vision Algorithms we suggest:
- Introduction to Computer Vision by Georgia Tech.
- Build a mobile document scanner for free.
- Learn Computer Vision with Open CV Library using Python by Frederick Ngoiya, python developer.