Computer Vision and AI Applications

Watch the Complete Video Youtube

As a precursor to the launch of M. Tech/PGD in Artificial Intelligence, REVA Academy for Corporate Excellence organized a webinar on ‘Computer Vision and AI Applications’ by Ratnakar Pandey, Lead-ML and Analytics for Customer Service at Amazon and the Mentor of AI and Analytics Programs at RACE. The webinar is hosted to share knowledge about various AI applications to the participants. Read on the excerpt of the webinar to know more on the topic discussed:

Introduction to Computer Vision

Computer Vision, an interdisciplinary field of science and subfield of AI is one of the compelling types of AI that focuses to replicate the complexities of human vision. It is the process that trains computers to visualize images and videos from the human perspective to draw conclusions corresponding to that of human beings.

Why do we need computer vision?

We need computer vision:

  1. To automate various tasks without the intervention of human beings and to do scalable actions
  2. To ensure high standards of quality and accuracy while executing things.

Applications of Computer Vision

Computer vision is used in various application areas such as facial recognition software, self-driving cars, manufacturing production lines, medical imaging, robots, Optical Character Recognition (OCR), robotics, drones, and animations.

Factors Driving Computer Vision and Its Business Impact

The factors that drive the need for computer vision are:

  1. Need for automation
  2. Cheaper memory and storage
  3. Explosion of visual data
  4. Omnipresent Cameras
  5. Computational and algorithm development

Even though computer vision poses certain limitations, it will bring huge business opportunities such as meeting productivity goals, streamlining of work processes, increasing business revenue, etc.

How Does Computer Vision Work ?

Human vision is complex as it has gone through billions of years of evolution.  Human vision is a complex process that includes eyes to capture the image, receptors of the brain to access it and a visual cortex on the brain to process the image.  However, computer vision works differently as it can view the image as a set of values or data. Hence, in order to understand the image content, computer vision should work like a human brain. To make this possible, it has to use algorithms similar to that of a human brain through machine learning.

With the use of machine learning, it can classify images with great accuracy than a human being.  A specific set of artificial neural networks known as Convolutional Neural Networks (CNN) is used, which is similar to that of a human brain to identify images. The CNN is made up of various layers of neurons and it is essential to train the CNN in advance. However, CNN can handle only visual data in the image but cannot handle the temporal or time features in the image. To address this issue, the computer vision uses a Recurrent Neural Network (RNN) to handle the temporal feature of images or videos by feeding the output of CNN into RNN. CNN handles a group of pixels independently and RNN retains the information already processed by CNN. RNN can handle several types of input and output data. The challenge of RNN is that it should be uploaded with a sequence of descriptions about the image.

If the algorithm has to understand images as the human does, we need to feed it with an incredible amount of data such as images, different angles of these images, etc.

Computer Vision Tools

The humans have to annotate to classify the enormous amount of data by labeling the data, and computers will use these annotated data to identify similar data pattern when you feed new data. Various annotation tools are used for computer vision such as OpenCV, Diib, Face_recognition, and Tensorflow. Google Cloud Vision AI, Amazon Rekognition, and Microsoft Azure are examples of computer vision as a service or API.

Final Thoughts

The computer vision applications’ market is rapidly growing and you can see continual growth in the Deep Learning, Machine Learning, Artificial Intelligence, etc., in the coming years. The AI revolution is underway and its applications are not limited to customer engagement but also applicable across various dimensions. So, by recreating the human’s ability to see things, we are going to develop endless Computer Vision applications in future that will create a vast amount of job opportunities also.


Ratnakar Pandey

Leading ML and Analytics for Customer Service Amazon

Leading a talented team of Data Scientists and Analytics professionals in Amazon to draw insights and build actionable models from mostly unstructured data such as text and speech from India and other geographies. He has won prestigious awards such as Project Excellence Award from Citigroup and Annual Leadership Award from Target. He has also been recognized as one of the ‘Top 10 Data Scientists in India: 2020’.

Leave a Reply

Your email address will not be published. Required fields are marked *