Making Machines See Class 12 Notes

Share with others

Computer vision is a field of artificial intelligence, which we are going to learn in the topic “Making Machines See Class 12 Notes”. Computer vision enables systems to see, observe and understand. Computer vision is similar to human vision. In this chapter we are going to learn how machines see, the basics of digital images, image acquisition, preprocessing, feature extraction, etc.

Contents show

How machines see?

Computer vision is also known as CV. Computer vision is a subset of machine learning, which can work similarly to human vision. CV derives meaningful information from digital images, videos, and other visual input and makes recommendations or takes appropriate actions. Computer Vision is sometimes called Machine Vision.

Definition: Computer Vision is a field of artificial intelligence (AI) that uses Sensing devices and deep learning models to help systems understand and interpret the visual world.

Process flow of computer vision — image source: https://www.ciopages.com/wp-content/uploads/2020/07/vision-work.jpg

Working of Computer Vision

computer vision is the field of study that focuses on processing and analyzing digital images and videos to comprehend their content. A fundamental aspect of computer vision lies in understanding the basics of digital images.

Basics of digital images

A digital image is a picture that is stored on a computer in the form of a sequence of numbers that computers can understand. Digital images can be created in several ways like using design software (like Paint or Photoshop), taking one on a digital camera, or scan one using a scanner.

Interpretation of Image in digital form

When a computer processes the image, the image is converted into multiple tiny square boxes known as pixels. Each pixel represents a specific color value. During the process of digitization, an image is converted into a grid of pixels. The resolution of an image is determined by the number of pixels; the higher resolution images have more pixels. Every pixel in the image was assigned numbers. For monochrome images, such as black and white pictures, there is a range from 0 to 255, where 0 corresponds to black and 255 represents white.

Computer Vision – Process

The Computer Vision process often involves five stages. Image Acquisition, Preprocessing, Feature Extraction, Detection/Segmentation, and High-Level Processing. They are explained below.

1. Image Acquisition:

Image acquisition is the initial stage in the process of computer vision, involving the capture of digital images or videos; it is raw data. A high-resolution camera can capture and produce clear images compared to a lower-resolution camera; in low-light conditions, it may result in poor image quality. In scientific and medical fields, MRI (Magnetic Resonance Imaging) or CT (Computer Tomography) can scan highly detailed images of biological tissues or structures.

2. Preprocessing:

Preprocessing in computer vision aims to improve the quality of the acquired image. Some of the common techniques are –

a. Noise Reduction: Remove unwanted elements like blurriness, random spots or distortions. This makes the image clearer and reduces distractions for algorithms.

b. Image Normalization: Standardizes pixel values across images for consistency. Adjusts the pixel values of an image so they fall within a consistent range (e.g., 0–1 or -1 to 1).

c. Resizing/Cropping: Changes the size or aspect ratio of the image to make it uniform. Ensures all images have the same dimensions for analysis.

d. Histogram Equalization: Adjusts the brightness and contrast of an image. Spreads out the pixel intensity values evenly, enhancing details in dark or bright areas. Example: Making a low-contrast image look sharper and more detailed. The main goal for preprocessing is to prepare images for computer vision tasks by:

Removing noise (disturbances).
Highlighting important features.
Ensuring consistency and uniformity across the dataset.

3. Feature Extraction:

Feature extraction involves identifying and extracting relevant visual patterns or attributes from the pre-processed image. Feature extraction algorithms vary depending on the specific application and the types of features relevant to the task.

Edge detection: Edge detection identifies the boundaries between different regions in an image where there is a significant change in intensity
Corner detection: Corner detection identifies points where two or more edges meet. These points are areas of high curvature in an image, focused on identifying sharp changes in image gradients, which often correspond to corners or junctions in objects.
Texture analysis: Texture analysis extracts feature like smoothness, roughness, or repetition in an image
Colour-based feature: Colour-based feature extraction quantifies colour distributions within the image, enabling discrimination between different objects or regions based on their colour characteristics.

4. Detection/Segmentation:

Detection and segmentation are fundamental tasks in computer vision, detection and sengentation focusing on identifying objects or regions of interest within an image. These tasks play a important role in applications like autonomous driving, medical imaging, and object tracking. This crucial stage is categorized into two primary tasks:

Single Object Tasks
Multiple Object Tasks

a. Single Object Tasks: Single object tasks focus on analysing or describing individual objects within an image, with two main objectives:

Classification: Classification is the process of using algorithms to sort data into different categories. KNN (K-Nearest Neighbour) classification algorithms are used in supervised learning, and the K-means clustering algorithm is used in unsupervised learning.
Classification + Localization: Classification with localisation helps computer vision to combine classifying an object in an image with identifying its location. Localisation involves precisely localising the object within the image by predicting bounding boxes that tightly enclose it.

b. Multiple Object Tasks: Multiple object tasks deal with multiple objects within a single image. Multiple object tasks deal with scenarios where an image contains multiple instances of objects or different object classes. These tasks aim to identify and distinguish between various objects within the image, and they include:

Object Detection: Object detection focuses on identifying and locating multiple objects within the image. It helps to make boundary boxes around detected objects by assigning class labels to these boxes.

Image segmentation: It helps to create a mask around similar characteristic pixels and identifies their class. Image segmentation helps to understand the image in detail. Each pixel of the object is assigned as a class; it helps to identify each object separately from the other. Techniques like edge detection, which works by detecting discontinuities in brightness, are used in image segmentation. There are different types of Image Segmentation available. Two of the popular segmentation are:
- Semantic Segmentation: It classifies pixels belonging to a particular class. For example, we have one image that contains multiple objects like animals, plants, and humans; the pixels are used to identify the class of animals but do not identify the type of animal.
- Instance Segmentation: All the objects in the image are differentiated even if they belong to the same class. In this image, for example, in the animal class image, now instance segmentation will categorise the type of dog in the image.

image classification, object localization, semantic segmentation and instance segmentation in computer vision

computer vision source image, semantic segmantation and instance segmantation

5. High-Level Processing:

In the final stage of computer vision, high-level processing plays an important role in interpreting and extracting meaningful information from the detected objects or regions within digital images. The high-level processing enables computers to have a deeper understanding of visual content and helps to make decisions based on the visual data. The high-level processing includes recognising objects, understanding scenes, and analysing the context of the visual content using machine learning algorithms.

Application of Computer Vision

Computer vision is integrated into the majority of the applications that we use in day-to-day life. Some of the applications are listed below, which you might have already learnt in lower classes.

Facial recognition: Face recognition is used in social media platforms like Facebook to identify the tags of the users.
Healthcare: In the healthcare sector, computer vision can identify the disease or abnormalities of the patient; object detection can also track the medical images.
Self-driving vehicles: In a self-driving vehicle, the computer vision captures the different angles around the car and their surroundings. Computer vision can also read the traffic signals, detect other vehicles, and detect pedestrian paths, etc.
Optical character recognition (OCR): OCR helps to extract text from the visual data, for example, the images of invoices, articles, bills, etc.
Machine inspection: Computer vision can inspect the machine and detect any defects, features, or functional flaws that are there in the machine.
3D model building: Computer vision can help to construct a 3D computer model using existing objects and also use it in various places, such as robotics, autonomous driving, 3D tracking, 3D scene reconstruction, and AR/VR.
Surveillance: Live footage from CCTV cameras in public places helps to identify suspicious behaviour, identify dangerous objects, and prevent crimes by maintaining law and order.
Fingerprint recognition and biometrics: Detects fingerprints and biometrics to validate a user’s identity.

Challenges of Computer Vision

Computer vision plays an important role in artificial intelligence, but to make some sense of the visual data, computer vision faces multiple challenges. These challenges include:

Reasoning and Analytical Issues: Computer vision has the capability to extract meaningful insights from images; computer vision also requires accurate interpretation. The robust reasoning and analytical skills are important to defining attributes within visual content, which helps to extract meaningful and accurate data from visuals.
Difficulty in Image Acquisition: Every image is different from others; some images are high resolution, some are not, some images have more light, some do not, and different images have different scales. The analyzing of data in computer vision is important, and accurate interpretation is required for the same.
Privacy and Security Concerns: The surveillance of security cameras raises privacy concerns, which can lead to the individuals privacy rights. Face recognition technology in computer vision can create dilemmas regarding privacy and security.
Duplicate and False Content: Computer vision introduces challenges related to the proliferation of duplicate and false content. Malicious actors can exploit vulnerabilities in image and video processing algorithms to create misleading or fraudulent content.

The future of Computer Vision

There are significant improvements in computer vision and technology; now the basic image can be processed using a complex system to understand and interpret visual data like a human. Deep learning is capable of analysing the vast amount of labelled data. Now the possibilities of computer vision are awe-inspiring, from personalized healthcare diagnostics to immersive AR experiences. We can unlock the full potential of computer vision and harness its transformative power for the benefit of humanity.

Disclaimer: We have taken an effort to provide you with the accurate handout of “Making Machines See Class 12 Notes“. If you feel that there is any error or mistake, please contact me at anuraganand2017@gmail.com. The above CBSE study material present on our websites is for education purpose, not our copyrights. All the above content and Screenshot are taken from Artificial Intelligence Class 12 CBSE Textbook, Sample Paper, Old Sample Paper, Board Paper and Support Material which is present in CBSEACADEMIC website, This Textbook and Support Material are legally copyright by Central Board of Secondary Education. We are only providing a medium and helping the students to improve the performances in the examination.

Images and content shown above are the property of individual organizations and are used here for reference purposes only.

For more information, refer to the official CBSE textbooks available at cbseacademic.nic.in

cbseskilleducation

Share with others