In this chapter, we will learn about different types of data distributions and characteristics of each distribution in detail.
Distributions in Data Science Class 10 Notes
What is distribution in data science?
Distribution in data science is a method which shows the probable values for a variable and how often they occur. While the concept of probability gives us the mathematical calculations, distributions help us actually visualise what is happening underneath. For example,
If you are tossing a coin, a coin has two sides:
- Head → Probability = ½ (or 0.5)
- Tail → Probability = ½ (or 0.5)
But what about the third side of the coin? Can a coin land on a third side? Then the answer will be NO. So the probability of the third side is zero.
Why is distribution useful?
- It helps to visualise the behaviour of the data.
- It is used to make machine learning, statistics and real-life decisions.
- Every distribution can be shown as a graph to make patterns easier to see.
What are the different types of distributions?
Types of distributions in data science are solely based on what kind of data we can encounter while dealing with problems. The data can be discrete or continuous.
- Discrete data is the data that takes only specified values. For example, if you give a test, you can either pass or fail. So, data is discrete in this case, as it has only two specified outcomes.
- Continuous data is the data that can take any value within a given range. This range can be either finite or infinite. For example, the depth of an ocean, the weight of a person or the length of a road.
Statistical Problem-Solving Process
This investigative process involves four components, each of which involves exploring and addressing variability:
- Formulate Statistical Investigative Questions
- Collect and consider the data.
- Analyse the Data
- Interpret the Data
Formulate Statistical Investigative Questions
A statistical investigative question is a smart question which has different answers. It helps to study a group of things and find the patterns within it.For example, if we ask, “Do plants grow faster with more sunlight?”, the following are all statistical investigative questions that anticipate variability and can lead to a rich data collection process and subsequent analysis of the data:
- How fast can my plant grow?
- Do plants exposed to more sunlight grow faster?
- How does sunlight affect the growth of a plant?
Collect/Consider the Data
In the statistical process, we collect data carefully to notice and understand differences. Some methods, like random sampling or statistical process control, help to reduce or detect variability. For example, if we want to compare how plants grow with different amounts of sunlight, then we must give different amounts of sunlight to different plants. This helps to identify the difference. Once we have collected data, then we can ask:
- What kind of data is this?
- How was it collected?
- Is it useful to answer our question?
Analyse the Data
In this step, we analyse the data to understand how it varies, which means we try to understand how the values are spread out. This can be done using graphs and numbers. This method is known as distribution. For example, if we compare the batting averages of an India-Australia cricket match, the best method is a graph where we use dot plots or boxplots to show how the scores are spread.
Interpret the Results
After analysing the data, the final step is to explain what the result means. But we have to remember that the data always varies, and also we have to understand when we have to make a conclusion. For example, in a medical experiment, people are randomly put into different treatment groups. Also, each person is different—some may respond better than others.
Disclaimer: We have taken an effort to provide you with the accurate handout of “Distributions in Data Science Class 10 Notes“. If you feel that there is any error or mistake, please contact me at anuraganand2017@gmail.com. The above CBSE study material present on our websites is for education purpose, not our copyrights.
All the above content and Screenshot are taken from Data Science Class 10 Microsoft Textbook published on CBSE Website, CBSE Sample Paper, CBSE Old Sample Paper, CBSE Board Paper and CBSE Support Material which is present in CBSEACADEMIC website This Textbook and Support Material are legally copyright by Central Board of Secondary Education. We are only providing a medium and helping the students to improve the performances in the examination.
Images and content shown above are the property of individual organisations and are used here for reference purposes only. To make it easy to understand, some of the content and images are generated by AI and cross-checked by the teachers. For more information, refer to the official CBSE textbooks available at cbseacademic.nic.in.