Analysing Line Graphs
Navigate the knowledge tree: 🌿 Skills ➡ Life Processes
Italics
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
When interpreting a graph, the first thing always is to pay attention to what the graph is about, i.e. the variables involved, as indicated by the labels on the axes and any other information. It is also essential to be aware of the range for each axis, since this affects the visual appearance of the graph. This is particularly important when comparing two or more similar graphs.Â
When thinking about the meaning of the line on a graph, a starting point is to identify whether the line goes up or down.Â
The graphs in Figure 7.2(a) go up: 'As x increases, y increases'
The graphs in Figure 7.2(b) go down: 'As x increases, y decreases'
What we can see from a line graph, but not so well from the table results, is whether the line is straight or curved. For straight line graphs, we could say:Â
Figure 7.3(a) 'As x increases, y steadily increases'
Figure 7.3(a) 'As x increases, y steadily decreases'
We use the term 'steadily' because it gives a good sense of what is happening.Â
The relationships shown in curved graphs are more complex to describe. One possibility is that:
Figure 7.4(a) 'As x increases, y increases slowly at first and then more rapidly'; or
Figure 7.4(b) 'As x increases, y increases rapidly at first and then more slowly'.
Similar descriptions can be used for the curved graphs that show a 'decrease of y with x'.
The formal term to describe a straight line graph is 'linear', whether or not it goes through the origin, and the relationship between the two variables is called a linear relationship. Similarly, the relationship shown by a curved graph is called 'non-linear'.
When we talk of a variable changing 'slowly' or 'rapidly', we are using these terms in a relative sense to describe how the gradient (or slope) of a line changes. For a linear relationship, the gradient at any point along the line is the same. For a curve, the gradient varies at different points along the curve.Â
An important feature of a relationship is whether the line goes through the origin (the point at which the values of x and y are zero). If the line does not go through the origin, the point at which the line meets the y-axis is called the intercept.Â
Figure 7.5(a) 'A straight line that goes through the origin'. This figure shows a proportional relationship (i.e. doubling the value of x, doubles the value of y. So 'as x increases, y increases, and y is proportional to x'.
Figure 7.5(b) 'A straight line with an intercept on the y-axis'. Although this figure represents a linear relationship, i is not a proportional relationship, since the line does not go through the origin.Â
Finally, while some curves may appear to increase indefinitely (Figure 7.6a), others may 'level out towards a maximum' (Figure 7.6b). Similarly, other curves showing decreasing values may 'level out towards a minimum' (Figure 7.6c).
(Figure 7.7) This graph provides an opportunity to put all of these ideas together. The relevant phrases to describe this graph are:
curved graph
intercept on y-axis
as x increases, y increases
y increases slowly at first, then more rapidly, then slows down again
reaches a maximum level.
On a graph that shows a change over time, the steepness of the line represents thow fast the change is happening. In other words, the gradient of the line represents a rate of change. For example, Figure 7.9(a) shows a graph that represents the progress of a chemical reaction between carbonate and an acid to produce carbon dioxide. At first, the increase in the volume of carbon dioxide is quite fast but then it slows down. Since this is the rate of change at a particular instant in time, it is called an instantaneous rate of change.Â
Other line graphs don't represent a change over time, but we still use the same language to describe it. For example, Figure 7.9(b) shows the way that the current through a light bulb varies with the potential difference across it, and we would describe it as rising rapidly at first and then more slowly.Â
Sometimes it happens that when the data points are plotted, a straight line can be drawn which appears to pass exactly through all of the points. But more often, even if the underlying relationship is linear, the data points don't lie exactly on a straight line because of measurement uncertainty. In such cases, the line of best fit would pass as close as possible to the points.Â
Remember that the meaning of a fitted line for linear relationships has a very differnet meaning to the line segments on a graph where each pair of data points are joined (connect-the-dots); it also has a very different meaning from the line of best fit on a scatter graph!
A linear relationship is one where a straight line could be fitted. A useful technique with points plotted on graph paper is to hold the paper at almost right angles to your face and then to rotate the paper to look down along the direction of the data points. This is a quick way of seeing how close the points would be to a straight line.Â
To draw a line on graph paper, it is better to have a transparent ruler so that all of the data points can be seen.Â
Figure 7.10(a) shows a set of eithg data points plotted on a graph.Â
Figure 7.10(b) shows a line of best fit drawn by eye.
Figure 7.11 shows two lines that are NOT good lines of best fit. The 'badness' of fit has been deliberately exaggerated in each chase to illustrate the criteria for fitting a good line.Â
In Figure 7.11(a), the gradient of the line matches the gradient of the data points, but the line is too high. In the good line of best fit, there were data points on both sides of the line, but here, they are all below the line.Â
By contrast, in Figure 7.11(b), there are similar numbers of points above and below the line, but the gradient of the line is wrong because all the points below the line are on the left, and all the points above the line are on the right.Â
So when drawing a line by hand on graph paper, there are two things to think about so that the line is as close to all the points as possible:
getting the height of the ruler right
getting its slope right
Sometimes, a line might pass through some of the points, but this is not essential - it is possible to have a line of best fit that doesn't actually pass through any of the points. There are no hard-and-fast rules for producing a line of best fit by eye... it is a matter of judgement to find the one that looks best.Â
Once a line of best fit has been drawn, it is possible to use it to estimate a value for y corresponding to any value for x.Â
Figure 7.12(a) shows how a value for y can be 'read off' the graph for a value of x that is in between the original data points. This technique is called interpolation. It can be used, for example, in calibrating instruments such as thermometers.Â
Getting good destimates from interpolation assumes that the fitted line is a good representation of what happens in between the measured data points, and that there are no unexpected variations. The more measurements that are made, the greater the chance that the interpolation will give good estimates. In fact, since the fitted line may compensate for measurement uncertainties in the data, it can actually give better estimates of the y values for the original data points than the actual data values themselves. This is the reason that a fitted line is used to find the gradient on such a graph, and not just the two extreme values.Â
The fitted line may also be extended in order to make estimates of values beyond the range of the original data. Figure 7.12(b) shows the line being extended to higher values, and a value of the y being 'read off' the graph for a value of x that is greater than the original range. This process is called extrapolation.Â
For example, a graph showing the extention of a spring against applied force could be extended to find the extension of the spring for a greater force. Care needs to be taken with extrapolation, however, since the linear relationship may NOT apply outside the data range. In the example of the spring, a point is reached when it becomes 'overstretched' and the extesion is no longer proportional to the force.Â
Extrapolation can also be done by extending the line towards lower values. In such a case, it may be of interest to find out whether it passes through the origin, or if it does not, to find the value of the intercept on the x-axis or y-axis.Â
If a straight line on a graph goes through the origin, then this represents a proprotional relationship. Deciding whether the first data point is at the origin (i.e. (0,0)) is important.Â
Figure 7.13(a) shows a proportional relationship - it is a graph of current against potential difference for a resistance that follows Ohm's Law. This shows us that the line MUST go through the origin because if there is no potential difference, there is no current. We can easily confirm this using a voltmeter and ammeter.Â
For other relationships, we may know that theoretically the line must start at the origin, but in practice there might not be any measurements to show this. In general, be careful about making any assumptions about the values at the intercepts. Only draw a line through the origin when you're sure it can't be otherwise.Â
Figure 7.13(b) shows how the length of a spring varies with force. It is a linear relationship (Hooke's Law), but it doesn't pass through the origin, so it is not a proportional relationship. Usually in cases like this, the intercept on the y-axis has a real-world meaning. Here, it represents the length of the spring when the value on the x-axis (the force on the spring) is zero. In other words, it is the 'normal' length when no force is acting on it to stretch it.Â
If the original length of the spring is subtracted from all of the data values, this gives the extension of the spring: plotting this against force would be a straight line through the origin, and this is a proportional relationship. In this example, the intercept on the y-axis can easily be found by measurement - it is simply the length of the spring with no force. However, in other situations, the intercept cannot be measured directly, though it may be found by extrapolation.
Figure 7.13(c) represents a graph of the temperature of the reaction mixture during an exothermic chemical exchange (that releases heat) in an insulated container. It's not easy to find the 'true' value of the temperature rise because the reaction takes a little time to complete and during this time, energy escapes from the warm reaction mixture. After the initial temperature rise and the completion of the reaction, the mixture starts to cool. Since the container is insulated, the cooling is relatively slow and is approximately a linear decrease over a small time period. By extrapolating this line backwards, the 'true' temperature rise (i.e. if there had been no cooling) can be estimated from the intercept on the y-axis.
An interesting historical use of extrapolation was to estimate the temperature at absolute zero. The solid line in Figure 7.13(d) represents the relationship between the volume of a fixed mass of gas and its temperature. As the gas cooled, its volume decreases. The theoretical interpretation of this is that the decrease is due to the molecules of the gas moving more slowly. If the molecules stop moving at absolute zero, then the volume would approach zero (assuming that the volume of the molecules themselves is negligible). Extrapolating the line back to zero volume, gives a temperature of about -273°C, which is close to the accepted value.Â
Not every graph has data points that clearly all lie close to a straight line. Two possibilities which may arise are:
The underlying relationship is linear, but there are outliers perhaps due to mistakes in measurement. These outliers may need to be ignored or rechecked.
The underlying relationship is not linear. The line of best fit is a curve, not a straight line.Â
The more data that are collected, the clearer the nature of the relationship becomes.Â
Figure 7.14(a) shows a graph with just four data points. It's not obvious hwat the line of best fit might be.Â
Figure 7.14(b) shows what a straight line drawn close to all of the points could look like...
Figure 7.14(c) shows what the straight line could look like if the final point was treated as an outlier, and a straight line was drawn through the other three points...
Figure 7.14(d) shows a curved line drawn close to the points.Â
With only four points, you cannot really decide.Â
Figure 7.15(a) shows the original four points, but now includes an additional four points to give eight in total. It is now much clearer to see a pattern - it looks like this is a linear relationship, but that one of the measurements is an outlier.Â
Figure 7.15(b) shows a straight line as a line of best fit, using seven of the points and ignoring the outlier.Â
These additional four points were necessary to identify the relationship, but if they had been different, then the relationship might have looked very different.Â
Figure 7.16(a) also shows the original four points but with a different set of four points added. Now, the pattern of data points suggests that the line of best fit should be curved. As before, if the data points are plotted on graph paper, holding it up and looking along the points by eye is a good way of making a sense of the shape. Drawing good curves by hand needs practice - it can be done using a sweeping movement of the hand with the wrist or elbow as a pivot 'inside' the curve.Â
If a straight line had been drawn (instead of a curve) then the points in the middle would have been below the line, and the ones at each end would have been above the line. This is a sign that a curve would be a better fit.Â
Biozone AP Biology
Interpreting data may include:
Describing patterns or trends (e.g. does a variable rise or fall over time).Â
Describing how the dependent variable changes in response to changes in the independent variable.Â
Explaining why the dependent variable changes in response to changes in the independent variable.Â
Making predictions based on trends in the data and justifying the prediction.Â
Justifying the predictions of others based on the data presented.Â
Data can be collected from many different situations. The data we collect are not much use without analysis and interpretation. The analysis involves finding patterns and trends. For example:
The graph shows that the height of the plants increased when fertiliser was applied.Â
The interpretation is where you show the explain what those patterns and trends mean.Â
Being able to analyse and interpret the results and to write a valid conclusion depends on a good understanding of the purpose of the investigation and of the scientific principles related to the topic being investigated.
This is where your research on the topic becomes important and you explain your results. Relationships should be stated clearly and concisely. Refer to the measurements as evidence, and draw conclusions from the data. Be sure that conclusions are supported by evidence.
Sometimes, it may be possible to explain data in more than one way. Your experiment may not always reveal the causes of your observations, so other explanations should always be considered. For example, if the leaves of a plant start to discolour and become yellow, it could indicate that the plant is suffering from a lack of water. It could also mean that the plant has a nutrient deficiency or a disease caused by an insect. It is important to consider alternative explanations when discussing data and analysing results.
When we describe quanitative data, it is usual to give a measure of central tendency. Â This is a single value (a mean, a median or a modal value) identifying the central position within that set of data. The type of statistic calculated depends on the type fo data (quantitative, qualitative) and its distribution (normal, skewed, bimodal).
When quantitative data is collected and put on a graph, a trend or relationship can be explained in terms of the two variables shown on the axes.
A trend - identify the consequences of changing the independent variable, e.g. the larger the force the faster the speed of the trolley.
A relationship - show a quantitative link between the independent and dependent variables, e.g. when the force is doubled the acceleration of the trolley is doubled.
Qualitative data is best described using the mode (the most common value or values). The mode is used to compare between the control and the experiment, or between features before and after.
Drawing a line of best fit helps you see the trend or general pattern of the results. Then you can see the relationship between the input variable and the outcome variable.
You can also extend the line and to read off the graph and a PREDICTION of data. Extending the data beyond the plotted points is to EXTRAPOLATE (extra = outside). Extrapolating is best done where the trends are known and predictable. If the trend in the data is not known then extrapolating should not be done.
For both the line and scatter graphs, the fitted line can be used to find an unknown value inside the set of data points. This is called interpolation.Â
The line graph to the right shows the speed of an object that has been dropped from an aircraft. To find out the speed of the object six seconds after it was dropped, follow these steps.
Go to 6 seconds on the x-axis.
Follow this grid line up to the graph line.
Go across to the y-axis and read the value. It is 152 km h−1.
So after six seconds, the object was moving downwards at 152 km h−1.
To read between the points on a graph, the reverse can be done. For example, when is the object travelling at 100 km h−1? Following the green dotted line on the graph, the value on the x-axis is 4.1 s. So the object was travelling at 100 km h- 4.1 seconds after it was dropped.
BIOZONE
The equation for a linear (straight) line on a graph is y = mx + c. The equation can be used to calculate the gradient (slope) of a straight line and tells us about the relationship between x and y (how fast y is changing relative to x). For a straight line, the rate of change of y relative to x is always constant.Â
All graphs are drawn to show the relationship, or trend, between two variables. Some variables increase when another variable increases, and some variables decrease when another variable increases. These relationships produce trend lines with characteristic shapes.
Note that:Â
Variables that change in linear or direct proportion to each other produce a straight trend line.Â
A variable that changes exponentially in response to the other variable produces a curved trend line.Â
'Inverse' means that one variable decreases as the other variable decreases.Â
Correlation does not imply causation. You may come across the phrase "correlation does not necessarily imply causation." This means that even when there is a strong correlation between variables (they vary together in a predictable way), you cannot assume that change in one variable caused change in the other.
For example, when data from the organic food association and the office of special education programmes is plotted (below), there is a strong correlation between the increase in organic food and rates of diagnosed autism. However, it is unlikely that eating organic food causes autism, so we cannot assume a causative effect here.Â
It is important to understand how variables are related, or whether there is any connection between them at all.Â
Descriptive statistics, such as mean and standard deviation, are used to summarize a set of data values and describe its features. The type of statistic calculated depends on the type of data and its distribution.Â
When describing a set of data, it is usual to give a measure of central tendency. This is a single value identifying the central position within that set of data.Â
Outliers are usually excluded from calculations of the mean. For very skewed data sets, it is better to use the median.Â
Variability in continuous data is often displayed as a frequency distribution. There are several types of distribution.Â
Normal distribution (A): Data has a symmetrical spread about the mean (the peak). It has a classical bell shape when plotted.Â
Skewed data (B): Data is not centred around the middle (skewed peak) but has a "tail" to the left or right.Â
Bimodal data (C): Data which has two peaks.Â
The shape of the distribution will determine which statistic (mean, median, or mode) should be used to describe the central tendency of the sample data.Â
Data nees to be processed by calculating standard deviations. These need to be recorded in a table or chart in the results section of the report. The processed data are used to draw an appropriate graph(s) to illustrate a pattern or trend (or its absence). To show how spread out the data are.Â
BIOZONE
When we take measurements from samples of a larger population, we are using the samples as indicators of what the whole population looks like. Therefore, when we calculate a sample mean for a variable, it is useful to know how close that value is to the true population mean for that same variable (i.e. its accuracy). If you are confident that your data set fairly represents the entire population, you are justified in making inferences about the population from your sample.Â
You can start by calculating a simple measure of dispersion called standard deviation. Standard deviation is a measure of the amount of variation in a set of values. Are the individual data values all close to the mean, or are the data values highly variable? Standard deviation provides a way to evaluate the distribution of your data, which can help you decide the step in your analysis.Â
Sample standard deviation (s) is presented as xÌ„ ± s.Â
In normally distributed data, 68% of all data values will lie within one standard deviation (1s) of the mean. 95% of all values will lie within two standard deviations (2s) of the mean (see the distribution plotted to the right).
The lower the standard deviation, the more closely the data values cluster around the mean. The formula for calculating standard deviation is...
Both the histograms to the left show how a nomral distribution of data with the values spread symmetrically about the mean. However, their standard deviations are different. In histogram A, the data values are widely spread around the mean. In histogram B, most of the data values are close to the mean. Sample B has a smaller standard deiation than sample A.
So how can you tell if your sample is giving a fair representation of the entire population? We can do this using the 95% confidence interval. This statistic allows you to make a claim about the reliability of your sample data. The mean ± the 95% confidence interval (95% CI) give sthe 95% confidence limits (95% CL). This tells us that, on average, 95 times out of 100, the true population mean will lie within the confidence limits. Having a large sample size greatly increases the reliability of your data.Â
You can plot the 95% CL on to graphs to determine if observed differences between sets of sample data (between field plots or treatments) are statistically significant. If the 95% CL do not overlap, it is likely that the differences between the sample sets is significant. The 95% CI is very easy to calculate.Â
Step 1 is to calculate the standard error of the mean (SE). It is simple to calculate and is usually a small value.Â
Step 2 is to calculate the 95% confidence interval (95% CI). It is calculated by multiplying SE by the value of t at P=0.05 for the appropriate degrees of freedom (df) in your sample.Â
BIOZONE
How your data is analysed depends on the type of data you have collected. Plotting your initial data can help you to decide what statistical analysis to carry out. Data analysis provides information on the biological significance.Â
Data analysis provides information on the biological significance of your investigation. How your data is analysed depends on the type of data you have collected. Plotting your data, even at the very early stages of your investigation, can help you to decide what statistical analysis to carry out. Sometimes, a statistical test is not needed to determine the significance of your findings. Plotting your data with 95% confidence intervals will very often provide you with the information you need to make a conclusion.
The panels below briefly describe the criteria for some of the simplest and most common statistical analyses you will come across. if your data do not meet the criteria, e.g. a t test is not appropriate when you are comparing more than two groups nad a linear regression is not appropriate if your data plot is a curve. There are other tess for these types of data.Â
Correlation and causality (Khan Academy) 📕 🎦Â
Introduction to Descriptive Statistics (Jay Hill, University of Illinois) 📕Â
Handbook of Biological Statistics (John H. McDonald) 📕 wow, what an amazing resource!
Descriptive statistics: Statistics of central tendency | Statistics of dispersion | Standard error of the mean | Confidence limits
Tests for nominal variables: Chi-square test of goodness-of-fit
Tests for one measurement variable: Student's t-test for one sampleÂ
Tests for multiple measurement variables: Correlation and linear regressionÂ
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
word: Definition
Need help? Consider getting Private Tutoring or Personalised Feedback for your work from Lemonade-Ed's Mrs. Heald.Â
Description
Description
Description
Description
Description
Description