## How Do You Find Statistics From Frequency Data In A Spreadsheet?

Suppose we are given some frequencies corresponding to some data in intervals.

To find the mean, we need to find a representative data value from each interval.

We’ll use the midpoint of each interval. The midpoint for the first interval is

We can find the other midpoints in a similar manner. Let’s add this to the table.

Let’s put this information in a spreadsheet.

To find the mean of this frequency distribution, multiply each data value times its corresponding frequency. In the spreadsheet, put =B2*C2 in cell D2.

Press Enter to compute this value. Next we need to fill this value into the cells D3 through D7. Click on cell D2. The cell will be outlined like you see below.

Place your mouse over the box in the lower right hand corner of this outline. The cursor will change to a black cross. Hold down the left mouse button and drag the cursor to cell D7.

When you release the button, the products will be calculated for each row. Adding a label in cell D1 might help you to remember what the numbers in the cell are.

The function SUM is useful for adding up lists of numbers. For instance, we can find the sum of the frequencies N by typing =SUM(C2:C7). Put this formula in cell B9.

In this picture a label has been added in A9 to help the reader understand what is in the cell to the right. Add another label in cell A10 with the text “Mean”. Next to that cell we’ll calculate the mean of the data. This is done by adding the entries in column D and dividing by the sum of the frequencies in cell B9.

Finding the variance is a bit more complicated. In this calculation we need to subtract the mean from each data value and square the result. This needs to be done with an absolute reference to cell B10 so that the fill always refer to that cell in making the calculation. Start by clicking in cell E2 and typing =B2-\$B\$10.

This means that the data value 9.5 is approximately 59.3 units below the mean. Fill the rest of the column and we end up with this worksheet.

We could sum these deviation now, but the positive and negative nature of each row would mask the spread of the data. Squaring each of the entries makes each of them positive. In cell F2, type =E2^2.

Fill the rest the rows using a fill.

The entries in this column are often called the deviations from the mean squared. Each of these occur with the frequencies in column C. To find the sample variance, multiply the entries in column F by the frequencies in column C to give the worksheet below.

Let’s calculate the variance. Start by typing a label for the sample variance in cell D9. In the adjacent cell we’ll put the value. The variance is the sum of the entries divided by the sum of the frequencies minus 1. In cell E9, type =SUM(G2:G7)/(B9-1).

The sample standard deviation is the square root of the standard variance. We can take the square root using the function SQRT. Put a label in cell D10 and type =SQRT(E9) in the adjacent cell.

The standard deviation is a measure of how spread out the frequency distribution is around the mean.

## How Do I Do Linear Regression in Google Sheets?

Like other spreadsheets, Google Sheets may be used to find a regression model for data. Following a few simple steps, we can graph a set of data in a scatter plot and find the corresponding model.
Suppose we have the demand data show in the table below.

To find a linear model for the Average Price per Gallon as a function of the Weekly Demand, we need to make a scatter plot of this data and add the linear regression model to it.

1. Start by opening a Google Sheet. In column A put the weekly demand data (the independent variables) and in column B put the average price per gallon (the dependent variable).

2. To graph the data, we first need to select it in the spreadsheet. Left mouse click in cell A1. While holding the mouse button down, drag the cursor to cell B6.

The data and labels at the top of the column should be highlighted.

3. From the Insert menu at the top of the window, select Chart.

4. A window will appear that allows you to create a variety of different types of graphs including scatter plots.

This window indicates where the data is located (A1:B6:) and indicates that the first rows contains headers or labels. There are several recommended chart, but none of them are scatter plots. To find a scatter plot, select Charts in the upper left corner of the window.

5. Select Scatter from the column of chart types on the left side of the window.

6. In the next column, choose the scatter plot at the top.

Now the preview on the right side of the window show a scatter plot.

7. To change the chart titles and axes labels, select Customize.

Under Chart Title, you should put a name that describes your chart. You can also change the color and size of the chart title font.

8. Use the scroll bar in the middle of the window to move farther through the chart options. Look for Axis Horizontal.

Give the horizontal axis an appropriate label under Title. You can also change the minimum and maximum extent of the graph by entering values into Min and Max.

9. Once you have given the horizontal axis a title, use the box next to Axis to choose Left Vertical.

Give the vertical axis an appropriate label under Title. Like the horizontal axis, you can change the vertical extent of the axis using Min and Max.

10. You may also add a trendline to the scatter plot. To do this, scroll farther down in the Customize window. Under the area labeled Series, look for Trendline. Selecting the box to the right give you several choices of regression for the data.

For example, select Linear to find the line of best fit. If you choose Use Equation for the Label and Show R2, the legend on the graph will show the equation as well as the coefficient of determination.

Select Insert to place the scatter plot in the sheet.

11. The equation in the legend may not include enough decimal places. If this is the case for a linear trend, the SLOPE and INTERCEPT commands may be used to calculate the slope and intercept of the trend. For this data, we would write

=SLOPE(B2:B6,A2:A6)

or

=INTERCEPT(B2:B6,A2:A6)

in a cell. Note that each of the functions require the range of the dependent variable to be listed first (B2:B6)and the range of the independent variable second (A2:A6).

12. If you need an output on the trend, the command FORECAST may be used. For instance, to find the y value on the linear trend at x = 130, we would type

=FORECAST(130,B2:B6,A2:A6)

## Compute the Sample Mean, Variance and Standard Deviation Using a Spreadsheet

The table below shows the dividend yields of six companies in the New York Stock Exchange energy sector.

 Company Dividend Yield  July 2012 (%) BP 4.80 Chevron 3.41 Exxon Mobil 2.66 PetroChina 3.50 Petroleo Brasiliero 1.20 Royal Dutch Shell 4.30

a. Find the sample mean.

Solution Use the AVERAGE command in a spreadsheet to compute the mean.

1. Enter the data from the table into the different cells in the spreadsheet.

2. In cell A9, type Mean to indicate that the number in B9 will be the mean.

3. Click on cell B9. This is where we will place the mean of the data. Type =AVERAGE( as shown to the right. To indicate the location of the data, type B2:B7. You can also click in cell B2, hold the left mouse button down and drag the cursor to cell B7. Type ) to complete the command.

4. Press Enter to compute the mean.

b. Find the sample variance.

Solution Use the VARA (VARPA for population variance) command to compute the sample variance of the data.

1. Type Variance in cell A10. Since we need the sample variance, type =VARA( in cell B10. Type B2:B7 or drag select these cells to identify the location of the data.

2. Press Enter to compute the sample variance.

c. Find the sample standard deviation.

Solution Use the STDEVA command to find the sample standard deviation of the data.

1. In cell A11, type Standard Deviation. In cell B11, type STDEVA( . Type B2:B7 or drag select these cells to identify the location of the data. To compute the population standard deviation, you would use the command STDEVPA. Type ) to complete the command.

2. Press Enter to compute the sample standard deviation.

## Find the Mode Using a Spreadsheet

During the week of 6/7/2012 through 6/14/2012, eight homes were sold in Paradise Valley, Arizona in the area code 85253. The sales prices for these homes are listed below.

900,000
535,000
182,500
1,550,000
2,250,000
1,525,000
490,000
1,525,000

Find the mode of the home sales prices.

Solution Use the MODE command in a spreadsheet to compute the mode of the data.

1. Enter the data from the table into the different cells in a spreadsheet.

2. Click on cell A11. This is where we will place the mode of the data. Type =MODE( as shown to the right. The command will be shown in the cell as well as the function bar. To indicate the location of the data, type A2:A9. You can also click in cell A2, hold the left mouse button down and drag the cursor to cell A9.

Type ) to complete the command.

3. Press Enter to compute the mode.

## Compare the Mean and Median Using a Spreadsheet

During the week of 6/7/2012 through 6/14/2012, eight homes were sold in Paradise Valley, Arizona in the area code 85253. The sales prices for these homes are listed below.

900,000
535,000
182,500
1,550,000
2,250,000
1,525,000
490,000
1,525,000

a. Find the mean sales price.

Solution Use the AVERAGE command to compute the mean of the data.

1. Enter the data from the table into the different cells in a spreadsheet.

2. Click on cell A11. This is where we will place the mean of the data. Type =AVERAGE( as shown to the right. The command will be shown in the cell as well as the function bar. To indicate the location of the data, type A2:A9. You can also click in cell A2, hold the left mouse button down and drag the cursor to cell A9. Type ) to complete the command.

Press Enter to compute the mean. In cell B11, type Mean to identify the type of central tendency.

b. Find the median sales price.

Solution Use the MEDIAN command to compute the median of the data.

1. In cell A12, type =MEDIAN(A2:A9).

3. Press Enter to compute the median.

4. In cell B12, type Median to identify the measure of central tendency.