MTH 3063 Point Loma Nazarene University Mathematics and Statistics Lab Report Must have R and R studio downloaded, and must know how to use it well! I have

Homework Answers

MTH 3063 Point Loma Nazarene University Mathematics and Statistics Lab Report Must have R and R studio downloaded, and must know how to use it well! I have attached the assignment and the data below. Please take screenshots on the R studio and add them to a Word document and fill out the rest of the assignment on the Word document as well. (Complete the 5 questions at the end of the document. The beginning is just a run-through of what to do.) Lab 5:Hypothesis Tests and Confidence Intervals
Goal: The title says it. We will focus here on implementing one-sample t-tests, two-sample
ttests, matched pairs t-tests, and ANOVA.
As usual, open a word file or a new notebook to complete your lab in. Answer questions
using narrative when requested and provide commands and output to verify what you did.
In R, as with other statistical software, confidence intervals are complimentary when you
run a hypothesis test for which they would make sense. Hence, we really only need to learn
how to perform hypothesis tests. Although it may sometimes seem as though the final
conclusion to accept or reject H0 is what is most important, all of the other supporting
evidence that you would typically compute by hand is informative and important to report
for your readers. In order to report the results of a hypothesis test you should include the
summary statistics, the test statistic, degrees of freedom if appropriate, the p-value and the
conclusion, and it is usually good practice to provide a confidence interval as well. R does
all the work, but you must identify the correct pieces of the output.
We will be using the FakeData.csv data for the demo portion of the lab. Load the data as
FakeData.
1.
Two-sample t-tests We are going to see if there is a significant difference in age
between the two genders in the FakeData set. For most tests, R prefers stacked format
such as the following, where all quantitative measurements are in one column and the
grouping variable is in the second such as the following
Age Gender
25
F
32
F
21
M
19
M
Other software platforms, such as Excel often use data in what is called wide or
unstacked format, where each of the groups has the data values stored in their own
columns.
Female
25
32
Male
21
19
Although R can handle wide format for t-tests it does not do so for other tests, and
so for consistency we will only implement the t.test command on data in the stacked
format. The additional benefit is that it allows us to retain the function notation y ~
x, where y is the response and x is the independent variable.
Pro tip: If you happen to have data in the wide format, R can easily convert between the
two formats using the melt commands.
Back to the question, suppose we want to see if there is a significant difference
between the ages of the males and females in these two datasets, in other words we
want to test whether there is a relationship between Gender and Age, e.g. Age ~
Gender. The command and output are below:
t.test(Age~Gender, data = FakeData)
##
##
##
##
##
##
##
##
##
##
##
Welch Two Sample t-test
data: Age by Gender
t = 0.39287, df = 331.8, p-value = 0.6947
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.482673 2.222702
sample estimates:
mean in group F mean in group M
56.31707
55.94706
What would the command be to test whether there is a significant relationship
between Gender and Height? Run the test and write out the four steps of the method
for hypothesis tests, incorporating the appropriate values output from R.
Pro tip: If you receive an error like
Error in t.test.formula(Age ~ Gender, data = FakeData) :
grouping factor must have exactly 2 levels
and you know that you have two values determining your groups, then It is likely that your
grouping variable is not stored as a categorical measurement. You can verify this by
printing a numerical summary. You can then fix the problem by wrapping the grouping
variable in the command as.factor(), for example
t.test(Age~as.factor(Gender), data = FakeData).
2.
The t.test command has several additional options you should be aware of; enter the
command ?t.test at the console to view these. List three of the optional arguments
and what they are used for. What commnad would you have to add to skip missing
data points?
3.
When reporting statistics you would not actually write out the four step method, you
would write something like the following:
A two-sample t-test was used to compare the differences between the fuzziness of
chocolate labs and yellow labs. There was not a significant difference between the
fuzziness of chocolate labs (M = 27.5, SD = 2.1) and yellow labs (M = 26.4, SD = 3.8);
t(21.9) = 2.051, p = 0.027. A 95% confidence interval for the actual mean difference in
fuzziness is (0.65, 1.55) units.
Note that no animals were harmed during this study and the degrees of freedom
was reported in parentheses after the t.
When reporting on statistical analyses it is often helpful to look up the APA
reporting format for that particular test.
Write a similar paragraph reporting the results of your analysis of the relationship
between height and gender in question 2. Note that you will need to compute the
standard deviation for each of the groups (the means are given in the R output). In a
previous lab you learned how to summarize a variable by group and you will need to
do that here.
4.
One-sample t-test This command can be easily modified to run a one-sample t-test.
Suppose we wanted to determine if the Age variable is significantly less than 57. The
command would be
t.test(FakeData$Age, mu = 57, alternative = “less”)
##
##
##
##
##
##
##
##
##
##
##
One Sample t-test
data: FakeData$Age
t = -1.8524, df = 333, p-value = 0.03242
alternative hypothesis: true mean is less than 57
95 percent confidence interval:
-Inf 56.90452
sample estimates:
mean of x
56.12874
Note that we often want a confidence interval in our statistical summary. A 95%
confidence interval is included by default, however if for example you wanted an 80%
confidence interval, it can be output by adding an additional option conf.level =
0.80 inside the parentheses.
Find a 99% confidence interval for the mean Pinkylen and determine if this value is
significantly greater than 3.10.
5.
Matched pairs t-test A matched pairs t-test is used for what cases?
6.
We will not actually implement this test here, but will only provide an example. In this
case, R requires that your data is in wide format with the first item in the pair in one
column and the second in another such as shown below.
Obs1
12
11
15
Obs2
13
14
14
In this case, we do not use the function notation, but instead refer to each column
separately using the following version of the t.test command:
t.test(DataSet$Obs1, DataSet$Obs2, paired = TRUE)
7.
ANOVA In what scenario would you use an ANOVA test?
8.
Because you are comparing whether means of different groups are all equal or not,
this is equivalent to determining if the grouping variable has a significant relationship
with the quantitative variable. How many levels must the grouping variable have to
use ANOVA? Also, write out the steps in an ANOVA test. This is technically what is
called a one-way ANOVA. There are many other types of ANOVA for similar related
scenarios that are not covered in this course.
9.
Suppose we want to determine if there is a significant relationship between the
FavColor variable and Heights in the FakeData set. That is to say, is the mean Height of
those who prefer Grey the same as that of those who prefer Green and so on, or is at
least one of the means different? The command looks very similar to that used for the
t-test, try it using the command
aov(Height~FavColor, data=FakeData)
##
##
##
##
##
##
##
##
##
##
Call:
aov(formula = Height ~ FavColor, data = FakeData)
Terms:
Sum of Squares
Deg. of Freedom
FavColor Residuals
27.932 7029.972
3
330
Residual standard error: 4.615511
Estimated effects may be unbalanced
You will notice something strange here in that there is no F statistic or p-value reported.
This has to do with the wide range of ways ANOVA can be utilized, e.g. in testing the
significance of a regression line, hence to get the necessary values for the hypothesis test
you will need to modify the command to
summary(aov(Height~FavColor, data=FakeData))
##
## FavColor
## Residuals
Df Sum Sq Mean Sq F value Pr(>F)
3
28
9.311
0.437 0.727
330
7030 21.303
Run an ANOVA to determine if there is a significant relationship between breakfast and
handspan.
10. Write a formal report of the results to the previous, following a pattern similar to what
you did for t-tests. As with a t-test, you will need to report the degrees of freedom,
however with F, there are two degrees of freedom, which you would report following
the pattern F(numerator df, denominator df).
Trying it on your own
Now we will apply our skills to the course survey data stored in the file CEAREALS.csv. The
columns contain the nutrition info of several major types of cereal for sale today. You may
want to open and view the command to see the values for yourself.
For each of the following questions, you will need to identify the appropriate test and then
implement it, writing your results in paragraph form and include the R command you used
to produce this.
1.
Do the these major cereals contain significantly more than 3 g of protein in a serving?
2.
Predict the mean sodium content in a serving of cereal.
3.
Is there a significant difference in the sodium content between the different cereal
manufacturers?
4.
Is there a significant difference in the mean calories between the hot and cold cereals?
5.
Predict the mean difference in Potassium levels between hot and cold cereals.
Name
Manufacturer Cold/Hot Calories Protein
Fat Sodium Fiber Carbohydrates
100%_Bran
N
C
70
4 1
130
10
5
100%_Natural_Bran
Q
C
120
3 5
15
2
8
All-Bran
K
C
70
4 1
260
9
7
All-Bran_with_Extra_FiberK
C
50
4 0
140
14
8
Almond_Delight
R
C
110
2 2
200
1
14
Apple_Cinnamon_Cheerios
G
C
110
2 2
180
1,5
10,5
Apple_Jacks
K
C
110
2 0
125
1
11
Basic_4
G
C
130
3 2
210
2
18
Bran_Chex
R
C
90
2 1
200
4
15
Bran_Flakes
P
C
90
3 0
210
5
13
Cap’n’Crunch
Q
C
120
1 2
220
0
12
Cheerios
G
C
110
6 2
290
2
17
Cinnamon_Toast_Crunch G
C
120
1 3
210
0
13
Clusters
G
C
110
3 2
140
2
13
Cocoa_Puffs
G
C
110
1 1
180
0
12
Corn_Chex
R
C
110
2 0
280
0
22
Corn_Flakes
K
C
100
2 0
290
1
21
Corn_Pops
K
C
110
1 0
90
1
13
Count_Chocula
G
C
110
1 1
180
0
12
Cracklin’_Oat_Bran
K
C
110
3 3
140
4
10
Cream_of_Wheat_(Quick)N
H
100
3 0
80
1
21
Crispix
K
C
110
2 0
220
1
21
Crispy_Wheat_&_Raisins G
C
100
2 1
140
2
11
Double_Chex
R
C
100
2 0
190
1
18
Froot_Loops
K
C
110
2 1
125
1
11
Frosted_Flakes
K
C
110
1 0
200
1
14
Frosted_Mini-Wheats
K
C
100
3 0
0
3
14
Fruit_&_Fibre_Dates,_Walnuts,_and_Oats
P
C
120
3 2
160
5
12
Fruitful_Bran
K
C
120
3 0
240
5
14
Fruity_Pebbles
P
C
110
1 1
135
0
13
Golden_Crisp
P
C
100
2 0
45
0
11
Golden_Grahams
G
C
110
1 1
280
0
15
Grape_Nuts_Flakes
P
C
100
3 1
140
3
15
Grape-Nuts
P
C
110
3 0
170
3
17
Great_Grains_Pecan
P
C
120
3 3
75
3
13
Honey_Graham_Ohs
Q
C
120
1 2
220
1
12
Honey_Nut_Cheerios
G
C
110
3 1
250
1,5
11,5
Honey-comb
P
C
110
1 0
180
0
14
Just_Right_Crunchy__Nuggets
K
H
110
2 1
170
1
17
Just_Right_Fruit_&_Nut K
H
140
3 1
170
2
20
Kix
G
C
110
2 1
260
0
21
Life
Q
C
100
4 2
150
2
12
Lucky_Charms
G
C
110
2 1
180
0
12
Maypo
A
H
100
4 1
0
0
16
Muesli_Raisins,_Dates,_&_Almonds
R
C
150
4 3
95
3
16
Muesli_Raisins,_Peaches,_&_Pecans
R
C
150
4 3
150
3
16
Mueslix_Crispy_Blend
K
Multi-Grain_Cheerios
G
Nut&Honey_Crunch
K
Nutri-Grain_Almond-RaisinK
Nutri-grain_Wheat
K
Oatmeal_Raisin_Crisp
G
Post_Nat._Raisin_Bran P
Product_19
K
Puffed_Rice
Q
Puffed_Wheat
Q
Quaker_Oat_Squares
Q
Quaker_Oatmeal
Q
Raisin_Bran
K
Raisin_Nut_Bran
G
Raisin_Squares
K
Rice_Chex
R
Rice_Krispies
K
Shredded_Wheat
N
Shredded_Wheat_’n’BranN
Shredded_Wheat_spoon_size
N
Smacks
K
Special_K
K
Strawberry_Fruit_WheatsN
Total_Corn_Flakes
G
Total_Raisin_Bran
G
Total_Whole_Grain
G
Triples
G
Trix
G
Wheat_Chex
R
Wheaties
G
Wheaties_Honey_Gold G
7_Grain_Hot_Cereal
R
10_Grain_Hot_Cereal
R
5_Grain_Rolled
R
Strawberries&Cream
Q
Bananas&Cream
Q
Peaches&Cream
Q
C
C
C
C
C
C
C
C
C
C
C
H
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
H
H
H
H
H
H
160
100
120
140
90
130
120
100
50
50
100
100
120
100
90
110
110
80
90
90
110
110
90
110
140
100
110
110
100
100
110
140
140
120
110
110
110
3
2
2
3
3
3
3
3
1
2
4
5
3
3
2
1
2
2
3
3
2
6
2
2
3
3
2
1
3
3
2
6
6
5
5
5
5
2
1
1
2
0
2
1
0
0
0
1
2
1
2
0
0
0
0
0
0
1
0
0
1
1
1
1
1
1
1
1
2
1
2
3
2
3
150
220
190
220
170
170
200
320
0
0
135
0
210
140
0
240
290
0
0
0
70
230
15
200
190
200
250
140
230
200
200
0
5
0
0
0
0
3
2
0
3
3
1,5
6
1
0
1
2
2,7 .
5
2,5
2
0
0
3
4
3
1
1
3
0
4
3
0
0
3
3
1
6
5
5
2
2
2
17
15
15
21
18
13,5
11
20
13
10
14
14
10,5
15
23
22
16
19
20
9
16
15
21
15
16
21
13
17
17
16
28
28
24
22
22
22
Sugars
Potassium
6
280
8
135
5
320
0
330
8.
10
70
14
30
8
100
6
125
5
190
12
35
1
105
9
45
7
105
13
55
3
25
2
35
12
20
13
65
7
160
0.
3
30
10
120
5
80
13
30
11
25
7
100
10
200
12
190
12
25
15
40
9
45
5
85
3
90
4
100
11
45
10
90
11
35
6
60
9
95
3
40
6
95
12
55
3
95
11
170
11
170
13
6
9
7
2
10
14
3
0
0
6
.
12
8
6
2
3
0
0
0
15
3
5
3
14
3
3
12
3
3
8
1
0
0
1
1
1
160
90
40
130
90
120
260
45
15
50
110
110
240
140
110
30
35
95
140
120
40
55
90
35
230
110
60
25
115
110
60
0
0
0
0
0
0

Purchase answer to see full
attachment

Don't use plagiarized sources. Get Your Custom Essay on

MTH 3063 Point Loma Nazarene University Mathematics and Statistics Lab Report Must have R and R studio downloaded, and must know how to use it well! I have

Get an essay WRITTEN FOR YOU, Plagiarism free, and by an EXPERT! Just from $10/Page

Order Essay

Continue to order Get a quote

Calculate the price of your order

Type of paper needed:

Pages:

550 words

Academic level:

We'll send you the first draft for approval by September 11, 2018 at 10:52 AM

Total price:

$26

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

Free title page and bibliography
Unlimited revisions
Plagiarism-free guarantee
Money-back guarantee
24/7 support

On-demand options

Writer’s samples
Part-by-part delivery
Overnight delivery
Copies of used sources
Expert Proofreading

Paper format

275 words per page
12 pt Arial/Times New Roman
Double line spacing
Any citation style (APA, MLA, Chicago/Turabian, Harvard)

MTH 3063 Point Loma Nazarene University Mathematics and Statistics Lab Report Must have R and R studio downloaded, and must know how to use it well! I have

Calculate the price of your order

Our guarantees

Money-back guarantee

Zero-plagiarism guarantee

Free-revision policy

Privacy policy

Fair-cooperation guarantee

Our Popular Essay Writing Services by Subject