Part I: Tests
1.
Fill out the chart above!
α = Significance level
z or t = is it a z or t test
z or t Value = Critical Value
A.
α = .1
Use Z test
z vals = 1.64, -1.64
B.
α=.05
Use t-test
t-vals = 2.59, -2.59
C.
α=.05
Use z-test
z-val = 1.64
D.
α=.01
Use z test
z vals = 2.57, -2.57
E.
α=.2
Use z-test
z-val = .84
F.
α=.01
use t-test
t val = 2.82
G.
α= .01
use t-test
t-vals = 3.33, -3.33
2. A Department of the interior in Washington D.C. estimates that the number of particular invasive species in a certain county (Bucks County) should number as follows (averages based on data from the whole state of Pennsylvania) per acre: Asian-Long Horned Beetle, 4; Emerald Ash Borer Beetle, 10; and Golden Nematode, 75. A survey of 50 fields had the following results: (10 pts)
μ σ
Asian-Long Horned Beetle 3.2 0.73
Emerald Ash Borer Beetle 11.7 1.3
Golden Nematode 77 5.71
a. Test the hypothesis for each of these products. Assume that each are 2 tailed with a Confidence Level of 95% *Use the appropriate test
b. Be sure to present the null and alternative hypotheses for each as well as conclusions
Asian-Long Horned Beetle
1) State Null Hypothesis: There is no difference between the Asian-Long Horned Beetle population in Bucks county compared to the expected values from Pennsylvania
2) State Alternative Hypothesis: There is an expected difference between this sample’s population and the expected figure.
3) This hypothesis test will use a z-test because the sample size is above 30 (50)
4) α = .05 (95% significance)
5) Calculate Test Statistic: Z-value= -7.7491, well outside critical values (-1.96, 1.96)
6) Make decision about hypotheses: Reject null hypothesis in favor of the alternative hypothesis. Asian Long Horned Beetle population is significantly lower than the expected value
Emerald Ash Borer Beetle
1) State Null Hypothesis: There is no difference between the Emerald Ash Borer Beetle population in Bucks county compared to the expected values from Pennsylvania
2) State Alternative Hypothesis: There is an expected difference between this sample’s population and the expected figure.
3) This hypothesis test will use a z-test because the sample size is above 30 (50)
4) α = .05 (95% significance)
5) Calculate Test Statistic: Z-value= 9.2468 which is outside of critical values (-1.96, 1.96)
6) Make decision about hypotheses: Reject null hypothesis in favor of the alternative hypothesis. Emerald Ash Borer Beetle population is significantly higher than the expected value
Golden Nematode
1) State Null Hypothesis: There is no difference between the Golden Nematode population in Bucks county compared to the expected values from Pennsylvania
2) State Alternative Hypothesis: There is an expected difference between this sample’s population and the expected figure.
3) This hypothesis test will use a z-test because the sample size is above 30 (50)
4) α = .05 (95% significance)
5) Calculate Test Statistic: Z-value= 2.48, well outside critical values (-1.96, 1.96)
6) Make decision about hypotheses: Reject null hypothesis in favor of the alternative hypothesis. Golden Nematode population is significantly lower than the expected value
c. What can ascertained pertaining to the findings about these invasive species in Buck County?
With these results in mind, it is safe to assume that Buck County has an invasive species problem, with multiple species exceeding their expected presence. The Long-horned beetle was significantly less than expected, though.
3. An exhaustive survey of all users of a wilderness park taken in 1960 revealed that the average number of persons per party was 2.1. In a random sample of 25 parties in 1985, the average was 3.4 persons with a standard deviation of 1.32 (one tailed test, 95% Con. Level)
a. Test the hypothesis that the number of people per party has changed in the intervening years. (State null and alternative hypotheses)
b. What is the corresponding probability value
a.
1) State Null Hypothesis: There is no difference between the average number of persons per party between the years of 1960 and 1985
2) State Alternative Hypothesis: There is an expected difference between the number of people per party between these years.
3) This hypothesis test will use a t-test because the sample size is below 30 (25)
4) α = .05 (95% significance) one-tailed.
5) Calculate Test Statistic: t-value= 4.92424, well outside critical value of 1.714)
6) Make decision about hypotheses: Reject null hypothesis in favor of the alternative hypothesis. Persons per party has increased significantly between the year of 1960 to 1985
b. 95% probability of this being the case
Part II. 'Up North' Study
Introduction:
This exercise was designed to introduce students to the concepts associated with Chi Square testing, providing a scenario in which we were asked to assume the role of a research consultant on the concept of 'up north.' This involves using Chi Square testing and a series of maps to examine a number of variables. WI SCORP (Statewide Comprehensive Outdoor Recreation Plan) data provides information on recreation variables for each county in Wisconsin. Chi Squred tests are used to evaluate the relationships between these variables as they apply to Northern and Southern Counties.
Methods:
First, I downloaded Wisconsin Counties data. This was available already on our UW-Eau Claire ArcGIS drive, so I imported it into a new geodatabase. Next, I had to determine how to define North vs. South for these counties. We were advised to use Highway 29 as a reference, so I imported a Major Roads dataset, queried for Hwy 29, and created a new feature class for it. I overlaid this on my counties dataset, then selected the counties I wanted to categorize.
Next, I added the SCORP table to the GIS, and joined it to my counties feature class. I then added a field to the joined feature class specifying North vs. South for each county. This is shown below in the Results section. Next, I selected a number of variables that I thought could demonstrate the difference between northern and southern Wisconsin. I chose number of Bike Trails, ATV Trails and Acreage of Parks. For each variable, I created a new field and used the select by attribute tool along with the field calculator to populate my new field. I divided the range of values for each variable into four, then subtracted the fourth from the total to yield a linear breaks categorization from 1-4. I exported the feature class's table as a dBASE Table for use in SPSS to calculate statistics.
I used SPSS' crosstabs statistics to calculate Chi Squared Tests for each variable. The results of these tests are shown below in the results section.
Results:
![]() |
This is the final map created using ESRI ArcGIS. The classification method used affects the display significantly, as shown by the two trail maps. |
Chi Square Test for distribution of Parks. |
Crosstabulation for Parks Distribution. |
Chi Square Test for Bike Trails |
Crosstabulation for Bike Trails |
Chi Square Test for ATV Trails |
Crosstabulation for ATV Trails |
Discussion:
The results of the Chi Square Tests are not what I anticipated. Refer to the 'Asymp. Sig. Pearson Chi-Square' number: this is the test statistic which is to be compared with the critical value of .05 (for a one-tailed test with 95% significance). As you can see, only the ATV Trails data set's test statistic is below this. This means that you would reject the null hypothesis which states that Northern and Southern WI's distribution of ATV Trails are statistically identical, in favor of the alternative hypothesis which states that there is indeed a difference. For bike trails and park acreage though, there is no statistical difference because the test statistic is below .05, so you would fail to reject the null hypothesis.
With this in mind, it becomes obvious that further research into the idea of 'up north' is necessary, and more pertinent variables should be selected to highlight the differences between northern and southern WI. Another interesting thought is that while I expected the bike trails variable to be more prevalent in northern WI, it really was more significant in the southern part of the state. In fact, it almost was significant with a test statistic of just .127.
Another important lesson from this exercise is that it is important to select a break method that accurately reflects the data. I used a linear break method, resulting in many counties with the same values, and little variation. Since linear break method does not account for outliers, the counties with high numbers of bike/ATV trails stood out and were basically the only members of their range of values. Perhaps in the future, a natural breaks method or standard deviation method would provide a more informative visualization.
No comments:
Post a Comment