ECON 306 – Homework 3 The following two problems will require a lot of calculations in STATA (or however you opt to

execute the calculations). It will generate many pages of output. Here is how your should

organize it. The first pages should contain your answers to all the questions, along with

showing any key algebraic equations or explanations you need to use along the way. After

that, include a printout of the output from the regressions you executed in support of your

answers. Highlight any numbers in this output that you used in the first section. (To save paper,

you may print this section double-side and/or with 2-up format.) Last, include a copy of the DO

file that contains the commands you asked STATA to execute. Be sure you organize these in a

way that will be clear to the reader.

1. (52 points total, 4 points each part) With this assignment you will find a STATA data file

called boston.dta. For reference, the variables in this file are:

nox = nitric oxides concentration (parts per 10 million)

rm = average number of rooms per dwelling

age = proportion of owner-occupied units built prior to 1940

dis = weighted distances to five Boston employment centers

ptratio = pupil-teacher ratio by town

lstat = percent lower status of population

medv = median value of owner-occupied homes (in thousands of dollars)

Open this dataset within STATA (only STATA can open it). Before you begin answering

the following, it’s not a bad idea to ask STATA to summarize the data using the command

summarize. You should also start a log file to store your results.

a.) Run the following regression: MEDV 0 1 * RM

b.) Hypothesize the sign of the bias, if any, resulting from excluding age from the

regression. Explain your reasoning. (There is no wrong answer as long as you make

a sensible story.)

c.) Use the data to verify (or not) your claim from b). Break down the bias into its

pieces.

d.) Now, run the regression:

MEDV 0 1 * NOX 2 * RM 3 * AGE 4 * DIS 5 * PTRATIO 6 * LSTAT

e.) At a level of α=.05, for which, if any, values of βi, would you reject the null

hypothesis that βi=0? f.) What is the predicted medv with nox=0.5, rm=4, age=60, dis=3, ptratio=20,

and lstat=10?

g.) Redo (f) but with nox=0.6. What is the difference in predicted medv between these

two communities? Compare this with the coefficient of nox.

h.) Ceteris Peribus, compared to (f), what is the impact of reducing the pupil-teacher

ratio to 18?

i.) What percentage of the variation in medv is explained by the six X-variables?

Now change the measurement of nox. Use the ‘gen’ command:

gen noxppm=nox/10

and then use this in place of nox in the regression command

regress medv noxppm rm age dis ptratio lstat

j.) Compare the coefficient, standard error, and t-ratio for noxppm to that of nox.

Interpret the difference between this model and the previous.

k.) Also compare the and remaining coefficients. Interpret the difference between this model and the original regression model.

Now change the variable age to newage

gen newage=100-age

and then use this in place of age in the original regression command. That is,

execute:

regress medv nox rm newage dis ptratio lstat

l.) Compare the coefficient, standard error, and t-ratio for newage to that of age.

Interpret the difference between this model and the previous.

m.) Also compare the and remaining coefficients. Interpret the difference between this model and the original regression model. 2. (48 points total. 5 points each part, +3 for free.) For the following problem, use the

STATA dataset called crime.dta. This data set was compiled by Christopher Cornwell

and William Trumbull to study factors that influence crime rates. The data set contains

observations for 90 counties in North Carolina for 1981. The definitions of the variables

represented in the data set are:

crmrte=crime rate

prbarr=probability of arrest

prbconv=probability of conviction

prbpris=probability of a prison sentence

avgsen=average sentence in days

polpc=number of police per capita

density=population density

pctmin=percent minority

pctymle=percent young males

wmfg=average weekly wage in manufacturing

wcon=average weekly wage in construction

wtuc=average weekly wage in transportation,utilities,and communications

wtrd=average weekly wage in wholesale and retail trade

wfir=average weekly wage in finance,insurance,and real estate

wser=average weekly wage in services

wfed=average weekly wage in federal government

wsta=average weekly wage in state government

wloc=average weekly wage in local government

According to the economic model of crime rates, lower crime rates are associated with

better labor markets (higher wages), more police presence and tougher sentences, and

lower population density. We will use this data set to examine these hypotheses. Use a

significance level of α=.05 for all hypothesis tests.

a.)

b.)

c.)

d.) Run a regression of crmrte on all of the other variables. Call this Model 1.

Do any t-statistics indicate a variable is not statistically significant? Which?

Interpret the F-statistic STATA has calculated for Model 1.

Test the hypothesis that the coefficients on wfed and wsta are equal to each other.

Use the t-test method described in the lectures. What transformation do you need to

do here? Be specific.

e.) Test the hypothesis that the coefficients on wfed, wsta and wloc are all equal to

each other. Do this by writing down the formula for the relevant F-statistic.

Calculate it (by running the appropriate restricted regression) and test the hypothesis.

Report these results. This restricted version of the regression will be called Model 2. f.) Return to Model 1. Now test the hypothesis that pctmin and pctymle both equal

zero. Do this by writing down the formula for the relevant F-statistic. Calculate it

(by running the appropriate restricted regression) and test the hypothesis. Report

these results. This restricted version of the regression will be called Model 3.

The model could potentially be simplified by replacing all the wage variables with an

average. Specifically, let us define

wmfg wcon wtuc wtrd wfir wser wfed wsta wloc

9

Generate this variable.

avgwage g.) Return to Model 1 and run using avgwage in place of the individual wage variables.

Check the validity of this restriction. As before, do this by writing down the formula

for the relevant F-statistic. Calculate it (by running the appropriate restricted

regression) and test the hypothesis. Report these results. This restricted version of the

regression will be called Model 4.

h.) Let’s focus our attention on the coefficient for the variable polpc. How does the

value of this coefficient change – as well as its statistical significance –as we move

from model to model? To answer this, write down a table containing the results for

this coefficient for each of the four models. In this table, include the coefficient

values, the values of the t-statistic (for a hypothesis that the coefficient=0,) and

whether you’d reject the hypothesis.

i.) What do your results in the last question imply about the relationship between the

number of police and the crime rate. Are you confident in these results based on the

work you have done? Why or why not?

Click here to order this paper @Superbwriters.com. The Ultimate Custom Paper Writing Service