Applied Econometrics Exam 1 (100 points) 1. [70 points] You wish to predict the sale price of single-family residences in Massachusetts using
property features (commonly called a “hedonic pricing model”). You collect price and property features
data on properties sold in the state for the year 2010 and obtain the following regression:
Pricei = 14407.60 – 759.92*houseagei + .24*lotsizei + 354.35*bldareai + 12015.61*roomsi + µi
Observations = 2691 R2 = 0.49 F = 65.10 Where:
houseage = age of the house (in years)
lotsize = total square feet of the land
bldarea = total square feet of the interior of the house
rooms = total number of rooms in the house
A. [6 points] How would you categorize, or label, this dataset? Defend your answer.
B. [6 points] What is the interpretation of the constant term in this regression? Why is it included?
C. [8 points] How do we interpret the coefficient on lotsize? Why is the coefficient on lotsize
nominally small if we expect it to have a large impact on the price of a house?
D. [6 points] What is the predicted price of a house that is 7 years old, with a lot size of 800 square
feet, a building interior of 400 square feet, and 5 rooms? Will this predicted price be close to the
actual price? Why or why not?
E. [6 points] Explain what is meant that the value of the R 2 = .49. What is one good reason and one
bad reason to use R2 as a measure of the “goodness of fit” of a regression?
F. [10 points] Test the significance of each independent variable in the model using α = .05. Are
these findings expected? Why or why not, and what could explain your findings? G. [8 points] Construct a 95% confidence interval for houseage in the model above. What this
measure is telling us? How will consistency in your OLS estimation affect your confidence
H. [10 points] After thinking about your model further, you wish to add median income as variable
in your regression. You collect data on the median income of each census tract in Massachusetts
in the year 2010, and match that to your housing data. Assuming you believe that your original
form of the model suffered from omitted variable bias, in what direction would you expect your
estimates to change with the inclusion of median income? Defend your answers. I. [10 points] Suppose you ran the same model as above only using log(price) instead of price and
obtain an R2 of 0.54 and an F-statistic of 68.17. Based on this information, are we able to say
which version (level or log) of the model is better? Explain why or why not. 2. [10 points] The following questions are based on airline data collected from routes in the U.S. for the
year 2000. You are interested in examining the determinants of ticket prices. 0 avg. one-way fare, $
400 600 You decide to generate a scatter diagram to visually assess if the average one-way fare for a route is
related to the distance of that route. 0 1000 2000 3000 distance, in miles Based on this scatter diagram, would OLS be BLUE if we ran a regression of the average one-way fare for
a route on the distance of that route? Explain why or why not and if this would affect any estimated
coefficients. 3. [20 points] Suppose that 5 years ago, a new job training program was introduced in Massachusetts
that accepted 1000 applicants. For $200, participants would undergo a 6-week training program that
would teach them about basic computer skills, resume building, interviewing, and job searching. Now, 5
years later, you wish to survey some of these participants to see if the training has helped them.
You create a simple survey asking participants how much they though the job training helped them in
their professional life on a scale of 1 to 5 (with 5 being most helpful and 1 being least helpful), how much
they currently earn per month, and how many jobs they are currently working.
You mail out 100 surveys to known participants, and after waiting several weeks you receive 32
responses. You wish to use the income and number of job responses to predict how favorably a participant viewed the training (using the 1 through 5 scale above). If you estimate your model using
OLS, do you believe that any of our Classical Linear Model assumptions would be violated? If so, explain
which ones, why they are violated, and what potential problems that could pose for your estimation.
Click here to order this paper @Superbwriters.com. The Ultimate Custom Paper Writing Service