\documentclass[12pt]{article} \addtolength{\textheight}{2.0in} \addtolength{\topmargin}{-1.15in} \addtolength{\textwidth}{1.2in} \addtolength{\evensidemargin}{-0.75in} \addtolength{\oddsidemargin}{-0.7in} \setlength{\parskip}{0.1in} \setlength{\parindent}{0.0in} \pagestyle{empty} \raggedbottom \newcommand{\given}{\, | \,} \begin{document} \begin{flushleft} Prof.~David Draper \\ Department of \\ \hspace*{0.1in} Applied Mathematics and Statistics \\ University of California, Santa Cruz \end{flushleft} \begin{center} \textbf{\large AMS 206: Quiz 2 \textit{[45 total points]}} \end{center} \begin{tabular}{ll} \hspace*{-0.14in} Name: \underline{\hspace*{6.0in}} \\ \end{tabular} \vspace*{0.1in} You're an economist interested in patterns of unemployment of U.S.~adults over time. As part of this interest, You decide to take a sample from the population $\mathcal{ P }$ of people 18 years of age or older who were living in Santa Cruz (city, not county) as of time $T =$ (16 Jan 2019). The most recent U.S.~census, extrapolated to the beginning of 2019, estimates the total population of the city of Santa Cruz at that time as 64,465, and data from the website \texttt{suburbanstats.org} lead to an estimate of $N \doteq $ 54,342 as the total number of those people whose age was at least 18 at time $T$. You decide to take a representative sample of $n = 921$ people from $\mathcal{ P }$ and ask each sampled person ``Do you consider yourself fully employed at the time of this survey?'', with possible responses \{\textit{yes, no, other} (e.g., refuse to answer)\}. Let $\theta$ be the proportion of the 54,342 people who would have answered \textit{yes} to this question, if You had been able to survey the entire population, and let $s$ (an integer between 0 and $n$, inclusive) be the number of people in Your sample who actually do answer \textit{yes}. \begin{itemize} \item[(1)] In class we agreed that the simplest method for obtaining a \textit{representative} sample from a (finite) population is \textit{random sampling}. Given that there's no list of \{all $N$ people, with their addresses and other contact information\} from which You could draw a random sample (which is true; for one thing, what about homeless people?), in practice would it be easy, hard, or in between for You to construct a sample that You and other reasonable people would agree is representative (like a random sample) from the population $\mathcal{ P }$? Explain briefly. \textit{[5 points]} Describe (e.g., on another sheet of paper) how You personally would attempt to obtain an arguably representative sample from $\mathcal{ P }$. \textit{[5 points]} \vspace*{0.8in} \end{itemize} For the rest of this problem, let's assume that You have indeed been able to create a sample that's similar to what You would have obtained with random sampling, and that Your results were as follows: $n_{ yes } = s = 830$ people said \textit{yes}, $n_{ no } = 72$ said \textit{no}, and $n_{ other } = 19$ were recorded as \textit{other}. \begin{itemize} \item[(2)] Before You get Your sampled data, is the logical status of $\theta$ known or unknown? What about $s$? Answer both questions at a moment in time after Your sample data has arrived. \textit{[5 points]} \end{itemize} \newpage \begin{itemize} \item[(3)] In class we saw that calculations relevant to uncertainty quantification were of two types --- \textit{probability} and \textit{statistics} --- and that statistical activities in turn were of four types --- \textit{description}, \textit{inference}, \textit{prediction}, and \textit{decision-making} --- making a total of five classes of methods relevant to AMS 206. For each of the following (\textit{[5 point each]}), identify the activity or calculation as one of these five classes, and briefly explain Your choice. \begin{itemize} \item[(a)] After the data are available, You estimate that a future sample survey of size $n_{ future } = 614$ from $\mathcal{ P }$ in early 2020 would contain about $\hat{ n }_{ yes } = 553$ \textit{yes} responses. \vspace*{0.6in} \item[(b)] Before the data set arrives, and temporarily pretending that $\theta$ is known, under IID random sampling the sampling distribution (probability mass function) of $s$ given $\theta$ (and $n$) is $( s \given n \, \theta \, \mathcal{ B } ) \sim \textrm{Binomial} ( n, \theta )$, where $\mathcal{ B }$ summarizes the background context of Your sample survey. \vspace*{0.6in} \item[(c)] In consultation with You and on the basis of Your survey, the Santa Cruz City Council votes (5 in favor, 2 opposed) to allocate \$57,300 in the fiscal year 2020 budget to be distributed to winning grant proposals for ways to reduce unemployment in the city. \vspace*{0.6in} \item[(d)] After the data set has been collected, You estimate $\theta$ to be about $\hat{ \theta } = \frac{ s }{ n } = \frac{ 830 }{ 921 } \doteq$ 90.1\%, with a give-or take of about 1.0\% and a 95\% interval estimate of about $( 88.2\%, 92.0\% )$. \vspace*{0.6in} \item[(e)] You summarize Your data set with the vector $( n_{ yes }, n_{ no }, n_{ other } ) =( 830, 72, 19 )$. \vspace*{0.6in} \end{itemize} \item[(4)] In estimating the unemployment rate in $\mathcal{ P }$ at time $T$, You have to decide what to do about the $n_{ other } = 19$ people who answered \textit{other}. One possible approach is \textit{sensitivity analysis}: at one extreme You could imagine that all 19 of those people would have answered \textit{yes} if they had given a \textit{yes}/\textit{no} answer, and at the other extreme You could imagine them all answering \textit{no}. This defines a range of possible unemployment rate estimates, and if this range is narrow enough You've demonstrated that it doesn't matter much what You do with the \textit{other} people. Compute the lower and upper endpoints of this range with the data set in this problem. Would You say that the effect of the \textit{other} people is negligible here? Explain briefly. \textit{[5 points]} \end{itemize} \end{document}