10.6 Exercises

Exercise 10.1 The following problem works with fitting \(y=bx\) as in this section, with the following data:

\(x\)	\(y\)
1	3
2	5
4	4
4	10

Using calculus, show that the cost function \(S(b)=(3-b)^2+(5-2b)^2+(4-4b)^2+(10-4b)^2\) has a minimum value at \(b=1.86\).
Use a similar approach to determine the minimum of the revised cost function \(\tilde{S}(b)=(3-b)^2+(5-2b)^2+(4-4b)^2+(10-4b)^2 + (b-1.3)^2\). Call this value \(\tilde{b}\).
Make a plot of the cost functions \(S(b)\) and \(\tilde{S}(b)\) to verify the optimum values.
Make a scatter plot with the data and the function \(y=bx\) and \(y=\tilde{b}x\). How do the two estimates compare with the data?

Exercise 10.2 (Inspired from Hugo van den Berg (2011)) Consider the nutrient equation \(\displaystyle y = c x^{1/\theta}\) using the dataset phosphorous.

Write down a formula for the objective function \(S(c,\theta)\) that characterizes this equation (that includes the dataset phosphorous).
Fix \(c=1.737\). Make a ggplot of \(S(1.737,\theta)\) for \(1 \leq \theta \leq 10\).
How many critical points does this function have over this interval? Which value of \(\theta\) is the global minimum?

Exercise 10.3 Use the cost function \(S(1.737,\theta)\) from Exercise 10.2 to answer the following questions:

Researchers believe that \(\theta \approx 7\). Re-write \(S(1.737,\theta)\) to account for this additional (prior) information.
How does the inclusion of this additional information change the shape of the cost function and the location of the global minimum?
Finally, reconsider the fact that \(\theta \approx 7 \pm .5\) (as prior information). How does that modify \(S(1.737,\theta)\) further and the location of the global minimum?

Exercise 10.4 Navigate to this desmos file, which you will use to answer the following questions:

By adjusting the sliders for \(a\) and \(b\), determine the values of \(a\) and \(b\) that you think best minimizes the objective function.
Desmos can do linear regression! To do that, you need to start a new cell and enter in the regression formula: \(y_{1} \sim c + d x_{1}\). (We need to use different parameters \(c\) and \(d\) because \(a\) and \(b\) are defined above). How do the values of \(c\) and \(d\) compare to what you found with \(a\) and \(b\)?
Alternatively, you can also define an objective function with absolute value: \(\displaystyle S_{mod}(a,b) = \sum_{i=1}^{n} | y_{i}-(a+bx_{i}) |\) Implement the absolute value objective function in Desmos and manipulate the slider values for \(a\) and \(b\) to determine where \(S_{mod}\) is minimized. How do those values compare to the least squares estimate?

Exercise 10.5 One way to generalize the notion of prior information using cost functions is to include a term that represents the degree of uncertainty in the prior information, such as \(\sigma\). For the problem \(y=bx\) this leads to the following cost function: \(\displaystyle \tilde{S}_{revised}(b)=(3-b)^2+(5-2b)^2+(4-4b)^2+(10-4b)^2 + \frac{(b-1.3)^2}{\sigma^{2}}\).

Use calculus to determine the optimum value for \(\tilde{S}_{revised}(b)\), expressed in terms of \(\tilde{b}_{revised} = f(\sigma)\) (your optimum value will be a function of \(\sigma\)). What happens to \(\tilde{b}_{revised}\) as \(\sigma \rightarrow \infty\)?

Exercise 10.6 For this problem you will minimize some generic functions.

Using calculus, verify that the optimum value of \(y=ax^{2}+bx+c\) occurs at \(x=-b/2a\). (You can assume \(a>0\).)
Using calculus, verify that the optimum value of \(z=e^{-(ax^{2}+bx+c)^{2}}\) also occurs at \(x=-b/2a\).
Algebraically show that \(\ln(z) = -y\).
Explain why \(y\) is similar to a cost function \(S(b)\) and \(z\) is similar to a likelihood function.

Exercise 10.7 This problem continues the re-election of the President and viewpoint on the economy. Determine the following conditional probabilities:

Determine the probability that you voted for the president given that you have a pessimistic view on the economy.
Determine the probability that you did not vote for the president given that you have an pessimistic view on the economy.
Determine the probability that you did not vote for the president given that you have an optimistic view on the economy.

Exercise 10.8 Incumbents have an advantage in re-election due to wider name recognition, which may boost their re-election chances. Complete the following table, estimating the following probabilities. Please report percentages as decimals.

Probability	Being elected for office	Not being elected for office	Total
Having name recognition	0.55	0.25	0.80
Not having name recognition	0.05	0.15	0.20
Total	0.60	0.40	1.00

Use Bayes’ Rule to determine the probability of being elected, given that you have name recognition.

Exercise 10.9 Show how you can derive Bayes’ Rule from the law of conditional probability.

References

Berg, Hugo van den. 2011. Mathematical Models of Biological Systems. Illustrated edition. Oxford ; New York: Oxford University Press.