Statistics in Engineering
With examples in MATLAB® and R

Andrew Metcalfe, David Green, Tony Greenfield, Mahayaudin Mansor, Andrew Smith and Jonathan Tuke.


Chapter 2 solutions to odd numbered exercises


In these solutions we use natural numbers in ascending order as seeds for pseudo-random number generators.
You might find this link an amusing distraction
Remember that if solutions depend on random numbers your answers will differ slightly from ours, but should be similar.
  • Exercise 2.1

    The following R script generates 10 sequences of 20 pseudo-random digits and prints each sequence as a column. Your sequences will be different unless you used R with the same seed i

    set.seed(1)
    randig=matrix(rep(0,200),nrow=20,ncol=10)
    for (i in 1:10){
      randig[,i]=sample(0:9,size=20,replace=TRUE)
    }
    print(randig)

     [,1][,2][,3][,4][,5][,6][,7][,8][,9][,10]
    [1,]2989469622
    [2,]326273468 1
    [3,]567432424 8
    [4,]915339127 5
    [5,]225676778 8
    [6,]837222444 1
    [7,]900471510 7
    [8,]634714273 7
    [9,]687029217 9
    [10,]036815583 5
    [11,]244329566 7
    [12,]158807058 3
    [13,]644363038 1
    [14,]312384643 9
    [15,]780471953 2
    [16,]460870518 5
    [17,]773847556 1
    [18,]915341507 8
    [19,]376784926 3
    [20,]744966529 7
    1. Sequence12345678910
      (i)1022305321
      (ii)2322222321
      (iii)0000001000
      (iv)0011301301
      notes:
      The \(555\) in Sequence \(7\) gives two cases of the same digit repeated. The instances of three consecutive digits are \(123\) and \(345\) in Sequence \(2\) and \(345\) in Sequence \(8\)
      1. There are \(19\) times \(10\) consecutive pairs of digits. There are \(19\) instances of the same digit repeated. The proportion is \(19/190\) which is \(0.1\). By chance for these sequences, this happens to precisely match the proportion we would expect in a very long sequence. This proportion is \(0.1\) because the first digit of a pair can be any digit and there is a \(1\) in \(10\) chance that the second digit will be identical. proportion.
      2. The longest run of digits in natural order is \(3\). This occurred three times. With these sequences there is no longer run, and no additional runs of \(3\) digits, if they are read as one long sequence - reading down columns.
      3. There are \(18\) times \(10\) consecutive triples. We would expect \(0.01\) of a very long sequence of triples to be the same digit repeated three times. This is because the first digit can be any digit and there is a \(1\) in \(100\) chance that the same digit will occur in the next two positions. There was one instance, \(555\), in the sequences which is consistent with an expected number of \(180/100 = 1.8\). Notice that the mathematical expectation need not be an integer.
      4. If the sequences of length \(20\) are considered as sequences of \(10\) disjoint pairs of digits there are \(10\) instances of the same digit being repeated. The proportion of \(10/100\) same digit repeated happens to precisely match the expected proportion in a very long sequence.
        Note:   Following a similar argument to (iii) we would expect \(1.8\) instances of a run of three consecutive digits in natural order. That is, in a very large number of sets of ten sequences of length \(20\) the average number of three consecuitve digits in natural order would be \(1.8\). We have just one set, and the observed \(3\) instances is quite consistent with the expected \(1.8\).
  • Exercise 2.3

    1. The \(5\) blocks of \(4\) binary digits give the following sequence of integers:
      \(1   12   7   2   9\)
      ignoring the \(12\) leaves \(1729\) (Hardy-Ramanujan number).
    2. The \(4\) blocks of \(4\) binary digits give the following sequence of integers:
      \(14   4   10   2\)
      and these are \(E   4   A   2\) in hexadecimal.
  • Exercise 2.5

    1. Each lifeboat has a \(2\) in \(100\) chance of selection for every pair of digits.
      The sequence of two digit integers is: \(62   42   83   08   81   35   16   52   86   70\)
      The selected lifeboats would be \(12 - - 8 - - - 2\)
    2. Associate \(01, ... , 12;   21, ... , 32;   41, ... , 52;   61, ... , 72;   81, ... , 92\) with lifeboats \(1, .... , 12\).
      Then each lifeboat has a \(5\) in \(100\) chance of selection for every pair of digits.
      The sequence of two digit integers is: \(62   42   83   08   81   35   16   52   86   70\)
      The selected lifeboats would be \(2 - 3 8\)
      Notice that selection is without replacement.
    3. The lifeboats are not equally likely to be selected.
      Lifeboats \(12, 1, 2, 3\) have a \(9\) in \(100\) chance of selection for every pair of digits whereas the other lifeboats have an \(8\) in \(100\) chance of selection.
      This is because \(96, 97, 98, 99\) correspond to lifeboats \(12, 1, 2, 3\).
  • Exercise 2.7

    The following R script generates a sequence of length \(6075\), including the seed I0, after which it repeats. The order appears to be haphazard. The distribution of digits is almost discrete unifom, and a plot of the \((i+1)^{th}\) digit against the \(i^{th}\) does not show any patterns. Is the choice of seed value critical?

    #Exercise 2.7
    I0=1234
    a=106;b=1283;m=6075
    K=6074
    I=rep(0,K)
    I[1]=(a*I0+b)%%m
    for (i in 2:K){
      I[i]=(a*I[i-1]+b)%%m
    }
    print(sort(I))
    plot(as.ts(I))
    D=floor(10*I/m)
    print(D)
    print(stem(D))
    plot(I[1:(K-1)],I[2:K],pch=".")
    #note the sequence repeats after m integers have been generated
    Im=(a*I[K]+b)%%m
    print(Im)

  • Exercise 2.9

    1. The following R script generates a sequence of \(30000\) numbers \(U_i\) using RANDU given in Example 2.3 and creates a set of \(10000\) triples \((x, y, z) = (U_i,U_i+1,U_i+2)\) from the generated \(30000\) data. It also plots the triples and angles the 3D plot to show a \(15\) well defined planar structure of the data.

      # This script generates data from an lcg (RANDU in particular)
      # plots a data histogram, makes triples and plots a scatterplot in 3d

      require('scatterplot3d')

      N=30000; # The number of data to produce
      Z0 = 1; # the initial seed value
      a=65539;
      c=0;
      m=2^(31);
      myData=numeric(length=(N+1))
      myData[1] = Z0
      for (i in 2:(N+1)){
        myData[i] = ((a*myData[i-1]+c)%%m)
      }
      myData=myData[2:(N+1)]/m
      hist(myData)
      myTriples<-array(0,dim=c(10000,3))
      for (i in 1:10000){
        myTriples[i,] = myData[(3*(i-1)+1):(3*i)]
      }
      scatterplot3d(myTriples[,1],myTriples[,2],myTriples[,3],angle=160,pch='.')

    2. Consider the following
      \( \begin{eqnarray} Z_{i+1} & = & (2^{16} + 3) Z_i (mod 2^{31})\\ Z_{i+2} & = & (2^{16} + 3) Z_{i+1} (mod 2^{31})\\ & = & (2^{16} + 3)^2 Z_i (mod 2^{31})\\ & = & (2^{32} + 6 \times 2^{16} + 9) Z_i (mod 2^{31})\\ & = & (2^{32} + 6 \times (2^{16} + 3) - 9) Z_i (mod 2^{31})\\ & = & 6 Z_{i+1} - 9 Z_i (mod 2^{31}) \end{eqnarray} \)

      This is because \(2^{32} (mod 2^{31}) = 0\) and so the form shows that \(9 Z_i - 6 Z_{i+1} + Z_{i+2}\) is a multiple of \(2^{31}\).
      Dividing by \(2^{31}\) to get a number in \([0,1]\), we get the triplets \((x, y, z) = (Z_i, Z_{i+1}, Z_{i+2})/2^{31}\).
      Then the plot actually shows that all triplets lie in 15 planes given by
      \(9x - 6y + z = w\), where \(w = -5, -4, ... , 9\).
      The latter resultant values of \(w\) can easily be gained by direct calculation on the triplets. Thus the lcg RANDU produces random numbers in \((0,1)\), but there are strong correlations between subsequent values (serial correlation).
  • Exercise 2.11

    Selection is random such that each of the pumps available for selection is equally likely to be chosen. There are \(5F\) and \(12G\). Assume sampling without replacement. Numerical answers correct to \(4\) decimal places
    1. \(P(G) = 12/17 = 0.7059\)
    2. \(P(GG) = 12/17 \times 11/16 = 0.4853\)
    3. \(P(FF) = 5/17 \times 4/16 = 0.0735\)
    4. \(P(GF \; or \; FG) = 12/17 \times 5/16 + 5/17 \times 12/16 = 0.4412\)
    5. \(1 - P(FF) = 0.9265\)
  • Exercise 2.13

      1. \(A = (A \; and \; not \; B) \; or \; (A \; and \; B)\)
        The events on the right are mutually exclusive so by Axiom 2
        \(P(A) = P(A \; and \; not \; B) + P(A \; and \; B)\)
      2. \(A\; or \; B = (A \; and \; not \; B) \; or\; (A \; and \; B) \; or \; (not \; A \; and \; B)\)
        The events on the right are mutually exclusive so by Axiom 2
        \(P(A \; or \; B) = P(A \; and \; not \; B) + P(A \; and \; B) + P(not \; A \; and \; B)\)
    1. Start from (ii)
      \(P(A \; or \; B) = P(A \; and \; not \; B) + P(A \; and \; B) + P(not\; A \; and\; B)\)
      Now use (i) to replace \(P(A \; and \; not \; B)\) by \(P(A) - P(A \; and \; B)\), and hence obtain
      \(P(A \; or \; B) = P(A) - P(A \; and \; B) + P(A \; and \; B) + P(not \; A \; and \; B)\)
      Exchanging \(A\) for \(B\) in Equation (i) leads to
      \(P(B) = P(not \; A \; and \; B) + P(A \; and \; B)\)
      and subtitution for \(P(not \; A \; and \; B)\) gives
      \(P(A \; or \; B) = P(A) - P(A \; and \; B) + P(A \; and \; B) + P(B) - P(A \; and \; B) = P(A) + P(B) - P(A \; and \; B)\)
      Therefore, the addition rule is a consequence of the axioms of probability.
  • Exercise 2.15

    1. \(P(L \; or\; B)= P(L) + P(B) - P(L \; and \; B) = 0.70 + 0.60 - 0.50 = 0.80\)
    2. \(P(L|B) = P(L\; and \; B)/P(B) = 0.50/0.60 = 5/6\)
    3. \(P(B|L) = P(L \; and \; B)/P(L) = 0.50/0.70 = 5/7\)
  • Exercise 2.17

    We are given: \(P(H) = 0.20, P(D) = 0.15, P(D|H) = 0.40\)
    1. \(P(H \; and \; D) = P(H)P(D|H) = 0.20 \times 0.040 = 0.08\)
    2. \(P(H \; or \; D) = P(H) + P(D) - P(H \; and \; D) = 0.20 + 0.15 - 0.08 = 0.27\)
    3. \(1 - 0.27 = 0.73\)
    4. \(P(H|D) = 0.08/0.15 = 0.5333\)
  • Exercise 2.19

    We are given: \(P(H) = 0.20, P(D) = 0.15, P(D|H) = 0.40\)
    1. The Venn diagram is not drawn to scale. Start with \(0.01\) for the intersection of all three events and then work out to the intersections of two events, and notice that this leaves \(0.01\) for the three events of the form, for example \((A \; and \; not \; E \; and \; not \; G)\).
    2. \(1 - 0.24 - 0.24 - 0.24 - 0.01 - 0.01 - 0.01 - 0.01 = 0.24\)
    3. \(P(A \; and \; E) = 0.25 = P(A)P(E) = 0.50 \times 0.5\) and similarly for \(P(A \; and \; G), P(E \; and \; G)\).
    4. \(P(A \; and \; E \; and \; G) = 0.01\) which is not equal to \(0.5 \times 0.5 \times 0.5 = 0.125\).
    5. The three events are not independent.
  • Exercise 2.21

    We are told errors occur independently.
      1. \(1 - 0.99^2 = 0.0199\)
      2. \(1 - 0.99^3 = 0.0297\)
      3. \(1 - 0.99^n\)
    1. Assuming all the digits equally likely to be any of \(\{0, 1, ..., 9\}\), the probability is \(0.1^{12} = 10^{-12}\).
  • Exercise 2.23

    \(P(functions) = 1 - P(system fails) = 1 - P((A\; or \; B \; fail) \; and \; C \; fails) = 1 - (a + b - ab)c\)
    Alternatively
    \(P(functions) = P((A \; and \; B \; work) \; or \; C \; works) = (1-a)(1-b) + (1-c) - (1-a)(1-b)(1-c)\)
  • Exercise 2.25

    1. Imagine a long period of \(N\) years. Suppose the annual maximum flood exceeds a critical level \(c\) on \(k\) years during this period.
      The probability of exceeding \(c\) in any one year is \(p = k/N\) from the relative frequency definition of probability.
      The average time between events (ARI) is \(N/k\) years, and this is \(1/p\).
    2. The probability of at least on exceedance is the complement of no exceedance and if exceedances are independent this equals \(1 - (1 - 1/T)^n.
      1. > T=100
        > n=c(50,100,200)
        > p=1-(1-1/T)^n
        > p
        [1] 0.3949939 0.6339677 0.8660203

      2. > log(0.5)/log(0.99)
        [1] 68.96756
  • Exercise 2.27

    1. \(P(O \; or \; G) = P(O) + P(G) - P(O \; and \; G) = 0.25 + 0.08 - 0.05 = 0.28\)
    2. \(P(O \; or \; G \; but \; not \; both) = P(O \; or \; G) - P(O \; and \; G) = 0.28 - 0.05 = 0.23\)
    3. \(P(O|G) = P(O \; and \; G)/P(G) = 0.05/0.08 = 0.625\)
    4. \(P(G|O) = P(O \; and \; G)/P(O) = 0.05/0.25 = 0.20\)
  • Exercise 2.29

    1. \((2p - p^2)^2\)
    2. \(\theta \times (2q^2 - q^4) + (1 - \theta) \times (2p - p^2)^2\)
    3. The following function, written in R, gives \(p = 0.37\) and \(q = 0.27\).

      theta=0.02;costfailstop=100;coststop=1
      p=seq(0.1,0.999,0.001)
      solution229 <- function(theta,p,costfailstop,coststop){
        q=0.1/p
        EMV=theta*(2*q^2-q^4)*costfailstop+(1-theta)*(2*p-p^2)^2*coststop
        return(EMV)
      }
      EMV=solution229(theta,p,costfailstop,coststop)
      plot(p,EMV,type="l")

  • Exercise 2.31

    It is not true.
    Notice that \(B\) favoring \(A\) is equivalent to \(P(A \; and \; B) > P(A)P(B)\), and if \(B\) favors \(A\) so too \(A\) favors \(B\). A counter example is shown in the figure.

    In the figure \(P(A) = 0.2, P(B) = 0.4\), and \(P(A \; and \; B) = 0.1\), so \(B\) favors \(A\) \(P(C) = 0.2, P(B) = 0.4\), and \(P(C \; and \; B) = 0.1\), so \(C\) favors \(B\) \(P(A \; and \; C) = 0\), so \(C\) does not favor \(A\).
  • Exercise 2.33

    The probability of flooding is \(0.3 \times 0.2 + 0.6 \times 0.4 + 0.1 \times 0.5 = 0.35\)
    This is an application of the law of total probability. It is also an example of a weighted average.
  • Exercise 2.35

    > IgC=0.96;IgN=0.02;prior=0.03
    > Bayes=function(IgC,IgN,prior){
      + postgI=prior*IgC/(prior*IgC+(1-prior)*IgN)
      + return(postgI)
      + }
    > Bayes(IgC,IgN,prior)
    [1] 0.5975104


    The probability of an engine problem given the warning is \(0.598\)
  • Exercise 2.37

    1. See figure
    2. \(0.004 \times (1 - 0.01) + (1 - 0.004) \times 0.02 = 0.02388\)
    3. > IgC=1-0.01;IgN=0.02;prior=0.004
      > Bayes=function(IgC,IgN,prior){
        + postgI=prior*IgC/(prior*IgC+(1-prior)*IgN)
        + return(postgI)
        + }
      > Bayes(IgC,IgN,prior)
      [1] 0.1658291

      The probability that the engineers finds the gallery unsafe, given the warning is \(0.166\)
  • (To be continued)

Powered by MathJax