Statistics in Engineering
With examples in MATLAB® and R

Andrew Metcalfe, David Green, Tony Greenfield, Mahayaudin Mansor, Andrew Smith and Jonathan Tuke.



  • Exercise 4.1

    The sample space is {bad,poor,fair,good,ideal}
    A random variable is defined by:
    \(X(bad) = 1\)
    \(X(poor) = 2\)
    \(X(fair) = 3\)
    \(X(good) = 4\)
    \(X(ideal) = 5\)
    Any five distinct real numbers could be used, but the integers from \(1\) up to \(5\) is a natural choice.
  • Exercise 4.3

    > x=c(0:4)
    > p=1/5
    > #(a)
    > m=sum(x*p)
    > print(m)
    [1] 2
    > #(b)
    > v=sum((x-m)^2*p)
    > sigma=sqrt(v)
    > print(c(v,sigma))
    [1] 2.000000 1.414214
    > #(c)
    > k=sum((x-m)^4*p)/sigma^4
    > print(k)
    [1] 1.7

    1. The mean is \(2\) as we can check from the symmetry of the probability distribution.
    2. The variance and standard deviation are \(2\) and square root of \(2\) respectively.
    3. The kurtosis is \(1.7\)
  • Exercise 4.5

    For a Bernoulli trial the mean is \(0 \times (1 - p) + 1 × p = p\)
    The variance is \((-p)^{2} \times (1 - p) + (1 - p)^{2} \times p = (1 - p)(p^{2} + p(1 - p)) = p(1 - p)\)
    The expectation \(E[(X-\mu)^{3}]\) is \((-p)^{3} \times (1 - p) + (1 - p)^{3} \times p = (1 - p)((-p)^{3} + p(1 - 2p + p^{2})) = p(1 - p)(1 - 2p)\)
    It follows that the skewness is \((1 - 2p)/\sqrt{p(1 - p)}\)
    We check that the skewness is \(0\) when \(p = 0.5\)
  • Exercise 4.7

    Using R the solution to (a) and (b) is

    > set.seed(28)
    > NS=1000
    > N=20
    > m=rep(0,NS)
    > v=rep(0,NS)
    > n=20;p=0.0475
    > for (i in 1:NS){
      + x=rbinom(N,n,p)
      + m[i]=mean(x)
      + v[i]=var(x)
      + }
    > print(c("mean",n*p,mean(m)))
    [1] "mean" "0.95" "0.94505"
    > print(c("variance",n*p*(1-p),mean(v)))
    [1] "variance" "0.904875" "0.895755263157895"

    There is close agreement between the theoretical values and the values obtained by the simulation. The agreement will be even closer if NS is increased.
    For part (c)

    > length(which(v>=2.2))
    [1] 3

    So, referring to Example 4.8, if non-conforming elements occur independently the probability of an observed variance as high or higher than \(2.2\) is around \(3\) in \(1000\). It is possible that we have observed an unusual event but a more plausible explanation is that non-conforming items tend to occur in clusters - maybe because a batch of raw material is slightly different from usual or a person performing the assembly is distracted. If non-conforming items do tend to occur in clusters the variance will be higher, for the same overall proportion of defects (Exercise 4.11).
  • Exercise 4.9

    Let \(X\) be the number of fish returning data after one year.
    \(X\) is distributed binomial with \(n=20, p=0.2\)
    The expected number returning data after one year is \(20 \times 0.2 = 4\).
    We require \(P(X \ge 5) = 1 - P(X \ge 4)\)

    > 1-pbinom(4,20,0.2)
    [1] 0.3703517

    This probability of five or more returning signals after one year is \(0.37\).
  • Exercise 4.11

    1. \(X\) can take the values \(0, 1\) or \(2\).
      The associated probabilities are \((1 - p)(1 - \psi), p(1 - \theta) + (1 - p)\psi\) and \(p \theta.\)
    2. \(E[X] = 0 \times (1 - p)(1 - \psi) + 1 \times (p(1 - \theta) + (1 - p)\psi) + 2p \theta = 2p\) implies \(\psi = p(1 - \theta)/(1 - p)\)
    3. \(var{(X)} = E[X^{2}] - (E[X])^{2}\)
      \(E[X^{2}] = p(1 - \theta) +(1 - p)\psi + 4p \theta\) and substituting for \(\psi\)
      \(E[X^{2}] = 2p(1 - \theta) + 4p \theta\)
      So, the variance is \( 2p(1 - \theta) + 4p \theta - 4p^{2}\)
    4. \(CV = \sqrt{2p(1 - \theta) + 4p \theta - 4p^{2}}/2p\)
    5. In the case that \(\theta = p\) we get \(\sqrt{2p(1 - p)}/p\) which is correct for a binomial distribution with two trials.
    6. > CV <- function(p,theta){
        + cv=sqrt(2*p*(1-theta)+4*p+theta-4*p^2)/(2*p)
       + return(cv)
        + }
      > CV(0.1,0.2)
      [1] 4.242641
      > CV(0.1,0.1)
      [1] 4

      If theta is greater than \(p\) there is a tendency for clustering and the variance increases relaive to independent events.
    7. > CV(0.1,0.05)
      [1] 3.872983

      If theta is less than \(p\) there is a tendency against clustering and the variance decreases relaive to independent events.
  • Exercise 4.13

    1. \(1/0.02 = 50\)
    2. \(P(no \; flood \; in \; 50 \; years) = 0.364\)
  • Exercise 4.15

    1. > n=15
      > p=c(0.02,0.05,0.10)
      > probcompa=1-pbinom(0,n,p)
      > print(probcompa)
      [1] 0.2614309 0.5367088 0.7941089

    2. > n=30
      > probcompb=1-pbinom(2,n,p)
      > print(probcompb)
      [1] 0.02171783 0.18782119 0.58864876

  • Exercise 4.17

    1. Each gold particle is a trial \(n = 28\). The particle has a probability of \(1/6\) of being in the one sixth portion taken for assay \(p = 1/6\).
    2. The gold particles are randomly and independently distributed throughout the crushed rock in the container. If you imagine the volume of crushed rock divided into a large number of small cubes, each the size of a gold particle, then each cube is equally likely to be a gold particle.
    3. The expected number of particles in the one sixth portion is \(28 \times (1/6) = 4.67\). The distribution of the number of particles in the one sixth portion is \(binomial(28,1/6)\).
  • (To be continued)

Powered by MathJax