The Gestation Period and Life Span of 22 Animals

The data file animals.dat gives the names of 22 animals (in alphabetic order), their average gestation period (length of pregnancy) in days and their average life span in years. (Source: New York Zoological Society)

Problem Statement

Determine the median, mean, and standard deviation of the gestation periods and the life spans of the 22 selected animals.

Input/Output Description

Input - The names of the animals, the number days in the gestation period, and the number of years in the life span of each animal are read from a data file called animals.dat into arrays animal, gest, and life, respectively.
Output - The median, mean, and standard deviation of the gestation periods and life spans of the 22 animals are printed.

Mathematical Equations

The mean is the sum of the values divided by the number of values.

The median is the middle number in an ordered (from lowest to highest or vice versa) set of values. If there is an odd number of values, the middle number is one of the numbers of the set. If there is an even number of values, the median is the mean of the two values in the middle, a value not in the set. To order the numbers, a sort algorithm will used. The selection sort is used in this program.

The standard deviation of a set of values is the statistically standard measure of variations from the mean. It is obtained from the formula

  		σ = √( ∑(xi-μ)2 /(N-1) ) 

where μ is the mean of the set of values, referred to as x's, ∑(xi-μ)2 is the sum of the squares of the differences of each x value and the mean of the x values, and N is the number of values.

Algorithm

1. Read the names of the animals, the gestation period, and the life span from the data file animals.dat into the arrays animal, gest, and life, respectively.
2. Create a sorted list of gestation period values and a sorted list of life spans to determine the median for each set of values.
3. Find the sum of each set of values and divide the sum by the number of values to determine the mean for each set.
4. In each set of values, find the sum of the squares of the differences in each value and the mean of that set; divide by the number of values minus one, and; take the square root of this quotient to find the standard deviation for each set.
5. Print out the median, mean, and standard deviation for each set of values.

Code

c       This program determines the median, mean, and standard deviation 
c       of the average gestation period and of the average life span of 
c       the 22 selected animals.
 
        PROGRAM stats

c       Specification statements

        PARAMETER (numdat=22)

        REAL mdgest, mdlife, mngest, mnlife, sdevgest, sdevlife
        REAL sumgest, sumlife, sumsqg, sumsql
        INTEGER numdat, hold, place,keep, j, m, p, q
        INTEGER gest(numdat), life(numdat)
        CHARACTER*15 animal(numdat), save

        DATA sumgest, sumlife, sumsqg, sumsql/0,0,0,0/

c       Variable Definitions
c       mdgest = median gestation period
c       mdlife = median life span
c       mngest = mean gestation period
c       mnlife = mean life span
c       sdevgest = standard deviation for the gestation period
c       sdevlife = standard deviation for the life span period
c       sumgest = sum of the gestation periods of the 22 animals
c       sumlife = sum of the life spans of the 22 animals
c       sumsqg = sum of the squares of the differences of each gestation
c                value and the mean gestation value
c       sumsql = sum of the squares of the differences of each life span
c                and the mean life span
c       hold, place, keep, save = variables used in sorting values
c       numdat = number of data values (22 in this case)
c       gest = set of 22 gestation periods
c       animal = set of 22 animals' names
c       life = set of 22 life spans

        OPEN(UNIT=20, FILE='animals.dat')

c       Read the values from the data file animals.dat into the arrays
c       animal (contains the animals' names), gest (contains the gestation
c       period in days for each animal), and life (contains the life span
c       in years for each animal).

        DO 100 k = 1, numdat
           READ (20,*) animal(k), gest(k), life(k)
100     CONTINUE

c       Sort (from highest to lowest) the values in the array gest using
c       the selection sort.  So that the three are not separated from each
c       other, all three must be moved at the same time.

        DO 200 j = 1,numdat - 1
           place = j
           start = j + 1
           DO 300 m = start, numdat
              IF (gest(m) .LT. gest(place)) place = m
300        CONTINUE
           save = animal(j)
           hold = gest(j)
           keep = life(j)
           animal(j) = animal(place)
           gest(j) = gest(place)
           life(j) = life(place)
           animal(place) = save
           gest(place) = hold
           life(place) = keep
200     CONTINUE


c       Sort (from highest to lowest) the values in the array life using
c       the selection sort.

        DO 400 j = 1,numdat - 1
           place = j
           start = j + 1
           DO 500 m = start, numdat
              IF (life(m) .LT. life(place)) place = m
500        CONTINUE
           hold = life(j)
           save = animal(j)
           life(j) = life(place)
           animal(j) = animal(place)
           life(place) = hold
           animal(place) = save
400     CONTINUE

        PRINT *

c       Determine the median for the set of gestation values and the set
c       of life span values.  If numdat is odd, the location of the median
c       in the array is (numdat/2 + 1).  If numdat is even, there are two
c       middle values and the locations of these values are (numdat/2) and
c       (numdat/2 + 1).  The median is the mean of these two middle values.

        IF (mod(numdat,2) .NE. 0) THEN
           mdgest = real(gest(numdat/2 + 1))
           mdlife = real(life(numdat/2 + 1))
        ELSE
           mdgest = (real(gest(numdat/2)+gest(numdat/2 + 1)))/2.0
           mdlife = (real(life(numdat/2)+life(numdat/2 + 1)))/2.0
        END IF

        PRINT *, 'The median gestation period is', mdgest
        PRINT *, 'The median life span is', mdlife

c       Find the mean gestation period (in days) and the mean life span
c       (in years) by suming the values in each set and dividing by 
c       numdat.

        DO 800 p = 1, numdat
           sumgest = sumgest + gest(p)
           sumlife = sumlife + life(p)
800     CONTINUE

        mngest = real(sumgest)/real(numdat)
        mnlife = real(sumlife)/real(numdat)

c       Find the sample standard deviation, the standard measure of
c       variations from the mean, for both sets of values.

        DO 900 q = 1, numdat
           sumsqg = sumsqg + (real(gest(q)) - mngest)**2
           sumsql = sumsql + (real(life(q)) - mnlife)**2
900     CONTINUE

        sdevgest = sqrt(sumsqg/real(numdat-1))
        sdevlife = sqrt(sumsql/real(numdat-1))

        PRINT *
        PRINT *, 'The mean gestation period is', mngest, ' days'
        PRINT *, 'with a standard deviation of', sdevgest
        PRINT *
        PRINT *, 'The mean life span is', mnlife, ' years'
        PRINT *,  'with a standard deviation of', sdevlife

        END

Output

The median gestation period is    112.000
The median life span is    11.0000
The mean gestation period is    154.000 days
with a standard deviation of    143.870

The mean life span is    16.0909 years
with a standard deviation of    14.8032