# FORTRAN ARRAYS

This file defines or explains many terms from FORTRAN.

## Array Constructors

Array constructors are a new feature of FORTRAN90 which allow you to define an array "on the fly". You can specify the value of an array by listing its elements between two special delimiters. For example, the statement

```
a = (/ 2, 3, 5, 7, 11, 13, 17 /)
```
assigns the first few small primes to A, and would be equivalent to:
```
a(1) = 2
a(2) = 3
...
a(7) = 17
```

Instead of constants, we may use expressions and variables:

```
integer a(9)
integer b(5)
integer c

a = (/ b, c, b(3), b(1)+2, 3.14159265 /)
```
Here, the first five entries of A are set to the contents of B, with the next four entries set to various values.

How could we use an expression like 3.14159265/12.0 in this statement?

Once we see that we can use an array inside of an array constructor, you know what's next; we can use an array constructor inside of another array constructor:
```  integer iarray(3)
integer jarray(3)
integer karray(9)

iarray = (/ 1, 2, 3 /)
jarray = (/ 3, 2, 1 /)
karray = (/ iarray, (/ 3, 2, 3 /), jarray /)
```
Array constructors may only be used to create vectors, or "rank one arrays", but you may use the RESHAPE operator to reshape the vector into any allowable shape. Thus, it would be easy to turn KARRAY in the last example into a 3 by 3 array.

## Array Declarations

FORTRAN90 has added new methods of declaring arrays. The old ways of declaring an array may still be used. In particular, you can declare a two dimensional real array A via statements like
real a(100,200)
However, the new standard expects that users will take advantage of the new array features by declaring many arrays of the same type and size. For this reason, there is an abbreviated declaration which lists a type, a dimension, and then all the array names that belong in that class. The following are examples of this format:
```  real,    dimension(100,200) :: a, b
complex, dimension(300)     :: d
integer, dimension(400,500) :: e, f, g
```
Be sure not to forget the comma after the type statement! Note that the current version of the UNICOS CF77 compiler does NOT recognize this new declaration! Sometimes it can be convenient to be able to index an array in an unusual fashion. C programmers, for instance, expect the first entry of an array to have index 0. There may be other cases where it would be natural to index an array unusually. This facility was added to Fortran 77, and is still available in FORTRAN90, although many people are unfamiliar with it. To use an indexing scheme different from the default, you simply specify the indexing scheme you wish to use as part of the declaration statement. Thus, to set aside 27 entries in X, and to be able to index them as X(-13), X(-12), ..., X(-1), X(0), X(1), ..., X(12), X(13), it is simply necessary to declare
real x(-13:13)
Of course, this indexing can be declared in the new declaration form as well:
real, dimension (-13,13) :: x, y, z
The EXTENT of an array in a given dimension is the number of index entries available in that direction. In the usual case, this is identical with the dimensions of the array. Thus the extent of A(100,200) is 100 in the first dimension and 200 in the second. However, an array can be declared using a more complicated form, such as B(10:19,50:100). In this case, the extent of B is 10 in the first dimension, and 51 (not 50!) in the second. Essentially, B has as much space as an array C(10,51), but we're just indexing it differently. FORTRAN90 allows arrays to be declared whose size is zero. Arrays may even be declared whose upper bound is less than the lower bound. In such a case, FORTRAN90 treats the array as though it had size zero. No space is allocated for it, but no error message is issued. It is assumed that the programmer wants the flexibility of handling all cases of a problem the same way. Two arrays are of the same SHAPE if they have the same number of dimensions and the same extent in each dimension. The arrays declared B(10:19,50:100) and C(10,51) have the same shape. The arrays D(100) and E(10,10) do not have the same shape, even though they both contain 100 elements. Arrays of the same shape are also called CONFORMABLE. Such arrays may be used together in ARRAY EXPRESSIONS.

## Array Expressions

In FORTRAN90, assignment statements of the form A=B can be used, where both A and B are arrays. There is much more flexibility available, however, since the right hand side can be any array expression. Array expressions are similar to scalar expressions, with the added complication that arrays come in many shapes and sizes. Array expressions include -A, B+C, SQRT(D), E+7, F/G, and H*X, and Y.EQ.Z. It's hard to tell these aren't simply scalar expressions! Of course, the compiler knows, because it has already compiled the declaration statements. So how does it handle these expressions? The basic rule is fairly simple. All the objects in an array expression should be arrays of the same shape, and possibly some scalars. If scalars appear with arrays, they will be treated as though they represented an array full of a single value, and of the appropriate shape. The result of the array expression is an array, whose entries are computed elementwise from a scalar interpretation of the formula. In particular,
• -A is an array whose entries are the negatives of those in A.
• B+C is an array whose entries are the sums of the entries in B and C.
• SQRT(D) is an array whose entries are the square roots of entries in D.
• E+7 is an array, each of whose elements are 7 more than the entries in E.
• F/G is an array whose elements are the ratios of elements in F and G. Do NOT assume that 1.0/G is the matrix inverse of G!
• H*X is an array whose elements are the products of elements in H and X. Do NOT assume that H*X is the matrix product of H and X.
• Y.EQ.Z is an array of logical values which are TRUE whenever the entries of Y and Z are equal.
The array assignment statement may be represented as "Array = Array expression", where "array expression" is a formula similar to those discussed above. The array expression and the array into which the value are stored must normally have the same shape. The only exception is that an array may be assigned a scalar value, such as "A=1.0". The compiler will correctly interpret this statement as assigning every entry of A to 1.0. The same type conversions will be done for the array expressions as are done for scalar expressions.

## Assumed Size Arrays

When an array is passed to a subroutine as an argument, the array must be declared in the subroutine. If the array is a vector (just one dimension), then there are several alternatives for the declaration. Supposing X is a real vector, we might declare
real x(n)
where N is also an input argument. We are therefore declaring both that X is a vector, and how large a vector it is. Or we can declare
real x(1)
Here we are stating that X is a vector. We are not actually stating that X is of size 1. The true size of X is determined in the calling routine, where the space for X was allocated. But using '(1)' is a traditional way of stating that X is an input vector of unspecified size. Since this could cause confusion, the preferred declaration in such a case is
real x(*)
This states that X is an input vector, and no information is given about the dimension of X. In effect, all three declarations are equivalent: X may be used as a vector in this routine. Some differences may occur if you use a debugger that checks for arrays out of bounds. In the second declaration ( REAL X(1) ), the out-of-bounds checker will complain about references to X(2), for example. In the case of arrays of two or more dimensions, the situation is a little more complicated. A subroutine which receives such an array as input must have information on the sizes of all but the last dimension in order to correctly extract the information. For example, if A is an array of size 10 by 8, here are some code fragments that demonstrate how A is declared in the main program, and passed to a subprogram:
```  real a(10,8)
call sub ( a, 10, 8)
...
subroutine sub ( a, m, n )
real a(m,n)
```
which passes both dimensions of A, or
```  real a(10,8)
call sub ( a, 10 )
...
subroutine sub ( a, m )
real a(m,1)
```
or
```  real a(10,8)
call sub ( a, 10 )
...
subroutine sub ( a, m )
real a(m,*)
```
Thus, you should see that the options available in the case of a vector (declare dimension, use "1" as the dimension, use "*" as the dimension) are all available for two dimensional arrays, but only for the last dimension. An array which is passed in to a subroutine, and whose only or final dimension is declared as "*", is known as an ASSUMED SIZE ARRAY. In order to use FORTRAN90 notation, you are strongly urged to pass the full dimension information for the array. If you use an assumed size array, then Fortran does not have enough information to interpret some array selection statements, such as A(:). Thus, the rule is that an array section of an assumed size array is allowed only if the section selector for the last dimension is an explicit subscript expression (such as "3" or "I+1"), an indexed section selector with the upper bound specified (such as ":N" or "3:5"), or a vector-valued section selector with a known size. Thus, assuming that we have declared
```  real a(*)
real b(10,*)
```
the following expressions are illegal:
```  a=3.0
a(:)=b(10,:)
a(1:)=6.0
```
while these expressions are legal:
```  a(1:10)=3.0
a( :5) =3.0
a(3:4) =b(10,1:2)
```
Note that an assumed size array cannot be conformable with any other arrays, nor can it be used in formulas with a scalar. The compiler simply does not have enough information to make the necessary assignments correctly. Even if two arrays actually have the same shape in the calling program, but one or both are assumed size in a subroutine, they are not conformable in that subroutine. Even though they are the same shape, the compiler can't tell this. In other words, this won't work:
```  real a(10)
real b(10)

b = 7.0                 <--- This one is legal.
a = b                   <--- This one is legal.

call sub ( a, b )
...

subroutine sub ( a, b )
real a(*)
real b(*)  <--- Information about shape is lost!

a = b                   <---  This one is NOT legal!
b = 2.0                 <---  This one is NOT legal!

a(1:5) = b(5:9)         <---  and this one is legal.
```
This example also should make clear that a SECTION of an assumed size array may be conformable in a subroutine, even though the whole array is not.

## Dynamic Array Allocation

A feature of FORTRAN90 that may be very useful to programmers is the ability to create DYNAMIC ARRAYS. These are essentially work vectors that may be created in a subroutine, used, and then discarded upon exiting the routine. Using dynamic arrays can decrease the memory requirements of a program, and simplify the source code of a program, since one does not have to use COMMON blocks or subroutine arguments to pass a limited amount of workspace about to various routines. For example, suppose we had a subroutine which computes a matrix inverse. Such a subroutine might, typically, need a work vector of size N for pivoting information, as well as a two dimensional array of size N by N in which to store the factored matrix. Here is how a traditional program might look:
```  integer n
parameter ( n = 100 )

real a(n,n)
real ainv(n,n)
integer ipivot(n)
real work(n,n)

call invert ( a, ainv, work, ipivot, n )

stop
end

subroutine invert ( a, ainv, work, ipivot, n )

integer n

real a(n,n)
real ainv(n,n)
integer ipivot(n)
real work(n,n)
...
return
end
```
The new standard allows you to declare IPIVOT and WORK in the subroutine INVERT without declaring them in the calling program. The calling program's only responsibility is to pass in the dimensions of the dynamic arrays. The arrays will be created on entry into the routine and discarded on exit. Hence, they must be true work arrays, whose values are not needed outside of the routine. Although INVERT may pass these values to subroutines that it calls, it will not be able to return the information in IPIVOT and WORK to the routine that called it. Thus, a modern program would look like
```  integer n
parameter ( n = 100 )

real a(n,n)
real ainv(n,n)

call invert ( a, ainv, n )

...
stop
end
subroutine invert ( a, ainv, n )

integer n

real a(n,n)
real ainv(n,n)
integer ipivot(n)
real work(n,n)
...
return
end
```
It may take you a minute to realize that this program is significantly different! If you have ever used a library package that required you to set up, dimension, and pass several work vectors, you may appreciate the convenience of this feature. However, it really does also decrease the amount of memory required. This program begins by requiring 2*N*N entries of array storage. Only when INVERT is called are N*N+N more entries asked for, and once INVERT is exited, those entries are freed up again. A program with many subroutines and work vectors could thus run in much smaller space.

## The FORALL Statement

On the Connection Machine, there is a FORALL statement, which is not part of the FORTRAN90 standard, but which you may find useful. This statment is NOT available on other machines, in particular the Cray. The format of the statement is
FORALL (index=lower bound:upper bound:increment) statement
For example, to set X(I) to I for entries 1 through N: FORALL (I=1:N) X(I)=I To set the entries of the N by N matrix A(I,J) to 1/(I+J-1): FORALL (I=1:N, J=1:N) A(I,J)=1.0/REAL(I+J-1.0) To set entries 9, 7, 5, 3 of X(I) to 1.0, FORALL (I=9:3:-2) X(I)=1.0 Notice that this FORALL statement is "halfway" between the old styled DO loops and the modern abbreviated assignment statements like "A=1.0". Of course, there's a very good reason for wanting a statement like this. The abbreviated statements are fine as long as the dummy DO loop index only occurs on the right hand side as an array index, and only in the same way it occurs on the left hand side. Thus, it's easy to rewrite loops whose inner statement is "A(I)=1.0", "A(I,J)=A(I,J)+X(I,J)" and so on. But it is impossible to rewrite the simple loop:
```      do I = 1, N
X(I) = I
end do
```
using the abbreviated FORTRAN90 constructs. For cases where the dummy DO loop index does something "interesting" on the right hand side, the FORALL statement can be used. It allows you to skip the "DO" statement and the labeled statement, and on the Connection Machine the FORALL statement will execute in parallel (assuming the arrays that occur in the statement reside on the Connection Machine!).

## FORTRAN90

The FORTRAN90 standard includes declarations and operations for use with arrays. As a very simple example, consider the following Fortran fragment:
```  do I = 1, N
do J = 1, N
A(I,J) = B(I,J)
end do
end do
```
Presumably, the user simply wants to copy array B into array A. However, because Fortran has no way to express that idea, the user has to request that each individual entry of B must be copied into the corresponding entry of A. Moreover the user must specify an order in which this happens and create temporary variables to hold the dummy indices (I and J). The corresponding Fortran90 statement is terse:
```  A = B
```
It should be clear why code optimizers prefer the second notation over the first. The compiler can choose how the copying is done. On most machines, in fact, the copying should be done as though the inner DO loop was on the first index, not the second. Users don't care about a detail like that, but compiler writers do. They can make sure that "A=B" is done in the most efficient way the given computer can achieve. New or modified features of FORTRAN90 include: Array declarations Array expressions Selecting regularly spaced array entries Selecting arbitrary array entries Assumed size arrays Dynamic array allocation The WHERE statement

## FORTRAN Arrays

You should first read the information about FORTRAN vectors, since all of that material applies to FORTRAN matrices as well. An additional complication occurs, however, because of the multiple indexing of the data in a matrix. The matrix is actually stored as a vector in memory, one column following the next. It is fairly easy to see that, if A is a matrix that was declared to be of size (M,N), then A(1,1) will be the first element of the vector, A(2,1) the next, ..., A(M,1) the M th, A(1,2) the M+1 th, ... A(M,2) the 2*M th, ..., and A(M,N) the M*N th. The formulas that relate the array indices I and J to the relative vector location K are
```  K = (J-1)*LDA + I
J = (K-1)/LDA + 1
I = K - (J-1)*LDA
```
It should be clear that the leading dimension of the array must be known by the program in order to figure out how to store the two dimensional quantity in a one dimensional way. This leading dimension is often not the same as the size of the information stored in the array; in other words, a 10 by 20 array may be set aside with a DIMENSION statement, but perhaps only a 5 by 6 matrix is assigned and used. In this case, the dimension of "10" is the leading dimension, or physical first dimension, whereas 5 is the logical first dimension. The only restriction is that the physical dimension must be at least as large as the logical dimension. It is extremely easy to cause an error by storing information into a two dimensional array with one leading dimension, and retrieving it with another. FORTRAN does not "remember" the dimensions of the array for you. For example, consider this situation:
```  program main
integer A(9,9)

do I = 1, 9
do J = 1, 9
A(I,J) = 10 * I + J
end do
end do

write ( *, * ) A(1,2)
call show ( A )

stop
end

subroutine show ( A )
integer A(5,5)

write ( *, * ) A(1,2)

return
end
```
The first WRITE statement will print the value "12", corresponding to A(1,2), as understood by the main program. A(1,2) corresponds to entry 11 when the array is written as a vector. But in the subprogram, the WRITE statement will print out "61", which, in the main program at least, was entry A(6,1), but which has now become entry A(1,2) in the subprogram! How did that happen? Well, in the main program, we put the value "61" into A(6,1), which is the 6-th element when the array is written as a vector. But in the subprogram, when we ask for entry A(1,2), the program looks for the element K = (J-1)*LDA + I = (2-1)*5 + 1 = 6. So rather than looking at entry 11, we look at entry 6. The problem is that the data was stored with a leading physical dimension of 10, and must be retrieved in the same way! Of course, it is possible to use triple indexing in FORTRAN, which becomes even more complicated. But triple indexing is fairly rare. It is enough to remember that whenever you setup a two dimensional array and it passes from one routine to another, you must also see to it that the array is declared with the same first dimension. In our example above, it would be enough to change the PRINT routine to declare A(10,10), or to pass in the leading dimension as an argument, say "N", and declare "DIMENSION A(N,*)".

## FORTRAN Vectors

There are several matters to be aware of when declaring and using vectors in a FORTRAN program. The first item concerns the declaration of vectors. If a variable is a vector, it must be declared so, in every routine in which it appears. This can be done with a DIMENSION statement, or by modifying the type statement, if any. Thus, to declare that X is a real vector with 10 elements, we could declare
```  DIMENSION X(10)
REAL X
```
or
```  REAL X(10)
```
These statements define the type of X, declare it to be a vector, and ask that storage for 10 elements be set aside. However, if X is passed to a subroutine, it is not necessary to specify the size of X. It is still necessary, however, to declare X to be a vector. This could be done in any of the following ways:
• DIMENSION X(10)
• DIMENSION X(N)
• DIMENSION X(1)
• DIMENSION X(*)
• REAL X(10)
• REAL X(N)
• REAL X(1)
• REAL X(*)
The use of "*" as a dimension allows us to say that X is a vector, that it has been passed in to this routine, but that we don't care to specify the size of the vector. The statement "DIMENSION X(1)" is the old fashioned way of doing this, before the "*" was legal as a dummy dimension. The traditional way of declaring an array passed in to a subroutine is the "DIMENSION X(N)" statement, where N is also passed in to the routine, either in the argument list, or through a COMMON block. The advantage of a declaration like this is that array bounds checkers will have enough information to tell if references to X go out of bounds. The second item is the amount of storage used for vectors. In FORTRAN, the amount of storage allocated for a vector is determined by the declaration of the vector in the highest level routine using that vector. In other words, if a vector X is declared as a vector in subroutines A, B, and C, and A calls B and B calls C, then it is only the declaration in A that determines the amount of storage set aside. When passing a vector to a subroutine, normally just the name of the vector is used in the calling sequence, as in: CALL SUB(N,X) However, it is possible to use a subscripted entry of the vector as the argument: CALL SUB(N,X(1)) or CALL SUB(N,X(10)) The call using X(1) should be no different from the call using X. However the call which passes X(10) will actually give the subroutine a "smaller" vector than the original X, which will start at the 10-th element of X. Thus, if the original X contained 13 entries, the subroutine will receive what looks like 4, X(10), X(11), X(12) and X(13). This method of sending information to subroutines is generally only used to break up a large workspace vector into individual chunks. For example, a workspace vector TEMP might be set up, of size 1000, and then, once N were specified, say, interactively, three vectors of size N might be passed to a subroutine by a call like CALL SUB(TEMP(1),TEMP(N+1),TEMP(2*N+1)) Another item to note is the new ability of FORTRAN77 to allow the user to declare the range of indices for the vector. If a vector has N elements, they no longer must be addressed 1 through N, although that is the default that will be in force if the usual dimension statement DIMENSION X(10) or REAL X(10) is used. But if you wish to index these 10 entries by counting from 0 to 9, or from 3 to 12, or -10 to -1, you simply use an explicit range indicator, which is the least value, followed by the greatest value, separated by a colon, as in: DIMENSION X(0:9) or REAL X(3:12) or INTEGER X(-10:-1). Nothing has changed in the actual storage of the vector, and the nontraditional indexing only applies to a single routine where it is declared in the dimension or type statement.

## FORTRAN90 WHERE Statement

The FORTRAN90 standard includes the new "WHERE" statement, a new control statement that allows the conditional execution of statements depending on the values of an array expression. Note that "WHERE" is available on the Connection Machine, but not on the Cray CFT77 compiler. For example, suppose that we are studying a single molar volume of ideal gas, and that we have a set of pressures and temperatures at various locations, and wish to know the gas densities at those same locations. The formula D=P/(R*T) can be used. In this formula, D is the density, P the pressure, T is the temperature, R is a constant. If that were all that was going on, we could write a Fortran formula like:
```  do i = 1, n
do j = 1, n
dens(i,j) = press(i,j) / ( r * t(i,j) )
end do
end do
```
or, in FORTRAN90,
```  dens = press / ( r * t )
```
which would compute all the densities at once. Now suppose that whenever the pressure is less than 30.0, we just set the density to some fixed value, such as 1.23. We could assign the value 1.23 to all entries of DENS, and then reassign individual values:
```      do i = 1, n
do j = 1, n
dens(i,j) = 1.23
end do
end do

do i = 1, n
do j = 1, n
if ( press(i,j) .ge. 30.0 ) then
dens(i,j) = press(i,j) / ( r * t(i,j) )
end if
end do
end do
```
or, in FORTRAN90,
```  DENS = 1.23

WHERE ( PRESS .GE. 30.0 ) DENS = PRESS /( R * T )
```
By the way, the only reason for writing the old Fortran version in two separate loops was to make clear the correspondence with the new Fortran version. But let's further suppose that whenever the pressure is less than, say, 30.0, we want to use 30.0 in the formula. The old Fortran version would look like:
```  do i = 1, n
do j = 1, n
if ( press(i,j) .ge. 30.0 ) then
dens(i,j) = press(i,j) / ( r * t(i,j) )
else
dens(i,j) = 30.0 / ( r * t(i,j) )
end if
end do
end do
```
The corresponding FORTRAN90 statements would be
```      where ( press .ge. 30.0 )
dens = press / ( r * t )
elsewhere
dens = 30.0 /( r * t )
endwhere
```
The first WHERE statement is much like the one line IF statement, and the second is much like the block IF statement. The difference is again one of scalar versus vector notation. It may help to think of the WHERE statement as being the replacement for simple DO loops with a single IF or a block IF controlling the assignment of values to an array. We've demonstrated the WHERE statement before explaining it. Sometimes that works better! Here is the formal definition of the WHERE statement's syntax: WHERE (Logical Array Expression) Array = Array Expression or
```  WHERE (Logical Array Expression)
Array = Array Expression
ELSEWHERE
Array = Array Expression
ENDWHERE
```
Note that the arrays, array expressions, and logical array expression in this statement must be conformable. The logical array expression may be either a LOGICAL array or an array expression with a LOGICAL value.

## Loop Unrolling

Linear algebra algorithms involve vectors and matrices. A corresponding routine will therefore contain arrays and loops. A loop is a piece of a program which is to be repeated a number of times. In many cases, these loops are "nested". That is, a first loop is begun, containing operations to be repeated many times. One of those operations is itself another loop. For instance, to obtain the row sums of a matrix, you repeat an operation, namely, get the sum of the elements of a particular row. To add a particular row together, you repeat an operation: add the element from the next column to the current sum. This corresponds to the FORTRAN code:
```  do I = 1, M
do J = 1, N
ROWSUM(I) = ROWSUM(I) + A(I,J)
end do
end do
```
This set of nested loops has the property that we could interchange the loops and achieve the same results. That is, we could write
```  do J = 1, N
do I = 1, M
ROWSUM(I) = ROWSUM(I) + A(I,J)
end do
end do
```
which carries out the operations in a different order. Many linear algebra operations involving nested loops also have this property, that the loops can be interchanged. One feature of FORTRAN DO loops is the ability to increment the loop counter by a value other than 1. This is done by explicitly listing the value of the increment in the DO statement. Assuming N was even, for instance, we could rewrite our algorithm as:
```  do I = 1, M
do J = 1, N, 2
ROWSUM(I) = ROWSUM(I) + A(I,J) + A(I,J+1)
end do
end do
```
What's happened? The idea of using a DO loop is to take a sequence of operations that are very similar and "roll" them up into one generic sequence. We have slightly reversed this process, by writing out every two steps of the operation rather than a single one. This is called "unrolling" a loop. Notice that we could also unroll the I loop. Again, for convenience, we'll assume that M is a multiple of 2. The resulting program is slightly different:
```  do I = 1, M, 2
do J = 1, N
ROWSUM(I) = ROWSUM(I) + A(I,J)
ROWSUM(I+1) = ROWSUM(I+1) + A(I+1,J)
end do
end do
```
Here we are updating the value of two different row sums, whereas in the previous loop, we were updating the value of one row sum, but doing it twice as much. There's no reason to restrict ourselves to using 2 as the increment. On the other hand, it's not at all obvious yet why we are doing this. The surprising thing is that both the order of loops and the amount of unrolling can affect the speed with which a loop is executed on a computer, especially the Cray. There are a lot of technical reasons for this. However, the order of the loops, at least on the Cray, determines which variables will be treated as vectors, and how large those vectors will be. But more interestingly, on any machine, unrolling can cut down on "useless" operations. In particular, on every iteration of the simple loop, we have to: Get A(I,J); Get ROWSUM(I); Add A(I,J) to ROWSUM(I); Store ROWSUM(I). Only the addition step does us any real good. Each A(I,J) is fetched just once, so there's no way around that. But each ROWSUM(I) is fetched M times, and surely we can cut down on that! Well, in fact, we can. Certain ways of unrolling the loop essentially result in us fetching ROWSUM(I) just M/2 times. For instance, it turns out that the best way to unroll on the Cray would be:
```  do J = 1, N, 2
do I = 1, M
ROWSUM(I) = ROWSUM(I) + A(I,J) + A(I,J+1)
end do
end do
```
• we're not unrolling the inner loop index, (the one which is vectorized) so we are still getting vector loops of size M. (This is an advantage only on the Cray.)
• the resulting unrolled code has only a single assignment operation. So for each iteration of the loop, we only have to fetch and store ROWSUM(I) once. And actually, we are halving the number of times that each ROWSUM entry is fetched and stored.
The advantages of unrolling vary from computer to computer, and there's no guarantee that it always helps. The amount of unrolling, that is, the value of the increment, is also a system dependent quantity. Unrolling makes the inner loop more complicated, meaning that more instructions are generated. On the Cray, this can start to penalize performance when the increment becomes, say 16 or 32. The early BLAS routines used unrolling to try to improve performance. However, in many cases, this did not help, at least on the Cray, because the loops were only simple, rather than nested. This is one reason why the higher level BLAS routines (matrix-vector and matrix-matrix levels) were developed.

## Selecting Regularly Spaced Array Entries

FORTRAN90 includes a new array notation that is supposed to replace simple DO loops. But DO loops have a lot of power. We can modify what the statement A(I,J)=B(I,J) does in a DO loop in many ways, by changing the starting and stopping values of the index, or also the increment. The new array notation includes the same abilities. It does so by borrowing the ":" notation introduced with Fortran CHARACTER variables. Recall that if A was a CHARACTER string, then A(3:9) was the string of 7 characters that started with the third, and ended with the ninth character of A. Similarly, if A is a vector, A(3:9) is also a vector, but of a smaller range. Using the ":" notation, it is clear that we can manipulate pieces of vectors and arrays. For example, if A were a vector of size 10, and B a vector of size 20, then we could say A = B(1:10) or B(1:10) = A or B(3:12) = A or even B(3:5) = A(5:7) If we omit the value before the ":", the first index (usually, but not necessarily, equal to 1) is assumed. If we omit the value after the ":", the last index is assumed. Thus, assuming the declarations:
```  REAL A(10)
REAL B(20)
REAL C(-2:17),
```
then the statement
```  A(:) = B(1:10)
```
is equivalent to any of the statements:
```  A(1:10) = B(1:10)
A = B(1:10)
A(:10) = B(1:10)
A(1:) = B(1:10)
A = B(:10)
```
and the statement C(:7) = A is equivalent to: C(-2:7) = A Note at this point that we could write the formula A(2:5) = A(1:4) What is the result? If we interpreted this the way it would be done as a Fortran DO loop, the value in A(1) would be copied into A(2), then the value in A(2) would be copied into A(3)...but wait, the value in A(2) was just overwritten by the value in A(1). So in fact the first five entries of A will all have the same value. This is not the way FORTRAN90 will interpret the statement A(2:5)=A(1:4). FORTRAN90 requires that an array assignment's value must not depend on the order that the individual assignments are actually carried out. Thus, FORTRAN90 will essentially shift the values over in the vector. In the figure below we suggest what the results of A(2:5)=A(1:4) would be if carried out in a DO loop, or in FORTRAN90: Old array A: 1.0 2.0 3.0 4.0 5.0 DO loop: 1.0 1.0 1.0 1.0 1.0 FORTRAN90: 1.0 1.0 2.0 3.0 4.0 The effect of the DO loop increment in scalar Fortran is modeled in the FORTRAN90 indexing scheme by the use of an optional second ":" followed by an increment. Thus, assuming REAL A(10) and REAL B(20), to copy the odd entries of B into A we would write A = B(1:20:2) which we could abbreviate to A = B(::2) This is equivalent to the DO loop:
```  do I = 1, 20, 2
A(I) = B(1+2*(I-1))
end do
```
The even entries would be copied by A = B(2:20:2) For simplicity, the discussion of indexing has been given for vectors. The same rules apply to arrays. For example, REAL A(10), B(20,10), C(50) B(1,:) = A copies A into the first row of B. B(5:14,3) = A copies A into entries 5 through 14 of column 3 of B. C(1:10) = A + B(20,:) adds A and row 20 of B elementwise, and copies the results into the first 10 elements of C. The ":" notation is an example of an ARRAY SELECTOR. Array selectors are operators that define a new array by selecting elements of a given one. Thus, the expression X(4:10) defines a new array comprising 7 elements starting with X(4). A(:,1:N:2) defines a two dimensional array comprising the even columns of A. An array which is defined by an array selector is called an ARRAY SECTION. An array section may appear on the right or left hand side of an assignment statement, but no where else. In particular, array sections may not appear as arguments to a subroutine, except for intrinsic functions, such as SQRT. If X is an array of dimension 10, and you wish to pass to a subroutine the subvector starting at entry 3, a correct call is
CALL SUBROUTINE ( X(3) )
but it is INCORRECT to say
CALL SUBROUTINE ( X(3:10) )
If you need the power of array selection to create arguments to a subroutine, you may have to use a temporary variable. Thus, to pass the even columns of an array A, you might set up an array TEMP and use commands like:
```  REAL A(10,10)
real TEMP(10,5)

TEMP = A(:, 1:10:2)
CALL SUBROUTINE ( TEMP )
A(:, 1:10:2) = TEMP
```

## Selecting Arbitrary Array Entries

The FORTRAN90 ":" array selector is only capable of creating array sections comprising regularly spaced entries of the primary array. Using a vector as our sample array, we can define X(1:N:2) = ( X(1), X(3), X(5), ... X(5:10:3) = ( X(5), X(8) ) X(N:1:-1) = ( X(N), X(N-1), ..., X(2), X(1) ) but there is no way to create an array section of the form ( (X(1), X(2), X(5) ) since the spacing between indexes varies. The irregular accessing of elements of a vector or an array is very common. Extracting arbitrary elements of an array is called GATHERING, and changing or updating arbitrary elements is called SCATTERING. The FORTRAN90 construct for handling this operation is called VECTOR VALUED SELECTION. FORTRAN90 allows you to carry out gathering and scattering by setting up an index vector and using the same sort of brief symbolic notation to denote the entire operation. For example, if you have a vector IP of indices, you might gather elements of X by a statement like Y = X(IP) and scatter elements of X by a command like X(IP) = Z What is meant by such statements? The Fortran 77 equivalents would be
```      REAL X(N), Y(M), IP(M), Z(M)

do I = 1, M
Y(I) = X ( IP (I) )
end do

do I = 1, M
X ( IP (I) ) = Z(I)
end do
```
Thus, you can see that nothing extremely mysterious is going on here. FORTRAN90 has simply allowed us to write a complicated expression in a simple shorthand. Let's go back to our example earlier, and see how we would express it using this notation. We want to create an array section containing the elements (X(1), X(2), X(5)). To do so, we must set up a vector containing the indices 1, 2 and 5. Let's assume that we want to add 1 to each of these elements of X. One version of our program might be:
```  integer ip(3)
real x(10)
real y(3)

IP(1) = 1
IP(2) = 2
IP(3) = 5

Y = X(IP)

Y = Y + 1.0

X(IP) = Y
```
In fact, if all we want to do is add 1.0 to those elements of X, we can write X(IP) = X(IP) + 1.0 It is possible to write ambiguous formulas when using index vectors. It is not a problem when "reading" values. Consider, for example, the following program fragment:
```      integer IP(2)
real X(10)
real Y(2)

IP(1) = 1
IP(2) = 1

Y = X(IP)
```
This is equivalent to saying
```  Y(1) = X(1)
Y(2) = X(1)
```
However, if we flipped the assignment line,
```  X(IP) = Y
```
we will be trying to assign the values Y(1) and Y(2) to X(1), simultaneously. In scalar mode, we must write the two formulas down in one order or the other, and thus there is no ambiguity. But in the array syntax, we are essentially saying "replace the value of X(1) by the value of Y(1), and while you're at it, replace the value of X(1) by the value of Y(2)". The standard says that in such a case the compiler is free to assign any one of the multiple values to the target entry. Thus X(1) will be equal to Y(1) or Y(2), but we have no guarantee which. A well written program will not have to worry about such a detail, but it is important to know that such things can occur. If you enjoy complication, you can try to figure out what the following expressions mean:
```  real A(10)
integer IS1(20,20)
integer IS2(5)
integer IS3(5)

A(IS2)=2.0

A(IS2+IS3)=3.0

A(IS1(IS2,3))=4.0
```