2-13  USING STRINGS AND CHARACTER ARRAYS
 ****************************************

 Comparison between strings and character arrays
 -----------------------------------------------

                              |   Strings            |   Character arrays
 =============================|======================|========================
  Substring notation          | ST(I:J)              | not allowed
 -----------------------------|----------------------|------------------------
  Array notation              | not allowed          | AR(I)
 =============================|======================|========================
  Constant declaration syntax | CHARACTER ST*10      | CHARACTER AR(10)
 -----------------------------|----------------------|------------------------
  Block I/O operations        | whole [sub]string    | only with an implied DO
  of the constant notation    |                      |
 =============================|======================|========================
  Star declaration syntax     | CHARACTER ST*(*)     | CHARACTER AR(*)
 -----------------------------|----------------------|------------------------
  Semantics of the star       | the length is passed | no length information
  declaration syntax          | transparently        | is passed
 -----------------------------|----------------------|------------------------
  Block I/O operations        | whole [sub]string    | only with an implied DO
  of the star notation        |                      |
 -----------------------------|----------------------|------------------------
  Mechanisms used for         | hidden argument,     | you are responsible
  passing the length          | descriptor,          | to keep inside bounds
 =============================|======================|========================
  Variable declaration syntax | CHARACTER ST*(N)     | CHARACTER ST*(N)
 -----------------------------|----------------------|------------------------
  Variable declaration        |                      | the usual adjustable
  semantics                   |                      | array mechanism
 -----------------------------|----------------------|------------------------



 The blank padding
 -----------------
 When you declare a FORTRAN's string, you define the maximal length it can 
 have ("the physical length"). Whole string operations "using" only "part"
 of it, e.g. assignment of a shorter string, or reading a shorter record, 
 automatically pads the rest of the string with blanks (spaces).

      CHARACTER         STRING*12
      ...........................
      STRING = 'FORTRAN'


    |--------------- Physical Length ---------------|
    +---+---+---+---+---+---+---+---+---+---+---+---+
    | F | O | R | T | R | A | N |   |   |   |   |   |
    +---+---+---+---+---+---+---+---+---+---+---+---+
    |------ Logical Length -----|---- Blank tail ---|


 It is clear that the physical length doesn't change, but the logical 
 length may change on each assignment or read operation.

 A subtle point about FORTRAN strings is that the "logical" length of the 
 string is not well-defined - it is defined only up to an arbitrary number 
 of trailing blank characters. Having assigned some text to a string all
 the information on the original number of trailing blank characters is
 irreversibly lost, e.g. trying to concatenate the text in two strings is
 ambiguous, you can't be sure if there was one or more blanks at the end 
 of the first one. 

      CHARACTER         ST1*10, ST2*10
      ...........................
      ST1 = 'FORTRAN'
      ST2 = 'FORTRAN '
      IF (ST1 .EQ. ST2) WRITE (*,*) 'The strings are equal! '

 The blank padding at the end of the string is counted when you use the 
 LEN() function to find the string's length, or when you WRITE the string.
 To find the "true" length of the string use:

      integer function strlen(st)
      integer		i
      character		st*(*)
      i = len(st)
      do while (st(i:i) .eq. ' ')
        i = i - 1
      enddo
      strlen = i
      return
      end

 Strings don't come initialized with blanks, if the compiler initializes 
 them (VMS, Sun) they are initialized to NULs. Note that some terminals 
 (e.g. VTnnn) ignore NUL characters and if such a string is written to the 
 screen there will be no visible output (except the start of a new line).



 Self-assignment of strings
 --------------------------
 Be careful when assigning strings to themselves, the FORTRAN 77 standard
 prohibits some common situations (Fortran 90 lifted this restriction). 
 A string (or sub-string) STR may not be assigned a character expression 
 that one of its components is an overlapping substring of STR itself.

 A small example program:

      PROGRAM SLFASS
      CHARACTER
     *				STRING*10
      STRING = '1234567890'
      STRING(2:) = STRING
      WRITE (*,*) ' Correct result is: 1123456789 '
      WRITE (*,*) ' Local result is:   ', STRING
      END

 On some machines you'll get a string composed of 1s only!

 The FORTRAN 77 standard didn't allow "self assignments" because the
 "right" way to do it requires (in the general case) using a temporary 
 character variable whose length is known only at run-time. Some older
 machines used at the time the FORTRAN 77 standard was written had
 problems with dynamic (run-time) memory allocations, to accommodate 
 their needs the standard choosed to restrict character assignments.

 A possible workaround for "self assignments" is concatenation
 with a null string:

      STRING(2:) = STRING // ''



 Null strings
 ------------
 The FORTRAN standard doesn't allow null constant strings (strings 
 with length = 0), you can check that with a small program:


      PROGRAM NULSTR
C     ------------------------------------------------------------------
      CHARACTER*1 STRING
C     ------------------------------------------------------------------
      STRING = ''
      WRITE(*,*) ' STRING= |', STRING, '|'
C     ------------------------------------------------------------------
      END



 Input/Output
 ------------
 FORTRAN supports input and output of strings, a very convenient 
 feature, and a rich set of string operations.

 You can use WRITE and READ with passed strings since they are not
 assumed-size strings, although the syntax looks similar.



 Sub-string manipulations
 ------------------------
 The following code shows some elementary 'tricks':

      INTEGER           OFFSET1, OFFSET2
      CHARACTER         STRING1*20, STRING2*20
      ......................................
      STRING1 = 'bla bla bla (FORTRAN) bla bla ... '
      OFFSET1 = INDEX('(', STRING1) + 1
      OFFSET2 = INDEX(')', STRING1) - 1
      STRING2 = STRING1(OFFSET1:OFFSET2)
      STRING2 = ' ' // STRING2
      WRITE(UNIT=*, FMT=*) STRING2
      WRITE(UNIT=*, FMT=*) STRING2 // STRING2
      WRITE(UNIT=*, FMT=*) STRING2 // STRING2 // STRING2


 INDEX() is an intrinsic standard FORTRAN function - a function that
 every FORTRAN compiler knows. INDEX takes two arguments, both of them
 are strings, it looks for the first string inside the second and
 returns the place the first string begins inside the second.

 For example:

        ST1 = 'good'
        ST2 = 'Fortran is good'
               123456789012345

        INDEX(ST1,ST2) is equal 12


 The // is the concatenation operator, it takes two strings and 'adds'
 them one after the other, to form one larger string. 

 You may use the // operator with passed string operands only in 
 assignment statements. Other FORTRAN statements (e.g. WRITE) may 
 accepts string concatenations, but it's against the standard.


 For example:

C     ------------------------------------------------------------------
      CHARACTER
     *              ST1*7,
     *              ST2*3,
     *              ST3*5
C     ------------------------------------------------------------------
      ST1 = 'Fortran'
      ST2 = ' is'
      ST3 = ' good'

        ST1 // ST2 // ST3  is equal  'Fortran is good'

 If the strings were defined with lengths larger than the non-blank 
 content, they would be padded by blanks, and when the // operator 
 will be applied the strings complete with the padding blanks will 
 be concatenated together to produce a rather ugly result.

 You can find the beginning of the blank padding (maybe with the 
 INDEX function), and use a substring excluding it.

  +-------------------------------------------------+
  |    USE STRINGS TO MANIPULATE FILE NAMES, ETC    |
  +-------------------------------------------------+


Return to contents page