Strings
Peter Suber, Computer Science, Earlham College

Standard Pascal can deal with strings of characters. But everyone agrees that standard Pascal is clumsy and cumbersome with this essential data type. Turbo Pascal has a much more friendly and powerful way of dealing with strings. We will use it in this course, but only with this warning: it is not standard Pascal. That means your TP programs that use its string operations may not be portable to standard Pascal environments. However, that warning given, you should also know that virtually every implementation of Pascal has non-standard extensions for dealing with strings, and most of them resemble TP's very closely.

A string is a series of characters. A word is a string; a sentence is a string; a paragraph is a string; a program is a string. In TP we can declare a variable to be of type string. ("String" is a reserved word in TP.)

var Name : string;

"Name" is a variable that can take as a value any string up to 255 characters in length, the maximum allowed by TP. Hence

Name := 'Santa Claus';

is a valid assignment. Note that the string is quoted in the assignment statement, just as a character would be. The space between 'Santa' and 'Claus' is part of the string. A string is a series of characters, and a space is a character. If we assigned a string to Name that was longer than 255 characters, only the first 255 characters would be stored in the variable; the rest would be discarded.

If we know we will be using strings about as long as 'Santa Claus', then we won't want to waste the memory required for a 255 character string. So we can declare Name this way:

var Name : string[20];

This carves out only enough memory to hold a string of 20 characters. If we assign Name a longer string, only the first 20 characters will be stored and the rest will be discarded.

Actually, TP carves out enough memory for the characters in your string, and then reserves one extra byte to hold the length of the string. This is handy because when you write

writeln(Name);

you want to print only the characters occupied by the variable's value (that is, 'Santa Claus'), not all 20 characters whether they are occupied or not. The "writeln" procedure consults the length of the string before printing it, and prints only that many characters.

TP has many standard procedures and functions for the most common operations you would want to perform on strings. If you want to do something more exotic, you can write your own procedures and functions, using those provided by TP as the basic ingredients.

1. The byte in memory indicating the length of a string is accessible to "writeln" automatically. It is also accessible to you if you ask. You ask with Length.

HowLong := Length(Name);

After this assignment, HowLong has value 11.

2. Copy lets you take a substring out of a string and use it for another purpose. Given a string, it returns the substring you specify.

FirstName := Copy(Name,1,5);

In this statement, Copy takes from Name the substring that starts at character 1 and occupies the next 5 characters. FirstName would get the value 'Santa'. Copy does not alter the original string.

3. Delete removes a substring from a string. Given a string, it returns that string minus the substring you specify.

Delete(Name,1,5);

This is a procedure, not a function, so it cannot stand in an assignment statement. Delete alters the original string. It deletes from Name the substring that starts with character number 1 and occupies the next 5 characters. After this procedure call, Name would be ' Claus'.

(Technical aside: In the procedure Delete, the string identifier is passed as a variable parameter; hence it can come back with a different value.)

4. Insert puts a substring into a string, at any location in the string that you specify.

Alias := '"Sled Wizard" ';
Insert(Alias,Name,6);

The result of these statements, is that Alias is inserted into Name at the 6th character position. Name is now

Santa "Sled Wizard" Claus

Insert also alters the original string.

(Technical aside: Insert passes the addition to be inserted as a value parameter, and the string to take the insertion as a variable parameter.)

5. Pos searches for a substring in a string. If it finds the substring, it returns its position; otherwise it returns 0.

Position := Pos('ant',Name);

Since 'ant' is a substring of Name ('Santa Claus'), Pos will return its position, which is 2; that is, 'ant' begins with the second character of Name. So Position would have value 2 after the assignment above. After this assignment,

Position := Pos('hubba hubba',Name);

position would have value 0.

6. Concat concatenates two or more strings into one jumbo string.

FullName := Concat(Name,', Esquire');

This statement adds a comma, then a space, then 'Esquire' to the end of Name, yielding

Santa Claus, Esquire

Concat can take any number of strings as arguments for concatenation, separated by commas. They may be quoted strings or variables of type string. However, if the concatenated result would exceed 255 characters, then only the first 255 characters survive. Concat does not affect the argument-strings.

7. You can use the plus sign '+' as an alternative to the Concat function.

FullName := Name + ', Esquire';

The plus sign is actually more flexible than Concat. Its operands can be strings, characters, or packed arrays (or any combination of these), whether they are variables or tokens. The result, however, is always of type string. Of course it has the usual 255 character limit. Like Concat, '+' does not affect its argument-strings.

8. Strings can be compared by the standard Pascal boolean operators. For example, if we let

FirstName := 'Santa';
Surname := 'Claus';

then these boolean expressions are either true or false:

FirstName = Surname
{false: these strings are not identical}
FirstName <> Surname
{true: these strings are not identical}
FirstName < Surname
{false: 'Santa' is not alphabetically prior}
FirstName > Surname
{true: 'Santa' is alphabetically posterior}
FirstName <= Surname
{false: 'Santa' is not alphabetically prior or equal}
FirstName >= Surname
{true: 'Santa' is alphabetically posterior or equal}

Note that these equalities and inequalities refer to alphabetical order, not length. Actually, "alphabetical" order must be qualified. These operators use ASCII order. That means, among other things, that all the capital letters are prior to all the lower case letters.

These operators can compare any two variables or tokens of type string. They can compare packed arrays, however, only if they are of the same length.

If you want to perform such comparisons on length, rather than "ASCII alphabetical order", the syntax is easy:

length(FirstName) = length(Surname);
length(FirstName) <> length(Surname);
length(FirstName) < length(Surname);

9. Val converts a string to a number, if the string consists of nothing but numerals. If we have

ZipCode := '47374';

then we have a string consisting of numerals. We can print it, and compare it with other strings for ASCII order priority, but we cannot do arithmetic with it.

Val(ZipCode,Number,ErrorCode);

After this command, Number contains the numerical value represented by the string ZipCode. Number can be declared to be of type integer, or type real. If the string to be converted to a number contains an initial hyphen, then it is converted to a negative number. If it contains a decimal point, it is converted to a real number; so if Number is not declared to be a real number, then Val will trigger a type-clash. The third parameter ErrorCode returns zero if the conversion went smoothly. It returns a non-zero value if the string was not suited to numerical conversion. The non-zero value is the position in the string of the first ineligible character. Val does not affect the original string.

For example, '23Skidoo' starts as a numeral but includes non-numerical characters. A call to

Val('23Skidoo',Number,ErrorCode);

would leave Number unchanged. ErrorCode would have value 3, since the third character of '23Skidoo' is the first non-numerical character. Hence, you will usually want to use code somewhat like the following when calling Val:

Val(String,Number,ErrorCode);
if ErrorCode <> 0
then begin
write('Error caused by ',String[ErrorCode]);
writeln(' at position ',ErrorCode,' of string.');
end
else GoForIt;

10. Str is the opposite of Val. It converts numbers into strings. Let Temperature be of type real, and set to 98.6.

Str(Temperature,NewString);

After this statement, NewString would be '9.8600000000E+01'. If we want the string to take a friendlier appearance, we can use the field width and decimal place indicators just as we would in a "writeln" statement.

Str(Temperature:1:1,NewString);

Now NewString is '98.6'.

11. A string is a kind of array, namely, an array of characters. A string has properties that a simple array of characters lacks, but you can still refer to the characters within a string using array notation. For example, Name[2] refers to the second letter of Name. If we have

Which := 1;

then Name[Which] refers to the first character of Name.

12. However, even though Name consists of nothing but characters, Name[2] is not of type char, nor is it entirely compatible with type char. Instead it is of type string[1], that is, a string of length 1. Now a string of length 1 and a character are indistinguishable for most purposes. But starting with version 4.0, TP has used very strict type-checking for the sake of speed and efficiency.

Despite this strict type-checking, if Letter is a variable of type char, then these two assignments

Letter := Name[2];
Name[2] := Letter;

will both compile and have their intended effects. This gives us a quick way to change single characters inside strings.

The strict type-checking shows up with variables declared to be of type string[1] in the first place. If we let ShortString be such a variable, then

Letter := ShortString;

is a syntax error. Nevertheless,

ShortString := Letter;

will compile.

(A string variable can take a packed array in an assignment statement, but not vice versa.)

*

Review of string operations, procedures, functions

  1. Length(String) is a function that returns the length of String in characters.
  2. Copy(String,StartChar,HowManyChar) is a function that returns a substring from String, starting with StartChar and ending HowManyChar later. Copy leaves String intact.
  3. Delete(String,StartChar,HowManyChar) is a procedure that removes a substring from String, starting with StartChar and ending HowManyChar later. Delete alters String.
  4. Insert(NewString,String,StartChar) is a procedure that inserts NewString at the StartChar position of String.
  5. Pos('test',String) is a function that returns the position of the 'test' string inside String. If 'test' does not occur in String, Pos returns 0.
  6. Concat(String1,String2,...,StringN) is a function that returns a single long string composed of String1, String2, ..., StringN laid end to end in that order.
  7. String1 + String2 + ... + StringN has the same effect as Concat. The plus sign can join string variables or tokens, or mix strings with characters and packed arrays.
  8. String1 = String2 (and so on with: <>, >, <, >=, and <=) are boolean expressions that are either true or false. They compare strings for their place in "ASCII alphabetical" order.
  9. Val(NumericalString,Number,Result) is a procedure that assigns the numerical value represented by NumericalString to Number, when Number is an already-declared variable either of type integer or real. If NumericalString is not suitable for conversion --if it has any non-numerical characaters in it-- then Result will return the position of the first ineligible character in it; otherwise Result will return 0.
  10. Str(Number,NumericalString) is a procedure that assigns the string of numeral-characters composing Number to the string, NumericalString.
  11. String[n] refers to the nth character of String.
  12. Expect strict type-checking, for example between characters and strings of length 1. While String[n] := Character and Character := String[n] are both valid assignments, String[n] is not of type char; instead it is of type string[1]. If ShortString is of type string[1], then ShortString := Character is a valid assignment, but Character := ShortString is a syntax error.

Reserved words:

string, copy, delete, insert, pos, concat, val, str.

This file is an electronic hand-out for the course, Programming and Problem Solving.

[Blue
Ribbon] Peter Suber, Department of Philosophy, Earlham College, Richmond, Indiana, 47374, U.S.A.
peters@earlham.edu. Copyright © 1997, Peter Suber.