Proving programs correct: Some techniques and examples

Proving the correctness of computer programs is justified as both advantageous and feasible. The discipline of proof provides a systematic search for ...

6 downloads 277 Views 998KB Size

Download PDF

BIT 10 (1970), 168-182

PROVING

PROGRAMS

SOME TECHNIQUES

CORRECT:

AND EXAMPLES

RALPH L. LONDON Abstract. Proving the ccrrectness of computer programs is justified as both advantageous and feasible. The discipline of proof provides a systematic search for errors, and a completed proof gives sufficient reasons why the program must be correct. Feasibility is demonstrated by exhibiting proofs of five pieces of code. Each proof uses one or more of the illustrated proof techniques of case analysis, assertions, mathematical induction, standard prose proof, sectioning a n d a table of variable value changes. Proofs of other programs, some quite lengthy, are cited to support the claim t h a t the techniques work on programs much larger than the examples of the paper. Hopefully, more programmers will be encouraged to prove programs correct.

The advantage and feasibility of proved programs. The often asked question, "How can one demonstrate that a computer program does what it is supposed to do?" has at least three answers: (i) use the standard debugging technique of testing "representative" data and checking the results, (fi) read the program and mysteriously discern that it works, thereby convincing oneself that the program is right or (iii) give a rigorous mathematical proof of the correctness of the program. The purpose of this paper is to demonstrate t h a t it is advantageous, feasible and realistic to employ the third alternative--to give proofs of correctness. It is advantageous because the discipline of proof provides a systematic search for errors; a completed proof shows the absence of program errors and gives sufficient reasons why the program must be correct. In other words, proof provides a new way to certify programs [10]. I n fact errors in several existing programs have been uncovered only in this w a y - - b y the proof process. Additional advantages are given by Dijkstra [2] and Naur [15]. The proofs are not superfluous, nor are the advantages illusory. I t is simply not true that even expert programmers design programs in such a way t h a t from their design t h e y can demonstrate that the programs achieve the desired results. (Notable exceptions are Dijkstra [2, 3] and Tobey [23]). While one m a y argue that programs should always be so deReceived J'uly 2, 1969.

PROVING PROGRAMS CORRECT: SOME TECHNIQUES AND EX A MP LES

169

signed, in fact this is seldom the case, since, for example, programs are often written to be efficient and elegant rather than to be proved. Since we do care that programs produce correct results, there is need for the actual existence of proofs of correctness. However, the proofs are not intended to be of interest in their own right as mathematical objects of study but rather are of value for what they actually prove. That the entei:prise of giving proofs is feasible and realistic will be demonstrated by exhibiting proofs of five pieces of code. The intent is, of course, to show proof techniques being used and not merely to prove these particular pieces of code. The sample proofs also show (i) just what is meant by a proof, (if) how one might state what needs to be proved and (iii) the implicit assumptions made. The illustrated techniques are also candidates for what Cooper [1] sought when he asked for "techniques for producing shallow results about large programs." Useful for dealing with practical programs, the techniques include case analysis, assertions, mathematical induction, standard prose proof, sectioning and a table of variable value changes. Where necessary, they are explained as they are used. Hopefully, by using these and other techniques, more programmers will be encouraged to and be able to prove their own programs and those of others.

An overview of the examples.

Each example code segment is small permitting the inclusion of both a statement of what is to be proved and a proof of the statement. More realistic examples would simply be longer without additional gain. Naur [15] even notes a simple 4-line Algol segment consisting of "some 7 operators and 12 operands" and states, "This is more than what can be grasped immediately by anyone . . . . " Each of the five examples has been selected to illustrate one or more useful proof techniques. Since none of the examples involves a deep mystery of why the code works, it might be argued that several of the examples have a one-word proof obvious--in accord with the second answer to the opening question. This m a y be, but the intent here is to show how to make the reasons explicit. The same techniques have been used successfully to prove the correctness of realistic and running programs where the complexity of each program is such t h a t the correctness is far from obvious [4, 6, 8-12, 14, 20]. Moreover, the relative simplicity of the examples allows the reader to concentrate on the techniques of proof and the concept of proof unencumbered by unnecessarily complex code.

170

RALPH L. LONDON

Only the last example a n d the recursive version of the f o u r t h e x a m p l e were w r i t t e n b y t h e a u t h o r and expressly for this paper. T h e rest are t a k e n f r o m o t h e r sources a n d were w r i t t e n b y others for purposes t o t a l l y u n r e l a t e d to a n y a t t e m p t at proof. T h e examples are therefore n o t concocted. All t h e sample code segments are ones t h a t m i g h t be useful in a b r i d g e bidding program. Bridge bidding is a useful source of exercises in prog r a m m i n g t e x t s [for e x a m p l e 16, 19], especially exercises t h a t emphasize logical r a t h e r t h a n numeric tasks. Bridge bidding is a finite d o m a i n w i t h a small b u t c o n v e n i e n t set of relations which often m u s t be e x p l o i t e d in a proof. Of course, t h e r e is no round-off error to consider in these code segments. Indeed, all numeric variables are assumed t o be of t y p e integer. Bridge bidding, in short, provides c o n v e n i e n t examples which t h e r e a d e r can follow w i t h o u t knowing t h e g a m e of bridge. I t is assumed only t h a t t h e r e a d e r knows t h a t a bridge h a n d consists of 13 cards f r o m a s t a n d a r d 52-card deck.

The technique of case analysis. T h e first example d e m o n s t r a t e s the value of case analysis as a proof technique. The uninterest, even dullness, of this m e t h o d should n o t obscure its e x t r e m e usefulness. T h e example appears as a p r o g r a m m i n g exercise in [19] in which t h e s t u d e n t is asked to write a routine to m a k e the opening bid for a bridge h a n d according to the specifications in Condition A. P O I N T S <-_12 and L O N G E S T <=6 B. 13 <=POINTS <=19 and ( S T O P P E R S --,false or B A L A N C E D =false) C. 16 <=P O I N T S <=19 and ( S T O P P E R S = true and B A L A N C E D = true) D. POINTS>=21 and ( S T O P P E R S = f a l s e or BALANCED.--~false) E. P O I N T S ~ 21 and ( S T O P P E R S =true and B A L A N C E D .=true) F. 7 <=POINTS<_12 and LONGEST>=7

Bid Pass

1 longest suit 1 no trump 2 longest suit 2 no trump 3 longest suit

Definitions of terms used: BALANCED true if each suit contains at least three cards. L O N G E S T - - n u m b e r of cards in the longest suit. P O I N T S - - A c e , 4; King and suitlength ~2, 3; Queen and suitlength _-_3, 2; -

-

Jack and suitlength _>_4, 1; Void (suitlength =0), 3; Singleton (suitlength = 1), 2; Doubleton (suitlength =2), 0[not 1]. S T O P P E R S - - t r u e if each suit contains Ace or (King and suitlength 22) or (Queen and suitlength __>3). Figure 1, Opening bidder from reference 19.

PROVING PROGRAMS CORRECT: SOME TECHNIQUES AND EXAMPLES

171

Figure 1. I t is irrelevant t h a t the specifications m a y be unsound strategy for expert bridge bidding; nothing will be proved concerning this. The intent is t o state and to prove several properties of this bidding algorithm. LEM~A 1.1. Each hand will meet the conditions of at most one bid.

PROOF. Tests A and F require P O I N T S < 12, tests B and C require 13 < P O I N T S < 19 and tests D and E require P O I N T S > 21. Hence it remains only to show t h a t a h a n d cannot bid a t both A a n d F , nor a t b o t h B and C nor a t b o t h D and E. Since b y DeMorgan's law, n o t ( S T O P P E R S = t r u e and B A L A N C E D = t r u e ) is ( S T O P P E R S = f a l s e or B A L A N C E D =false), pairs B - C and D - E are eliminated. The pair A - F is eliminated since A requires L O N G E S T < 6 while F requires L O N G E S T > 7. Q.E.D. I t is n o t true t h a t at least one bid is prescribed for each hand, namely L~MMA 1.2. T h e case analysis in Figure 2 shows which classes of hands make which bid.

Class of Hand

Bid Made (Test)

1. P O I N T S < 6 a. LONGEST < 6 b. LONGEST >=7

Pass (A) NO BID M A D E

2. 7 `=7 b. LONGEST `<6

3 longest suit (F) Pass (A)

3. 13 "
1 longest suit (B) NO B I D M A D E

4. 16 `
1 longest suit (B) 1 no trump (C)

5. P O I N T S = 20

NO B I D M A D E

6. P O I N T S ~_21 a. S T O P P E R S =false or B A L A N C E D =false b. S T O P P E R S =true and B A L A N C E D =true

2 longest suit (D) 2 no trump (E)

Figure 2. Analysis of opening bidder of figure 1. PROOF. Classes 1 through 6 are disjoint since the respective P O I N T S are disjoint. Moreover, the six classes exhaust the range of P O I N T S and

172

RALPH L, LONDON

also subclass b in each case is the n e g a t i v e of subclass a. H e n c e F i g u r e 2 applies to all h a n d s a n d gives a unique result for each h a n d . The proof consists of verifying first t h a t each class of h a n d meets t h e test for t h e g i v e n bid, for e x a m p l e class 3a, n a m e l y 13 __

_ 17 and T O T A L P T S >=19) or (SUITLEtVGTH[spades] = 5 and S U I T P O I N T S [ s p a d e s ] >=5) or (SUITLE_NGTH[hearts] -- 5 and SUITPOI1VTS[hearts] > 5)) H . H C P > 1 7 and LO2~GEST>=5 I. H C P >=17 and L O N G E S T = 4 and S U I T P O I N T S [ l o n g e s t suit] ~SP[2] J . H C P ~ 1 8 and E V E N D I S T and S T O P = 4

Bid 1

1 no trump 1 club 1 club 1 club

1 1 club is a strong conventional bid of the bidding system. Definitions of terms used: DISTP the number of distributional points : void, 3 ; singleton, 2; doubleton, 1. E V E i V D I S T - - true if D I S T P <-1. H C P - the number of high card points: Ace, 4; King, 3; Queen, 2; Jack, 1. LONGEST length of the longest suit. SP[I] . . . . ,SP[4J - - the four S U I T P O I N T S ordered such that S P [ l J ~ S P [ 2 ] >SP[3] ->_SP[4]. S T O P - - the number of suits for which suitlength R 3 or (suitlength = 2 and Ace or King present) or singleton Ace. S U I T L E N G T H [ s u i t ] - the number of cards in the suit. S U I T P O I N T S [ s u i t ] - - t h e number of H C P s in the suit. TOTALPTS--HCP suitably modified: add to H C P for each suit with at least five cards, 2(suittength-5)+ 1; for each singleton Ace, 1; for having all four Aces, 1; for the hand, 1; and subtract from H C P for having no Aces, 1; for 4333 distribution, 1. -

-

-

-

Figure 3. Excerpt of opening bidder from reference 18.

PROVING

PROGRAMS

CORRECT:

SOME TECHNIQUES

AND EXAMPLES

173

Thus even the short, seemingly uncomplicated specifications of Figure 1 yielded a surprise: hands for which no bid is made. However, the proof of its properties is straightforward. While the cases of Figure 2 are not copied directly from the bidding algorithm, they are not difficult to produce. A complete analysis, similar to lemmas 1.1 and 1.2, has been successfully accomplished [12] on the entire opening bid procedure (some 145 lines of Algol code, excluding declarations) of a running bridge bidding program [18]. The analysis consists of over 500 cases with a maxim u m depth of 11. Those cases are definitely not copied directly from the bidding algorithm. (For comparison, the number of bridge hands is (~), approximately 6.3 × 1011. Of course, smaller a priori bounds on the number of Cases needed can be derived.) The second example consists, then, of a small portion of the analysis in [12]. The purpose is to show case analysis in a more representative use and to show the exploitation of the relations of the problem domain. Figure 3 is the excerpt of the code which is analyzed as well as definitions of the terms involved. LEMMA 2.1. I f a hand has H C P > 17, then it will meet at least one condition in Figure 3. PROOF. The proof is b y case analysis, 15 cases to a maximum depth of 5. The phrase, for example, "distribution is 4441" means "4 cards in each of three suits, 1 card in the remaining suit." There are only three distributions if L O N G E S T = 4. I. I f L O N G E S T > 5, condition H is met. II. If L O N G E S T = 4 A. If the distribution is 4441, the 1-card suit is at best Ace and then the total H C P s for the three 4-card suits is at least 13. Thus one 4-card suit has the highest ( > 5) S U I T P O I N T S . Condition I is met. B. If the distribution is 4432, then E V E N D I S T holds. 1. I f H C P > 18 a. If S T O P = 4, condition J is met. b. I f S T O P ~ 4, then the 2-card suit is at best Q J else S T O P = 4. The 3-card suit is at best A K Q . Thus the total H C P s for the two 4-card suits is at least 6, i.e., 1 8 - ( 9 + 3 ) . B u t then S U I T P O I N T S [ o n e 4-card suit] > 6/2= 3, at least the second best S U I T P O I N T S . Condition I is met. 2. If H C P = 1 7 a. If all four aces are in the hand, the only other card contriB I T 10 ~

12

174

RALPH

L. L O N D O N

buting to H C P is a Jack. Thus S U I T P O I N T S [ o n e 4-card suit] > 4, at least the second best S U I T P O I N T S . Condition I is met. b. If all four aces are not in the hand i. If S T O P = 4 , T O T A L P T S < 18, since the only addition to H C P is 1 for the hand. Since L O N G E S T = 4, condition G is met. ii. If S T O P 4 4, then proceed as in Case lb. The HCPs for the two 4-card suits total at least 5. Thus S U I T P O I N T S [one 4-card suit] > 3. Condition I is met. C. If the distribution is 4333, then S T O P =4 and E V E N D I S T holds. 1. If H C P > 18, condition J is met. 2. If H C P = 17, then T O T A L P T S < 18 since 1 is added for the hand, possibly 1 is added for all four aces, b u t 1 is subtracted for 4333 distribution. Condition G is met. Q.E.D. This proof depends heavily upon the problem domain (bridge hands) and the properties of the terms involved; only incidentally are the semantics of the code of Figure 3 used. This is natural since the programmer will take advantage of his knowledge of the problem domain and, accordingly, the prover must expect to do likewise. One should not expect, in general, that the proof will follow without the use of relations peculiar to the problem domain. Any implications of this to automatic generation of correctness proofs will not be pursued here. Of special note is the use of the so-called pigeonhole principle [17] which states, "if/on + 1 pigeons are in n pigeonholes, at least one of the holes contains/c+ 1 or more pigeons." Its use occurred at case IIA and elsewhere. While certainly applicable, this principle has nothing a priori to do with bridge or with programming. I t m a y thus be necessary to use as well in a proof a general principle seemingly unrelated to the problem domain or to programming.

The technique of proof by assertions. The third example, also taken from [18], demonstrates an extremely useful proof technique, especially in dealing with loops. I t is described, with examples, in [5], [7] and [15]. The idea is that assertions concerning the progress of the computation are made between lines of code, and the proof consists of demonstrating that each assertion is true each time control reaches that assertion, under the assumption that the previously encountered assertions are true. Using induction on the number of lines of code, it can be shown [7] once and for all that this yields a valid proof

PROVING PROGRAMS CORRECT: SOME TECHNIQUES AND EXAMPLES

175

procedure. I n effect, the technique shows there is no first false assertion. Termination of the program is shown separately. The code for this example is in the form of an Algol Boolean procedure named O U T S I D E A C E : Boolean procedure O U T S I D E A C E (SUIT); value S U I T ; integer S U I T ; begin integer I, K; O U T S I D E A C E : = false; for I : = 1 step 1 until 13 do for K := 4 step - 1 until 1 do if H A N D [ I ] = 1 3 . K - 1 2 and K # S U I T then O U T S I D E A C E :=. true; end of O U T S I D E A C E ; I t is necessary to explain the data representation of this example. O U T S I D E A C E is to return true if the hand contains an outside ace, i.e., an ace other than the ace of the suit denoted by the integer parameter S U I T , and is to return false otherwise. The hand is the array HAND[I:13]. S U I T will be either 4, 3, 2, or 1 denoting the four suits. The internal card representations are the integers 1 to 52 according to the formula, 13 • ( S U I T - 1) + R A N K V A L U E , where

RANK

tAI2IaI415t6171Sl911ol'JIQIK

RANKVA UEt 11 !,3141 16tTIS!" l t o l

n112113

.

LEM~A 3.1. O U T S I D E A C E returns the value true or false according as H A N D contains an ace other than the ace of S U I T or not. PROOF. Using Figure 4 the proof consists of verifying the six assertions t h a t have been inserted as comments in the code for O U T S I D E ACE. E x t r a begin-end pairs have been added in the bodies of the for loops to distinguish points of control, namely the end of the body of the loop from the end of the entire loop. Comment 1 is reached only from the immediately preceding assignment statement, and t h a t statement verifies comment 1. Comment 2 m a y be reached either from comment 1 or from comment 5. In the former case, I = 1 and the range of J is empty; hence comment 2 says O U T S I D E A C E is false and this is so from comment I. In the latter case comment 2 follows from comment 5 noting the appropriate change in I caused by the for statement on I. Comment 4 is reached only from comment 3 and the if statement. Since the aces are represented by 13 • ( S U I T - 1) + 1 = 13 • S U I T - 12,

176

RALPH L. LONDON

Boolean p r o c e d u r e O U T S I D E A C E ( S U I T ) ; value S U I T ; i n t e g e r S U I T ; begin integer 1, K ; O U T S I D E A C E : = false; c o m m e n t 1 : 0 U T S I D E A C E = false; for 1 : = 1 step 1 until 13 do b e g i n c o m m e n t 2: O U T S I D E A C E = t r u e i f f some H A N D [ J ] , J = 1 . . . . . 1 - 1 , is an outside ace;

for K : = 4 step - 1 until 1 do b e g i n c o m m e n t 3: O U T S I D E A C E = t r u e i f f some H A N D [ J ] , J = 1. . . . . 1 - 1 , is an outside ace or H A N D [ I ~ is the L - t h outside ace, L = K + I . . . . . 4; if H A N D [ l ] = 13,K--12 and K ~ S U I T then 0 U T S I D E A C E : = t r u e ; c o m m e n t 4: O U T S I D E A C E = t r u e i f f some H A N D [ J ] , J = 1 , . . . , I-l,

is a n outside ace or H A N D [ I ]

is the L-th outside ace, JL =

K,...,4; end K loop; c o m m e n t 5: O U T S I D E A C E

= t r u e i f f some H A N D [ J ] , J = 1 . . . . ,1, is a n outside ace; end I loop; c o m m e n t 6: O U T S I D E A C E = t r u e i f f some H A N D [ J ] , J = 1. . . . ,13,/s a n outside ace, i.e., i f f the hand contains a n outside ace; end o f 0 U T S I D E A C E ;

Figure 4. Code for OUTSIDEACE including assertions. t h e Boolean e x p r e s s i o n is t r u e iff H A N D [ I ] is t h e K t h ace a n d a n outside ace. Accordingly O U T S I D E A C E is set t r u e if H A N D [ I ] is an outside ace a n d otherwise is n o t changed. H e n c e c o m m e n t 3 holds for L = K also, i.e., c o m m e n t 4 holds. C o m m e n t s 3, 5 a n d 6 follow b y similar a r g u m e n t s which are o m i t t e d . C o m m e n t 6 is t h e l e m m a t o be p r o v e d . O U T S I D E A C E terminates since I a n d K are c h a n g e d o n l y in t h e for s t a t e m e n t s a n d hence t h e if s t a t e m e n t is e x e c u t e d precisely 52 times. Q.E.D. This e x a m p l e is also p r o v e d in [9] b u t b y t h e t e c h n i q u e s d e m o n s t r a t e d in t h e fifth e x a m p l e below.

T h e t e c h n i q u e of m a t h e m a t i c a l induction.

T h e f o u r t h e x a m p l e comes f r o m [18] a n d is a u t i l i t y p r o c e d u r e . G i v e n a c a r d in t h e f o r m of S U I T and RANK, the Boolean procedure CHECKFOR(SUIT, RANK) is t o be t r u e iff t h e c a r d is in H A N D . T h e s a m e r e p r e s e n t a t i o n s are used as in O U T S I D E A C E . T h e code is Boolean p r o c e d u r e C H E C K F O R ( S U I T , RANK); value S U I T , R A N K ; integer SUIT, RANK; begin integer I;

PROVING PROGRAMS COI~I:¢ECT: SOME TECttlqIQIJES AND EXAMPLES

177

C H E C K F O R : = false; for I : = 1 step 1 until 13 do

ff H A N D [ I ] = 13 • ( S U I T - 1) + R A N K then begin C H E C K F O R := true; go to Q U I T end; QUIT: end of C H E C K F O R ; I t is straightforward to supply and to verify the necessary assertions for proving this short program terminates correctly. Instead, an alternative recursive version of this program, C H E C K F O R 1 , will be proved to meet the above definition b y using the techniques of mathematical induction and case analysis. Thus define C H E C K F O R I ( S U I T , R A N K ) := C F ( S U I T , R A N K , l, 13) where CF(S, R, I, N ) : = ff I > N then false else ff H A N D [ I ] = 1 3 , ( S - I ) + R then true else CF(S, R, I + 1, N); CF is in the form McCarthy [13] called iterative form. LEMMA 4.1. CF(S, R, I, N) is true iff the card given by S and R is in the hand consisting of H A N D [ J ] , J = I , . . . ,N. C F terminates in all cases. P~ooF. Use backwards induction on I, i.e., induction on N - I . If I > N + 1, CF terminates with false, the correct result. Assume the lemma holds for N + 1 > I >_K + 1 and consider I = K. K < N so control reaches the second if of the definition. If H A N D [ K ] = 13. ( S - 1)+ R, H A N D [ K ] is the sought card, and CF terminates with true, the correct result. If H A N D [ K ] is not the sought card, the result is CF(S, R, I + 1, N) which, b y the induction assumption for I + I = K + I, terminates with the correct result. Q.E.D. Using lemma 4.1, the correctness of C H E C K F O R 1 follows since I = 1 and N = 13 in the call to CF. Assuming CHECK:FOR has been proved, this shows that C H E C K F O R and C H E C K F O R 1 are the same Boolean procedure. T h e t e c h n i q u e s of a p r o s e p r o o f and t w o o t h e r s .

The final example computes the quantity P O I N T S , the point-count of a hand as defined in Figure 1 except that now a doubleton is worth 1 point. The points for specific cards are called high card points; the points for low suitlengths are called distributional points. An Algol block for computing P O I N T S appears in Figure 5. It assumes that P O I N T S and the H A N D array are global to this block and further assumes the existence of two integer procedures, F I N D R A N K ( C A R D ) and F I N D S U I T ( C A R D ) , for computing the rank and the suit of a card according to the conventions used by the O U T S I D E A C E example.

178

RALPH L. LONDON

0 begin integer I, LS, RANK, 1 P O I N T S : = 0;

SUIT;

2

LENGTH[I]

3

for I : =

4

begin S U I T : -- F I N D S U I T ( H A N D [ I ] ) ;

5

6 7

: = LENGTH[2]

integer array LENGTH[1:4];

: = LENGTH[3]

: = LENGTH[4]

: = 0;

1 s t e p 1 u n t i l 13 do

LENGTH[SUIT]

: = LENGTH[SUIT]

+ 1

end; for I : = 1 s t e p 1 u n t i l 13 do

8 begin R A N K : = F I N D R A N K ( H A N D [ I ] ) ; 9 if 2 __ 2 14 then P O I N T S : = P O I N T S + 3 15 else if R A N K = 12 and L S >__ 3 16 then P O I N T S : = P O I N T S + 2 17 e l s e if R A N K = 11 and L S > 4 t h e n P O I N T S : = 18 D O N E : e n d ; 19 for I : = 1 step I until 4 do 20 begin L S : = L E N G T H [ I ] - 3; 21 if L S < 0 t h e n P O I N T S : = P O I N T S - L S 22 end 23 end o f P O I N T S convputation ;

POINTS+I;

Figure 5. Algol block for computing :POINTS. The code n a t u r a l l y divides into four sections each of which will be p r o v e d (lemmas 5.2-5.5). T h e proofs could t a k e t h e f o r m of verified assertions, b u t instead it will be d e m o n s t r a t e d t h a t it is possible to give a convincing, y e t still rigorous, proof b y s t a n d a r d m a t h e m a t i c a l argum e n t s given in prose form. This e x a m p l e also d e m o n s t r a t e s the obvious b u t often overlooked technique of p r o v i n g a p r o g r a m a section at a time. This is useful for m a n y of t h e same reasons t h a t one debugs a p r o g r a m one section a t a time a n d t h a t one uses s u b r o u t i n i n g in p r o g r a m m i n g . The usefulness of a table such as the one in l e m m a 5.1 is also shown. T h e first l e m m a gives a n overall p r o p e r t y of t h e code. LEM~V~A 5 . 1 .

Values

of variables are changed

Variable HAND[ ] I LENGTH[ LS POINTS RANK S UIT

only as follows :

Changed at Line Number

]

3, 7, 19 2,5 12, 20 1, 11, 14, 16, 17, 21 8 4

PROVING PROGRAMS CORRECT: SOME TECHNIQUES AND EX A MP LES

179

PROOF. Inspection of the code. LEMMA 5.2. Lines 1-2 initialize P O I N T S and the four elements of the L E N G T H array to O. PROOF. Obvious. LEMMA 5.3. Lines 3-6 compute the L E N G T H altering P O I N T S .

of each suit without

PROOF. For fixed I, i.e., each card, line 4 finds the suit of the card and line 5 increments the proper L E N G T H element. Lines 3, 4 (the begin) and 6 and lemma 5.1 for I insure t h a t each of the 13 cards is counted exactly once. Thus after line 6, using lemma 5.1, P O I N T S = 0 from line 1 and L E N G T H [ J ] , J = 1. . . . ,4, is the correct length of the J t h suit. LEMMA 5.4. Lines 7-18 add the high card points to P O I N T S without altering the L E N G T H array. PROOF. For fixed I, i.e., each card, line 8 finds the rank of the card with 1 < R A N K < 13. If the rank is 2-10, P O I N T S is unchanged (lemma 5.1) and control goes to D O N E at line 18. If the rank is one (an ace), 4 points are added and control goes to D O N E . At line 12 R A N K must be 11, 12 or 13 by lines 9-11. Line 12 computes LS, the length of the suit of the card. Lines 13-17 change P O I N T S at most once according to the point-count and control goes to DONE. Lines 7, 8 and 18 insure t h a t each card is counted exactly once. Thus after line 18, using lemma 5.1, P O I N T S equals the high card points of the hand and the L E N G T H array is unchanged from line 6. LEMMA 5.5. Lines 19-22 add the distributional points to P O I N T S . PROOF. For fixed I, i.e., each suit, line 20 computes LS, the length of the suit less 3. That is,

LENGTH[I]

LS

-LS

>=3 2 1 0

>=0 -1 -2 -3

1 2 3

180

RALPH

L. L O N D O N

Hence line 21 adjusts P O I N T S by - L S if L S is negative, agreeing with the distributional points of the point-count. Lines 19, 20 and 22 insure t h a t each of the four suits is counted exactly once. LEM~L~ 5.6. The Algol block in Figure 5 computes P O I N T S quired.

as re-

PROOF. The code is executed by sections in the order covered in lemmas 5.2-5.5. Thus after line 22, P O I N T S is correct, namely the high card points plus the distributional points. Since the for variable I is not changed in the body of any for loop (lemma 5.1), it is clear t h a t the code terminates. Q.E.D. The presentation of the proof of the last example is patterned after the so-called "inside-out" strategy of programming--start with the innermost code and work out. That is, lemmas 5.2-5.5 correspond to inner code and lemma 5.6 to the outermost code. Conclusion.

The feasibility of proving the correctness of programs, at least small ones, should be clear from the preceding five examples. Techniques of proof have been presented by example. Larger, more realistic programs can be proved in roughly the same manner. This claim is supported in another paper [9] which discusses in detail techniques and strategies of program proving. That paper also summarizes the salient features of five additional proofs [4, 6, 8, 12, 14] which are more representative of realworld proofs. I t is not suggested that other techniques, including uninvented ones, are unnecessary nor t h a t an arbitrary program can be proved using this set of techniques. Instead the techniques are but reasonable starting points for the human as he employs his ingenuity and creativity in stating what needs to be proved and in constructing a proof. However, he now has existing proofs to serve as models, and he has available techniques of proof [22]. As the reader surely noted, there are several assumptions implicit in this type of proof of correctness, for example t h a t (i) a common understanding of the programming language exists between people (no explanation of the semantics of any code is here given), (ii) the problem domain is specified sufficiently and (iii) the proof is error-free. If one also considers the environment in which the program is executed, then correctness can at best be assumed for the compiler or interpreter, the operating system and the hardware.

PROVING PROGRAMS CORRECT: SOME TECHNIQUES AND EXAMPLES

181

That a proof depends upon these, and possibly other assumptions, is not reason to avoid proofs entirely. A completed proof of correctness, even with these limitations, is advantageous because (i) it gives sufficient reasons why the program must be correct, (ii) it makes explicit the assumptions on which correctness rests, certainly more so than the program alone does and (iii) it leads to increased understanding of the program. More programs can and should be proved correct.

Acknowledgements. This work is supported by NSF Grant GP 7069 and the Mathematics Research Center, United States A r m y under Contract Number DA-31124-ARO-D-462.

REFERENCES 1. Cooper, D.C., Mathematical proofs about computer programs, Machine Intelligence 1, CoUins, N . L . and Michie, D. (Eds.), American Elsevier, New York, 1967, 17-28. 2. Dijkstra, E. ~V., A constructive approach to the problem of program carrectness, BIT, Vol. 8, No. 3, 1968, 174-186. 3. Dijkstra, E. W., The structure of the "THE"-multiprogramming system, Comm. ACM, VoL II, No. 5, May, 1968, 341-346. 4. Evans, A., Jr., Syntax analysis by a production language, P h . D . thesis, CarnegieMellon University, 1965. 5. Floyd, R . W . , Assigning meanings to prograzas, Proc. of a Symposium in Applied Mathematics, Vol. 19, Mathematical Aspects of Computer Science, Schwartz, J . T . (Ed.), American Mathematical Society, Providence, R.I., 1967, 19-32. 6. Good, D . I . and London, R . L . , Interval arithmetic for the Burroughs B5500: Four Algol procedures and proofs of their correctness, Computer Sciences Technical Report No. 26, University of Wisconsin, 1968. See also [21]. 7. Knuth, D . E . , The Art of Computer Programming, Vol. 1, Fundamental Algorithms, Addison-Wesley, Reading, Mass., 1968, section 1.2.1. 8. London, R. L., Correctness of the Algol procedure A S K F O R H A N D , Computer Sciences Technical Report No. 50, University of Wisconsin, 1968. 9. London, R. L., Computer programs can be proved correct, Theoretical Approaches to Nonnumerical Problem Solving, Proc. of Systems Symposium at Case Western Reserve University, Banerji,R. B. and Mesarovic ~¢[.D. (Eds.), Springer-Verlag, 1970, 281-303. 10. London, R. L., Proof of algorithms: A new kind o/certification (Certification of Algorithm 245 T R E E S O R T 3), Comm. ACM, to appear. 11. London, R. L. and Halton, J. H., Proofs of algorithms for asymptotic series, Computer Sciences Technical Report No. 54A, University of Wisconsin, 1969. 12. London, R. L. and Wasserman, A . I . , The anatomy of an Algol procedure, Computer Sciences Technical Report No. 5, University of Wisconsin, 1967. 13. McCarthy, J., Towards a mathematical science of computation, Information Processing 1962, Proe. of I F I P Congress 62, Popplewell, C. M. (Ed.), North-Holland, Amsterdam, 1963, 21-28.

182

RALPH L. LONDON

14. McCarthy, J . a n d Painter, J . A., Correctness of a compiler for arithmetic expressions, Proc. of a Symposium in Applied Mathematics, Vol. 19, Mathematical Aspects of Computer Science, Schwartz, J . T . (Ed.), American Mathematical Society, Providence, R.I., 1967, 33-41. 15. Naur, P., Proof of algorithms by general sncrpshots, BIT, Vol. 6, No. 4, 1966, 310-316. 16. Newell, A. et. al., Information Processing Y_xtnguage-V Manual, Second Edition, Prentice-Hall, Englewood Cliffs, N.J., 1964, 48-67. 17. Niven, I., Mathematics of Choice, R a n d o m House, New York, 1965, 120. 18. Wasserman, A. I., Achievement of skill and generality in an artificial intelligence pra. gram, P h . D . thesis to be submitted to University of Wisconsin, 1970. 19. Weissman, C., Lisp 1.5 Primer, Dickenson, Belmont, Calif., 1967, 157-159. 20. Wood, D., A proof of Hamblin's algorithrafor translation of arithmetic expresslons from infix to postfix form, BIT, Vol. 9, :No. 1, 1969, 59-68. 21. Good, D. I. and London, R. L., Computer interval arithmetic: Definition and proof of correct implementation, J . ACM, t o appear. 22. London, R. L., Bibliography on proving the correctness of computer programs, Machine Intelligence 5, ~ e l t z e r , B. a n d Michie, D. (Eds.), E d i n b u r g h University Press, Edinburgh, 1970, 569-580. 23. Tobey, R. G., Rational function integration, Ph. D. thesis, H a r v a r d University, 1967.

COMPUTER SCIENCES DEPARTMENT AND

MATHEMATICS RESEARCH CENTER, UNITED STATES ARMY UNIVERSITY OF WISCONSIN MADISON, WISCONSIN 53706

Proving programs correct: Some techniques and examples

Recommend Documents