Version 0.20 Last updated 2002-10-13-12:32 -0700 (pdt)
Summary
References
What is the relationship between algorithms and their implementations that is important to the mastery of computer programming? Is it foundational merely for computer science or does it inform the development of all successful computing practice? Is there a firm distinction or is it conceptual and qualitative?
Reviewing Sedgewick's approach leads me to explore the separateness of algorithms more closely. I conclude that writing down an algorithm in any form commits one to a framework, like it or not. Formalism is like that.
Even so, I maintain, differentiating statements of algorithms from programs is indispensable. Abstracting algorithms encourages students and practitioners to develop their ability to identify, operate with, and interpret levels of abstractions. It is at the heart of computing.
-- Dennis E. Hamilton
2001 January 7
Every time we look at the implementation of an algorithm, we see more than the algorithm. An implementation always adds more detail than what is essential to the algorithm. An implementation is also likely to take something away from the algorithm.
There is an useful example on the Miser Project page of notations for code and data: two implementations of the same algorithm are presented. One implementation is in Python, the other is in C Language. The implementations are clearly different, far more than the superficial similarity of language constructs reveals. But the algorithm is the same. It is a flavor of Donald E. Knuth's Algorithm 6.2.1B.
Here is the more-general implementation, the one in Python:
def BinSearch(List, item): "return the index of item in the already-sorted List" "return -1 when item is not in List" "based on Donald E. Knuth's Algorithm 6.2.1B" " cleaned up to work with 0-origin lists, tuples, and indexes" L = 0 # Current left-most position R = len(List) # Current right boundary while L < R: j = (L + R) / 2 # A place to try in the if List[j] == item: # candidate interval return j if List[j] < item: # When no match yet, close off L = j + 1 # the area where item can't be else: # and continue in the interval R = j # that remains return -1 # Return impossible index if no match
This version is similar to a statement of the essential algorithm, yet it does not provide a precise statement of the algorithm. Far from it.
Notice that there are a number of assumptions lurking behind this implementation:
We assume that
len(List)
is always representable as an exact integer value and there is no risk ofList
having more elements than the largest representable integer. It is further assumed that (len(List)-1)+len(List)
(the worst-case value ofL+R
) is also computed precisely.The Python version operates properly with any
item
andList
-item data type such that the values are comparable and are simply ordered by operator <.
BinSearch
employs an impossible index (for Python lists) of-1
to report unsuccessful completion of the search. The implementation assures that this value cannot arise as a successful result.The program variables
L
,R
, andj
are private to the implementation and the only operations with them are those that are shown.The parameters
List
anditem
are assumed to be stable and non-volatile, with no alteration occurring to them whileBinSearch
is being carried out.The operations are clearly stated, but the identification of the particular algorithm and preservation of essential conditions is by appeal to a separate publication [ACP 6.2.1B]
To see the difference between the statement of an algorithm for binary search and the
BinSearch
implementation, we will use the style of algorithm presentation introduced by Donald Knuth[Knuth97: 1.1]:Algorithm Bin (Binary search). Given a list of items K0, K1, ..., KN-1
in non-decreasing order K0 < K1 < ··· < KN-1,
this algorithm finds a position, if any, having a given item, K.
- Bin1. [Initialize.]
- Set L to 0; set R to N, the number of items.
(If there are no items, N is 0.)- Bin2. [Out of Luck?]
- Let Rem = R - L, the number of eligible items remaining.
If Rem = 0, terminate the search as unsuccessful.
(Each time we return to Bin2, Rem is smaller and we must eventually find a matching item or terminate with all possibilities eliminated. See Bin4 and Bin5 to confirm that.)- Bin3. [Take a Look.]
- (L < R and if K is among the eligible items then KL < K < KR-1.)
Set j to floor((L+R)/2).- (j is always an index within one place of the middle of the interval L through R-1. L < j < R-1 always.)
If Kj < K, go to Bin4. If K < Kj, go to Bin5.
Otherwise, the search is successful and Kj is a matching item.- Bin4. [Look Higher Next.]
- (If K is among the eligible items, then Kj+1 < K < KR-1.)
- Set L to j+1. Go to Bin2.
(0 < R - L < Rem. We have eliminated at least half of the current items as ineligible.)- Bin5. [Look Lower Next.]
- (If K is among the eligible items, then KL < K < Kj-1.)
Set R to j. Go to Bin2.
(0 < R - L < Rem. We have eliminated at least half of the current items as ineligible.)The statement of the algorithm relies on a supposedly pre-existing framework. We will discuss that more in section 2. For now, it is valuable to notice that there is an implicit framework and perhaps ponder (1) how much of it we take for granted in reading the algorithm and (2) how much the author of this description takes for granted about what needs to be said and what it takes to understand it.
This statement of the algorithm is contrived so that someone skilled in the analysis of algorithms can independently confirm some important characteristics:
- The preconditions in which the algorithm is assured to operate are clearly established.
- We rely on mathematical objects, not computational ones, so there is a generality beyond the incomplete data representations that arise in typical implementations.
- All operations are definite and presumed to be carried out in finite time and space. This includes use of floor(x) for the largest integer not greater than x.
- At the same time, we are not so definite as to fix the operations in a rigid way. It is presumed that there is or can be made to be such a fixation when the algorithm is embodied in an implementation.
- No indefinite operations (e.g., reference to KN or K-1) are ever attempted.
- The algorithm must terminate.
- If the input conditions are satisfied, the algorithm accurately determines an index for a K-matching item if and only if there is one. The algorithm will terminate unsuccessfully if and only if there is no K-matching item.
- One can analyze the performance of the algorithm in terms of the number of times step Bin3 is ever performed.
- There is a way of speaking of algorithms as if they exist, yet the statements of them are not the algorithms (not unlike the concern we began with, involving recognition of algorithms embodied in implementations).
Also, notice that the annotations made as parenthetical comments are not strictly part of the statement of the algorithm. The annotations provide assertions about the algorithm that are important in comprehending the achievement of its function and of its efficacy. The annotations also support confirmation that the recipe given is indeed that of an algorithm for the identified function.
We are looking here at a few ideas that might be worth developing separately:
1. That the statement of an algorithm has a certain looseness to it that is important. Yet it is easy to reach agreement and demonstration that there is an interpretation of the statement in an implementation that preserves the essentials of the algorithm.
2. That implementations can be viewed as interpretations of (formal or informal) algorithms
3. That the annotations are indispensible to the statement of the algorithm and it being seen as satisfying the requirements for for accomplishment of the stated purpose with an algorithmic solution.It will probably be valuable to identify what the technical conditions on algorithms are and how they are satisfied with this particular example. I will do that in Miser Note N010207: What's an Algorithm?
At some point, we want to nail down this way of speaking about the statement of an algorithm, versus the algorithm, versus an implementation of the algorithm in a computer program. We want to establish that, whatever the difficulties that even the statement of an algorithm introduces on "grokking" the algorithm, it is valuable to take that route. (2001-02-20)
When I am working with statements of algorithms and implementations, I toggle back and forth between them. Experiences in developing the implementation suggest refinements -- usually simplifications -- to the algorithm and its statement that are then reflected by economies in the implementation. The process can continue until I am satisfied with the result.
When I developed
BinSearch
, I started by consulting Knuth's statement of Algorithm 6.2.1B. I immediately introduced transformations that allowed my preferred style to be expressed inBinSearch
,. I satisfied myself that none of the transformations sacrificed anything that is essential to preserving the algorithm, its correctness, or its performance.I did all of this informally, the objective at the time being providing an useful worked example of notational conventions and their application, not demonstrating algorithmic analysis.
In questioning whether programs teach algorithms, I took a fresh look. Although I again consulted Algorithm 6.2.1B, this time I created Algorithm Bin by re-abstraction from BinSearch. I Also restated it in a way where I could demonstrate that I had captured everything that there is in Algorithm 6.2.1B, and more.
Discuss how the statement of Algorithm Bin has a pleasant separation of concerns. It is abstracted completely away from any application. Contrast this with Knuth's statement, and also with Sedgewick's. This is an useful illustration in the abstraction of design from algorithm and of algorithm from implementation. (Both Knuth and Sedgewick cast the algorithm in the context of a wider purpose. We can reflect that in the discussion of design vs. algorithm.) 2001-02-20.
Talk about why I don't go the final step and usewhile (L != R):
, leaving the safety cord attached even when it is demonstrably unnecessary.
In creating Algorithm Bin as a re-abstraction ofBinSearch
, it is not like I walked up with a blank mind and then figured out an abstraction that gave me an algorithm I could use. I was certainly aware of the function ofBinSearch
and what it was intended to accomplish by me, its author.We can consider decorating code with information about the algorithm. This might be a form of tangling that has the algorithm carried with the code so that someone can inspect it as a whole.
The value of that is that the code is tied to the algorithmic statement that it provides a valid interpretation of. It puts rigor around what the intended function is, and the only knowledge of the implementation that it is safe for an user of the implementation to make (assuming that one might want to substitute a different implementation that preserves the agreed interface and the algorithm).
If I were to do that, I would want a very clear notation for distinguishing discussion about the implementation from the annotations that tie the implementation to the algorithm. This might make for some wordy code. It would certainly have a valuable self-contained property to it. It is apparently not what most programmers want to look at.
I would also want a way of tying implementation conditions and restraints, so that the mapping of the algorithm is very clear, and the restraints of the implementation are also separately very clear. I am thinking of constraints like Bin's L+R being a defined quantity in the operational system of the implementation.
If I attempt this, it will be by coming up with a new version ofBinSearch
here.
Point out that we pretty much always say more than is essential. And it is still important to have a form of expression that has it be understood that the algorithm is definitely and always more abstract than the implementation.
Every expression of an algorithm has baggage. This includes assumptions about the existence of orderings, the assumption of 0-origin indexing, the use of natural numbers for indexes, and the stability of the objects involved. That the algorithm is effectively carried out atomically, while itself being composed of definite steps, is a key assumption in almost all expression of algorithms. This is part of the framework.
Then there is the simple use of ordinary language along with borrowing of mathematical objects and notations, all more framework for expressing algorithms.
The value of using ordinary language and distinct notations from that for program code must not be underestimated. It gives us a way of distinguishing the two, and of being able to establish what is essential and what is incidental (but perhaps inescapable baggage) of the given implementation. We can also begin to separate the baggage of the algorithmic framework and the baggage of the implementation system.
It is important that we appreciate the value of having more than one perspective on this. It gives us a place to separate problem from solution, and be clear which is a statement of what. the algorithm-implementation separation contributes to the separation of the abstraction from the statements of it.
Consider that the level of abstraction for statements of algorithms might need to be improved.
One way is to notice that the objects appealed to in the statement of the algorithm might be at the wrong level. A large part of the baggage in our example is related to the level at which we deal with ordered lists via indexes. There might be a way to change the level such that we address lists in other terms that are somehow liberated from the baggage about indexes and their origins. It might be a quite different algorithm, assuming quite different basic operations, and we won't go there.
Another approach is to make reliance on abstract elements explicit rather than tacit. What we might get to is a form of expression of algorithms that can be taken as concrete, given a representation of the abstract elements.We are always going to have algorithmic baggage. Can we arrange baggage in such a way that the baggage is understandable and the abstracted algorithm is pretty clear? We take a look at a toy case using an abstract structure for the bit.
This might be better described as manifesting abstractions. That is, we want to make the subject-matter of the algorithm, the structures that are understood, explicitly abstracted so that we can see what the dependencies are. To do this, we still need a framework, but one that makes explicit that which was originally tacit. (2001-02-20, 2001-02-23)I don't know that I want to go far down this road here. What I might want to say is more about how algorithms are not what we write down. When we make a statement of an algorithm, we provide an expression in which is to be seen a method. What we are up to here is finding a couple of viewpoints that allow the abstraction, the algorithm, to be discerned. The idea of manifesting abstractions is an useful one, but the illustration of bits and such creates a long diversion. What I am going to do is posit the value of manifesting abstractions, include a diagram, and leave it like that. I argue that it is particularly useful to have the statement of an algorithm and also an implementation. The difference between stating an algorithm and being one is also important and perhaps most to the point here. (2001-02-23)
We then look at what may well be the likely case in a practical situation.
1. We confess that we came in with a point of view about conceptual abstractions, and make the best of it.
2. We look at the motivation for doing this, and why we don't want to collapse algorithms and implementations together. Part of it has to do with coming to grips with what is essential. We have a practical example of what is available when we bother to do that. (I have another example in the Dr. Dobb's paper on integer powers.)
3. There is also an example in why we often do applied mathematics the way we do, whether balancing a checkbook or building a table or playing pool or pitching at baseball, when we bother to look at these activities that way.There is an example of design versus algorithm in what we want to represent versus how we are representing it (algorithmically). Also, the combination of algorithms that go into the design of something can have algorithms at a quite different level of abstraction than that of the design.
We should also differentiate a requirement from a design. That is, something that provides a purposive setting that any design must fit within as its context. The design is a bridge between the requirements and the computational methods that will satisfy the requirements. The design tends to allocate functions, and these functions will be mechanized using methods and algorithms. There are different abstractions at these different levels. And they are related but not in a strict way. The design establishes what is essential about those relationships. (2001-02-23)
- [ACP 6.2.1B]
- Knuth, Donald E. Binary Search. Algorithm 6.2.1B on p. 410 of The Art of Computer Programming, v.3: Sorting and Searching. ed.2. Addison-Wesley Longman (Reading, MA: 1998). ISBN 0-201-89685-0.
BinSearch
is derived from the form given by Knuth in Scientific American 236, 4 (April 1977), as reprinted on pp.66-67 of Selected Papers on Computer Science, CSLI (Palo Alto: 1996), ISBN 1-881526-91-7 pbk.
ThisBinSearch
is adjusted to avoid wondering whatm/2
might be for integerm<0
, while preserving a clean terminating condition and loop invariant. It is required that(2*i+1)/2
bei
for non-negativei
.- [Knuth97]
- Knuth, Donald E. The Art of Computer Programming, vol.1: Fundamental Algorithms. ed.3. Addison Wesley Longman (Reading, MA: 1997). ISBN 0-201-89683-4.
- [Pólya57]
- Pólya, George. How to Solve It. ed.2. Princeton University Press (Princeton, NJ: 1945, 1957). ISBN 0-691-08097-6.
- [Sedgewick89]
- Sedgewick, Robert. Algorithms. Second edition. Addison-Wesley (Reading, MA: 1983, 1988). 1989 reprint with authors corrections. ISBN 0-201-06673-4.
- 0.20 2001-02-23 Separate extended material (orcmid)
- A long-winded version of this note is frozen and separated. This draft is now available for pruning down to the essential ideas, with references to articles having extended development under the Miser Project.
- 0.10 2001-02-23 Prepare for division into separate articles (orcmid).
- This note becomes the common basis for several articles that address the underlying themes more carefully.
- 0.00 2001-01-04 Initiate "Do Programs Teach Algorithms?" (orcmid)
- On re-examinination of Sedgwick's Algorithms, I noticed that there seems to be a collapse of algorithms with their implementations, and that algorithms are not directly addressed. This reminded me of earlier work, "Programs Are Rarely Algorithms" that I'd discussed with Bill Anderson in the 80's. It seemed important to speak up about it here, especially in light of my inquiry into "concrete abstractions" (now termed manifest abstractions).
created 2001-01-04-12:40 -0800 (pst) by orcmid
$$Author: Orcmid $
$$Date: 03-05-25 10:50 $
$$Revision: 17 $