This is an extraction of R010101a, which is to be credited after this is edited. All of the material between here and the construction block at the bottom requires rewriting and revision.
Algorithms as specifications for programs. A dangerous idea? Like executable specifications?
What are formal algorithms?
Satisfying ourselves that Programs are rarely algorithms.
We also want to move into the value of computational methods separate from algorithms.
We want to move to the idea of computations as something that can occur on a larger scale than algorithms, as between cooperating / coordinated processes. The computation is in the aggregation or the network, as some would say, and is not revealed in the computational methods of the individual participants.
Version 0.10 Last updated 2006-06-08-19:51 -0700 (pdt)
This is a snapshot of in-progress work. There is no further development on this note. Continuation of this inquiry is in the following and related materials:
R010101: Do Programs Teach Algorithms?
Miser Note N010201: Manifest Abstractions
Miser Note N010202: Homage to the Bit
Miser Note N010204: Framework Versus Structure
Miser Note N010205: Programs Are Rarely Algorithms
- see also:
- 2006-03-08 Professor von Clueless: What We See Is Not What We Get - Character Codes and Computers
This is becoming too long and wandering quite a ways beyond the original inspiration in Algorithms. The article will be separated as an independent work and this commentary will simply link to it. The article will still be on Do Programs Teach Algorithms, but the longer, detailed excursions will be pulled into articles of their own (such as Homage to the Bit and others being worked up as part of the Miser Project.)
Summary
References
What is the relationship between algorithms and their implementations that is important to the mastery of computer programming? Is it foundational merely for computer science or does it inform the development of all successful computing practice? Is there a firm distinction or is it conceptual and qualitative?
Reviewing Sedgewick's approach leads me to explore the separateness of algorithms more closely. I conclude that writing down an algorithm in any form commits one to a framework, like it or not. Formalism is like that.
Even so, I maintain, differentiating statements of algorithms from programs is indispensable. Abstracting algorithms encourages students and practitioners to develop their ability to identify, operate with, and interpret levels of abstractions. It is at the heart of computing.
Every time we look at the implementation of an algorithm, we see more than the algorithm. An implementation always adds more detail than what is essential to the algorithm. An implementation is also likely to take something away from the algorithm.
There is an useful example on the Miser Project page of notations for code and data: two implementations of the same algorithm are presented. One implementation is in Python, the other is in C Language. The implementations are clearly different, far more than the superficial similarity of language constructs reveals. But the algorithm is the same. It is a flavor of Donald E. Knuth's Algorithm 6.2.1B.
Here is the more-general implementation, the one in Python version 1.5.2 for win32:
def BinSearch(List, item): "return the index of item in the already-sorted List" "return -1 when item is not in List" "based on Donald E. Knuth's Algorithm 6.2.1B" " cleaned up to work with 0-origin lists, tuples, and indexes" L = 0 # Current left-most position R = len(List) # Current right boundary while L < R: j = (L + R) / 2 # A place to try in the if List[j] == item: # candidate interval return j if List[j] < item: # When no match yet, close off L = j + 1 # the area where item can't be else: # and continue in the interval R = j # that remains return -1 # Return impossible index if no match
This version is similar to a statement of the essential algorithm, yet it does not provide a precise statement of the algorithm. Far from it.
Things to notice:
We assume that
len(List)
is always representable as an exact integer value and there is no risk ofList
having more elements than the largest representable integer. It is further assumed that(2*len(List)-1 (the worst-case value oflen(List)-1)+len(List)
L+R
) is also computed precisely. [dh:2006-06-08 You can guard this assumption on practically all platforms by asserting that len(List) < 2*len(List). Why this works is left as a valuable exercise for the student of programming-language implementations.]The Python version operates properly with any
item
andList
-item data type such that the values are comparable and are simply ordered by operator <.
BinSearch
employs an impossible index (for Python 1.5.2 lists) of-1
to report unsuccessful completion of the search. The implementation assures that this value cannot arise as a successful result.The program variables
L
,R
, andj
are private to the implementation and the only operations with them are those that are shown.The parameters
List
anditem
are assumed to be stable and non-volatile, with no alteration occuring to them whileBinSearch
is being carried out.The operations are clearly stated, but the identification of the particular algorithm and preservation of essential conditions is by appeal to a separate publication [ACP 6.2.1B]
[dh:2006-06-08 It is also important to know the version of the language and its implementation for which this code implements a correct binary search algorithm. There is an assumption about array bounds and the origin of indices that is tacitly reflected in this code. There is also an assumption about division of integers that is a breaking change for later versions of Python. These are all indicative of the degree to which this program is not reliably the algorithm.]
To see the difference between the statement of an algorithm for binary search and the
BinSearch
implementation, we will use the style of algorithm presentation introduced by Donald Knuth[Knuth97: 1.1]:Algorithm Bin (Binary search). Given a list of items K0, K1, ..., KN-1
in non-decreasing order K0 < K1 < ··· < KN-1,
this algorithm finds a position, if any, having a given item, K.
- Bin1. [Initialize.]
- Set L to 0; set R to N, the number of items.
(If there are no items, N is 0.)- Bin2. [No Luck?]
- Let Rem = R - L, the number of eligible items remaining.
If Rem = 0, terminate the search as unsuccessful.
(Each time we return to Bin2, Rem is smaller and we must eventually find a matching item or terminate with all possibilities eliminated. See Bin4 and Bin5 to confirm that.)- Bin3. [Take a Look.]
- (L < R and if K is among the eligible items then KL < K < KR-1.)
Set j to floor((L+R)/2).- (j is always an index within one place of the middle of the interval L through R-1. L < j < R-1 always.)
If Kj < K, go to Bin4. If K < Kj, go to Bin5.
Otherwise, the search is successful and Kj is a matching item.- Bin4. [Look Higher Next.]
- (If K is among the eligible items, then Kj+1 < K < KR-1.)
- Set L to j+1. Go to Bin2.
(0 < R - L < Rem. We have eliminated at least half of the current items as ineligible.)- Bin5. [Look Lower Next.]
- (If K is among the eligible items, then KL < K < Kj-1.)
Set R to j. Go to Bin2.
(0 < R - L < Rem. We have eliminated at least half of the current items as ineligible.)The statement of the algorithm relies on a supposedly pre-existing framework. We will discuss that more in section 2. For now, it is valuable to notice that there is an implicit framework and perhaps ponder (1) how much of it we take for granted in reading the algorithm and (2) how much the author of this description takes for granted about what needs to be said and what it takes to understand it.
This statement of the algorithm is contrived so that someone skilled in the analysis of algorithms can independently confirm some important characteristics:
- The preconditions in which the algorithm is assured to operate are clearly established
- We rely on mathematical objects, not computational ones, so there is a generality beyond the incomplete data representations that arise in typical implementations.
- All operations are definite and presumed to be carried out in finite time and space. This includes use of floor(x) for the largest integer not greater than x.
- At the same time, we are not so definite as to fix the operations in a rigid way. It is presumed that there is or can be made to be such a fixation when the algorithm is embodied in an implementation.
- No indefinite operations (e.g., reference to KN or K-1) are ever attempted.
- The algorithm must terminate.
- If the input conditions are satisfied, the algorithm accurately determines an index for a K-matching item if and only if there is one. The algorithm will terminate unsuccessfully if an only if there is no K-matching item.
- One can analyze the performance of the algorithm in terms of the number of times step Bin3 is ever performed.
- There is a way of speaking of algorithms as if they exist, yet the statements of them are not the algorithms (not unlike the concern we began with, involving recognition of algorithms embodied in implementations).
Also, notice that the annotations made as parenthetical comments are not strictly part of the statement of the algorithm. The annotations provide assertions about the algorithm that are important in comprehending the achievement of its function and of its efficacy. The annotations also support confirmation that the recipe given is indeed that of an algorithm for the identified function.
We are looking here at a few ideas that might be worth developing separately:
1. That the statement of an algorithm has a certain looseness to it that is important. Yet it is easy to reach agreement and demonstration that there is an interpretation of the statement in an implementation that preserves the essentials of the algorithm.
2. That implementations can be viewed as interpretations of (formal or informal) algorithms
3. That the annotations are indispensible to the statement of the algorithm and it being seen as satisfying the requirements for for accomplishment of the stated purpose with an algorithmic solution.It will probably be valuable to identify what the technical conditions on algorithms are and how they are satisfied with this particular example. I don't want to do that here. Perhaps in a companion article, or a brief sidebar.
At some point, we want to nail down this way of speaking about the statement of an algorithm, versus the algorithm, versus an implementation of the algorithm in a computer program. We want to establish that, whatever the difficulties that even the statement of an algorithm introduces on "grokking" the algorithm, it is valuable to take that route. (2001-02-20)
Someplace we might talk about focal attention versus subordinate intention, the distinctions that Polyani introduced concerning language. That and how statements of algorithms are methods and there is a relationship to behavior that always has the algorithm be "off the paper" and yet something we might see ourselves having our focal attention on when we examine the statement of the algorithm. (2001-02-20)
When I am working with statements of algorithms and implementations, I toggle back and forth between them. Experiences in developing the implementation suggest refinements -- usually simplifications -- to the algorithm and its statement that are then reflected by economies in the implementation. The process can continue until I am satisfied with the result.
When I developed
BinSearch
, I started by consulting Knuth's statement of Algorithm 6.2.1B. I immediately introduced transformations that allowed my preferred style to be expressed inBinSearch
,. I satisfied myself that none of the transformations sacrificed anything that is essential to preserving the algorithm, its correctness, or its performance.I did all of this informally, the objective at the time being providing an useful worked example of notational conventions and their application, not demonstrating algorithmic analysis.
In questioning whether programs teach algorithms, I took a fresh look. Although I again consulted Algorithm 6.2.1B, this time I created Algorithm Bin by re-abstraction from BinSearch. I Also restated it in a way where I could demonstrate that I had captured everything that there is in Algorithm 6.2.1B, and more.
Discuss how the statement of Algorithm Bin has a pleasant separation of concerns. It is abstracted completely away from any application. Contrast this with Knuth's statement, and also with Sedgewick's. This is an useful illustration in the abstraction of design from algorithm and of algorithm from implementation. (Both Knuth and Sedgewich cast the algorithm in the context of a wider purpose. We can reflect that in the discussion of design vs. algorithm.) 2001-02-20.
Talk about why I don't go the final step and usewhile (L != R):
, leaving the safety cord attached even when it is demonstrably unnecessary.
In creating Algorithm Bin as a re-abstraction ofBinSearch
, it is not like I walked up with a blank mind and then figured out an abstraction that gave me an algorithm I could use. I was certainly aware of the function ofBinSearch
and what it was intended to accomplish by me, its author.We can consider decorating code with information about the algorithm. This might be a form of tangling that has the algorithm carried with the code so that someone can inspect it as a whole.
The value of that is that the code is tied to the algorithmic statement that it provides a valid interpretation of. It puts rigor around what the intended function is, and the only knowledge of the implementation that it is safe for an user of the implementation to make (assuming that one might want to substitute a different implementation that preserves the agreed interface and the algorithm).
If I were to do that, I would want a very clear notation for distinguishing discussion about the implementation from the annotations that tie the implementation to the algorithm. This might make for some wordy code. It would certainly have a valuable self-contained property to it. It is apparently not what most programmers want to look at.
I would also want a way of tying implementation conditions and restraints, so that the mapping of the algorithm is very clear, and the restraints of the implementation are also separately very clear. I am thinking of constraints like Bin's L+R being a defined quantity in the operational system of the implementation.
If I attempt this, it will be by coming up with a new version ofBinSearch
here.
Point out that we pretty much always say more than is essential. And it is still important to have a form of expression that has it be understood that the algorithm is definitely and always more abstract than the implementation.
Every expression of an algorithm has baggage. This includes assumptions about the existence of orderings, the assumption of 0-origin indexing, the use of natural numbers for indexes, and the stability of the objects involved. That the algorithm is effectively carried out atomically, while itself being composed of definite steps, is a key assumption in almost all expression of algorithms. This is part of the framework.
Then there is the simple use of ordinary language along with borrowing of mathematical objects and notations, all more framework for expressing algorithms.
The value of using ordinary language and distinct notations from that for program code must not be underestimated. It gives us a way of distinguishing the two, and of being able to establish what is essential and what is incidental (but perhaps inescapable baggage) of the given implementation. We can also begin to separate the baggage of the algorithmic framework and the baggage of the implementation system.
It is important that we appreciate the value of having more than one perspective on this. It gives us a place to separate problem from solution, and be clear which is a statement of what. the algorithm-implementation separation contributes to the separation of the abstraction from the statements of it.
Consider that the level of abstraction for statements of algorithms might need to be improved.
One way is to notice that the objects appealed to in the statement of the algorithm might be at the wrong level. A large part of the baggage in our example is related to the level at which we deal with ordered lists via indexes. There might be a way to change the level such that we address lists in other terms that are somehow liberated from the baggage about indexes and their origins. It might be a quite different algorithm, assuming quite different basic operations, and we won't go there.
Another approach is to make reliance on abstract elements explicit rather than tacit. What we might get to is a form of expression of algorithms that can be taken as concrete, given a representation of the abstract elements.We are always going to have algorithmic baggage. Can we arrange baggage in such a way that the baggage is understandable and the abstracted algorithm is pretty clear? We take a look at a toy case using an abstract structure for the bit.
This might be better described as concretizing abstractions. That is, we want to make the subject-matter of the algorithm, the structures that are understood, explicitly abstracted so that we can see what the dependencies are. To do this, we still need a framework, but one that makes explicit that which was originally tacit. (2001-02-20)
We then look at what may well be the likely case in a practical situation.
1. We confess that we came in with a point of view about conceptual abstractions, and make the best of it.
2. We look at the motivation for doing this, and why we don't want to collapse algorithms and implementations together. Part of it has to do with coming to grips with what is essential. We have a practical example of what is available when we bother to do that. (I have another example in the Dr. Dobb's paper on integer powers.)
3. There is also an example in why we often do applied mathematics the way we do, whether balancing a checkbook or building a table or playing pool or pitching at baseball, when we bother to look at these activities that way.There is an example of design versus algorithm in what we want to represent versus how we are representing it (algorithmically). Also, the combination of algorithms that go into the design of something can have algorithms at a quite different level of abstraction than that of the design.
- [ACP 6.2.1B]
- Knuth, Donald E. Binary Search. Algorithm 6.2.1B on p. 410 of The Art of Computer Programming, v.3: Sorting and Searching. ed.2. Addison-Wesley Longman (Reading, MA: 1998). ISBN 0-201-89685-0.
BinSearch
is derived from the form given by Knuth in Scientific American 236, 4 (April 1977), as reprinted on pp.66-67 of Selected Papers on Computer Science, CSLI (Palo Alto: 1996), ISBN 1-881526-91-7 pbk.
ThisBinSearch
is adjusted to avoid wondering whatm/2
might be for integerm<0
, while preserving a clean terminating condition and loop invariant. It is required that(2*i+1)/2
bei
for non-negativei
.- [Knuth97]
- Knuth, Donald E. The Art of Computer Programming, vol.1: Fundamental Algorithms. ed.3. Addison Wesley Longman (Reading, MA: 1997). ISBN 0-201-89683-4.
- [Pólya57]
- Pólya, George. How to Solve It. ed.2. Princeton University Press (Princeton, NJ: 1945, 1957). ISBN 0-691-08097-6.
- [Sedgewick89]
- Sedgewick, Robert. Algorithms. Second edition. Addison-Wesley (Reading, MA: 1983, 1988). 1989 reprint with authors corrections. ISBN 0-201-06673-4.
- 0.10 2006-06-08-19:51 Coalesce to the Subject Area
- I will preserve the basic sections, but I will abbreviate sections 2-4 to keep the main points only. This article will be converted into the standard folio structure and moved under Astraendo. What I am adjusting immediately is the bound on the sum of the lower and upper indices and also making the version of Python used explicit. (The reason for that is left as a puzzle for the reader.) The deletions are stricken and the insertions are underlined. They are all in section 1.1.) I also used this opportunity to correct the relative links to the proper locations on Orcmid's Lair where this material was originally located.
- 0.00 2002-12-07-22:47 Split off from Orcmid Reading R010101a to provide initial material (orcmid).
- The original too-comprehensive note is pasted here as a boilerplate to be distilled down to the basic Programs Are Rarely Algorithms discussion, if I can remember what I had in mind for that too.
created 2002-12-07-22:47 -0800 (pst) by orcmid
$$Author: Orcmid $
$$Date: 06-06-08 19:52 $
$$Revision: 5 $