The Miser Project
Notations

Version 0.12 Last updated 2002-02-24-15:24 -0800 (pst)


Content

1. Notation Requirements

2. Code and Data Notation

3. Metalinguistic Notations


1. Notation Requirements

We want to use separate fonts and styles of characters to aid in notational separation.  At the same time, we must be prepared to have there only be one available font and limited style selections.  

In using HTML, we are never certain how the default behavior of browsers, use of accessibility aids, and other local variations will cause the material to be presented.  Note that it will be very difficult to understand this page if the typography being discussed has been altered.  Here's a place where there's no workaround.

We could adopt a style of language that makes explicit separation of the domain we are speaking in and the domain we are speaking about.  We could simply not allow duplicate symbologies in different domains.  That is, we couldn't reuse symbols like +, -, =, and common words.  We're not willing to go quite that far. 

2. Code and Data Notation

For specimens and expression of code and data elements, a fixed-pitch font is always used.  The elements are usually shown in bold-face as well, to have it be clear that mention/use of code and data elements is being made.  

This convention is consistent with the example used in the specification of the C Language and its library [Pflauger92].

When prototypes of functions and procedures are given, it is customary to show the formal parameters as italicized code elements.  This does not mean that such an element is expressed in a special typography.  We use this convention to emphasize that we are actually speaking of the values or forms that are supplied as the parameters.  

In code specimens and algorithms, these conventions are also honored, except that we use bold-face elements selectively as a way to improve readability.  Also, we may italicize tags that have roles as variables just as in prototypes of functions.  Comments may be in different fonts, and other typographical techniques may be used to enhance readability.  It is to be understood that there is no such special typography in the actual computer data elements, which are represented as plain text.

Examples:

Using this convention, the McCarthy "91 function" can be expressed this way in a  hypothetical computer language [Lifschitz1991]:

def rec f(x) = if x > 100 then x - 10 else f(f(x+11)) .

In text we would speak of f or f(x) and discuss the behavior of calculations with different values of x.  This is to emphasize that we are discussing computational interpretations and computer processing.  It reduces confusion with similarly-named abstract or mathematical entities and formulae that will often be discussed at the same time. 

The following example is for a brief Python definition to which these conventions have been applied:

def BinSearch(List, item):
    "return the index of item in the already-sorted List"
    "return -1 when item is not in List"
    "based on Donald E. Knuth's Algorithm 6.2.1B"
    "  cleaned up to work with 0-origin lists, tuples, and indexes"
    L = 0                    # Current left-most position
    R = len(List)            # Current right boundary
    while L < R:
        j = (L + R) / 2      # A place to try in the 
        if List[j] == item:  #   candidate interval
            return j
        if List[j] < item:   # When no match yet, close off
            L = j + 1        #   the area where item can't be
        else:                #   and continue in the interval
            R = j            #   that remains
    return -1                # Return impossible index if no match

The following illustrates a sample dialog with a Python processor:

>>> L1=(-8,-3,0,1, 15)
>>> BinSearch(L1,-10)
-1
>>> BinSearch(L1,-8)
0
>>> BinSearch(L1,15)
4
>>> BinSearch(L1,-7)
-1
>>> L2=['Chocolate', 'Strawberry', 'Vanilla']
>>> BinSearch(L2, 'Pistachio')
-1
>>> BinSearch(L2, 'Strawberry')
1

Finally, a small C Language example presented using the notational conventions:

int BinSearch(int List[], int R, int item)
    { /* Return the index of item in the already-sorted List[R].
         Return -1 when item is not in List[0] to List[R-1].
         Based on Donald E. Knuth's Algorithm 6.2.1B cleaned up to
         work with 0-origin arrays. 
         */
      int L = 0;
      int j;
      while (L < R)
         { j = (L + R) / 2;
           if (List[j] == item)
                return j;
           if (List[j] < item)
                L = j+1;
           else R = j;  
           }
      return -1;
      } ;

Notice that there are two different binary search implementations here.  BinSearch|Python and BinSearch|C realize the same algorithm, yet their functions are quite different and not at all equivalent, despite the similarities between the two languages.  Wherever we need to be more rigorous about separation of implementations, we'll use different names for the implemented functions.

3. Metalinguistic Notations

When we provide and discuss the rules of construction and interpretation of concrete languages, we will use special metalinguistic notations.  This helps us be clear that we are using language to talk about language.  

3.1 BNF For Context-Free Grammars

We use Backus-Naur Form (BNF) notation with little modification.  When we are specifying a grammar for data or programming-language code, the literal occurrence of text characters will be presented in ordinary bold-face fixed-pitch characters.  Metalinguistic characters and terms are kept in the overall language of expression, just like mathematical forms:

‹numeral› ::= ‹digit› | ‹digit› ‹numeral›

‹digit› ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

‹numeric-tag› ::= #‹numeral›

We use differences in graphic appearance to assist the reader.  For example, the bracketed term, ‹numeral›, can also be written <numeral>, using the signs for less-than and greater-than.   We use alternative angled brackets to avoid confusion.

To have more rigorous differentiation between characters in the subject language and metalinguistic text, we will use quotation of the subject-language characters, as in

‹category-name› ::= “<” ‹tag› “>”

‹form-definition› ::= ‹category-name› “:” “:” “=” ‹alternative-forms›

‹alternative-forms› ::= ‹sentential-form› | ‹alternative-forms› “|” ‹sentential-form›

Sometimes we will refer to encoded text characters and symbols by name, rather than literally depicting them text of the metalanguage usage.  This is done by using Unicode or other agreed names for the elements in small capitals [Unicode00].  Compound names will be hyphenated to avoid confusion about word boundaries:

‹category-name› ::=  left-pointing-single-guillemet
‹tag›
right-pointing-single-guillemet

We will employ the small-capitals convention in many other situations when we need to name a coded data element or fixed concrete entity without exhibiting a literal occurrence in the structure being described.

References

[ACP 6.2.1B]
Knuth, Donald E.  Binary Search.  Algorithm 6.2.1B on p. 410 of The Art of Computer Programming, v.3: Sorting and Searching.  ed.2.  Addison-Wesley Longman (Reading, MA: 1998).  ISBN 0-201-89685-0.
     BinSearch is derived from the form given by Knuth in Scientific American 236, 4 (April 1977), as reprinted on pp.66-67 of Selected Papers on Computer Science, CSLI (Palo Alto: 1996),  ISBN 1-881526-91-7 pbk.  
     This BinSearch is adjusted to avoid wondering what m/2 might be for integer m<0, while preserving a clean terminating condition and loop invariant.  It is required that (2*i+1)/2 is i for non-negative integer i.
 
[Lifschitz1991]
Lifschitz, Vladimir (ed.). Artificial Intelligence and Mathematical Theory of Computation: Papers in Honor of John McCarthy. Academic Press (San Diego: 1991). ISBN 0-12-450010-2.
     There are two articles featuring McCarthy's "91 function" here, one by Feferman and another by Knuth.  I picked this one for an example simply because I had recently looked at those papers and the book was in easy reach.
 
[Plauger92]
Plauger, P.J.  The Standard C Library.  Prentice-Hall (Englewood Cliffs, NJ: 1992).  ISBN 0-13-131509-9.
 
[Unicode00]
The Unicode Consortium.  The Unicode Standard, Version 3.0.  Addison Wesley Longman (Reading, MA: 1991-2000).  ISBN 0-201-61633-5 with CD-ROM. 

version 0.12 2000-07-24 (orcmid).  I created a place for introduction of metalinguistic notation and tidied up some wording so people can see where we are going in this compilation of notations.

version 0.11 2000-07-21 (orcmid).  A lesson in never including code that I haven't actually tested.  I replace the Python BinSearch example with one that adjusts the window on the correct side and then make it work for both Python and C.  The program files are linked to this page as demonstration that the examples have been verified.  The sample Python dialog is pasted from an interactive Python session.

version 0.10 2000-07-18 (orcmid).   Initial creation of documentation for proposed notation.

created 2000-07-18-22:31 -0700 (pdt) by orcmid
$$Author: Orcmid $
$$Date: 02-07-14 12:36 $
$$Revision: 11 $

Home