Trailing-Edge
-
PDP-10 Archives
-
decus_20tap2_198111
-
decus/20-0066/starn.txt
There are 2 other files named starn.txt in the archive. Click here to see a list.
TITLE: *n - A Machine-Independent List Processing Language
AUTHOR(S): Richard A. Stone
DEPARTMENT: 18PR02314
LOCATION: Western Electric, ERC, Princeton
LOCAL REPORT NUMBER: CC 5269 CASE: 15006
DATE: May 3, 1973
VERSION: 6.0
ABSTRACT:
*n is a highly efficient list processing language which will
run on a variety of small and large computers. It can be
interfaced with standard programming languages to add smaller,
faster list processing subroutines to a program.
*n is a version of *1 (decendant of L6).
REFERENCES: CC4868 - User's Manual for ERC Version of *1
INDEXING TERMS: Computers, list processing, machine independence, S/360,
DECsystem10, PDP-11, compilers, *1
BACKROUND
---------
HISTORY
-------
*n began with L6 (Low Level Linked List Language) by K. Knowlton
at Bell Laboratories on the 7090. Carnegie Mellon University
modified it to run as macros on the 360, and called it *1
(pronounced "star one"). The ERC of Westen Electric wrote a
general compiler (called *n) that can be modified to produce
code for any of several machine. Three examples are the
compiler for the PDP11 (STAR11),the IBM 360 (STAR360), and the
PDP10 (STAR10). The compiler is written in SNOBOL IV and will
run on any machine that supports SNOBOL IV.
VERSIONS
--------
Different versions of *n use the designation m.i.d., where 'm'
is the version of this manual, 'i' is the number of changes in
the machine independent version since the last manual change,
and 'd' is the number of changes in the particular machine
dependent part since the last manual change. Hence, "5.1.2" says
that the manual last changed for the 5th time, the machine
independent code changed once since then and the machine
dependent code twice.
USING THE STARn COMPILER
----- --- ----- --------
COMPILING ON THE 360
--------------------
To compile a program:
//jobname JOB (.....
// EXEC STARn
//STEPC.SYSIN DD *
-*1 deck-
/*
To compile a program and punch out assembly source:
//jobname JOB (.....
// EXEC STARn,PUNCH=1
//STEPC.SYSIN DD *
-*1 deck-
/*
NOTE: For the PDP11, use STAR11.
For the 360, use STAR360.
For the PDP10, use STAR10.
For compile and link for the 360, use STAR360L.
COMPILING ON THE PDP10
----------------------
To compile a program:
.R STARn
*assembly-file,list-file=source-file/switch:setting
.COMPILE assembly-file/LIST
NOTE: The ",list-file" may be ommitted.
The "source-file" may be files separated by commas,
which will be concatenated
Files may be devices such as "TTY:" or "LPT:"
As many switches as desired may be set.
Switches are as described under "Control Cards"
NOTE: For the PDP11, use STAR11.
For the 360, use STAR360.
For the PDP10, use STAR10.
NOTE: For the PDP11, use the extension ".P11" for the assembly
For the PDP10, use the extension ".MAC" for the assembly
DIFFERENCES BETWEEN VERSIONS OF *1
----------- ------- -------- -- --
CARNEGIE-MELLON *1 FEATURES MISSING FROM *n
-------------------------------------------
(1) Fields are restricted to the following sizes: single bits,
whole bytes, half words, whole words.
(2) In operations of the form (X,op,Y), X and Y shall determine
fields of the same length. (NOTE: For convenience, because
based fields cannot have a defined size, the size will be taken
from the size of the right side with which it is first used)
(3) The only form of subroutine call is CALLS.
(4) Conversion Operations are not supported.
(5) The tests (S) and (not S) are not supported.
(6) Continuations are not permitted
(7) The only single bit operations are: (B,=,0), (B,=,1),
(B,<-,0), (B,<-,1), (B1,<-,B2).
(8) Tests for sequences that yield the same location (=o and
=/o) are not supported.
*n FEATURES NOT SUPPORTED BY STAR11
-----------------------------------
(1) Shift operations of more than 1 bit or a variable number of
bits is not supported (except on the 11/40 or above).
*n FEATURES NOT SUPPORTED BY STAR360
------------------------------------
none
*n FEATURES NOT SUPPORTED BY STAR10
-----------------------------------
none
DEBUGGING
---------
When the debugging switch ("+DEBUG" - see CONTROL CARDS) is on,
additional code is generated to allow the user to determine where a
programme terminated abnormally.
DEBUGGING ON THE PDP11
----------------------
On termination, the STARn statement number may be found in
octal in register 0.
DEBUGGING ON THE PDP10
----------------------
On termination, the STARn statement number may be found in
register 0, and the routine name in register 2 (in 6 bit
ascii). These may be found, following a fault, by typing "E 0"
and "E 1", respectively.
DEBUGGING ON THE 360
--------------------
On termination, a traceback is given showing the routine name
and STARn statement number in parenthesis, followed by the
location from which that routine was called, and so on.
CONTROL CARDS
-------------
GENERAL FORMAT
--------------
Control cards are all of the form "*+xxx" or "*-xxx". Where xxx
is the switch name, plus turns the switch on, and minus turns
it off. Switches may be placed anywhere in a deck and affect
all subsequent cards. The "*" is optional and has been included
in the syntax to allow control cards to be ignored on the CMU
version.
CAVEAT PROGRAMETEUR
-------------------
Most of the switches have not been fully tested, and care should
be taken when they are first used.
SWITCHES
--------
INITIAL
SWITCH SETTING MEANING
------ ------- ---------
OPT1 -(off) Register optimization
OPT2 -(off) Machine dependent optimization (indirect)
DEBUG +(on) Adds inline statement numbers, etc.
REG +(on) Allow REG to be in a register
REENT -(off) Generate reenterable code
RCALL -(off) Calls use a reenterable sequence
M40 (PDP11) -(off) Model 40 or above
PIC (PDP11) -(off) Position independent code
OPTIMIZATION
------------
Optimized code may be generated by adding the cards "+OPT1"
(for machine independent register optimization) and "+OPT2"
(for machine dependent address optimization).
The two options may be turned on (with the "+") or off (with a
"-") for different sections of code.
It is not obvious what code will be produced, and hence, they
should not be used for debugging.
In addition, the two types of optimization may conflict -- so
that the two together might be produce less optimization than
one or the other. Still, the one test performed showed the both
to be better than either separately (see Appendix I).
DEBUGGING
---------
The debugging switch causes one extra instruction to be
generated per STARn statement to allow determining the location
of abnormal terminations at execution time. The program will be
larger and run slower as a result (see DEBUGGING).
USING A REGISTER FOR A BASEFIELD
--------------------------------
The name "REG" has been reserved to tell the compiler that the
basefield is to go into a register. Only one register is
available, but any number of basefields may be defined in it
(note, though, that they will wipe each other out).
Note, also, that this scheme is compatible with *1 for the 360,
and will not be effected by the lack of this feature on another
compiler. In fact, the feature may be turned off by a "-REG"
card.
EXAMPLE:
BFIELDE P,REG
BFIELD R,REG
BFIELD Z,REG
DO (P,+,5),...
REENTERABLE CODE
----------------
The "+REENT" card can be used to cause the compiler to generate
reenterable code.
NOTE: Routines compiled with the "+REENT" switch are also
recursive, and hense, may call themselves or their caller.
However, on the 360 this is highly inefficient and should be
avoided if possible.
In the present implementation, "+REG" cannot be used with
"+REENT".
M40 (PDP11 only)
---
Generates code using the op codes available on the models 11/40
and 11/45. Otherwise, the existance of an EAE is assumed.
POSITION INDEPENDENT CODE (PDP11 ONLY)
-------------------------
It is sometimes desirable to have routines that can be moved
anywhere in core at execution time. This is done with the
"+PIC" card, and is only very slightly less efficient.
NOTES ON *n IMPLEMENTATION
--------------------------
GENERAL NOTES FOR ALL MACHINES
-----------------------------
(1) The above rules imply that products and dividends cannot
exceed 1 word.
(2) FORTRAN may call, or be called from *n on any machine. The
calling mechanism is transparent across machines, so that
programs should require no changes.
(3) The field boundaries are taken modulo the machine word
size. For the PDP11 this implies that "(0,15)","(0,31)", and
"(16,31)" are all equivalent.
(4) The operations (X,<-,0), (X,<-,1), (X,=,0), and (X,=,1)
produce efficient code on all machines for X as a single bit
and can be used for packing.
(5) Code is better optimized when constants are placed on the
right. Thus, "(L,=,0)" will produce better code than "(0,=,L)".
MACHINE DEPENDENT NOTES
-----------------------
(1) The following table gives a rough approximation of the
relative efficiency of different operations on the various
machines (word operations=1)
MACHINE WORD 1/2W BYTE BIT OTHER
------- ---- ---- ---- --- -----
360 1 1 2 1
PDP11 1 1 1
PDP10 1 2 3 1 3
(2) In most cases, the PDP10 can handle bytes of any size.
(3) In most cases, the 360 can handle differing right and left
sizes.
(4) Some operations are slower on the PDP11. These are, in order
of increasing inefficiency: AND, XOR, MULTIPLICATION,
DIVISION, GENERALIZED POINTERS.
WRITING MACHINE INDEPENDENT CODE
--------------------------------
In general, most of the programs in *n will be independent of machine.
The following are a few pitfalls to avoid if the code is intended to
move from machine to machine:
--Never intermix assembly language with the *n.
--Make sure values don't exceed the word or byte size (as appropriate)
of the smallest machine.
--Use either 'Y' format (bytes) or non-'Y' format (words), but don't
mix them, as this course leads only to madness. (e.g. byte 8 is the
same on the PDP11 and IBM/360, word 2 is the same on both machines, but
byte 8 corresponds to word 2 on the 360 and word 4 on the PDP11).
Further, bytes are numbered backwards on the PDP11!
--Decide which storage locations contain numbers, and which contain
addresses! Performing arithmetic on addresses, or pointer operations on
numbers, will cause endless grief when going to different computers, or
sometimes even going to larger versions of the same computer.
APPENDIX I - COMPILER COMPARISONS
---------------------------------
SYSTEM COMPILE COMPILE COMPILE COMPILE EXECUTE
USED MINUTES MINUTES MINUTES BYTES BYTES
(*N) (ASSMB) (TOTAL)
------------- ------- ------- ------- ------- ----------
*1 (CMU) 6.4m .9m 7.3m 170K 6420
*360 (ERC) .75m .9m 1.5m 150K 4188
+OPT1 " " " " 4054(3.1%)
*11 (ERC) " .75m " " 2922
+OPT1 " " " " 2802(4.1%)
+OPT2 " " " " 2780(4.8%)
+OPT1+OPT2 " " " " 2710(7.2%)
note: compilation is on the IBM S/360 model 50
Note: PDP10 execution bytes is just slightly less than the 360,
assuming 4.5 bytes/word.
The following conclusions may be drawn from the above:
1. The ERC compiler runs in 12% of the time of the CMU compiler
in roughly the same core.
2. The ERC compiler can produce code that is 60% of the size of
the CMU compiler. (Only 3.1% is due to register optimization)
3. The PDP11 architechure allows programs to be produced at 66%
of the size of the 360.
APPENDIX II - STARn EXTENSIONS
------------------------------
STARn has some minor extensions over the CMU *1. However, for
compatability, it is not recommended that these features be used (they
are included here only for completeness).
1. Negative constants may be used, as in: (S,<-,-5)
2. Shifting may be specified as an infix operator, as in: (F,<-<-,4)
3. The character "&" may be used in place of "/&/", as in: (F,&,Q)
4. Recursive code (routines which call themselves or the caller) may be
used if the "+REENT" switch is set.
5. Character strings (see appropriate appendix).
6. User defined operations (see appropriate appendix).
APPENDIX III - CRITERIA FOR A *N MACHINE
----------------------------------------
A *n compiler can be written for any computer. However is should be
recognized that some machines will be superior to others. The
following list are those features which could appear in specifications
for a *n machine.
***********************************************************************
[The following are considered manditory]
--Words must have at least K bits each (K is application dependent).
--The must be at least N words (N is application dependent).
--Word size a multiple of 8.
--Four registers must be available (excluding those used by hardware,
as base registers, for standard linkage conventions, etc.).
--It must be possible to do "indexed loads" using at least two of the
registers. The instructions should permit fixed offsets from the index.
--The following operations should be available: ADD, SUBTRACT, CALL,
RETURN, STORE, COMPLIMENT, MULTIPLY, COMPARISON (LEQ, GEQ, GT, LT,
NEQ, EQ, UN).
***********************************************************************
[The following are considered highly desirable, and are listed in
order of decreasing desirability]
--An assembler running on the host (not necessarily vendor supplied).
--Bytes and half-words (with load/store operations).
--Addressability through all of core (no pages, base addressable
regions, etc.)
--All operations can be performed on half words and bytes.
--The following operations should be available: SET-BIT, CLEAR-BIT,
TEST-BIT, SHIFT, DIVIDE.
--The following operations should be available: AND, OR, EXCLUSIVE-OR.
--Byte addressability.
--Efficient core to core operations.
--Instructions for zeroing, incrementing, decrementing.
--Indirect addressing.
--An assembler with the ability to handle literals.
--Eight bit bytes.
***********************************************************************
APPENDIX IV - CHARACTER STRING EXTENSIONS
-----------------------------------------
DESCRIPTION
-----------
Character strings are considered block sequences and may appear
where-ever a block sequence appears. The strings are preceded by
" =C' " and followed by " ' ". Any character may be used in
place of the quote.
Character strings are always padded to a full word with zeros.
At least one null (zero) character will be included.
Character strings may only be used on the right of an operation,
and take their length from the length of the left operand.
EXAMPLES
--------
IF (PA,=,C'.'),THEN,(P,<-[,=C"Help! I'm trapped in a *n program!")
CALLS PRINT,(=C'END OF PROGRAM')
NOTES
-----
This is an experimental feature, and may change with time.
APPENDIX V - USER DEFINED OPERATIONS
------------------------------------
DESCRIPTION
-----------
The operation field of the DO, IF, or IFANY statements may
contain either the standard *1 operations, or user defined
operations. A user defined operation is any name starting with a
letter and followed by letters and numbers. This will then
become the name of a subroutine that is called with the left and
right operands as arguments. I.E., if the operation is
(left,op,right), then it becomes the equivalent of:
CALLS op,(left,right)
EXAMPLES
--------
DO (LAA,<-,2),(PRINT,LAA) a printing routine
IF (L,EQ,=C'Hi'),THEN,(L,GETS,=C'Bye') string handling
IF (A,GT,B),THEN,(A,MU,B) shades of L6
NOTES
-----
1. It is up to the system programmers at an installation to
write and put useful routines in libraries.
2. Some of the more useful routines, particularly in the
character handling area, might be incorporated into *n at some
future date.
3. This is a convenient way of doing "inline" subroutine calls.