Google
 

Trailing-Edge - PDP-10 Archives - decus_20tap2_198111 - decus/20-0066/starn.txt
There are 2 other files named starn.txt in the archive. Click here to see a list.
TITLE:		*n - A Machine-Independent List Processing Language

AUTHOR(S):	Richard A. Stone

DEPARTMENT:	18PR02314
LOCATION:	Western Electric, ERC, Princeton
LOCAL REPORT NUMBER:	CC 5269		CASE: 15006
DATE:		May 3, 1973
VERSION:	6.0

ABSTRACT:

	*n is a highly efficient list processing  language  which  will
	run  on  a  variety  of  small  and  large computers. It can be
	interfaced with standard programming languages to add  smaller,
	faster list processing subroutines to a program.

	*n is a version of *1 (decendant of L6).




REFERENCES: CC4868 - User's Manual for ERC Version of *1

INDEXING TERMS: Computers, list processing, machine independence, S/360,
		DECsystem10, PDP-11, compilers, *1

			       BACKROUND
			       ---------


HISTORY
-------

	*n began with L6 (Low Level Linked List Language) by K. Knowlton
	at  Bell  Laboratories  on the 7090.  Carnegie Mellon University
	modified it to run as macros  on  the  360,  and  called  it  *1
	(pronounced  "star  one").   The  ERC of Westen Electric wrote a
	general compiler (called *n) that can  be  modified  to  produce
	code  for  any  of  several  machine.   Three  examples  are the
	compiler for the PDP11 (STAR11),the IBM 360 (STAR360),  and  the
	PDP10  (STAR10).   The compiler is written in SNOBOL IV and will
	run on any machine that supports SNOBOL IV.


VERSIONS
--------

	Different versions of *n use the designation m.i.d.,  where  'm'
	is  the  version of this manual, 'i' is the number of changes in
	the machine independent version since the  last  manual  change,
	and  'd'  is  the  number  of  changes in the particular machine
	dependent part since the last manual change. Hence, "5.1.2" says
	that  the  manual  last  changed  for  the 5th time, the machine
	independent  code  changed  once  since  then  and  the  machine
	dependent code twice.

		       USING THE STARn COMPILER
		       ----- --- ----- --------


COMPILING ON THE 360
--------------------

	To compile a program:

		//jobname    JOB   (.....
		//	     EXEC  STARn
		//STEPC.SYSIN     DD    *
			-*1 deck-
		/*

	To compile a program and punch out assembly source:

		//jobname    JOB   (.....
		//	     EXEC  STARn,PUNCH=1
		//STEPC.SYSIN     DD    *
			-*1 deck-
		/*

	NOTE: 	For the PDP11, use STAR11.
		For the 360, use STAR360.
		For the PDP10, use STAR10.
		For compile and link for the 360, use STAR360L.


COMPILING ON THE PDP10
----------------------

	To compile a program:

		.R STARn
		*assembly-file,list-file=source-file/switch:setting

		.COMPILE assembly-file/LIST

	NOTE:	The ",list-file" may be ommitted.
		The "source-file" may be files separated by commas,
			which will be concatenated
		Files may be devices such as "TTY:" or "LPT:"
		As many switches as desired may be set.
		Switches are as described under "Control Cards"

	NOTE: 	For the PDP11, use STAR11.
		For the 360, use STAR360.
		For the PDP10, use STAR10.

	NOTE:	For the PDP11, use the extension ".P11" for the assembly
		For the PDP10, use the extension ".MAC" for the assembly

		  DIFFERENCES BETWEEN VERSIONS OF *1
		  ----------- ------- -------- -- --


CARNEGIE-MELLON *1 FEATURES MISSING FROM *n
-------------------------------------------

	(1) Fields are restricted to the following sizes: single  bits,
	whole bytes, half words, whole words.

	(2) In operations of the form (X,op,Y), X and Y shall determine
	fields  of  the  same  length.  (NOTE: For convenience, because
	based fields cannot have a defined size, the size will be taken
	from the size of the right side with which it is first used)

	(3) The only form of subroutine call is CALLS.

	(4) Conversion Operations are not supported.

	(5) The tests (S) and (not S) are not supported.

	(6) Continuations are not permitted

	(7) The only  single  bit  operations  are:  (B,=,0),  (B,=,1),
	(B,<-,0), (B,<-,1), (B1,<-,B2).

	(8) Tests for sequences that yield the same  location  (=o  and
	=/o) are not supported.



*n FEATURES NOT SUPPORTED BY STAR11
-----------------------------------

	(1) Shift operations of more than 1 bit or a variable number of
	bits is not supported (except on the 11/40 or above).


*n FEATURES NOT SUPPORTED BY STAR360
------------------------------------

	none

*n FEATURES NOT SUPPORTED BY STAR10
-----------------------------------

	none

			       DEBUGGING
			       ---------


When the debugging  switch  ("+DEBUG"  -  see  CONTROL  CARDS)  is  on,
additional  code  is  generated  to allow the user to determine where a
programme terminated abnormally.


DEBUGGING ON THE PDP11
----------------------

	On  termination,  the  STARn  statement  number may be found in
	octal in register 0.


DEBUGGING ON THE PDP10
----------------------

	On termination, the STARn statement  number  may  be  found  in
	register  0,  and  the  routine  name  in  register 2 (in 6 bit
	ascii).  These may be found, following a fault, by typing "E 0"
	and "E 1", respectively.


DEBUGGING ON THE 360
--------------------

	On termination, a traceback is given showing the  routine  name
	and  STARn  statement  number  in  parenthesis, followed by the
	location from which that routine was called, and so on.

			     CONTROL CARDS
			     -------------


GENERAL FORMAT
--------------

	Control cards are all of the form "*+xxx" or "*-xxx". Where xxx
	is  the  switch name, plus turns the switch on, and minus turns
	it off.  Switches may be placed anywhere in a deck  and  affect
	all subsequent cards. The "*" is optional and has been included
	in the syntax to allow control cards to be ignored on  the  CMU
	version.


CAVEAT PROGRAMETEUR
-------------------

	Most of the switches have not been fully tested, and care should
	be taken when they are first used.


SWITCHES
--------

			INITIAL
	SWITCH		SETTING	MEANING
	------		-------	---------
	OPT1		-(off)	Register optimization
	OPT2		-(off)	Machine dependent optimization (indirect)
	DEBUG		+(on)	Adds inline statement numbers, etc.
	REG		+(on)	Allow REG to be in a register
	REENT		-(off)	Generate reenterable code
	RCALL		-(off)	Calls use a reenterable sequence
	M40 (PDP11)	-(off)	Model 40 or above
	PIC (PDP11)	-(off)	Position independent code



OPTIMIZATION
------------

	Optimized code may be generated by  adding  the  cards  "+OPT1"
	(for  machine  independent  register  optimization) and "+OPT2"
	(for machine dependent address optimization).

	The two options may be turned on (with the "+") or off (with  a
	"-") for different sections of code.

	It is not obvious what code will be produced, and  hence,  they
	should not be used for debugging.

	In  addition,  the two types of optimization may conflict -- so
	that the two together might be produce less  optimization  than
	one or the other. Still, the one test performed showed the both
	to be better than either separately (see Appendix I).


DEBUGGING
---------

	The  debugging  switch  causes  one  extra  instruction  to  be
	generated per STARn statement to allow determining the location
	of abnormal terminations at execution time. The program will be
	larger and run slower as a result (see DEBUGGING).


USING A REGISTER FOR A BASEFIELD
--------------------------------

	The  name "REG" has been reserved to tell the compiler that the
	basefield is to go into  a  register.   Only  one  register  is
	available,  but  any  number of basefields may be defined in it
	(note, though, that they will wipe each other out).

	Note, also, that this scheme is compatible with *1 for the 360,
	and will not be effected by the lack of this feature on another
	compiler.  In fact, the feature may be turned off by  a  "-REG"
	card.

		EXAMPLE:
			BFIELDE	P,REG
			BFIELD	R,REG
			BFIELD	Z,REG
			DO	(P,+,5),...



REENTERABLE CODE
----------------

	The "+REENT" card can be used to cause the compiler to generate
	reenterable code.

	NOTE:  Routines  compiled  with  the  "+REENT"  switch are also
	recursive, and hense, may  call  themselves  or  their  caller.
	However,  on the 360  this is  highly inefficient and should be
	avoided if possible.

	In  the  present  implementation,  "+REG"  cannot  be used with
	"+REENT".


M40 (PDP11 only)
---

	Generates code using the op codes available on the models 11/40
	and 11/45. Otherwise, the existance of an EAE is assumed.


POSITION INDEPENDENT CODE (PDP11 ONLY)
-------------------------

	It  is  sometimes  desirable to have routines that can be moved
	anywhere in core at execution time.   This  is  done  with  the
	"+PIC" card, and is only very slightly less efficient.

		      NOTES ON *n IMPLEMENTATION
		      --------------------------


GENERAL NOTES FOR ALL MACHINES
-----------------------------

	(1) The above rules imply that products  and  dividends  cannot
	exceed 1 word.

	(2) FORTRAN may call, or be called from *n on any machine.  The
	calling mechanism  is  transparent  across  machines,  so  that
	programs should require no changes.

	(3)  The  field  boundaries  are  taken modulo the machine word
	size. For the PDP11 this implies  that  "(0,15)","(0,31)",  and
	"(16,31)" are all equivalent.

	(4)  The  operations  (X,<-,0),  (X,<-,1), (X,=,0), and (X,=,1)
	produce efficient code on all machines for X as  a  single  bit
	and can be used for packing.

	(5)  Code  is better optimized when constants are placed on the
	right. Thus, "(L,=,0)" will produce better code than "(0,=,L)".


MACHINE DEPENDENT NOTES
-----------------------

	(1)  The  following  table  gives  a rough approximation of the
	relative efficiency of  different  operations  on  the  various
	machines (word operations=1)

		MACHINE	WORD	1/2W	BYTE	BIT	OTHER
		-------	----	----	----	---	-----
		360	1	1	2	1
		PDP11	1		1	1
		PDP10	1	2	3	1	3

	(2) In most cases, the PDP10 can handle bytes of any size.

	(3) In most cases, the 360 can handle differing right and  left
	sizes.

	(4) Some operations are slower on the PDP11. These are, in order
	of   increasing   inefficiency:     AND,   XOR,  MULTIPLICATION,
	DIVISION, GENERALIZED POINTERS.

		   WRITING MACHINE INDEPENDENT CODE
		   --------------------------------


In  general, most of the programs in *n will be independent of machine.
The following are a few pitfalls to avoid if the code  is  intended  to
move from machine to machine:

--Never intermix assembly language with the *n.


--Make sure values don't exceed the word or byte size (as  appropriate)
of the smallest machine.


--Use either 'Y' format (bytes) or non-'Y' format  (words),  but  don't
mix  them,  as  this  course leads only to madness. (e.g. byte 8 is the
same on the PDP11 and IBM/360, word 2 is the same on both machines, but
byte  8  corresponds  to  word  2  on the 360 and word 4 on the PDP11).
Further, bytes are numbered backwards on the PDP11!


--Decide which storage locations contain  numbers,  and  which  contain
addresses! Performing arithmetic on addresses, or pointer operations on
numbers, will cause endless grief when going to different computers, or
sometimes even going to larger versions of the same computer.

		   APPENDIX I - COMPILER COMPARISONS
		   ---------------------------------


SYSTEM		COMPILE	COMPILE	COMPILE	COMPILE	EXECUTE
USED		MINUTES	MINUTES	MINUTES	BYTES	BYTES
		(*N)	(ASSMB)	(TOTAL)
-------------	-------	-------	-------	-------	----------

*1 (CMU)	6.4m	.9m	7.3m	170K	6420

*360 (ERC)	.75m	.9m	1.5m	150K	4188

   +OPT1	"	"	"	"	4054(3.1%)

*11 (ERC)	"	.75m	"	"	2922

   +OPT1	"	"	"	"	2802(4.1%)

   +OPT2	"	"	"	"	2780(4.8%)

   +OPT1+OPT2	"	"	"	"	2710(7.2%)



note: compilation is on the IBM S/360 model 50

Note: PDP10 execution  bytes  is  just  slightly  less  than  the  360,
assuming 4.5 bytes/word.



The following conclusions may be drawn from the above:

	1. The ERC compiler runs in 12% of the time of the CMU compiler
	in roughly the same core.

	2. The ERC compiler can produce code that is 60% of the size of
	the CMU compiler. (Only 3.1% is due to register optimization)

	3. The PDP11 architechure allows programs to be produced at 66%
	of the size of the 360.

		    APPENDIX II - STARn EXTENSIONS
		    ------------------------------


STARn  has  some  minor  extensions  over  the  CMU  *1.  However,  for
compatability,  it is not recommended that these features be used (they
are included here only for completeness).

1. Negative constants may be used, as in: (S,<-,-5)

2. Shifting may be specified as an infix operator, as in: (F,<-<-,4)

3. The character "&" may be used in place of "/&/", as in: (F,&,Q)

4. Recursive code (routines which call themselves or the caller) may be
used if the "+REENT" switch is set.

5. Character strings (see appropriate appendix).

6. User defined operations (see appropriate appendix).

	       APPENDIX III - CRITERIA FOR A *N MACHINE
	       ----------------------------------------

A *n compiler can be written for any computer. However  is  should  be
recognized  that  some  machines  will  be  superior  to  others.  The
following list are those features which could appear in specifications
for a *n machine.

***********************************************************************

[The following are considered manditory]


--Words must have at least K bits each (K is application dependent).

--The must be at least N words (N is application dependent).

--Word size a multiple of 8.

--Four registers must be available (excluding those used by  hardware,
as base registers, for standard linkage conventions, etc.).

--It must be possible to do "indexed loads" using at least two  of  the
registers. The instructions should permit fixed offsets from the index.

--The following operations should be available: ADD,  SUBTRACT,  CALL,
RETURN,  STORE,  COMPLIMENT,  MULTIPLY,  COMPARISON (LEQ, GEQ, GT, LT,
NEQ, EQ, UN).

***********************************************************************

[The following are considered highly  desirable,  and  are  listed  in
order of decreasing desirability]


--An assembler running on the host (not necessarily vendor supplied).

--Bytes and half-words (with load/store operations).

--Addressability through all  of  core  (no  pages,  base  addressable
regions, etc.)

--All operations can be performed on half words and bytes.

--The  following  operations  should be available: SET-BIT, CLEAR-BIT,
TEST-BIT, SHIFT, DIVIDE.


--The following operations should be available: AND, OR, EXCLUSIVE-OR.

--Byte addressability.

--Efficient core to core operations.

--Instructions for zeroing, incrementing, decrementing.

--Indirect addressing.

--An assembler with the ability to handle literals.

--Eight bit bytes.

***********************************************************************

	       APPENDIX IV - CHARACTER STRING EXTENSIONS
	       -----------------------------------------


DESCRIPTION
-----------

	Character  strings are considered block sequences and may appear
	where-ever a block sequence appears. The strings are preceded by
	"  =C'  "  and  followed  by " ' ". Any character may be used in
	place of the quote.

	Character strings are always padded to a full word  with  zeros.
	At least one null (zero) character will be included.

	Character strings may only be used on the right of an operation,
	and take their length from the length of the left operand.


EXAMPLES
--------

  IF  (PA,=,C'.'),THEN,(P,<-[,=C"Help! I'm trapped in a *n program!")
  CALLS  PRINT,(=C'END OF PROGRAM')


NOTES
-----

	This is an experimental feature, and may change with time.

		  APPENDIX V - USER DEFINED OPERATIONS
		  ------------------------------------


DESCRIPTION
-----------

	The operation field of the  DO,  IF,  or  IFANY  statements  may
	contain  either  the  standard  *1  operations,  or user defined
	operations. A user defined operation is any name starting with a
	letter  and  followed  by  letters  and  numbers. This will then
	become the name of a subroutine that is called with the left and
	right   operands   as  arguments.  I.E.,  if  the  operation  is
	(left,op,right), then it becomes the equivalent of:

		CALLS	op,(left,right)


EXAMPLES
--------

	DO	(LAA,<-,2),(PRINT,LAA)			a printing routine
	IF	(L,EQ,=C'Hi'),THEN,(L,GETS,=C'Bye')	string handling
	IF	(A,GT,B),THEN,(A,MU,B)			shades of L6


NOTES
-----

	1.  It  is  up  to  the system programmers at an installation to
	write and put useful routines in libraries.

	2.  Some  of  the  more  useful  routines,  particularly  in the
	character handling area, might be incorporated into *n  at  some
	future date.

	3. This is a convenient way of doing "inline" subroutine calls.