Trailing-Edge
-
PDP-10 Archives
-
SRI_NIC_PERM_SRC_3_19910112
-
stanford/5-swskit/eapgmg.mem
There are 17 other files named eapgmg.mem in the archive. Click here to see a list.
EXTENDED ADDRESSING ON THE DECSYSTEM-20
Programming Implications and Methodology
Written by: Dan Murphy
Revised for V5 SWSKIT by: John R. Grout
The basic architecture of the DECsystem-10 family of processors up
through the KL10 provided an 18-bit, 256K-word addressing space. This
means in particular that:
1. The Program Counter register is 18 bits;
2. Each and every instruction executed computes an 18-bit
effective address. The contents of this address may or may
not be referenced depending on the actual instruction, but
the algorithm for calculating it (including indexing and
indirecting rules) is the same for all instructions.
The above is true regardless of the size of the physical core memory
available on any particular configuration. Note also that paging (or
relocation on earlier processors) will determine if a particular
virtual address corresponds to an existing physical word in memory,
but the fundamental size of the virtual address space is a constant.
During the design of the KL10 processor, the need for a larger virtual
address space was recognized. A design was developed which provides
an extended address space to new programs while still allowing
existing unmodified programs to execute correctly on the same
processor. Although most of the essential data paths were provided in
the original KL10 implementation, various design changes caused actual
availability of extended addressing operation as described here to be
deferred to the "model B" KL10 CPU.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 2
Programming Implications and Methodology
VIRTUAL ADDRESS SPACE
Under the extended addressing design, the virtual address space of the
machine is now 30 bits, or 1,073,741,824 words. Although one can
think of this as a single homogeneous space, it is generally more
useful to consider an address as consisting of two components, the
section number, and the word number.
0 5 6 17 18 35
- - - - --------------------------------------
! ! ! !
! ! section ! word !
! ! 12 bits ! 18 bits !
- - - - --------------------------------------
The word (more precisely word-within-section) field consists of 18
bits and thus represents a 256K-word address space similar to the
single address space on earlier machines. The section number field is
12 bits and thus provides 4096 separate sections, each of 256K words.
Each section is further divided into pages of 512 words each just as
on earlier machines. The paging facilities allow the monitor to
independently establish the existence and protection of each section
as a unit.
In order to implement this extended 30-bit address space, the PC must
be extended to hold a full address. The PC is always considered to
consist of a section field and a word field, and the incrementing of
the PC never allows a carry from the word field into the section
field. That is, if a program allows flow of control to reach the last
word of a section and the instruction in that location is not a jump,
then the PC will "wrap-around", and the next instruction fetched will
be word 0 of the same section. This will in fact be AC 0 as described
below. The consequence of this is that flow of control of a program
cannot implicitly cross section boundaries. In general, it would be a
programming error to allow the PC to reach the last location of a
section and execute a non-jump instruction or to execute a skipping
instruction in either of the last two locations of a section.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 3
Programming Implications and Methodology
COMPATIBILITY
In order to allow efficient use of the extended address space, it was
necessary to modify the operation of various machine instructions and
to change the algorithm for the calculation of effective addresses.
Because these changes have a high probability of causing any existing
program to malfunction, the following convention was adopted:
If a program is executing in section 0,
all instructions are executed exactly as
they would be on a non-extended machine.
If a program is executing in any section
other than 0, the extended addressing
algorithms are used for effective
address calculation and instruction
execution.
A program is said to be executing in section 0 when the section field
of the PC contains 0. Effective address calculations and instruction
executions are performed exactly as on a non-extended machine; hence
any existing program will work correctly if run in section 0. Note
however, that this also implies that no addresses outside of section 0
can be generated, either for data references or for jumps. That is, a
program executing in section 0 cannot leave section 0 except via a
monitor call.*
It is easy however to write or generate code which is compatible with
both extended and non-extended execution. This was a specific goal of
the design, and in general requires only that certain precautions must
be taken regarding previously unused fields. Since these precautions
must be taken however, one cannot generally assume that any existing
program will have observed them and so execute correctly in an
extended section.
- - - - - - - - - - - - - - - - -
*The only exception to this is the XJRSTF instruction.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 4
Programming Implications and Methodology
EFFECTIVE ADDRESS COMPUTATION
Unless explicitly stated otherwise, everything in the following
discussion refers to execution of instructions with the PC in a non-0
section; nothing applies to instructions executed from section 0.
1. Instruction format
The format of a machine instruction is the same as on a
non-extended machine. In particular, the effective address
computation is dependent on three quantities from the
instruction, the Y (address) field, the X (index) field, and
the I (indirect) field. These are 18 bits, 4 bits, and 1 bit
respectively.
0 8 9 12 13 14 17 18 35
-------------------------------------------------------
! ! ! ! ! !
! OPCODE ! AC !I! X ! Y !
! ! ! ! ! !
-------------------------------------------------------
Depending on the format of the index and indirect words, the
effective address algorithm will perform either 18-bit or
30-bit address computations. When a 30-bit quantity is
indicated, an explicit section number is being specified and
the address is called a global address. When an 18-bit
quantity is indicated, the section field is defaulted from
some other quantity (e.g., the PC), and the address is thus
local to the default section and is called a local address.
In the simplest case, consider an instruction which specifies
no indexing or indirection. E.g.,
3,,400/ MOVE T,1000
Here the effective address computation yields a local address
1000, and the section used for the reference is 3, i.e., the
section from which the instruction itself was fetched.
Precisely stated:
Whenever an instruction or indirect word
specifies a local address, the default
section is the section from which the
word containing that address was taken.
This means that the default section will change during the
course of an effective address calculation which uses
indirection. The default section will always be the section
from which the last indirect word was fetched.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 5
Programming Implications and Methodology
2. Indexing
The first step in the effective address calculation is to
perform indexing if specified by the instruction. The
calculation performed depends on the contents of the
specified index register:
1. If the left half of the contents of X is negative or 0,
the right half of X is added to Y (from the instruction
word) to yield an 18-bit local address.
This is consistent with indexing on a non-extended
machine, and means, for example, that the usual AOBJN and
stack pointer formats can be used for tables and stacks
which are in the same section as the program.
2. If the left half of the contents of X is positive and
non-0, the entire contents of X are added to Y (sign
extended) to yield a 30-bit global address.
This means that index registers may be used to hold
complete addresses which are referenced via indexed
instructions. A Y field of 0 will commonly be used to
reference exactly the address contained in X. Small
positive or negative offsets (magnitude less than 2**17)
may also be specified by the Y field, e.g., for
referencing data structure items in other sections.
3. Indirection
If indirection is specified by the instruction, an indirect
word is fetched from the address determined by Y and indexing
(if any). The indirect word is considered to be "instruction
format" if bit 0 is a 1, and "extended format" if bit 0 is a
0.
1. Instruction format indirect word (IFIW):
This word contains Y, X, and I fields of the same size
and in the same position as instructions, i.e., in bits
13-35. Bit 1 must be 0 (its use is reserved for future
hardware); bits 2-12 are not used.
0 1 2 12 13 14 17 18 35
-------------------------------------------------------
! !M! ! ! ! !
!1!B! UNUSED !I! X ! Y !
! !Z! ! ! ! !
-------------------------------------------------------
The effective address computation continues with the
quantities in this word just as for the original
instruction. That is, indexing may be specified and may
be local or global depending on the left half of the
index, and further indirection may be specified. Note
that the default section for any local addresses produced
from this indirect word will be the section from which
the word itself was fetched.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 6
Programming Implications and Methodology
2. Extended Format Indirect Word (EFIW):
This word contains Y, X, and I fields also, but in a
different format such as to allow a full 30-bit address
field.
0 1 2 5 6 35
-------------------------------------------------------
! ! ! ! !
!0!I! X ! <--------------- Y ---------------> !
! ! ! ! !
-------------------------------------------------------
If indexing is specified in this indirect word, the
entire contents of X are added to the 30-bit Y to produce
a global address. A local address is never produced by
this operation, and the type of operation is not
dependent on the contents of X. Hence, either Y or C(X)
may be used as an address or an offset within the
extended address space just as is done in the 18-bit
address space.
If further indirection is specified, the next indirect
word is fetched from Y as modified by indexing if any.
The next indirect word may be instruction format or
extended format, and its interpretation does not depend
on the format of the previous indirect word.
4. Some examples
1. Simple instruction reference within the current PC
section:
3,,400/ MOVE T,1000 ;fetches from 3001000
JRST 2000 ;jumps to 3002000
2. Local tables may be scanned with standard AOBJN loops:
MOVSI X,-SIZ
LP: CAMN T,TABL(X) ;TABL in current section
JRST FOUND
AOBJN X,LP
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 7
Programming Implications and Methodology
3. Global tables may be scanned with full address and
index:
MOVEI X,0
LP: CAMN T,@[EFIW TABL,X] ;TABL(X) in ext fmt
JRST FOUND
CAIGE X,SIZ-1
AOJA X,LP
4. Subroutine argument pointer may be passed to subroutine
in another section:
word in arglist:
IFIW @VAR(X) ;indexing and indirecting
;if specified will be relative
;to the section in which this
;pointer resides, i.e., the
;section of the caller
IMMEDIATE INSTRUCTIONS
All effective address computations yield a 30-bit address defaulting
the section if necessary, as described above. Immediate instructions
use only the low-order 18-bits of this as their operand however,
setting the high-order 18 bits to 0. Hence, instructions such as
MOVEI, CAI, etc. produce identical results regardless of the section
in which they are executed.
Two immediate instructions are implemented which do retain the section
field of their effective addresses.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 8
Programming Implications and Methodology
1. XMOVEI (opcode 415) Extended Move Immediate:
This instruction loads the entire 30-bit effective address
into the designated AC, setting bits 0-5 to 0. If no
indexing or indirection is specified, the current PC section
will appear in the section field of the result. This
instruction would replace MOVEI in those cases where an
address (rather than a small constant) is being loaded, and
the full address is needed.
Example: calling a subroutine in another section (assuming
arglist in same section as caller):
XMOVEI AP,ARGLIST
PUSHJ P,@[EFIW SUBR]
The subroutine could reference arguments as:
MOVE T,@1(AP)
or could construct argument addresses by:
XMOVEI T,@2(AP)
In both cases, the arglist pointer would be found in the
caller's section because of the global address in AP. The
actual section of the effective address is determined by the
caller, and is implicitly the same as the caller if an IFIW
is used as the arglist pointer, or is explicitly given if an
EFIW is used.
NOTE
One cannot construct section zero global addresses by
using indexing, since a zero left half in the index
AC causes a local address calculation to be
performed. So, one cannot directly use section zero
global addresses (like argument pointers to arguments
in section zero) in index registers. If the
arguments must be referenced in section zero (they
aren't mapped in a non-zero section), one more level
of indirection must be used to reference them.
2. XHLLI (opcode 501)
This instruction replaces the left half of the designated
accumulator with the section number of its effective address.
It is convenient where global addresses must be constructed.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 9
Programming Implications and Methodology
AC REFERENCES
Any reference to a local address in the range 0-17(8) will be made to
the hardware ACs. Also, any global reference to an address in section
1 in the range 0-17(8) (i.e., 1000000-1000017) will be made to the
hardware ACs. Global references to locations 0-17(8) in any section
other than section 1 will reference memory. Thus:
1. Simple addresses in the usual AC range will be reference ACs
as expected, e.g., MOVE 2,3 will fetch from hardware AC 3
regardless of the current section.
2. To pass a global pointer to an AC, a section number of 1 must
be included.
3. Very large arrays may lie across section boundaries; they
will be referenced with global addresses which will always go
to memory, never to the hardware ACs.
4. PC references are always considered local references; hence
a jump instruction which yields an effective address of 0-17
in any section will cause code to be executed from the ACs.
SPECIAL CASE INSTRUCTIONS
In addition to the differences in effective address calculation,
certain instructions are affected in other ways by extended
addressing.
1. Instructions which store/load the PC; PUSHJ, POPJ, JSR, JSP.
These instructions store a 30-bit PC without flags and with
bits 0-5 of the destination word set to 0. POPJ restores the
entire 30-bit PC from the stack word. JSA and JRA are not
affected by extended addressing and store/load only 18 bits
of PC. Hence they are not useful for inter-section calls.
2. Stack instructions (PUSHJ, POPJ, PUSH, POP, ADJSP) use a
local or global address for the stack according to the
contents of the stack register following the same convention
as for indexing. That is, if the left half of the stack
pointer is 0 or negative (prior to incrementing or
decrementing), a local address using the right half of the
stack pointer is computed and used for the stack reference.
If the left half of the stack pointer is positive and non-0,
the entire word is taken as a global address. In the latter
case, incrementing and decrementing of the stack pointer is
done by adding or subtracting 1, not 1000001. Hence, a
global stack for routines in many sections may be used in a
similar manner to present stacks. Stack overflow and
underflow protection would be done by making the pages before
and after stack inaccessible since a space count field is not
present in a global stack pointer.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 10
Programming Implications and Methodology
3. Byte instructions. Four formats of byte pointer are
recognized by the byte instructions: two one-word formats,
and two two-word formats. P (bits 0-5) greater than 44
(octal) causes a one-word global byte pointer format to be
recognized, which contains the fields as shown:
0 5 6 35
-------------------------------------------------------
! ! !
! P ! <--------------- Y ---------------> !
! ! !
-------------------------------------------------------
P can specify any aligned 6-, 7-, 8-, 9-, or 18-bit byte
pointer.* If bit 12 (previously unused) is 0, a one-word
local byte pointer, which is identical in format with the
unextended byte pointer, is recognized. Unlike its
counterpart in section 0, a one-word local byte pointer can
form a global address through indexing or indirection.
If bit 12 is 1, one of the two two-word byte pointer formats
is recognized, which contain the fields as shown:
0 5 6 11 12 13 17 18 35
-------------------------------------------------------
! ! ! ! ! ! Word
! P ! S !1! MBZ ! AVAIL TO SFTW. ! 0
! ! ! ! ! !
-------------------------------------------------------
followed by:
0 1 2 5 6 35
-------------------------------------------------------
! ! ! ! ! Word
!0!I! X ! <--------------- Y ---------------> ! 1
! ! ! ! !
-------------------------------------------------------
or
0 1 2 12 13 14 17 18 35
-------------------------------------------------------
! !M! ! ! ! ! Word
!1!B! UNUSED !I! X ! Y ! 1
! !Z! ! ! ! !
-------------------------------------------------------
- - - - - - - - - - - - - - - - -
*See the Processor Reference or Monitor Calls Reference Manuals for
details.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 11
Programming Implications and Methodology
Note that the second word can be either an EFIW (in a
two-word global byte pointer) or an IFIW (in a two-word local
byte pointer). The right half of the first word is
specifically reserved to software for byte counts, etc.
Incrementing of this format of byte pointer is consistent
with non-extended format; the P field is reduced until the
end of the word is reached, whereupon the address in the
second word is incremented. Incrementing may cause a carry
from the word field to the section field of the address (in a
two-word global byte pointer); hence extended byte arrays
may be referenced without indexing by a two-word global byte
pointer.
4. Other new or modified instructions are LUUOs, BLT, XBLT, XCT,
XJRSTF, XJEN, XPCW, XSFM. Some of these are valid only in
exec mode. Consult the System Reference Manual or chapter
2.2 of the KL10 Functional Specifications for details.
COMPATIBLE PROGRAMMING
It is possible to generate code which works correctly in both extended
and non-extended environments. Such code may even utilize
inter-section references when running in an extended environment; it
is not limited to local addressing.
The basic rule is to observe the extended addressing rules in the
construction or computation of:
1. Index words: be sure the left half is cleared or contains a
negative quantity (e.g., a count) when setting up and using
an index register. This will cause a local reference in the
extended environment which will produce the same result as on
the non-extended machine.
2. Indirect words: always set bit 0 of indirect words used in
argument lists, etc. so as to produce local addresses.
3. Stack pointers: the most common present format (negative
count in left half) works consistently under extended
addressing; modifying a program to use an extended stack
should require no change to stack instructions except if the
code expects to find the processor flags in the stacked PC
words.
When generating addresses to be passed to subroutines or used by other
code, XMOVEI and XHLLI instructions may be used. In the non-extended
environment, these opcodes are SETMI (equivalent to MOVEI) and HLLI
respectively, which always load 0 into the left half.
PROGRAM ARCHITECTURE AND FACILITIES
Ultimately, the extended addressing hardware in conjunction with the
monitor should provide some of the power of general segmentation and
some other useful conventions and facilities. The following are some
of the ideas which have been advanced.
EXTENDED ADDRESSING ON THE DECSYSTEM-20 Page 12
Programming Implications and Methodology
1.
The fact that instructions are generally executed relative to
the current sections suggests that subroutine libraries and
facilities packages can be written which can be dynamically
loaded into any available section and used by many programs.
An important point to observe in connection with this and
most of the following conventions is that programs must not
be compiled with fixed section numbers built into the code.
Programs should be built so as to be loadable into any
section and to request additional section allocations from
the monitor as necessary. This is the only reasonable way to
ensure that conflicting use of particular sections is
avoided.
2. An entire section can be mapped by the monitor as quickly as
a single page. Hence, an entire file (up to 256K) can be
mapped into a section, and the data therein manipulated with
ordinary instructions.
3. Programs can greatly reduce memory management logic by merely
assuming very large sizes for all data bases. It is not
however, very efficient to use many sections, each with only
a small amount of data. A reasonable middle ground should be
chosen.
APPENDIX A
PROGRAMMING IN NON-ZERO SECTIONS ON THE DECSYSTEM-20
This appendix outlines the changes in the use of TOPS-20 system
facilities by programs running in one or more non-zero sections. It
outlines the several new extended JSYSes provided and the changes in
the use of existing JSYSes in non-zero sections.
Universal Device Designators in Non-zero Sections
The introduction of the one-word global byte pointer to the KL10
hardware requires a change in the interpretation of arguments passed
as universal device designators to JSYSes in non-zero sections. In
section 0, an argument containing 5 in bits 0-2 is interpreted as an
immediate-mode argument, and one containing 6 in bits 0-2 is
interpreted as a device designator (.DVDES+device type code,,unit
number): but in non-zero sections, arguments beginning with 5 and 6
are now reasonable byte pointers (one-word globals).
The situation was resolved as follows: if a source/destination
designator is accepted, (i.e., a byte pointer is a meaningful
argument), one-word global byte pointers are accepted and device
designators beginning with 6 (those containing .DVDES) are not
accepted; if a file designator is accepted (i.e., where a byte
pointer would be a meaningless argument), all device designators are
accepted; if only a byte pointer to an ASCIZ string is accepted
(e.g., to a user name string), only 7-bit byte pointers, local or
global, are accepted (since all 7-bit one-word global byte pointers
begin with 6, there is no conflict with immediate-mode arguments, like
user and directory numbers, which begin with 5).
Using the Software Interrupt System in Non-Zero Sections
In non single-section programs, the old level and channel table
formats are inadequate, since the addresses contained in them are
18-bit addresses (the left half is, respectively, reserved and
assigned to the level number), so no explicit section number can be
supplied. The new formats require a 30-bit global address in bits
6-35, and for the channel table, the corresponding level number in
bits 0-5. The XSIR% and XRIR% JSYSes set and return the addresses of
level and channel tables in the new format. Use of the SIR% and RIR%
JSYSes is legal in single-section programs, but their use in non-zero
sections is not recommended; XSIR% and XRIR% must be used in non
single-section programs.
PROGRAMMING IN NON-ZERO SECTIONS ON THE DECSYSTEM-20 Page A-2
Inferior Processes Running in Non-zero Sections
Creation, termination, freezing, and resumption of inferior processes
running in non-zero sections is unchanged. However, starting an
inferior process with an explicit global PC requires the use of the
XSFRK% JSYS: SFORK% (and the CR%ST option of the CFORK% JSYS) will
only start an inferior process in section 0. As before, the SFRKV%
JSYS will start a process using its entry vector.
Use of the XSVEC% and XGVEC% JSYSes is required to save and get the
global address of a multiple-section process's entry vector. The GET%
JSYS will now load single-section save files into any section,
specified in word .GBASE of the argument block pointed to by AC2 (if
GT%ARG is on in AC1, and GT%BAS is on in .GFLAG).
Mapping Pages and Files in Non-zero Sections
The SMAP% (Section MAP), RSMAP% (Read Section MAP), and XRMAP%
(eXtended Read MAP) JSYSes have been added to TOPS-20, along with
slight changes to PMAP%, RMAP%, RPACS%, and SPACS%, in order to map
pages and files into non-zero sections, and read the mapping
information for sections and pages in non-zero sections.
All the JSYSes which accept process page numbers will now accept
18-bit process page numbers (9 bits of section number and 9 bits of
page number), which is sufficient to handle all the sections (0-37
octal) supported by the KL10 hardware, providing that flag FH%EPN is
set in the associated process handle, or, in a PMAP% JSYS, the PM%EPN
flag is set in AC 3.
One cannot create a previously non-existent section by simply writing
into it: one must use PMAP% or SMAP%. Mapping a page via PMAP% in a
previously non-existent section causes a private section to be created
(again, PM%EPN or FH%EPN must be set).
RSMAP% returns the section map information for a given process handle
and section number. XRMAP% is like RMAP%, but accepts an argument
block to return information for multiple groups of pages at once.
SMAP% can do one of four things: map file sections to a process
(mapping process sections to a file is done by PMAP%); map process
sections to a process; create a private section; delete process
sections. SMAP% accepts a source designator in AC1 and a destination
designator in AC2, like PMAP%: the acceptable designators are
JFN,,file section number for files and handle,,section number for
process sections.