Trailing-Edge
-
PDP-10 Archives
-
bb-lw55a-bm
-
7-documentation/eapgmg.mem
There are 17 other files named eapgmg.mem in the archive. Click here to see a list.
EXTENDED ADDRESSING ON THE DECSYSTEM-20
Programming Implications and Methodology
Dan Murphy
The basic architecture of the DECsystem-10 family of processors up
through the KL10 provided an 18-bit, 256K-word addressing space. This
means in particular that:
1. The Program Counter register is 18 bits;
2. Each and every instruction executed computes an 18-bit
effective address. The contents of this address may or may
not be referenced depending on the actual instruction, but
the algorithm for calculating it (including indexing and
indirecting rules) is the same for all instructions.
The above is true regardless of the size of the physical core memory
available on any particular configuration. Note also that paging (or
relocation on earlier processors) will determine if a particular
virtual address corresponds to an existing physical word in memory,
but the fundamental size of the virtual address space is a constant.
During the design of the KL10 processor, the need for a larger virtual
address space was recognized. A design was developed which provides
an extended address space to new programs while still allowing
existing unmodified programs to execute correctly on the same
processor. Although most of the essential data paths were provided in
the original KL10 implementation, various design changes caused actual
availability of extended addressing operation as described here to be
deferred to the "model B" KL10 CPU.
Page 2
VIRTUAL ADDRESS SPACE
Under the extended addressing design, the virtual address space of the
machine is now 30 bits, or 1,073,741,824 words. Although one can
think of this as a single homogeneous space, it is generally more
useful to consider an address as consisting of two components, the
section number, and the word number.
0 5 6 17 18 35
- - - - --------------------------------------
! ! ! !
! ! section ! word !
! ! 12 bits ! 18 bits !
- - - - --------------------------------------
The word (more precisely word-within-section) field consists of 18
bits and thus represents a 256K-word address space similar to the
single address space on earlier machines. The section number field is
12 bits and thus provides 4096 separate sections, each of 256K words.
Each section is further divided into pages of 512 words each just as
on earlier machines. The paging facilities allow the monitor to
independently establish the existence and protection of each section
as a unit.
In order to implement this extended 30-bit address space, the PC must
be extended to hold a full address. The PC is always considered to
consist of a section field and a word field, and the incrementing of
the PC never allows a carry from the word field into the section
field. That is, if a program allows flow of control to reach the last
word of a section and the instruction in that location is not a jump,
then the PC will "wrap-around", and the next instruction fetched will
be word 0 of the same section. This will in fact be AC 0 as described
below. The consequence of this is that flow of control of a program
cannot inmplicitly cross section boundaries. In general, it would be
a programming error to allow the PC to reach the last location of a
section and execute a non-jump instruction or to execute a skipping
instruction in either of the last two locations of a section.
Page 3
COMPATIBILITY
In order to allow efficient use of the extended address space, it was
necessary to modify the operation of various machine instructions and
to change the algorithm for the calculation of effective addresses.
Because these changes have a high probability of causing any existing
program to malfunction, the following convention was adopted:
If a program is executing in section 0,
all instructions are executed exactly as
they would be on a non-extended machine.
If a program is executing in any section
other than 0, the extended addressing
algorithms are used for effective
address calculation and instruction
execution.
A program is said to be executing in section 0 when the section field
of the PC contains 0. Effective address calculations and instruction
executions are performed exactly as on a non-extended machine; hence
any existing program will work correctly if run in section 0. Note
however, that this also implies that no addresses outside of section 0
can be generated, either for data references or for jumps. That is, a
program executing in section 0 cannot leave section 0 except via a
monitor call.*
It is easy however to write or generate code which is compatible with
both extended and non-extended execution. This was a specific goal of
the design, and in general requires only that certain precautions must
be taken regarding previously unused fields. Since these precautions
must be taken however, one cannot generally assume that any existing
program will have observed them and so execute correctly in an
extended section.
- - - - - - - - - - - - - - - - -
*The only exception to this is the XJRSTF instruction.
Page 4
EFFECTIVE ADDRESS COMPUTATION
Unless explicitly stated otherwise, everything in the following
discussion refers to execution of instructions with the PC in a non-0
section; nothing applies to instructions executed from section 0.
1. Instruction format
The format of a machine instruction is the same as on a
non-extended machine. In particular, the effective address
computation is dependent on three quantities from the
instruction, the Y (address) field, the X (index) field, and
the I (indirect) field. These are 18 bits, 4 bits, and 1 bit
respectively.
0 8 9 12 13 14 27 18 35
---------------------------------------------------
! ! ! ! ! !
! OPCODE ! AC ! I! X ! Y !
! ! ! ! ! !
---------------------------------------------------
Depending on the format of the index and indirect words, the
effective address algorithm will perform either 18-bit or
30-bit address computations. When a 30-bit quantity is
indicated, an explicit section number is being specified and
the address is called a global address. When an 18-bit
quantity is indicated, the section field is defaulted from
some other quantity (e.g., the PC), and the address is thus
local to the default section and is called a local address.
In the simplest case, consider an instruction which specifies
no indexing or indirection. E.g.,
3,,400/ MOVE T,1000
Here the effective address computation yields a local address
1000, and the section used for the reference is 3, i.e., the
section from which the instruction itself was fetched.
Precisely stated:
Whenever an instruction or indirect word
specifies a local address, the default
section is the section from which the
word containing that address was taken.
This means that the default section will change during the
course of an effective address calculation which uses
indirection. The default section will always be the section
from which the last indirect word was fetched.
Page 5
2. Indexing
The first step in the effective address calculation is to
perform indexing if specified by the instruction. The
calculation performed depends on the contents of the
specified index register:
1. If the left half of the contents of X is negative or 0,
the right half of X is added to Y (from the instruction
word) to yield an 18-bit local address.
This is consistent with indexing on a non-extended
machine, and means, for example, that the usual AOBJN and
stack pointer formats can be used for tables and stacks
which are in the same section as the program.
2. If the left half of the contents of X is positive and
non-0, the entire contents of X are added to Y (sign
extended) to yield a 30-bit global address.
This means that index registers may be used to hold
complete addresses which are referenced via indexed
instructions. A Y field of 0 will commonly be used to
reference exactly the address contained in X. Small
positive or negative offsets (magnitude less than 2**17)
may also be specified by the Y field, e.g., for
referencing data structure items in other sections.
3. Indirection
If indirection is specified by the instruction, an indirect
word is fetched from the address determined by Y and indexing
(if any). The indirect word is considered to be "instruction
format" if bit 0 is a 1, and "extended format" if bit 0 is a
0.
1. Instruction format indirect word (IFIW):
This word contains Y, X, and I fields of the same size
and in the same position as instructions, i.e., in bits
13-35. Bit 1 must be 0 (its use is reserved for future
hardware); bits 2-12 are not used. The effective
address computation continues with the quantities in this
word just as for the original instruction. That is,
indexing may be specified and may be local or global
depending on the left half of the index, and further
indirection may be specified. Note that the default
section for any local addresses produced from this
indirect word will be the section from which the word
itself was fetched.
2. Extended Format Indirect Word (EFIW):
This word contains Y, X, and I fields also, but in a
different format such as to allow a full 30-bit address
field.
Page 6
0 1 2 5 6 17 18 35
-----------------------------------------------------
! ! ! ! ! !
! ! ! ! <-----------! -----------------> !
!0 ! I! X ! (section) ! Y (word) !
! ! ! ! ! !
-----------------------------------------------------
If indexing is specified in this indirect word, the
entire contents of X are added to the 30-bit Y to
produce a global address. A local address is never
produced by this operation, and the type of operation is
not dependent on the contents of X. Hence, either Y or
C(X) may be used as an address or an offset within the
extended address space just as is done in the 18-bit
address space.
If further indirection is specified, the next indirect
word is fetched from Y as modified by indexing if any.
The next indirect word may be instruction format or
extended format, and its interpretation does not depend
on the format of the previous indirect word.
4. Some examples
1. Simple instruction reference within the current PC
section:
3,,400/ MOVE T,1000 ;fetches from 3001000
JRST 2000 ;jumps to 3002000
2. Local tables may be scanned with standard AOBJN loops:
MOVSI X,-SIZ
LP: CAMN T,TABL(X) ;TABL in current section
JRST FOUND
AOBJN X,LP
Page 7
3. Global tables may be scanned with full address and
index:
MOVEI X,0
LP: CAMN T,@[EFIW TABL,X] ;TABL(X) in ext fmt
JRST FOUND
CAIGE X,SIZ-1
AOJA X,LP
4. Subroutine argument pointer may be passed to subroutine
in another section:
word in arglist:
IFIW @VAR(X) ;indexing and indirecting
;if specified will be relative
;to the section in which this
;pointer resides, i.e., the
;section of the caller
IMMEDIATE INSTRUCTIONS
All effective address computations yield a 30-bit address defaulting
the section if necessary, as described above. Immediate instructions
use only the low-order 18-bits of this as their operand however,
setting the high-order 18 bits to 0. Hence, instructions such as
MOVEI, CAI, etc. produce identical results regardless of the section
in which they are executed.
Two immediate instructions are implemented which do retain the section
field of their effective addresses.
Page 8
1. XMOVEI (opcode 415) Extended Move Immediate:
This instruction loads the entire 30-bit effective address
into the designated AC, setting bits 0-5 to 0. If no
indexing or indirection is specified, the current PC section
will appear in the section field of the result. This
instruction would replace MOVEI in those cases where an
address (rather than a small constant) is being loaded, and
the full address is needed.
Example: calling a subroutine in another section (assuming
arglist in same section as caller):
XMOVEI AP,ARGLIST
PUSHJ P,@[EFIW SUBR]
The subroutine could reference arguments as:
MOVE T,@1(AP)
or could construct argument addresses by:
XMOVEI T,@2(AP)
In both cases, the arglist pointer would be found in the
caller's section because of the global address in AP. The
actual section of the effective address is determined by the
caller, and is implicitly the same as the caller if an IFIW
is used as the arglist pointer, or is explicitly given if an
EFIW is used.
2. XHLLI (opcode 501)
This instruction replaces the left half of the designated
accumulator with the section number of its effective address.
It is convenient where global addresses must be constructed.
Page 9
AC REFERENCES
Any reference to a local address in the range 0-17(8) will be made to
the hardware ACs. Also, any gobal reference to an address in section
1 in the range 0-17(8) (i.e., 1000000-1000017) will be made to the
hardware ACs. Global references to locations 0-17(8) in any section
other than section 1 will reference memory. Thus:
1. Simple addresses in the usual AC range will be reference ACs
as expected, e.g., MOVE 2,3 will fetch from hardware AC 3
regardless ot the current section.
2. To pass a global pointer to an AC, a section number of 1 must
be included.
3. Very large arrays may lie across section boundaries; they
will be referenced with global addresses which will always go
to memory, never to the hardware ACs.
4. PC references are always considered local references; hence
a jump instruction which yields an effective address of 0-17
in any section will cause code to be executed from the ACs.
SPECIAL CASE INSTRUCTIONS
In addition to the differences in effective address calculation,
certain instructions are affected in other ways by extended
addressing.
1. Instructions which store/load the PC; PUSHJ, POPJ, JSR, JSP.
These instructions store a 30-bit PC without flags and with
bits 0-5 of the destination word set to 0. POPJ restores the
entire 30-bit PC from the stack word. JSA and JRA are not
affected by extended addressing and store/load only 18 bits
of PC. Hence they are not useful for inter-section calls.
2. Stack instructions (PUSHJ, POPJ, PUSH, POP, ADJSP) use a
local or global address for the stack according to the
contents of the stack register following the same convention
as for indexing. That is, if the left half of the stack
pointer is 0 or negative (prior to incrementing or
decrementing), a local address using the right half of the
stack pointer is computed and used for the stack reference.
If the left half of the stack pointer is positive and non-0,
the entire word is taken as a global address. In the latter
case, incrementing and decrementing of the stack pointer is
done by adding of subtracting 1, not 1000001. Hence, a
global stack for routines in many sections may be used in a
similar manner to present stacks. Stack overflow and
underfow protection would be done by making the pages before
and after stack inaccessible since a space count field is not
present in a gobal stack pointer.
Page 10
3. Byte instructions. Two formats of byte pointer are
recognized by the byte instructions. The non-extended format
is identical to the present standard byte pointer format and
is recognized if bit 12 (previously unused) is 0. If bit 12
is 1, a two-word extended byte pointer format is recognized
which contains the fields as shown:
0 5 6 11 12 13 17 18 35
-------------------------------------------------------------
! ! ! ! ! !
! P ! S ! 1 ! MBZ ! AVAIL TO SFTW. !
! ! ! ! ! !
-------------------------------------------------------------
! ! ! ! ! !
! ! I!X! <------------------- ! Y ---------------> !
! ! ! ! ! !
-------------------------------------------------------------
0 1 2 5 35
Note that the second word is identical to the Extended Format
Indirect Word (EFIW). The right half of the first word is
specifically reserved to software for byte counts, etc.
Incrementing of this format of byte pointer is consistent
with non-extended format; the P field is reduced until the
end of the word is reached, whereupon the address in the
second word in incremented. Incrementing may cause a carry
from the word field to the section field of the address;
hence extended byte arrays may lie across section boundaries.
4. Other new or modified instructions are LUUOs, BLT, XBLT, XCT,
XJRSTF, XJEN, XPCW, XSFM. Some of these are valid only in
exec mode. Consult the System Reference Manual or chapter
2.2 of the KL10 Functional Specifications for details.
COMPATIBLE PROGRAMMING
It is possible to generate code which works correctly in both extended
and non-extended environments. Such code may even utilize
inter-section references when running in an extended environment; it
is not limited to local addressing.
The basic rule is to observe the extended addressing rules in the
construction or computation of:
1. Index words: be sure the left half is cleared or contains a
negative quantity (e.g., a count) when setting up and using
an index register. This will cause a local reference in the
extended environment which will produce the same result as on
the non-extended machine.
2. Indirect words: always set bit 0 of indirect words used in
argument lists, etc. so as to produce local addresses.
Page 11
3. Stack pointers: the most common present format (negative
count in left half) works consistently under extended
addressing; modifying a program to use an extended stack
should require no change to stack instructions except if the
code expects to find the processor flags in the stacked PC
words.
When generating addresses to be passed to subroutines or used by other
code, XMOVEI and XHLLI instructions may be used. In the non-extended
environment, these opcodes are SETMI (equivalent to MOVEI) and HLLI
respectively, which always load 0 into the left half.
PROGRAM ARCHITECTURE AND FACILITIES
Ultimately, the extended addressing hardware in conjunction with the
monitor should provide some of the power of general segmentation and
some other useful conventions and facilities. The following are some
of the ideas which have been advanced.
1. The fact that instructions are generally executed relative to
the current sections suggests that subroutine libraries and
facilities packages can be written which can be dynamically
loaded into any available section and used by many programs.
An important point to observe in connection with this and
most of the following conventions is that programs must not
be compiled with fixed section numbers built into the code.
Programs should be built so as to be loadable into any
section and to request additional section allocations from
the monitor as necessary. This is the only reasonable way to
ensure that conflicting use of particular sections is
avoided.
2. An entire section can be mapped by the monitor as quickly as
a single page. Hence, an entire file (up to 256K) can be
mapped into a section, and the data therein manipulated with
ordinary instructions.
3. Programs can greatly reduce memory management logic by merely
assuming very large sizes for all data bases. It is not
however, very efficient to use many sections, each with only
a small amount of data. A reasonable middle ground should be
chosen.