Google
 

Trailing-Edge - PDP-10 Archives - bb-jr93d-bb - 7,6/ap016/mon703.d16
There are 2 other files named mon703.d16 in the archive. Click here to see a list.
MCO: 13067		Name: TL		Date: 31-Aug-86:14:43:32


[Symptom]
SETCLK utility causes UAF crash on system with malfunctioning
or missing TCU.  

[Diagnosis]
PF.IOP is treated as an absoutely fatal error, on the asumption
that the monitor did it.  In fact, J random privileged user (eg, a user
who can get User IO) can do IO instructions.  Should the user look for a
non-responding device, a UAF results.

If the user messes up, the system certainly can continue.

[Cure]
Treat PF.IOP pagefails as "illegal memory references" if they come
from user jobs.  Otherwise, UAF as before.  Since a Unibus isn't really
memory, disallow APRENB trapping using AP.ILM to avoid confusing older
programs.  Allow PSI (.PCIMR) trapping of this condition, in which case
the data item (PFW) tells the user the IO address that failed.

[Comments]
I did this for 7.01, and thought I MCO'd it long ago.  
No documentation impact, as the new behavior is as consistent with the
documentation as the old behavior was.

[Keywords]
PF.IOP
UAF
Don't

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO
KS10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		KSSER	SEILMU,SEPDLO
704		APRSER	SEILMU


[End of MCO 13067]

MCO: 13069		Name: KBY		Date:  1-Sep-86:14:50:51


[Symptom]
Swapper hung during migration.

[Diagnosis]
<Inhale>
If a virtual job has JXPN set in the middle of a UUO and we then
decide to try to process its bad pages, PFHMIG will never complete as it won't
change the job's working set until it's SIMCHK-able, but it will never get in
that state since JXPN is set, and we'll never swap it out and expand it since
we check for migration before almost everything else in the swapper.

[Cure]
If JXPN is set for the job we're trying to migrate, go to CHKXPN instead
of FLGNUL if the job is not SIMCHKable.

[Comments]

[Keywords]

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	314	SCHED1	CHKM1A
703A	


[End of MCO 13069]

MCO: 13072		Name: JAD		Date:  2-Sep-86:13:55:25


[Symptom]
DI hangs after a hung or offline disk (multi-CPU configurations only).

[Diagnosis]
FILIO decides to start I/O at UUO level.  Before it set up the UDB, KDB,
etc., FILIO did a "DSKOFF" which turned off the disk PI channel.  If the
call to the device-dependent  driver  takes  the  error  return,  STRTIO
branches  to  BADUNI.   BADUNI  notices  the  call  was at UUO level and
decides it can type  a  disk  offline  message  to  the  operator.   The
"DSKOFF" is still in effect at this time.

When the various text output routines  are  called  they  dispatch  into
SCNSER.   SCNSER does a "SCNOFF" to turn off the scanner PI channel.  In
a multi-CPU configuration, "SCNOFF" also turns off the disk PI  channel.
Eventually  SCNSER does a "SCNON" to turn on the scanner PI channel.  In
a multi-CPU configuration, "SCNON" also turns on the  disk  PI  channel.
Consequently,  the  PI system state after a "SCNON" may not match the PI
system state before a "SCNOFF".

After BADUNI completes FILIO attempts to start another I/O.  If  a  disk
interrupt should happen to occur during this time FILSER's database will
not  be  protected  against  multiple   accesses.    Under   the   right
circumstances the UUO level start I/O will get overlaid by the interrupt
level start I/O, causing the job which was starting I/O at UUO level  to
hang forever waiting for I/O completion which will never occur since the
I/O request never got fully queued.

[Cure]
Teach the SCNOFF support code in CPNSER (LOKSCI) to read the  PI  system
state and to only enable those channels at SCNON which were turned on at
the time of the SCNOFF.

[Comments]
The reasoning behind this came from Eugene at Copley.  I
checked it out and sure enough, there was a bug.

[Keywords]
DI HANGS
SCNOFF

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	315	COMMON	.CPSCI
		CPNSER	LOKSCI,ULKSCI

703A	


[End of MCO 13072]

MCO: 13101		Name: JMF		Date:  7-Oct-86:08:21:36


[Symptom]
Paging I/O counts are wrong (more reads than writes show up on a
unit).

[Diagnosis]
IO is wrong in S at DONE when the swapout of a paging queue
completes. Don't know why that true but noone cares except the test for
updating counts.

[Cure]
Test SL.DIO instead.

[Comments]
SCCed to BLKK:[1,2].

[Keywords]
Paging I/O counts
SYSTAT
SYSDPY

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
VM only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	321	VMSER	DONE

703A	


[End of MCO 13101]

MCO: 13111		Name: KBY		Date: 20-Oct-86:08:54:19


[Symptom]
Clearing the PM.CSH bit for the low seg doesn't work

[Diagnosis]
Drugs?

[Cure]
Fix code.

[Comments]
Something Eugene's program does...

[Keywords]
LOCK
CACHE

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	323	VMSER	SETCSH
703A	


[End of MCO 13111]

MCO: 13113		Name: KBY		Date: 20-Oct-86:09:07:34


[Symptom]
Strange things might happen if FTCIDISK is off and FTMP is off
(or something like that...)

[Diagnosis]
Conditional wrong

[Cure]
Fix conditional.

[Comments]
Tnx to Spider.  The conditional is in SWPSCN, but I don't
think it's significant.

[Keywords]

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	323	VMSER	SWPSCN
703A	


[End of MCO 13113]

MCO: 13121		Name: KBY		Date: 26-Oct-86:14:52:13


[Symptom]
Various security holes in PAGE. UUO:
1.  It's possible to create "indirect section loops".  The monitor KAFs
    the next time it tries to resolve an indirect section pointer contained
    in the loop.
2.  With the advent of IDC, the monitor gets confused if you try to delete
    (for example) pages 2, 3, and 4, and create pages 3, 4, and 5.  It
    will wind up putting physical pages 0 and 1 in the user's address
    space for virtual pages 4 and 5 (I think).
3.  Monitor too big.

[Diagnosis]
1.  No code to check for loops.
2.  Monitor sees it wants to delete pages 2, 3, and 4 (which exist), then
    sees it wants to create 3, 4, and 5 with IDC lit.  Since 3 and 4
    already exist when the list is scanned, we say they're already created
    and allocate only one page to add to the user's PAGTAB chain.  Previous
    to IDC this couldn't happen because pages 3 and 4 would have caused
    the create part of the PAGE. UUO to fail since they already existed.
    The monitor then processes the deletes for all 3 pages so that when
    it gets around to making sure that pages exist for pages 3, 4, and 5,
    it really needs 3 pages and falls off the end of the pages it allocated.
    Similar things can happen for PAGE IN/OUT.
3.  SCNPTB isn't used anymore.

[Cure]
1.  Add (lots of) code to check for loops.  Force the target sections
    (i.e. right half of the argument word) to be in monotonically increasing
    order, analagous to other PAGE. UUO functions so that the checking
    is a little (but not much) easier.
2.  Although individual cases I might think of may be fixable by just
    not assuming the correct number of pages are always pre-allocated
    through PLTSN (actually through the number returned by PLTSN), it
    seems safer to just restrict the arg list to be strictly monotonically
    increasing.  Current behaviour is that the creates must be increasing
    and the deletes must be increasing, but not that the entire list
    must be increasing.  However, this anomaly hasn't ever been documented
    and I can't think of anyone who uses it.  Restricting the list to
    be monotonically increasing means the user can't do two things to
    the same page in one arg list.
3.  REPEAT 0

[Comments]
(1) isn't as easy to fix as it sounds.

[Keywords]
Speer Attn.

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
Documentation change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	324	VMSER	PLTSN,CHGSEC
703A	


[End of MCO 13121]

MCO: 13132		Name: JAD		Date: 10-Nov-86:15:38:25


[Symptom]
Doing a USETO UUO to extend a file doesn't zero the intervening blocks.
Only the first "n" blocks (cluster size minus one) of each group get zeroed.

[Diagnosis]
Slightly over-trusting programmer thought when the monitor did I/O it
would actually write the number of blocks requested.  Silly programmer.
The monitor decided to write to the end of a group, leaving any blocks
between the end of the group and the end of the file in an "indeterminate"
state.

[Cure]
Get the actual number of blocks written rather than assuming (in blind
faith) that SETDMP will do the entire I/O word in one swell foop.  Use
that number to figure out how many blocks are left to zero.

[Comments]
This happened between 7.03 and the first update tape.  Raw PCO time.

Probably should be a CMCO since it can possibly be regarded as a security
hole (reading stale data from a disk after someone deletes a file?).

[Keywords]
USETO
ZERO FILL

[Related MCOs]
12501

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
PCO required

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	FILIO	USET9A

703A	


[End of MCO 13132]

MCO: 13133		Name: JMF		Date: 12-Nov-86:10:52:07


[Symptom]
PFNOIO Stopcd

[Diagnosis]
PFH won't page out pages at the end of the IOWD so it can page
in the pages at the beginning of the IOWD so it can do the I/O.

[Cure]
Page out pages at the end of the IOWD if it is necessary to break down
the IOWD.

[Comments]
SCC'ed to BLKK:[1,2]. Attn. Spider for CH2M.

[Keywords]
PFNOIO

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	MONPFH	PFHBRK
703A	


[End of MCO 13133]

MCO: 13138		Name: KBY		Date: 16-Nov-86:13:30:05


[Symptom]
Stopcode SNO

[Diagnosis]
A sharable high seg which exists only in some user's saved
context but which is swapped on a unit from which swap space is being
migrated doesn't have a findable owner using the monitor's current
algorithms.

[Cure]
Don't worry if we can't find the owner; no one really cares.
Remove the stopcode and fix a few paranoid routines.

[Comments]
I haven't actually tried to migrate something with this, but
it's the right thing to do even if it doesn't yet work quite right.

[Keywords]
migrate

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	SCHED1	CHKMIG
703A	


[End of MCO 13138]

MCO: 13140		Name: JAD		Date: 17-Nov-86:11:12:06


[Symptom]
PDL overflow doing lots of I/O to a CI disk.

[Diagnosis]
The virtual circuit ran out of credits (we never thought this
would happen).  When we get to BADUNC we call RTNDRB to return the
DRB.  RTNDRB calls GIVDRB, which can wind up calling CRNPOS to start
up another I/O operation.  This one will probably also fail due to
no credits, so we start a PDL excursion, eventually overflowing.
There is actually code in BADUNC to light the "no credits" flag in
the KDB so CRNPOS doesn't try to start I/O, but unfortunately the
code is AFTER the call to RTNDRB.

[Cure]
Light the "no credits" flag BEFORE calling RTNDRB.

[Comments]
More hazardous waste from Pan Am's CLD (pronounced "CLOD").

[Keywords]
CREDITS
PDL OVERFLOW

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	FILIO	BADUNC

703A	


[End of MCO 13140]

MCO: 13143		Name: RCB		Date: 17-Nov-86:21:57:29


[Symptom]
Jobs stuck in the command-wait queue.  This is usually reported as being stuck
in RN state, since CMWQ is a sub-queue of PQ1.  It happens most frequently to
swapped-out users with user-defined commands, but will also happen to users who
type control-R too many times in too rapid of succession or when the
clock-level NETSER interlock is not available.

[Diagnosis]
Use of DLYCM to either bring a swapped-out job into core or to defer a
command.  DLYCM will always put the job into CMWQ, and the job then needs to be
requeued in order to remove it from that queue.

[Cure]
Make sure that DLYCM is not called merely to defer a command.  Make DLYCM1
global, and call it instead.  When it is necessary to bring a job into core,
and thus we have to call DLYCM, make sure that we will perform the requeueing
needed to allow the job to run normally again.

[Comments]
SCCed to BLKK:[1,2].

I finally got CH2M-Hill to test this for me, and Mike says that he went from 20
such jobs a day to none.  Seems like it works.

[Keywords]
CMWQ
Command wait

[Related MCOs]
None

[Related SPRs]
35040, 35587, 35604

[MCO status]
Checked

[MCO attributes]
HOSS attention

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	COMCON	COMGO,DLYCM1
703A		SCNSER	RETYPE
		NETSER	NTLCKC
		FILIO	XCHDSW


[End of MCO 13143]

MCO: 13145		Name: RCB		Date: 18-Nov-86:04:50:53


[Symptom]
STOPCD SICDNA after swap read errors.

[Diagnosis]
Assuming that we have to be in core to kill off a hiseg.

[Cure]
Recognize this case.  Just go back to where we detect that the segment should
be made dormant (or maybe even deleted) when we find that we were the last
sharer, even if we didn't have the segment in core.

This completely removes the SICDNA stopcode.

[Comments]
SCCed to BLKK:[1,2].

[Keywords]
Swap read error
You lose your mind

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
Documentation change
HOSS attention

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	SEGCON	SCNLOG,KILFN0

703A	


[End of MCO 13145]

MCO: 13146		Name: RCB		Date: 18-Nov-86:05:09:24


[Symptom]
Terminals hang in TO or TI, even though there is something for the driver to do.

[Diagnosis]
LDLIDL gets set long after we determine that there are no characters, and not
under the same incarnation of the SCNSER interlock.

[Cure]
Don't release the interlock in between.  That way, UUO level and interrupt
level can't sneak things past each other at inopportune moments.

[Comments]
SCCed to BLKK:[1,2]

[Keywords]
Hung terminal

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
New development MCO
HOSS attention

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	SCNSER	LOTS
703A	


[End of MCO 13146]

MCO: 13147		Name: RCB		Date: 18-Nov-86:05:55:46


[Symptom]
Fix some minor annyances in NRTSER:
	1)  Sometimes we drop the link without sending the unbind message.
	2)  NETOP. can return junk for a node name when we don't have the
		remote node defined.

[Diagnosis]
	1)  Send the data, then wait before performing the synchrounous
		disconnect.
	2)  Don't expect random routines to preserve T1.  Save data in P2
		instead.

[Cure]
Yes.

[Comments]
SCCed to BLKK:[1,2]

[Keywords]
Connection aborted
Trashy INITIA banner

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	326	NRTSER	CTHRLS,NTDID1
703A	


[End of MCO 13147]

MCO: 13152		Name: JMF		Date: 19-Nov-86:09:21:29


[Symptom]
IME stopcd on E or D command with no address specified or on
an invisible break point if there is junk in .JBBPT.

[Diagnosis]
If the high order bits of JOBEXM or .JBBPT contain cruft,
FLTTC indirects through this cruft into the boonies.

[Cure]
Clear high order bits of of T2 in FLTTC.

[Comments]
SSCed.

[Keywords]
IME
D
E
 .JBBPT

[Related MCOs]
None

[Related SPRs]
35608

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	327	VMSER	FLTTC
703A	


[End of MCO 13152]

MCO: 13156		Name: JAD		Date: 20-Nov-86:11:45:52


[Symptom]
Disks don't get re-attached properly when a CPU is REMOVEd and
subsequently ADDed.

[Diagnosis]
Calling ATTCPD with a detached unit's UDB without flagging the
UDB address as such so ATTCPD will call UNLDET.  Using the UNISYS
link word in a detached UDB after calling ATTCPD gets the next
attached unit rather than the next detached unit, since the unit
was moved to the SYSUNI chain in ATTCPD.

[Cure]
TLO U,-1 if attaching a detached UDB, saving UNISYS link before
calling ATTCPD then using that as the link rather than UNISYS.

[Comments]
Probably will fix the KAFs on 1026 when spinning up a new RA60
(I hope).

[Keywords]
SYSDET
ATTCPD

[Related MCOs]
None

[Related SPRs]
35642

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	327	COMMON	SPRIN7

703A	


[End of MCO 13156]

MCO: 13158		Name: JMF		Date: 21-Nov-86:11:08:59


[Symptom]
After assigning more than 8 logical names (using up all of the
space in the logical name table) a DEVCHR on a logical name assigned to
a device which has been INITed in the current context returns 0. Also,
it is possible to INIT a disk with a logical name which was assigned
and INITed in a previous context (this really doesn't matter since the
DDB gets copied but it does seem unclean).

[Diagnosis]
1) JCH and job number confusion.
2) JCH doesn't get checked.

[Cure]
1) Find the DDB if JCHs match, or if the DDB contains a job number
(not a JCH so not INITed) which matches the JCH.
2) Make the same test as in 1 if the logical name is found via the logical
name table.

[Comments]
SCCed. Attn: Narf. This edit is quite extensive but a very simple
patch is available.

[Keywords]
Logical names
DEVCHR

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	327	UUOCON	DEVCHR,DDBSRC

703A		UUOCON	DDSC2A


[End of MCO 13158]

MCO: 13174		Name: JAD		Date:  4-Dec-86:14:38:41


[Symptom]
RTTRP is appropriately named in 7.03.

[Diagnosis]
Multiple choice.  First of all, when the AC definitions changed in
7.03, RTTRP's real-time interrupt level UUO handler (UUOHND) didn't
get the message.  UUOHND assumes ACs 16 and 17 are preserved registers
(they used to be P3 and P4 in 7.02).  Unfortunately, they are now F
and R.  UUOHND loads the job's protection/relocation into R, which
clobbers the contents of AC 17 (the MUUO flags,,PC).

Secondly, when UUOHND calls WAKEUP on a WAKE UUO, a trap to RTTILM
occurs due to P containing a global stack pointer.  UUOHND loads
P with a HRROI P,CnPD1+RTPOPN, then does an AOBJN P,.+1 to prevent
a PDL overflow from occuring on the first PUSH/PUSHJ.  UUOHND does
a PUSHJ to WAKEUP, which eventually gets to PSICND via a SIGNAL of
the wake.  PSICND calls CTXPSI which calls SSEC1.  SSEC1 saves T1
on the stack, then XJRSTs to TPOPJ in section 1.  By now the left
half of P has something like positive 15 in it.  When TPOPJ does
a POP P,T1 a trap occurs due to an illegal memory reference.

Unfortunately, RTTILM winds up dispatching to the user's trap
handling routine as an exec virtual address, rather than a user
virtual address.  Depending on what is in location xyzzy in the
monitor, various bizzare stopcodes will result.

[Cure]
Fix the AC definitions, and put a PRINTX in to warn of future
confusion.

Load P with a real stack pointer which has been offset in both
halves by RTPOPN.

Puzzle over RTTILM for a while longer.

[Comments]
The first two should be enough to get Rockwel running again.
The last may take quite a while so I may wind up writing a PCO
on the first two and answer the last one later.

[Keywords]
RTTRP

[Related MCOs]
None

[Related SPRs]
35657

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	331	COMMON	CHNDXX
		RTTRP	UUOHND,ETC.

703A	


[End of MCO 13174]

MCO: 13176		Name: DPM		Date:  5-Dec-86:07:21:26


[Symptom]
IME in DSKTIC and DI hangs.  A day one 7-series monitor problem.

[Diagnosis]
Queued protocol I/O processing in DSKTIC isn't interlocked
against other CPUs adding or removing things from the queue.

[Cure]
Surround most of DSKTIC with a DSKOFF/DSKON pair.  I know it's
not very elegant, but it's the only way to fix it.

[Comments]

[Keywords]
DI HANG

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
Field service attention
KI10 only
KL10 only
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	331	FILIO	DSKTIC

703A	

702A	

701A	

700A	


[End of MCO 13176]

MCO: 13183		Name: JAD		Date: 10-Dec-86:13:50:57


[Symptom]
Monitor's idea of amount of EVM available for locking is wrong.

[Diagnosis]
 .C0EVM is set up by using a value computed in COMMON which doesn't
take into account space allocated by ONCMAP, etc.  Consequently, .C0EVM
is usually way too high in comparison to reality.

[Cure]
Since SYSINI already marks off the bit table, have it count up the
pages it marks off and use that to set up .C0EVM.

[Comments]
Deleted some junk in COMMON computing EVLN and EVBN, only
added 3 instructions to SYSINI.

[Keywords]
EVM
 .C0EVM

[Related MCOs]
None

[Related SPRs]
35660

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	332	COMMON	EVBN,EVLN
		SYSINI	MMTINI

703A		SYSINI	KIIN10,KII10C


[End of MCO 13183]

MCO: 13192		Name: JAD		Date: 19-Dec-86:10:40:35


[Symptom]
An exec page fault while running at real-time interrupt UUO level
will wind up dispatching into what it thinks is the user's trap
handler, but as an exec PC, winding up in the boonies.

[Diagnosis]
RTTILM assumes all faults will be caused by the user.

[Cure]
Check for exec mode fault and execute a stopcode RTTIME if the trap
occurs in exec mode.

[Comments]
Last half of Rockwell's SPR.

[Keywords]
REAL TIME
RTTRP

[Related MCOs]
13174

[Related SPRs]
35657

[MCO status]
None

[MCO attributes]
New development MCO
Documentation change
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	333	COMMON	RTTILM

703A	


[End of MCO 13192]

MCO: 13193		Name: JAD		Date: 22-Dec-86:08:12:48


[Symptom]
KAF stopcode spinning up an RA60 drive.

[Diagnosis]
Loop in SYSUNI chain caused by UNLDET assuming unit being attached
isn't already on the SYSUNI chain.  This may not be the case if the
drive is attached via the same port it was previously attached on.

[Cure]
Teach UNLDET not to link a unit into the SYSUNI chain if the unit is
already on the SYSUNI chain.

[Comments]
About time ...

[Keywords]
RA60
KAF
UNLDET

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	333	FILIO	UNLDET

703A	


[End of MCO 13193]

MCO: 13196		Name: JAD		Date: 29-Dec-86:12:00:17


[Symptom]
Can't run newer versions of RSX-20F with released versions of
7.03 or 7.03 Update "A".

[Diagnosis]
Newer versions of -20F will send the "here are drive serial numbers"
message to the -10.  Since there is no entry in ALLDSP/FNCTAB for the
new message type (40), DTESER will bitterly complain and crash the -10
with a DTEPCI stopcode.

[Cure]
Add dummy entries in ALLDSP and FNCTAB so the "7.04 version" of -20F
will run on prior monitors.

[Comments]
Paving the way for 7.04 field test ...

[Keywords]
DTESER
DRIVE SERIAL NUMBERS
OH BOY

[Related MCOs]
13194

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		DTESER	ALLDSP,FNCTAB


[End of MCO 13196]

MCO: 13197		Name: TL		Date: 29-Dec-86:19:19:34


[Symptom]
DDT11 doesn't work reliably in hard-to-pin-down circumstances.
Tony's whizzy code based on my hacks fails every blue moon.
Software installation using the FE: device can install corrupt data.
Software diagnosis of -20F resident files transfered via FE: can be
confused by corrupt data.

Also, Grey hair and Baldness.

[Diagnosis]
FEDSER carefully nails down its data base whilest switching
monitor buffers.  Unfortunately, the test to decide whether to switch buffers
is not nailed down.  Thus, data delivered by the -11 in this 20-odd
instruction window never gets handed to the FE device owner.

[Cure]
Check again to see if data was delivered after we have the interlock,
but before we switch buffers.  If so, go back and use it.  If not, switch
buffers.

This approach prevents getting the SYSPIF interlock most of the time.

[Comments]
As I remember the code, this bug has existed since 6.03A.
I've wondered about DDT11's hiccups since I taught it about FE:.
I'm gonna autopatch this for DDT11 and FE (the program).

It was time for a colorful metaphor.

[Keywords]
603A
KL LIR
DDT11
FE
LOST DATA
FE:
Gosh.

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
HOSS attention
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		FEDSER	FEDUI2,FEDUI3

704	


[End of MCO 13197]

MCO: 13202		Name: TL		Date: 30-Dec-86:17:02:08


[Symptom]
Even after MCO 13197, FED devices will return a random amount
of junk data under obscure circumstances.

FEDs can trash monitor free core under obscure circumstances.

FEDs get lost on PUSH/POP.  Contexts hung in FI/FO after PUSH/POP.

FEDs are hard to use.

[Diagnosis]
Too many uses for too few pointers.

FED monitor buffers are allocated on FED GET, but the
pointers used to access them aren't initialized.  

FEDSER marks data as available to the user when the DTE hasn't
delivered it yet.

RESET can release buffers that are being filled (or emptied).

FEDs don't understand CTXSER.

No provision for a generic open forces user programs into percentage roulette.

[Cure]
Yes.

Add the necessary book-keeping to the FED.  Teach FEDSER to use it.

Make sure that the interrupt service doesn't think that the
FED has been initialized before it has buffers and pointers.  This
prevents trashing random memory locations in "free" core.

Allow the FED OPEN function to accept -1 for the FED unit number.  In
this case, search for a free FED on the specified CPU/DTE, and if one
is found, assign it and return its unit number in AC.

[Comments]
Keep 'em coming Tony.

[Keywords]
FED
DTE
DDT11
RSX20F
KL LIR
Gosh.

[Related MCOs]
13197

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
Documentation change
HOSS attention
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		FEDSER	FEDGET,FEDGTW,FEDASG,FEDGTC,FEDUIN,FEDUIB,FEDUBX,FEDSAK,FEDTKD,FEDLMG,FEDTDD
		DTEPRM	FE.MIP,FE.JCH,FEDTCT

704	


[End of MCO 13202]

MCO: 13203		Name: KBY		Date: 31-Dec-86:09:56:19


[Symptom]
JOBPEK still doesn't quite work.

[Diagnosis]
Previous fixes got unfixed, and new ones needed to be
added anyway.

[Cure]
Add all fixes.  Things fixed include making sure the funny
space monitor buffer for reading and writing is uncached, and
being sure .JBPK gets updated if we get swapped out (the funny
space page moves but .JBPK doesn't get updated by the swapper).

[Comments]
CH2M?

[Keywords]
jobpek

[Related MCOs]
13195

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		UUOCON	JOBPEK
704	335


[End of MCO 13203]

MCO: 13204		Name: KBY		Date: 31-Dec-86:11:49:09


[Symptom]
Potential stopcode BSN

[Diagnosis]
13138 not quite complete.

[Cure]
Don't look for a high seg to be owned by job 0.

[Comments]
Hohum.

[Keywords]
migrate

[Related MCOs]
13138

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		SEGCON	FINHGH
704	335


[End of MCO 13204]

MCO: 13207		Name: KBY		Date:  4-Jan-87:14:27:44


[Symptom]
Can't convince LOKCON to LOCK something in EVM with the
cached turned on in EVM.

[Diagnosis]
LOKCON always turns off the cache for pages in EVM.
Also, VMSER for the LOCK PAGES function always leaves the cache
as is currently.

[Cure]
Although it should be possible to turn the cache back on for the
user pages, it still won't go on in EVM.  Add code to follow the user's
request bit.  VMSER will currently always turn the cache off until/unless
I can figure out a way to pass the argument in.

[Comments]
If the user wants to...

[Keywords]
cache
LOCK

[Related MCOs]
None

[Related SPRs]
35658

[MCO status]
None

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		LOKCON
704	336	VMSER


[End of MCO 13207]

MCO: 13208		Name: JAD		Date:  5-Jan-87:10:07:32


[Symptom]
IME, etc., stopcode possible trying to return a BHD for a CI disk.

[Diagnosis]
The BHD we're trying to return is garbage because DRBBHD contains
garbage.  Garbage begets garbage and things go downhill from there.

[Cure]
Zero DRBs before letting FILIO fill them in.  That will (possibly)
prevent the problem.

[Comments]
This was the problem Pan Am had.  The patch appears to have fixed
the problem.

[Keywords]
DRBS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
KL10 only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	335	FILIO	GETDRB

703A	


[End of MCO 13208]

MCO: 13228		Name: RCB		Date: 18-Jan-87:23:32:04


[Symptom]
Still too hard to debug SMP.  Non-policy CPUs don't have symbols when
hitting breakpoints.

[Diagnosis]
The EDV points to a hidden symbol switching block which is per-CPU.
Thus, when CPU1 hits a breakpoint, and we update CPU0's map, we still
don't have a symbol table.

[Cure]
Add yet another word to the EDV, .EDLNK, to point to the next EDV in
the system.  This is a ring pointer, not a zero-terminated chain.
EDDT knows how to handle this (as of edit 662).

[Comments]
SCCed.

[Keywords]
Symbol table lost
Debugging

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Checked

[MCO attributes]
New development MCO
Documentation change
UUOSYM change

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	337	COMMON	.CPEDV
703A		S	.EDLEN
		UUOSYM	.EDLEN


[End of MCO 13228]

MCO: 13233		Name: KDO		Date: 21-Jan-87:15:33:42


[Symptom]
ANF-10 file transfers don't.

[Diagnosis]
DAP message type gets lost: IDC type deposited in wrong location.
NTDSIB contains an "AOS (P)" followed by a PJRST to a routine which does
a skip return.  The result is a POPJ2, which leaves T1 in routine T.SIB with
the wrong value.  

[Cure]
Remove the offending code.

[Comments]

[Keywords]
ANF-10
file transfer

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	340	NETSER	NTDSIB

703A	


[End of MCO 13233]

MCO: 13239		Name: KBY		Date: 25-Jan-87:13:05:16


[Symptom]
(code reading) Not sure but I don't want to think about it.

[Diagnosis]
GVIPCP looks for IPCF paging entries but doesn't check to
be sure SL.SIO is also off (paging queue entry) which means (I think)
that if we have both types of entries present (so the counts will be
right and we can get here), we'll never get to IP2OUT as we'll come
through here first.

[Cure]
Check SL.SIO (only adds one instruction).

[Comments]
Not the SLZs.

[Keywords]
paging queue
IPCF pages.

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
703A		VMSER	GVIPCP
704	340


[End of MCO 13239]

MCO: 13247		Name: JMF/BF		Date: 29-Jan-87:07:22:48


[Symptom]
System hung setting memory containing the monitor's high segment off
line.

[Diagnosis]
SWPSER can allocate space on a unit accessible only by CPUs that are
not running when trying to swap out all of the jobs in core so that the monitor's
high segment can be moved. Since SWPSCN can't find a unit it can do the swapping
I/O on, the system is hung.

[Cure]
If setting monitor memory off line, only allocate swapping space on a unit
which is accessible to the policy CPU.

[Comments]
SCCed. This can lead to a SRO STOPCD if all of the units accessible to
the BOOT CPU are full but tough rocks for those nasty cases. Left as an exercise
for the reader (read that BF) to retrofit the patch into 7.02 for CNA.

[Keywords]
set memory off line
hung

[Related MCOs]
None

[Related SPRs]
35565

[MCO status]
None

[MCO attributes]
New development MCO
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	341	SWPSER	GT2
703A	
702	


[End of MCO 13247]

MCO: 13248		Name: JMF		Date: 29-Jan-87:07:31:13


[Symptom]
System hung setting monitor memory off line.

[Diagnosis]
If a job owns a resource which will not be given up until it runs
after the completion of disk I/O, e.g. the AU, and the disk that the I/O is
being done on lives on a CPU that's going to stick its head in the sand while
the memory is getting set off line, and if that job gets picked by LOKCON to
swap out, the job gets put in FORCEF and will never go out thus hanging the
swapper.

[Cure]
Ignore sharable resources when trying to swap out jobs to set monitor
memory off line.

[Comments]
SCCed. This is another CNA required patch.

[Keywords]
set memory off line
hung

[Related MCOs]
None

[Related SPRs]
35565

[MCO status]
None

[MCO attributes]
New development MCO
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	341	SCHED1	FORC0A
703A	
702	


[End of MCO 13248]

MCO: 13249		Name: JMF		Date: 29-Jan-87:10:32:25


[Symptom]
If memory is set off line and there are no pages of monitor memory
to be moved, the job winds up sleeping forever and the pages don't get set
off line.

[Diagnosis]
Sometime during 7.03 development, a label got moved which causes
the BOOT CPU to wait for other CPUs to jump into their ACs but since there
are no monitor pages to set off line, there is nothing to cause the other
CPUs to go away. Thus, the job setting the memory off line waits forever.

[Cure]
Move the tag back where it belongs.

[Comments]
SCCed.

[Keywords]
set memory off line
hung

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
Multi CPU only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	341	LOKCON	MEMOF2
703A	


[End of MCO 13249]

MCO: 13250		Name: JMF		Date: 30-Jan-87:05:59:45


[Symptom]

1) IME STOPCD deleting a section which contains a sharable high segment with
a PAGE. UUO.
2) PQW, BAC, PIF STOPCDs when passing over a section which already exist (IDC)
and which contains a sharable high segment.

[Diagnosis]

1) KILHSS gets called in section 1 and then tries to use J as an index when the
left half of J contains high segment bits.
2) KILSEC gets called without deleting sharable high segments first.

[Cure]

1) S0PSHJ.
2) Call KILHSS before calling KILSEC.

[Comments]
SCCed. Leave it to Spider (he's about as random as Speer).

[Keywords]
IME
PQW
BAC
PIF
sharable high segments

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
Restricted distribution

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	341	VMSER	CHGSC5,CHGS10
703A	


[End of MCO 13250]

MCO: 13264		Name: JAD		Date:  9-Feb-87:13:15:34


[Symptom]
ENTER error 14 when trying to extend allocation of an existing file.
Space is available on disk and the user has not exceeded their quota.

[Diagnosis]
The RIB is full of retrieval pointers.  UPDALC can't hack extended
RIBs, so the allocation attempt fails.

[Cure]
Teach UPDALC how to extend a RIB when the current RIB is full.

[Comments]
This one has been aging since April 28, 1981 (a 7.01 SPR).  I can see
why SMW left this sucker around for so long; I've been staring at this
particular SPR for an EXTENDED (cheap pun) length of time.

[Keywords]
ALLOCATION
EXTENDED RIBS

[Related MCOs]
None

[Related SPRs]
31137

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	342	FILIO	EXTRIB
703A		FILUUO	UPDAT4,UPDAL1,UPDAL5


[End of MCO 13264]

MCO: 13266		Name: RCB		Date:  9-Feb-87:16:00:33


[Symptom]
Undefined global "M.LAMC" building monitors without networks.

[Diagnosis]
No conditionals, and no dummy globals.

[Cure]
Put code to define things for LATSER under IFN M.LATN.

[Comments]
SCCed.

[Keywords]
CXO

[Related MCOs]
None

[Related SPRs]
35618

[MCO status]
Checked

[MCO attributes]
None

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	342	COMDEV	NFRSBQ
703A	


[End of MCO 13266]

MCO: 13268		Name: RCB		Date:  9-Feb-87:21:41:07


[Symptom]
PAGE. to return WSBTAB fails with illegal instruction on KS.  IME also
possible.

[Diagnosis]
XBLT and other extended addressing things used without regard for
non-extended machines.  BLTing beyond the end of user core without an ERJMP
when clearing excess words in the argument block.

[Cure]
Conditionalize properly, with PXCT'ed BLTs where appropriate.  Add the missing
ERJMP.

[Comments]
SCCed.

[Keywords]

[Related MCOs]
None

[Related SPRs]
35611

[MCO status]
Checked
Restricted distribution

[MCO attributes]
Single-section monitors only

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	342	VMSER	GETWSB,GETWSZ
703A	


[End of MCO 13268]

MCO: 13274		Name: JMF		Date: 10-Feb-87:07:31:37


[Symptom]
STOPCD SLZ

[Diagnosis]
A random segment number can get stored in SW3LST when swapping
out the paging queues. If a job comes along and does a SETUWP on and its
high segment number just happens to be the random segment number that
got stored in SW3LST, GIVBKH can find and zap the SWPLST entry while the
swapping I/O is in progress.

[Cure]
Store -1 in the right half of SW3LST if swapping out a paging queue.

[Comments]
SCCed. Thanks to Kimo's patch.

[Keywords]
SLZ

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO
HOSS attention

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	342	VMSER	MAKLS1
703A	


[End of MCO 13274]

MCO: 13275		Name: BAH		Date: 12-Feb-87:10:07:58


[Symptom]
LOGNUM goes negative.

[Diagnosis]
At ACLDCR+11 a test is made to see if LOGNUM should be
decremented.  If it does, the job is not marked when LOGNUM is 
decremented.  When the LOGOUT UUO is done, the job gets decremented
one more time.

[Cure]
If LOGNUM is decremented, mark the job.

[Comments]

[Keywords]
LOGNUM

[Related MCOs]
None

[Related QARs]
35571, 868939, 35606, 35648

[MCO status]
None

[MCO attributes]
QAR answer

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	343	UUOCON	ACLDCR

703A	16	UUOCON	ACLDCR


[End of MCO 13275]

MCO: 13277		Name: JAD		Date: 13-Feb-87:10:33:58


[Symptom]
Stopcode IME likely, others possible.

[Diagnosis]
SAB ring terminates with a zero.  Apparently CORGRS is calling
NZSCGT to allocate the same virtual address twice when a SAB
begins on a page boundary.  The second call creates a new
page in that address which is zeroed, causing the previous
SAB link to point at a zeroed page.

[Cure]
Fix up how CORGRS decides to call NZSCGT so it won't try to
allocate the same virtual address twice.

[Comments]
This appears to be a day 1 extended addressing bug (7.02).
Odd that no one has ever seen it before, but the odds of it happening
are 1 in 512, so maybe we were just lucky . . .

[Keywords]
SAB RINGS

[Related MCOs]
None

[Related SPRs]
None

[MCO status]
None

[MCO attributes]
New development MCO

[Validity]

Monitor	 Load	Module	 Tags
-------	------	------	------
704	343	ONCMOD	CORGRS
703A	


[End of MCO 13277]