Google
 

Trailing-Edge - PDP-10 Archives - decuslib20-07 - decus/20-0172/blfmaint.mem
There are no other files named blfmaint.mem in the archive.
BLISS Language Formatter PRETTY: Maintenance Information	Page 1



                          TABLE of CONTENTS


         Section                 Title                           Page

        1.0      SUMMARY . . . . . . . . . . . . . . . . . . . . 2
        2.0      PROJECT CONVENTIONS . . . . . . . . . . . . . . 2
        2.1      Labels And Symbols  . . . . . . . . . . . . . . 2
        2.2      Subprogram Interfaces And Calling Sequences . . 2
        2.3      Data Formats And Representations  . . . . . . . 3
        2.4      Error And Exception Reporting . . . . . . . . . 3
        2.5      Unusual Conditions Treatment Philosophy . . . . 3
        3.0      DESIGN OVERVIEW . . . . . . . . . . . . . . . . 4
        4.0      TABLES, QUEUES, AND BUFFERS . . . . . . . . . . 5
        5.0      OVERVIEW OF MAJOR MODULES . . . . . . . . . . . 6
        5.1      Module FORMAT.xxx . . . . . . . . . . . . . . . 6
        5.2      Module BLFxxx . . . . . . . . . . . . . . . . . 6
        5.3      Module CONTRL . . . . . . . . . . . . . . . . . 6
        5.4      Module LEX  . . . . . . . . . . . . . . . . . . 7
        5.5      Module SCANNR . . . . . . . . . . . . . . . . . 7
        5.6      Module OUTPUT . . . . . . . . . . . . . . . . . 8
        5.7      Module PARSE1 . . . . . . . . . . . . . . . . . 8
        5.8      Module PARSE2 . . . . . . . . . . . . . . . . . 9
        5.9      Module PARSE3 . . . . . . . . . . . . . . . . . 9
        5.10     Module SYMPRP . . . . . . . . . . . . . . . . . 10
        5.11     Module UTILIT . . . . . . . . . . . . . . . . . 10
        6.0      KEY ALGORITHMS  . . . . . . . . . . . . . . . . 11
        6.1      The Parsing Algorithm . . . . . . . . . . . . . 11
        6.2      The Search Algorithm  . . . . . . . . . . . . . 11
        7.0        MAINTENANCE AND DIAGNOSTIC TOOLS AND PROCEDURES12
        7.1      Adding A Language Feature . . . . . . . . . . . 12
        7.1.1      Adding A New Keyword -  . . . . . . . . . . . 12
        7.1.2      Adding A Syntactic Element -  . . . . . . . . 12
        7.1.3      Adding A User Option -  . . . . . . . . . . . 12
        7.2      Debugging . . . . . . . . . . . . . . . . . . . 13
        8.0      SYSTEM TEST DESCRIPTION . . . . . . . . . . . . 13
BLISS Language Formatter PRETTY: Maintenance Information	Page 2


1.0  SUMMARY

The maintainability and reliability of computer  programs  is  closely
related  to  their  readability.  It is common practice to enhance the
readability of complex programs by the judicious use of open space  in
the  listings to indicate the relationships between different parts of
the program:  closely related parts are bunched together so  that  the
eye  perceives  them  as  a  unit, while unrelated parts are listed on
separate pages, etc.  In a highly structured language such  as  BLISS,
it  is  useful  and customary to organize the code and listings into a
heirarchical  structure  in  which  the  highest-level   commands   or
expressions  are  aligned  near  the  left  margin and the position of
lower-level code is indicated by the  amount  of  indentation  to  the
right.   In this way, the exact conditions under which a given line of
code are executed may be determined by the reader by simply  examining
the  points  at  which  the  indentation  level changes (which are the
decision and iteration points of the program.) At the same time, parts
of  the  code which are executed most frequently (and which become the
attack points for optimization of the program) are indicated as  local
maxima  of  the  indentation  level.   The  BLISS  Language  Formatter
("PRETTY") will transform an existing  BLISS  program  into an
equivalent form which adheres to conventional rules and standards  for
readability.




2.0  PROJECT CONVENTIONS

2.1  Labels And Symbols


Global routine names are of the form "XYZ$..." where XYZ is associated
with a function of the program, e.g.  LEX with lexical processing, SCN
with source text scanning, PRS with parsing, OUT with output, etc.



2.2  Subprogram Interfaces And Calling Sequences


Communication between modules consists of simple calling sequences (in
which  the  arguments are integers or character string pointers) and a
single Global data structure called  TOKEN.   Communication  within  a
module  is  by  means  of  simple OWN variables (few arrays except for
character string VECTORS) or simple calling sequences.
BLISS Language Formatter PRETTY: Maintenance Information	Page 3


2.3  Data Formats And Representations


     1.  The input and output source files  consist  of  sequences  of
         ASCII  text  records, punctuated by linefeed characters. The
         XPORT routine  library  is  used  for   record   I/O.

     2.  The  global  array  TOKEN  consists  of  a  character  string
         description  (character  pointer  and length) and a type code
         for the current  token.   Each  item  in  this  structure  is
         allocated a full word.



2.4  Error And Exception Reporting


     1.  File I/O errors are reported to the user terminal  and  cause
         abortion  of the run.  System action pertaining to open files
         will be taken.

     2.  Anomalies in  the  syntax  of  the  source  are  reported  by
         inserting  comments  into  the output file at the point where
         the anomaly occured.  The same message is also  sent  to  the
         terminal.  These messages have a distinctive format:

         "!!ERROR!!..." in the output file
         "?!ERROR!!..." at the terminal or log file 

         which is easily found by an editor.  Subsequent  reprocessing
         of   the  output  file  by  BLF  will  erase  these  comments
         automatically.   Occurrence  of  such  an  error   does   not
         necessarily  cause  termination of the run.  A command option
         is provided for suppression of error messages in  the  output
         file (but not to the terminal.)




2.5  Unusual Conditions Treatment Philosophy


The main philosophy of the formatter is always to produce a consistent
output  file, even in the face of errors by the user, confusion of the
parsers due to hidden syntax information, etc.  The output format  may
be unacceptable, but the output file must be complete.
BLISS Language Formatter PRETTY: Maintenance Information	Page 4


3.0  DESIGN OVERVIEW


     1.  PRETTY processes a single input source file  and  produces  a
         single   output  source  file;an  optional  listing  file  is
         possible.

     2.  The input file is parsed using a recursive descent  algorithm
         and syntax errors are reported in the output file as comments
         inserted into the text  (thus  the  problems  of  correlating
         separate output and error files are avoided.)

     3.  Everything which PRETTY can insert into the  output  text  is
         first  deleted  from the input text:  error comments, spaces,
         tabs, formfeeds, etc.  This guarantees that  subsequent  runs
         through  PRETTY  will  produce comparable text, but confounds
         the user who invents, e.g., vertical spacing conventions  for
         his own use.

     4.  Each lexeme found in the input  text  is  reproduced  in  the
         output,  in  the  same order.  This obvious rule leads to the
         use of remarks as line-breaks (since they must appear at  the
         end  of the line) and thus gives the user some flexibility in
         laying out the format of the program.
BLISS Language Formatter PRETTY: Maintenance Information	Page 5

4.0  TABLES, QUEUES, AND BUFFERS


The major global data structure in BLF is called TOKEN.   It  consists
of three words:


     1.  [TOK_LEN]:  The number of characters in the token (0 - 120)

     2.  [TOK_CP]:  A character pointer to the first character of  the
         token in the input stream.

     3.  [TOK_TYPE]:  The value of the type of the token (0 -  approx.
         130),   as   determined   by   the   table  in  REQUIRE  file
         'TOKTYP.BLI'.

         The token  type,  set  by  SCN$GETSYM,  may  be  modified  by
         LEX$GETSYM if it is of interest to the formatting process.



Other tables of interest to the maintainer of BLF are as follows:

     1.  BLISS Keywords - found in module LEX.  This table contains  a
         complete sorted list of all known BLISS keywords, and a token
         type associated with each.  All keywords with type S_NAME are
         not relevant to the formatting process (for example, names of
         character string functions.) All others are relevant, and are
         singled out by some parsing routine.

     2.  Error messages and types - found in module UTILIT.

     3.  Symbol properties table -  found  in  module  SYMPRP.   These
         properties are used in parsing, e.g.  to locate the end of an
         expression.

     4.  Synonym definition table - found in module LEX.   This  table
         defines  the  sequence of lexemes assigned to a user variable
         which is declared a synonym by means of the  SYNONYM  control
         comment.   The  table  is  initially  cleared by LEX$INIT and
         built up by user controls.
BLISS Language Formatter PRETTY: Maintenance Information	Page 6


5.0  OVERVIEW OF MAJOR MODULES


5.1  Module FORMAT.xxx


This module is the central control module.  Its function is to  invoke
other  modules  to  obtain  file  specifications,  process  the  input
file(s), etc.  The execution of this module terminates on receipt of a
ctrl-c (abort) or ctrl-z (end of file) signal from the terminal.  This
module is extremely simple in structure  and  the  listing  is  wholly
self-explanatory.   FORMAT.VT1  should   be   used   on  the  VAX  and
DECsystem-10,  and FORMAT.T20 should be used on DECSYSTEM-20 machines.




5.2  Module BLFxxx


This module provides  the  BLF  Command  Language  Interface.   It  is
responsible  for  obtaining  from  the  user terminal the names of the
input and output  files.   The  XPORT library  is  used  for  all  I/O
functions, both for the terminal and the files.  BLFVMS is used on VAX
systems,  BLFT10  and BLFT20 are used on DECsystem-10 and -20 machines,
respectively.




5.3  Module CONTRL


This module processes the special control comments which are  used  to
specify the user options for BLF.  When such a comment is found in the
input by Module LEX, it is passed to CONTRL for analysis  and  action,
usually  a  matter  of  setting  the  values  of  internal  flags  and
variables.  These values are returned to the parsers, etc.   by  calls
to the function CTL$SWITCH.

Among the controls recognized is a request to read more controls  from
an  alternate  input  file  (!<BLF/require'file.ext'>).  The alternate
file may not contain another control of this type.  When this  control
is  found,  the  scanner is directed to the alternate input file until
its end-of-file or some disallowed lexeme is found;  then  input  from
the  primary  input file is resumed.  The switching is done in routine
SCN$SETINUNIT in module SCANNR.
BLISS Language Formatter PRETTY: Maintenance Information	Page 7


5.4  Module LEX


This module accepts lexemes  from  SCANNR  and  discriminates  between
syntactic elements of the language and other lexemes such as comments,
conditional compilation controls (%IF etc.) and file punctuation  (end
of  line,  end  of  file.)  Names  are  identified  as relevant to the
formatter (e.g.,  BEGIN,  MODULE,  MACRO)  or  irrelevant  (most  user
identifiers,  character  function names, etc.) and a unique token type
is assigned (in TOKEN.) Each syntactical element is  returned  to  the
parsers on request.

LEX detects the control comments (of the form "!<Blf/...>") and passes
them to CONTRL for further processing.

This module consists of three routines and a  large  table  (of  BLISS
keywords.)  Routine  LEX$GETSYM  accepts  tokens  from the Scanner and
sorts them out.  If the lexeme is a name, routine LOOKUP is called  to
do  a binary search of the table and to perform case conversion.  Case
conversion (which may be specified by user control) is performed  even
if  the text is not being reformatted (either because it is in a macro
definition, or if requested not to reformat.) LEX may change the  type
of  the token obtained, and under certain circumstances may change the
pointer to point to a copy of the lexeme which  has  been  created  to
prevent overlaying by the following token.  

The special SYNONYM control causes a user name to be associated with a
sequence of lexical tokens, which are stored in the SYN data structure
within LEX.  Whenever that user name subsequently appears in the text,
the sequence of tokens is returned to the parsers one at a time with a
token length of  zero.   At  a  designated  point  the  user  name  is
associated  with  one  of the tokens for purposes of output.  When the
last of the tokens associated with the user name is  returned  to  the
parsers by LEX$GETSYM, the normal scanning of input tokens is resumed.




5.5  Module SCANNR


This module controls the input of text lines and the separation of the
input  text  into  lexemes, the characteristics of which are stored in
the global data structure TOKEN.

The primary function of the routines in this module is to input a line
(READALINE)  and  provide  the  next token in sequence (SCN$GETSYM.) A
secondary function is to provide the  controls  needed  to  copy  text
verbatim  (e.g.   in  a  macro definition.) To achieve the latter, the
lines are written directly  as  they  are  read:   PRETTY  still  goes
through  the  motions of putting tokens, spaces, etc.  into the output
buffer, but that buffer is not written out in "verbatim mode."

The scanner may take its inputs from one of three sources:
BLISS Language Formatter PRETTY: Maintenance Information	Page 8


     1.  The normal input file

     2.  An alternate "require" file

     3.  A point in the midst of a SYNONYM definition line from either
         of the two input files

The switching of context from one source to another is accomplished by
means  of  structure  references  through a pointer usually designated
"sp".  A short stack for values  of  this  pointer  is  maintained  by
routines SCN$PUSH and SCN$POP.




5.6  Module OUTPUT


This  module  contains  all  the  routines  which   pertain   to   the
construction  of  the  output  line image and writing the output file.
The actual writing is  primarily  done  by  BREAK1,  which  is  called
whenever the parsing and formatting routines determine that a new line
is in order.  Writing is also done by routine  OUT$EJECT,  whose  sole
function   is  to  put  formfeeds  into  the  file.   These  are  done
independently because it is  not  always  clear  exactly  when  it  is
correct  to  do  so:   e.g.   when  GLOBAL  ROUTINE is found, the text
"GLOBAL " is already in the buffer when the decision to create  a  new
page  (based on finding "ROUTINE") is made.  Therefore the formfeed is
written before further parsing or output is done.

A major function of the output module routines is in keeping track  of
the  current  indentation  level.   The parsing routines may alter the
level incrementally by calls to OUT$INDENT;  actual generation of tabs
and  spaces  to  acheive  the  correct  indentation is done by routine
OUT$TAB.

Whenever one of the parsers finds an unexpected token (because  of  an
error  or  other  anomaly),  it  usually  outputs  the token anyhow to
prevent looping.  The routine OUT$DEFAULT provides default  formatting
rules  for  tokens  found  out of context;  these rules are not always
those  that  would  have  been  used  if  the  token  were   correctly
recognized.




5.7  Module PARSE1


This module contains the main parsing  routine  (PRS$MAIN)  and  other
routines  to  parse the major syntactic structures (Modules, Routines,
Blocks, etc.) of the BLISS language.  Together with its sister modules
PARSE2   and  PARSE3,  it  contains  the  decision  process  by  which
whitespace is reinserted into the stream of lexemes to form the output
BLISS Language Formatter PRETTY: Maintenance Information	Page 9


stream.   The  parsing  routines  perform  a minimal syntactic (and no
semantic)  analysis  of  the  text  and  report  errors  as  they  are
discovered.

It should be noted that the syntax which the formatter sees  and  that
which the compiler sees are not necessarily the same:


     1.  The compiler joins REQUIRE files with the source  text;   the
         formatter does not.

     2.  The  compiler  ignores  text  not  selected  by   conditional
         compilation  controls;   the  formatter  must process all the
         input text.

     3.  The compiler expands all macros;  the formatter does not.

Thus what may appear to the formatter as an error  in  syntax  may  be
correct to the compiler.

The three parsing modules are maintained separately only because, as a
single  module,  they would be unmanageably bulky.  Each routine looks
at successive tokens in the order in which they naturally occur, using
the  routine  LEX$GETSYM  to  access each token, and disposes of those
tokens, in the same order, by calls to routine OUT$TOK.   In  between,
each routine makes certain decisions as to the current syntactic state
of the input text, and on the basis of context makes decisions  as  to
what whitespace to reinsert into the output stream.



5.8  Module PARSE2


This module contains all the routines pertaining  to  the  parsing  of
expressions,  especially of control expressions (IF/ THEN/ ELSE, INCR/
DECR, WHILE/ UNTIL, etc.) These  control  expressions  are  the  major
causes of indentation in the output file.  




5.9  Module PARSE3


This module contains all the routines pertaining  to  the  parsing  of
declarations,  with  the  exception of Modules and Routines (which are
handled by PARSE1.) There are specific routines for declarations (e.g.
STRUCTURE)  which  have  unusual  syntax, but many declarations have a
common general format and are handled by the default declaration
parser, DO_DECL_DEF.
BLISS Language Formatter PRETTY: Maintenance Information	Page 10


5.10  Module SYMPRP


This short module has the function of making  certain  discriminations
between  classes  of  lexemes,  e.g.   which  can be used to terminate
expressions ("OF", "UNTIL", etc.)




5.11  Module UTILIT


This module contains the  error-handling  routine  UTL$ERROR  and  its
associated tables of error messages and types.
BLISS Language Formatter PRETTY: Maintenance Information	Page 11


6.0  KEY ALGORITHMS


6.1  The Parsing Algorithm

The parsing algorithm used in PRETTY is a recursive  descent  analysis
of  the  major  structural  elements, with a strong tendency to ignore
syntactic elements (e.g.  declaration switches) which  are  irrelevant
to  the  task  of  formatting  the  text.   In this parse, the topmost
element is taken to be a block  body  (rather  than  MODULE  or  other
declaration,  which  might seem more natural from reading the language
manuals) in order to be able  to  handle  built-in  macro  references,
declarations, or whatever with equal facility.




6.2  The Search Algorithm

The process of finding  a  BLISS  keyword  among  the  other  uses  of
identifiers  is  done  by  a  binary search algorithm which appears in
Routine LOOKUP in Module LEX.  This routine has a  complete  table  of
all   BLISS  keywords.   The  capitalization  scheme  used  by  PRETTY
distinguishes these keywords from all other  uses  of  identifiers  in
implementing the selected case conversion options.
BLISS Language Formatter PRETTY: Maintenance Information	Page 12


7.0  MAINTENANCE AND DIAGNOSTIC TOOLS AND PROCEDURES


7.1  Adding A Language Feature


If and when BLISS is extended, the formatter must be updated to handle
the  new  syntax  and associated keywords.  The process goes along the
following lines:




7.1.1  Adding A New Keyword - 

The new keyword may be inserted into the keyword table in module  LEX,
routine  LOOKUP, at any time.  It must be inserted at the proper place
according to ASCII collating sequence;  the table-look-up  in  routine
LOOKUP  does  a binary search which depends for its success on correct
sequencing.



7.1.2  Adding A Syntactic Element - 

Depending on the nature of the syntactic function, this may be easy or
difficult.   The  first  thing  to  check  is to see if it has similar
syntax to some other element.  If so, the two can be doubled  up  with
little  effort  (e.g.  just as UNTIL and WHILE are handled by the same
routine.) Otherwise, it will be necessary to write a  new  routine  to
perform  the analysis.  The basic guidelines for writing such routines
are:

     1.  Calls to the routines OUT$TOK (or OUT$STOKS)  and  LEX$GETSYM
         must   be  paired.   Otherwise  a  lexeme  will  be  lost  or
         duplicated.

     2.  Use existing routines (PRS$EXPRESSION,  for  example)  to  do
         most of the work.





7.1.3  Adding A User Option - 

The maintainer who plans to add a new  user  option  should  begin  by
studying   module  CONTRL.BLI,  which  contains  the  current  option-
handling routines.  In particular, the way in which CONTRL  cooperates
with  SCANNR  in handling a function which may be either internally or
externally invoked (namely, turning off  the  formatting  process)  by
means of routine calls should be examined carefully.
BLISS Language Formatter PRETTY: Maintenance Information	Page 13


7.2  Debugging

Insertion of the comment line

!<BLF/DEBUG>

will cause routine LEX$GETSYM to print on the terminal each  syntactic
lexeme  encountered.   This,  coupled  with the use of a debugger,
is sufficient to determine exactly when any error condition
occurs and what routine examines a particular lexeme.



8.0  SYSTEM TEST DESCRIPTION


Two areas of functionality in BLF must be tested independently:

     1.  The property that all incoming lexemes are output in the same
         order  must be verified, by compiling representative programs
         both before and after processing by  BLF  and  comparing  the
         binary  output  files  bit  by  bit  with the file comparison
         utility program.

     2.  The visual properties of the listing  must  be  examined  for
         readability.   This  is  a  completely subjective process for
         which there can be no completely automatic methods.