Google
 

Trailing-Edge - PDP-10 Archives - decuslib20-03 - decus/20-0078/libsim/select.hlp
There are no other files named select.hlp in the archive.
                                 SELECT
                                 ======

SELECT is a separately compiled SIMULA  CLASS  to  enable  searching  of
texts or files applying BOOLEAN search criteria like

"SWEDEN+DENMARK&-COPENHAGEN"

meaning that you want to find all lines or records which contain  either
the word "SWEDEN" or the word "DENMARK" but not the word "COPENHAGEN".

SELECT contains procedures to

a) Convert a text containing  such  a  BOOLEAN  search  formula  into  a
formula tree suitable for fast searching.

b) Procedures for applying such a formula tree to  a  text  or  part  or
whole of a text array.

The allowed operators in the formula are +(=OR),  &(=AND),  -(=NOT)  and
the  left  and  right  parenthesises,  ( and ).  If parenthesises do not
indicate otherwise, + is assumed to have the lowest  priority,  followed
by &, - and the parenthesises.  These operator characters can be changed
by the user.

SELECT does not contain any input/output procedures.

Accessible attributes of the CLASS select:


BOOLEAN PROCEDURE BUILD_CONDITION

          Translates a  text  string  like  "SWEDEN+DENMARK&-COPENHAGEN"
          into a formula tree
                                (OR)
                           SWEDEN   (AND)
                              DENMARK   (NOT)
                                      COPENHAGEN

          This tree can later  be  used  to  scan  a  text  through  the
          procedures  LINE_SCAN  and  ARRAY_SCAN.   BUILD_CONDITION will
          return FALSE if the input formula had bad  syntax,  e.g.  with
          non-balancing  parenthesises.   In  that  case, an appropriate
          error message is returned in the text SELECT_ERRMESS.

          Parameters to BUILD_CONDITION:

          REF (OPERATOR) TREE_TOP returns the formula tree.

          TEXT SELECTOR contains the input formula text.  SELECTOR  need
          not  contain  any  operators,  in which case it simply means a
          search for that whole  text.   If  SELECTOR  ==  NOTEXT,  then
          TREE_TOP  returns  NONE,  which  means  a formula matching all
          scanned texts.

          BOOLEAN CASESHIFT:  If TRUE, upper and lower  case  characters
          will be regarded as identical.


BOOLEAN PROCEDURE LINE_SCAN

          sapplies a scanning formula to a TEXT.  Returns  TRUE  if  the
          formula  is  satisfied  by  segments of the TEXT.  The formula
          "SWEDEN+DENMARK&-COPENHAGEN" is for example TRUE for the  TEXT
          "DENMARK  IS  A  EUROPEAN  COUNTRY"  but not TRUE for the TEXT
          "COPENHAGEN IS THE CAPITAL OF DENMARK".

          Parameters to LINE_SCAN:

          REF  (OPERATOR)  TREE_TOP  is  a  formula-tree   produced   by
          BUILD_CONDITION.   If  this  parameter is NONE, LINE_SCAN will
          always return TRUE.

          TEXT INLINE is the text to which the formula is to be applied.
          If  TREE_TOP =/= NONE AND INLINE == NOTEXT, then FALSE will be
          returned.


BOOLEAN PROCEDURE ARRAY_SCAN

          is similar to LINE_SCAN, but will apply the formula to several
          lines of TEXT comprising part or the whole of a TEXT array.

          Parameters to ARRAY_SCAN:

          REF  (OPERATOR)  TREE_TOP   is   the   formula   produced   by
          BUILD_CONDITION.

          TEXT ARRAY LINES is the array  of  text  lines  to  which  the
          formula is to be applied.

          INTEGER I1, I2 are the lower and upper bound of the  lines  in
          the  TEXT ARRAY to which the formula is to be applied.  I1 may
          be larger than the absolute lower bound of the array,  and  I2
          may be less than the absolute upper bound of the array, if you
          only want to apply the formula to part of  the  lines  in  the
          whole  array.   If  TREE_TOP  ==  NONE,  TRUE  will  always be
          returned.  If TREE_TOP =/= NONE AND I2 < I1, then  FALSE  will
          always be returned.


PROCEDURE TREE_PRINT

          will print a formula tree on SYSOUT in the format
          (SWEDEN+(DENMARK&(-COPENHAGEN)))
          i.e. fully parenthesized to show any  default  assumptions  of
          operator  priorities.   Outimage  should be called immediately
          after TREE_PRINT.

          Parameters to TREE_PRINT:

          REF (OPERATOR) TREE_TOP is the formula tree to be output.


PROCEDURE SET_OPERATOR_CHARACTERS

          will tell the package which  characters  are  to  be  used  as
          delimiters  in  formulas  input to BUILD_CONDITION.  A default
          assumption is made if SET_OPERATOR_CHARACTERS  is  not  called
          from your program.

          Parameters to SET_OPERATOR_CHARACTERS:

          TEXT T:  This TEXT should always be of length  5  and  contain
          the following five characters:

                    Default Character
                       +    OR
                       &    AND
                       -    NOT
                       (    LEFT PARENTHESIS
                       )    RIGHT PARENTHESIS


TEXT SELECT_ERRMESS

          If BUILDCONDITION finds a syntax error in  the  formula,  this
          TEXT  will  return an appropriate error message.  You can thus
          combine SELECT with SAFEIO and write e.g.:

                  request("Give selection criteria:",
                  NOTEXT,textinput(line1selector,
                  buildcondition(condition,selector,caseshift)),
                  selecterrmess,myhelp("SELECT"));


CLASS OPERATOR

          Is the qualification to be used when you declare formula  tree
          variables in your program, e.g.:

                    REF (OPERATOR) selector1, selector2;


OPTIONS(/l); COMMENT demonstration example for SELECT;
          COMMENT this program will list all lines in an input file
          which satisfy a selection formula;
          BEGIN
            EXTERNAL TEXT PROCEDURE rest, upcase;
            EXTERNAL TEXT PROCEDURE scanto, from, conc;
            EXTERNAL CHARACTER PROCEDURE findtrigger;
            EXTERNAL BOOLEAN PROCEDURE frontcompare, puttext;
            EXTERNAL INTEGER PROCEDURE scanint, search;
            EXTERNAL CLASS select;
            select BEGIN
              REF (operator) formula;
              LINECOPY_BUFFER:- blanks(150);
              ask_for_formula:
              outtext("Input selection formula:"); breakoutimage;
              inimage;
              IF NOT build_condition(formula,
              sysin.image.strip,TRUE) THEN
              BEGIN outtext(select_errmess); GOTO ask_for_formula;
              END;
              tree_print(formula); outimage;
              INSPECT NEW infile("Infile *") DO
              BEGIN
                open(blanks(150)); sysout.image:- image;
                WHILE NOT endfile DO
                BEGIN
                  inimage;
                  IF line_scan(formula,image.strip) THEN outimage;
                END;
                close;
              END;
            END;
          END;


EFFICIENCY CONSIDERATIONS:

          You need not consider this section to make the select  package
          work, only if you want to make your programs more efficient.

          Scanning of non-significant texts takes time,  especially  for
          complex  formulas requiring many scans of the text.  In such a
          case you can often save time by  only  applying  LINE_SCAN  or
          ARRAY_SCAN   to   those  parts  of  your  text  which  contain
          information, e.g. by supressing non-significant  blanks.   The
          simplest  case  of this is to strip your texts before scanning
          them.

          LINE_SCAN on a long text is more efficient than ARRAY_SCAN  on
          several  shorter  texts,  especially  if  CASESHIFT  is  TRUE.
          Sometimes, you can let your array elements be  subtexts  of  a
          common main text, and apply LINE_SCAN on part or whole of this
          main text  instead  of  ARRAY_SCAN  on  the  array.   However,
          ARRAY_SCAN is faster than LINE_SCAN if the total length of the
          scanned texts becomes  smaller,  e.g.  if  use  of  ARRAY_SCAN
          allows  you to avoid scanning of blanks at the end of lines in
          the text.



     TEXT LINECOPY_BUFFER

          LINECOPY_BUFFER is a text attribute to  select  used  to  keep
          copies  of  the  texts to be scanned when caseshift = TRUE and
          when the number of array elements to be scanned by  ARRAY_SCAN
          is larger than 10.

          Sometimes you  can  improve  efficiency  by  assigning  values
          yourself  to  this  buffer.  Do not make assignments to it too
          often.


     TEXT LINE

          If you want to write your own procedures similar to  LINE_SCAN
          and  ARRAY_SCAN  you  need access to this attribute of SELECT,
          which internally refers to the lines to be scanned.

                          [END OF SELECT.HLP]