Google
 

Trailing-Edge - PDP-10 Archives - decuslib10-05 - 43,50337/23/store.rno
There is 1 other file named store.rno in the archive. Click here to see a list.
.title ^^STORE - A SIMPLE TEXT ORIENTED DATA BASE HANDLER
.spacing 1
.nojustify
.left margin 5
.paper size 55,65
.indent -5
STORE - A SIMPLE TEXT ORIENTED DATA BASE HANDLER
.skip 1
By Jacob Palme, Swedish National Defense Research Institute.
.skip 1
.indent -5
ABSTRACT
.skip 1
STORE is a simple data base handler written in SIMULA.
STORE is primarily used through two procedures,
PUTMESSAGE and GETMESSAGE. PUTMESSAGE stores a message under a
key in the data base, GETMESSAGE retrieves a message for a given
key from the data base. Both message and key can be variable
length text strings.
Any structuring of key and message into fields is up to the user,
not done by the STORE program.
.skip 1
The strings are stored in a direct access file on secondary storage.
The system uses a combination of hashing and binary trees to find
the place of a message in the data base. Hashing to get high efficiency,
binary trees to allow unlimited growth of the data base.
.skip 1
The system is designed with the goal of being easy to use for a
programmer and having as few restrictions as possible.
.skip 1
.indent -5
PUTTING AND GETTING MESSAGES
.SKIP 1
You store a message into the data base by calling the
.break
BOOLEAN PROCEDURE PUTMESSAGE(KEY, MESSAGE);
.break
TEXT KEY, MESSAGE;
.skip 1
KEY and MESSAGE should both be TEXT variables of arbitrary
contents and length.
The length of KEY should however not be larger than 400.
.skip 1
The value of PUTMESSAGE will be TRUE if the message could be stored
in the data base, FALSE if this was not possible because of a too
long key or because there already was a message in the data base
under the same key.
.skip 1
If you want to overwrite a previous message with the same key
in the data base, you should set the switch REMOVE to TRUE.
If REMOVE is TRUE, PUTMESSAGE will overwrite previous messages
with the same key, if REMOVE is FALSE, PUTMESSAGE will not
overwrite previous messages and will return FALSE if a message
with the same key was already in the data base.
.skip 1
To get a message back from the data base, you use the
.break
TEXT PROCEDURE GETMESSAGE(KEY);
.break
TEXT KEY;
.skip 1
This procedure will look for a message stored under the given
key in the data base. If no such message can be found, the
value NOTEXT is returned.
.skip 1
.indent -5
CONVERTING KEYS TO UPPER CASE
.skip 1
If you want a key to refer to the same message regardless of
upper or lower case characters in the key, then you
should apply the TEXT PROCEDURE UPCASE(KEY);
.break
TEXT(KEY);
.break
to the key. This procedure changes all lower case characters
in the key to upper case. So if you put a message
under a key "GooD" you can later get the same message
through the key "GOOD" or "good".
.skip 1
.indent -5
PRODUCING A DIRECTORY OF A FILE
.skip 1
The STORE package has a built-in procedure for producing a
directory of a file, that is for listing all the keys in the
file.
.skip 1
The procedure is 
.break
PROCEDURE DIRECT(KEYPROCESSOR);
.break
PROCEDURE KEYPROCESSOR;
.skip 1
Where KEYPROCESSOR is a procedure supplied by you with two
parameters KEY and LOCATION. Example:
.skip 1
.nofill
PROCEDURE KEYPROCESSOR(KEY,LOCATION);
TEXT KEY;
.break
INTEGER LOCATION;
BEGIN OUTTEXT(KEY);
.break
OUTINT(LOCATION,5);
.break
OUTIMAGE;
END;
.fill
.skip 1
The procedure KEYPROCESSOR will be called once for each
KEY in the data base. LOCATION is the number of the block in
the file where the MESSAGE for that KEY ends.
.skip 1
.indent -5
PUTTING STORE INTO YOUR APPLICATION PROGRAM
.skip 1
In the outermost block of your main program write:
.break
EXTERNAL CLASS STORE;
.break
Also put the same statement in front of any separately compiled
modules which are to be able to use the facilities of STORE.
.skip 1
Where you need a STORE in your data base you write the
statements:
.break
REF(STORE) MYSTORE;
.break
MYSTORE:- NEW STORE(VIRTUALSIZE, INITIALSTORESIZE);
.break
.skip 1
VIRTUALSIZE is the size of the storage area in which
lines from the data base are kept in core. A large virtualsize will
thus reduce the number of disk reads but increase core usage.
For most applications, VIRTUALSIZE=0 is best. However, if
you call GETMESSAGE to do long searches in the data base, a higher
value may be good. Try for yourself.
.skip 1
INITIALSTORESIZE is the size of the area on disk into which the
key is initially hashed. If this area is full, further messages
are stored in binary trees. A large INITIALSTORESIZE will thus
give a larger disk file in the beginning, but will reduce
disk accesses and CPU-time. In a test case,
CPU time had increased 20% and disk accesses 30% when the
actual store was allowed to grow to four times the INITIALSTORESIZE,
as compared with starting from the beginning with a four times
larger INITIALSTORESIZE.
.skip 1
A simple rule is to choose INITIALSTORESIZE as about half the
total size into which you expect your data base to grow.
.skip 1
Once a data base has been started, the INITIALSTORESIZE cannot
be changed. A new value of INITIALSTORESIZE is ignored by the
STORE program for old data bases.
.skip 1
Both VIRTUALSIZE and INITIALSTORESIZE are measured in disk
blocks, that is 128 36-bit words or 640 ASCII-7-bit characters.
.skip 1
Before you can use the data base, you must call the procedure
.break
MYSTORE.OPEN(FILENAME);
.break
where FILENAME is a text of 1-6 letters or digits.
No extension, STORE will add the extension ".DAF".
.skip
Before stopping the execution of your program, you must
call the procedure CLOSE.
If you leave your program by CTRL-C or by a system crash,
you may lose information newly stored into the data base.
You can ensure against this by calling CLOSE before each
interaction with the terminal user and calling OPEN again
when you need to use STORE. But this will
double the CPU time and cost of your run in
cases where you only make one single
call to PUTMESSAGE or GETMESSAGE between
each OPEN and CLOSE.
.skip 1
.indent -5
RESTRICTIONS
.skip 1
The length of the KEY must not exceed 400. The total size of
one data base should not be larger than 9999 disk blocks.
There is no program yet available to reorganize a data base.
Reorganisation will however
in many applications never be needed, since the
program automatically frees storage when messages are
removed.
.skip 1
The parameter texts key and message may contain any ASCII characters.
(In release 1C of DECsystem-10 SIMULA, certain device control characters
are not allowed, but this has been remedied in release 2.)
.skip 1
.indent -5
EXAMPLE OF USE
.skip 1
This very simple example only illustrates how to use the
procedures in the package in the simplest case:
.skip 1
.nofill
BEGIN
.left margin 8
EXTERNAL CLASS STORE;
REF (STORE) MYSTORE;
TEXT KEY, MESSAGE;
.skip 1
PROCEDURE KEYPROCESSOR(KEY, LOCATION);
TEXT KEY;
.break
INTEGER LOCATION;
BEGIN OUTTEXT(KEY);
.break
OUTIMAGE;
END;
.skip 1
MYSTORE:- NEW STORE(0,8);
.break
MYSTORE.OPEN("TEST");
KEY:- COPY("KEYTEXT");
.break
MESSAGE:- COPY("MESSAGETEXT");
MYSTORE.PUTMESSAGE(KEY,MESSAGE);
OUTTEXT(MYSTORE.GETMESSAGE(KEY));
.break
OUTIMAGE;
.break
COMMENT WILL PRINT "MESSAGETEXT";
MYSTORE.DIRECT(KEYPROCESSOR); COMMENT WILL PRINT "KEYTEXT";
MYSTORE.REMOVE:= TRUE; COMMENT TO ALLOW CHANGING MESSAGES FOR THE SAME KEY;
MESSAGE:= "NEWMESSAGE";
MYSTORE.PUTMESSAGE(KEY,MESSAGE); COMMENT THE NEW MESSAGE REPLACES THE OLD;
MYSTORE.CLOSE;
.left margin 5
END;
.fill
.skip 1
.indent -5
FILES DELIVERED
.skip 1
The delivery of the STORE system consists of the following files:
.nofill
STORE.SIM####Source program of the CLASS store
STOREU.SIM###Source program of a demonstration example using store
STOREU.SAV###Executable core image of demonstration example
STORE.RNO####The text you are reading now in RUNOFF format
STORE.HLP###The text you are reading now in reading format
.fill
.skip 1
The demonstration program STOREU also uses the SAFEIO package for
terminal interaction.
.skip 1
At the QZ data center in Stockholm, you will find the files
on the DEC-tape H39I04.
.skip 1
.indent -5
INTERNAL STRUCTURE OF THE DIRECT ACCESS FILE
.skip 1
The following sections are not necessary to use the STORE package, only
if you want to know how it works internally or if you need
to modify the program.
.skip 1
The data base consists of disk blocks. Each disk block is 128 words
or 640 characters. The last two of these characters are always <CR><LF>
to mark the end of the disk block.
.skip 1
The messages are stored in the data base in STORE UNITS. Each STORE UNIT
consists of
.break
.nofill
5 characters = s = length of store unit as a decimal number
3 characters = k = length of key as a decimal unit
k characters = text of key
s-k-8 characters = text of message
.skip 1
.fill
The disk block for a STORE UNIT is chosen in the following way:
.indent -2
>#Compute a hash number from the text in the KEY. See program
for hashing algoritm.
.indent -2
>#The number of the first disk block to look at is computed
as (((hash number) modulo INITIALSTORESIZE)+1). At the same
time the hash number is divided by INITIALSTORESIZE and the
remainder kept.
.indent -2
>#If that disk block is too full(while putting) or does not
contain the message(while getting) then the next disk block
is taken from a binary tree started at the first disk block.
The first four characters of each disk block contains
a pointer to the left continuation of the tree, and the next
four characters a pointer to the right continuation.
Both as decimal integers. Left is chosen if the remainder
of the hash number is even, right if the remainder is odd.
After this, the remainder is again divided by 2 and
the remainder of this kept for going further down the tree.
.skip 1
When the total length of a STORE UNIT is larger than the storage part of
a disk block(630 characters) then the first part
of the STORE UNIT is put into one disk block, the continuation
into later disk blocks. Each succesive STORE UNIT has as its
length the remainder of the total STORE UNIT. For example,
for a message which has a total STORE UNIT of
1500 characters and the key "KEYVALUE", the first disk block will contain
.break
" 1500  8KEYVALUE... first 614 characters of message..."
.break
and the second disk block will contain
.break
" #886  8KEYVALUE... the next 614 characters of message..."
.break
and the third and last disk block will contain
.break
" #272  8KEYVALUE... the last 256 characters of message..."
.skip 1
Further information about the internal organisation of
the program is given by comments in the text of the program
itself.