Google
 

Trailing-Edge - PDP-10 Archives - decuslib20-05 - decus/20-0149/multhe.rnh
There are 2 other files named multhe.rnh in the archive. Click here to see a list.
.LM0;.RM75;.TS72;.LC;.AP;.FLAG CAPITAL;.NO PAGING;.NO NUMBER;#
.BR;^MULTIPLE ^LINEAR ^REGRESSION ^ANALYSIS
.SK;^^THE REGRESSION MODEL\\
 ^IN A REGRESSION PROBLEM THE RESEARCHER POSTULATES A CERTAIN RELATION- SHIP 
BETWEEN A RANDOM VARIABLE Y (THE REALIZATIONS OF WHICH ARE SUBJECT 
TO SOME FORM OF DISTURBANCE) ON THE ONE SIDE AND A NUMBER OF VARIABLES 
X1,...,XP 
(WHICH ARE WITHOUT OR AT LEAST ALMOST WITHOUT DISTURBANCES) ON THE 
OTHER SIDE. ^THIS RELATIONSHIP IS EXPRESSED BY A MATHEMATICAL FORMULA, 
WHICH IS CALLED THE (LINEAR) REGRESSION MODEL, FOR INSTANCE:
.TS72;.SK;.I18;Y = A0 + A1 * X1 +#...#+ AP * XP + E	(1)
.SK;IN WHICH A0,...,AP REPRESENT UNKNOWN REGRESSION COEFFICIENTS (PARAMETERS)
WHICH ARE TO BE ESTIMATED AND E REPRESENTS THE DISTURBANCE.
^IF A CONSTANT TERM IS PRESENT IN THE MODEL FORMULA (IN (1) THE A0), THE MODEL  IS 
SAID TO BE AN 'INTERCEPT#MODEL', IF NO CONSTANT TERM IS PRESENT, THE MODEL IS 
CALLED A 'NO-INTERCEPT#MODEL'.
.BR;^THE VARIABLES X1,...,XP AND THE VARIABLE Y CAN ALSO REPRESENT 
(OTHER) TRANSFORMED VARIABLES. ^THE RESEARCHER MIGHT HAVE REASONS 
TO BELIEVE (FROM BACKGROUND INFORMATION CONCERNING THE EXPERIMENT) 
THAT TRANSFORMATIONS ARE NECESSARY, FOR INSTANCE:
.BR;1)#TO OBTAIN NORMALLY DISTRIBUTED DISTURBANCES,
.BR;2)#TO OBTAIN A GREATER HOMOGENEITY OF THE VARIANCES OF THE DISTURBANCES,
.BR;3)#TO LINEARIZE NON-LINEAR REGRESSION MODELS (IF POSSIBLE).
.BR;^THE TRANSFORMED REGRESSION MODEL CAN BE WRITTEN AS:
.SK;.I5;^G(Y) = A0 + A1 * ^F1(X1,...,XM) +#...#+ AP * ^FP(X1,...,XM) + E	(2)
.SK;IN WHICH ^G, ^F1,...,^FP REPRESENT THE TRANSFORMATIONS,
.BR;.I12;A0,...,AP REPRESENT THE PARAMETERS TO BE ESTIMATED,
.BR;.I20;Y REPRESENTS THE DEPENDENT VARIABLE,
.BR;.I12;X1,...,XM REPRESENT THE INDEPENDENT VARIABLES,
.BR;.I20;E REPRESENTS THE DISTURBANCE.
 ^THE CHOICE OF A TRANSFORMATION BY MEANS OF 'TRIAL AND ERROR' IS RATHER 
TIME CONSUMING AND COSTLY. ^THE IMPORTANCE OF THE LOCATION PARAMETER MAKES 
FOR THE DIFFICULTY. ^IT IS NOT UNUSUAL THAT ^LOG#(X) YIELDS NO IMPROVEMENT,
BUT THAT ^LOG#(C+X) GIVES BETTER RESULTS FOR A PARTICULAR CHOICE OF C. 
^BECAUSE THIS HOLDS FOR ALMOST ANY TRANSFORMATION OF SOME IMPORTANCE, 
WE MUST ACTUALLY SOLVE IN EACH CASE A NONLINEAR ADJUSTMENT PROBLEM. ^OFTEN 
THOUGH, A SIMPLE FORM OF THE TRANSFORMATION IS SUGGESTED BY THE RESEARCHER 
WHO IS BETTER ACQUAINTED WITH THE PECULIARITIES OF THE EXPERIMENT.
.SK2;^^LEAST SQUARES\\
 ^REGRESSION ANALYSIS CONSISTS IN FACT OF THE ADJUSTMENT OF A HYPERPLANE 
OF THE REQUIRED DIMENSION TO THE DATA. ^THE FITTING IS DONE WITH THE METHOD 
OF LEAST SQUARES, WHICH MEANS THAT THE SUM OF THE SQUARES OF THE DIFFERENCES 
BETWEEN THE OBSERVED VALUES FOR Y AND THE ESTIMATED VALUES FOR THE EXPECTATION 
OF Y, ARE MINIMIZED. ^THIS SUM OF SQUARES IS ALSO CALLED THE RESIDUAL 
SUM OF SQUARES.
.BR;^IN MATRIX NOTATION THE REGRESSION MODEL CAN BE WRITTEN AS:
.SK;.I30;^Y = ^XA + E	(3)
.SK;IN WHICH ^Y IS A (N*1) RANDOM VECTOR OF OBSERVATIONS,
.BR;.I9;^X IS A (N*P) MATRIX OF KNOWN (FIXED) VALUES,
.BR;.I9;A IS A (P*1) VECTOR OF (UNKNOWN) PARAMETERS,
.BR;.I5;AND E IS A (N*1) RANDOM VECTOR OF DISTURBANCES.
.SK;^IT IS SUPPOSED THAT ^E(E)#=#0 AND VAR(E)#=#^ISIGMA_^2, IN WHICH ^I 
IS THE UNIT MATRIX, THUS:
.SK;.I31;^E(^Y) = ^XA	(4)
 ^THE SUM OF SQUARES OF THE DIFFERENCES BETWEEN THE OBSERVED VALUES OF ^Y AND 
THE ESTIMATED VALUES FOR THE EXPECTATION OF ^Y THUS EQUALS:
.SK;.I17;(^Y-^XA)'(^Y-^XA) = <Y'Y - 2A'<X'Y + A'^X'^XA	(5)
.SK;(FOR A'<X'Y IS A SCALAR AND THEREFORE EQUAL TO ^Y'^XA).
 ^CHOOSING AS LEAST SQUARES ESTIMATOR B THAT VALUE OF A WHICH MINIMIZES (5),
INVOLVES DIFFERENTIATING WITH RESPECT TO THE ELEMENTS OF A AND EQUATING 
THE RESULT TO ZERO:
.SK;.I19;-2<X'Y + 2^X'^XB = 0,##THUS:##<X'Y = ^X'^XB	(6)
.SK;^THIS SYSTEM IS CALLED THE NORMAL EQUATIONS.
^IF THE RANK OF ^X EQUALS P, <X'X IS NONSINGULAR AND THE INVERSE OF <X'X 
EXISTS. ^IN THAT CASE THE SOLUTION OF THE NORMAL EQUATIONS CAN BE WRITTEN AS:
.SK;.I29;B = INV(<X'X)X'Y	(7)
.SK;^OBSERVE THAT P _<= N MUST HOLD, IN ORDER THAT THE RANK OF ^X CAN BE P AT ALL. 
^THEREFORE AT LEAST AS MANY OBSERVATIONS MUST BE MADE, AS THERE ARE PARAMETERS IN THE MODEL.
^ALSO OBSERVE THAT ^E(B)#=#INV(<X'X)^X'^E(^Y)#=#A, THUS B IS AN UNBIASED 
ESTIMATOR OF A.
.SK;^THE LEAST SQUARES ESTIMATOR HAS THE FOLLOWING PROPERTIES:
.LM+3;.I-3;1.#^IT IS AN ESTIMATOR WHICH MINIMIZES THE SUM OF SQUARES OF 
DEVIATIONS, IRRESPECTIVE OF ANY DISTRIBUTION PROPERTIES OF THE DISTURBANCES. 
^THE ASSUMPTION THAT THE DISTURBANCES ARE NORMALLY DISTRIBUTED IS, OF COURSE, 
NECESSARY FOR TESTS WHICH DEPEND ON THIS ASSUMPTION, SUCH AS T- OR ^F- TESTS, 
OR FOR OBTAINING CONFIDENCE INTERVALS BASED ON T- OR ^F- DISTRIBUTIONS.
.BR;.I-3;2.#^ACCORDING TO THE ^GAUSS-^MARKOV THEOREM, THE ELEMENTS OF B ARE 
UNBIASED ESTIMATORS, WHICH HAVE MINIMUM VARIANCE (OF ANY LINEAR FUNCTION OF 
THE ^Y'S WHICH PROVIDES UNBIASED ESTIMATORS), AGAIN IRRESPECTIVE OF THE 
DISTRIBUTION PROPERTIES OF THE DISTURBANCES.
.BR;.I-3;3.#^IF THE DISTURBANCES ARE MUTUALLY INDEPENDENT AND NORMALLY 
DISTRIBUTED (WITH ^E(E)#=#0 AND VAR(E)#=#^ISIGMA_^2), THEN B IS ALSO THE 
MAXIMUM LIKELIHOOD ESTIMATOR.
.LM-3;.SK;^THE VARIANCE-COVARIANCE MATRIX OF B IS:
.SK;.I25;VAR(B) = INV(^X'^X)SIGMA_^2	(8)
.SK;^THE VARIANCES ARE THE DIAGONAL AND THE COVARIANCES THE 
OFF-DIAGONAL ELEMENTS.
.SK;^AN UNBIASED ESTIMATOR FOR SIGMA_^2 IS GIVEN BY:
.SK;.I23;S_^2 = (<Y'Y - B'<X'Y) / (N-P)	(9)
.SK;^THE SQUARE ROOT OF THIS ESTIMATOR IS FREQUENTLY CALLED 'STANDARD 
ERROR OF ESTIMATE'. ^IN THE PRINTED OUTPUT OF THE PROGRAM IT IS 
INDICATED MORE PROPERLY AS 'STANDARD DEVIATION OF THE ERROR TERM'.
 ^LET VIJ BE THE ELEMENT IN THE I-TH ROW AND J-TH COLUMN OF INV(<X'X), 
THEN SDI = S * ^SQRT(VII) ESTIMATES THE STANDARD DEVIATION OF BI, AND 
CIJ = VIJ / ^SQRT(VII * VJJ) GIVES THE CORRELATION COEFFICIENT BETWEEN 
BI AND BJ FOR I = 1,...,P AND J = 1,...,P. ^THUS:
.TS71;.SK;.I27;VII = (SDI / S)_^2	(10)
.BR;AND
.BR;.I10;VIJ =  CIJ * ^SQRT(VII * VJJ) = CIJ * (SDI * SDJ) / S	(11)
.SK; ^A FREQUENTLY USED STATISTICAL MEASURE FOR EVALUATING REGRESSION MODELS 
IS THE MULTIPLE CORRELATION COEFFICIENT ^R WHICH IS DEFINED IN THE INTERCEPT MODEL AS THE SQUARE 
ROOT OF THE PROPORTION OF THE CORRECTED TOTAL SUM OF SQUARES ACCOUNTED FOR BY THE 
MODEL. ^IF THE CORRECTION FOR MEANS IS DENOTED BY NU_^2, WITH U = ^SUM(I,1,N,YI)/N,
THEN ^R CAN BE DEFINED AS:
.SK;.I7;^R_^2 = (B'^X'^Y-NU_^2)/(^Y'^Y-NU_^2) = 1 - (^Y'^Y-B'^X'^Y)/(^Y'^Y-NU_^2)	(12)
.SK;^HOWEVER, WE MUST DIVIDE ^Y'^Y-B'^X'^Y BY N-P, NOT BY N, TO OBTAIN AN 
UNBIASED ESTIMATOR OF SIGMA_^2, MOREOVER IT IS CUSTOMARY TO DIVIDE ^Y'^Y-NU_^2 
BY N-1, NOT BY N. ^IF WE ADOPT BOTH MODIFICATIONS WE OBTAIN THE ADJUSTED 
MULTIPLE CORRELATION COEFFICIENT, WHICH CAN THUS BE DEFINED AS:
.SK;.I11;ADJ(^R)_^2 = 1 - (N-1)/(N-P) * (^Y'^Y-B'^X'^Y)/(^Y'^Y-NU_^2)	(13)
 ^IN THE NO-INTERCEPT MODEL THE CORRECTION FOR MEANS IS IGNORED, GIVING 
AS DEFINITION OF ^R_^2: B'^X'^Y/^Y'^Y#=#1#-#(^Y'^Y-B'^X'^Y)/^Y'^Y,# 
WHILE THE ADJ(^R)_^2 IS DEFINED CORRESPONDINGLY AS: 1#-#N/(N-P)#*#(^Y'^Y-B'^X'^Y)/^Y'^Y.
^R_^2 ITSELF IS OFTEN CALLED THE 'PROPORTION OF VARIATION EXPLAINED'.
.SK2;^^WEIGHTED LEAST SQUARES\\
 ^IT SOMETIMES HAPPENS THAT SOME OF THE OBSERVATIONS FOR THE DEPENDENT 
VARIABLE ARE 'LESS RELIABLE' THAN OTHERS. ^THIS USUALLY MEANS THAT THE 
VARIANCES OF THE OBSERVATIONS ARE NOT ALL EQUAL; IN OTHER WORDS THE MATRIX 
^V#=#VAR(E) IS NOT OF THE FORM ^ISIGMA_^2, BUT IS DIAGONAL WITH UNEQUAL 
DIAGONAL ELEMENTS. ^THE BASIC IDEA TO SOLVE THIS PROBLEM IS, TO TRANSFORM 
^Y TO OTHER VARIABLES, WHICH DO APPEAR TO SATISFY THE USUAL TENTATIVE 
MODEL ASSUMPTIONS, AND THEN APPLY THE USUAL (UNWEIGHTED) ANALYSIS 
TO THE VARIABLES SO OBTAINED. ^THE ESTIMATES CAN THEN BE RE-EXPRESSED IN 
TERMS OF THE ORIGINAL VARIABLES ^Y.
 ^LET THE ORIGINAL REGRESSION MODEL BE: ^Y#=#^XA#+#E, WITH ^E(E)#=#0 AND 
VAR(E)#=#^VSIGMA_^2, WITH ^V DIAGONAL WITH UNEQUAL DIAGONAL ELEMENTS, 
AND LET ^P#=#INV(^V). ^PREMULTIPLYING THE ORIGINAL REGRESSION MODEL 
WITH ^Q#=#^SQRT(^P) GIVES AS TRANSFORMED REGRESSION MODEL:
.SK;.I29;<QY = ^Q^XA + ^QE	(14)
.SK;WITH ^E(^QE)#=#0 AND VAR(^QE)#=#^ISIGMA_^2. ^THE NORMAL EQUATIONS THEN BECOME:
.SK;.I27;<(QX)'QY = (^Q^X)'^Q^XA	(15)
.SK;GIVING AS SOLUTION IF THE INDICATED INVERSE MATRIX EXISTS:
.SK;.I16;B = INV((<QX)'QX)(QX)'QY = INV(<X'PX)X'PY	(16)
.SK;WITH VARIANCE-COVARIANCE MATRIX:
.SK;.I23;VAR(B) = INV(^X'^P^X)SIGMA_^2	(17)
 ^IN PRACTICAL SITUATIONS IT IS OFTEN DIFFICULT TO OBTAIN SPECIFIC INFORMATION 
ON THE FORM OF ^V AT FIRST. ^FOR THIS REASON IT IS SOMETIMES NECESSARY TO MAKE 
THE (KNOWN TO BE ERRONEOUS) ASSUMPTION ^V#=#^I AND THEN ATTEMPT TO DISCOVER 
SOMETHING ABOUT THE FORM OF ^V BY EXAMINING THE RESIDUALS FROM THE REGRESSION 
ANALYSIS.
.SK2;^^RESIDUAL ANALYSIS\\
 ^THE VECTOR OF RESIDUALS ^D IS DEFINED AS THE DIFFERENCE BETWEEN THE VECTOR 
OF OBSERVATIONS ^Y AND THE VECTOR OF FITTED VALUES ^Z, OBTAINED BY USING THE 
REGRESSION EQUATION##^Z#=#^XB. ^SO ^D#=#^Y#-#^Z OR DI#=#YI#-#ZI FOR I#=#1,...,N. 
^IF THE MODEL IS CORRECT, THE RESIDUAL MEAN SQUARE <MSE = S_^2 ESTIMATES SIGMA_^2, AND 
THE ESTIMATED STANDARD DEVIATION OF THE FITTED VALUE ZI AT XI = (XI1,...,XIP)' IS:
.SK;.I21;SD(ZI)#=#S * ^SQRT(XI'INV(^X'^X)XI)	(18)
.SK;WHICH CAN BE USED TO CONSTRUCT A CONFIDENCE INTERVAL FOR THE EXPECTED 
VALUE OF YI:#^E(YI) AT XI = (XI1,...,XIP)', OR TO CONSTRUCT A PREDICTION 
INTERVAL FOR THE MEAN OF H NEW OBSERVATIONS AT THIS POINT. ^IN THE FIRST 
CASE THE CONFIDENCE INTERVAL IS:
.SK;.I11;ZI +- T(N-P-1,1-ALPHA/2) * S * ^SQRT(XI'INV(^X'^X)XI)	(19)
.SK;AND IN THE SECOND CASE THE PREDICTION INTERVAL IS:
.SK;.I8;ZI +- T(N-P-1,1-ALPHA/2) * S * ^SQRT(1/H + XI'INV(^X'^X)XI)	(20)
 ^RESEARCHERS OFTEN DIVIDE THE RESIDUALS DI BY S, RESULTING IN THE STANDARDIZED 
RESIDUALS, WHICH CAN BE EXAMINED TO SEE IF THEY MAKE IT APPEAR THAT THE ASSUMPTION 
EI/SIGMA ~ ^N(0,1) IS VIOLATED. ^IT MIGHT BE EXPECTED THAT ROUGHLY 95% OF THE 
DI/S WERE BETWEEN THE LIMITS (-2,2).
 ^HOWEVER, THE VARIANCES OF THE 
RESIDUALS ARE NOT CONSTANT BUT A FUNCTION OF THE ^X MATRIX (SEE (18)), 
WHICH SUGGESTS AS STANDARDIZATION:
.SK;.I19;TI = DI / S / ^SQRT(1 - XI'INV(^X'^X)XI)	(21) 
.SK;GIVING THE STUDENTIZED RESIDUAL.
^THE MAXIMUM STUDENTIZED RESIDUAL CAN BE USED IN A TEST FOR DETECTING 
OUTLIERS, AS FOLLOWS: LET T_^2#=#MAX(TI_^2), 
THEN##MIN(1,#N#*#(1-^FISHER(1,#N-P-1,#T_^2*(N-P-1)/(N-P-T_^2)))) IS AN  
'UPPER BOUND FOR THE RIGHT TAIL PROBABILITY OF THE LARGEST 
ABSOLUTE STUDENTIZED RESIDUAL'.
.BR;#