public class Logistic_Regression
extends java.lang.Object
The (binary) logistic regression is a model commonly used to predict the chances of a dependent variable resulting into Good=1 or Bad=0. The general function is the following:
ð(x)=e(X0+X1+X2+...Xn)/(1+e(X0+X1+X2+...Xn))
Where,ð(x) can take any value between 0 and 1,
while xn represents the characteristics of the model and bn their coefficients.
(Samprit et al.2000/p:321)
The equation is non-linear for b0…bn, however it can be transformed by replacing the ð value with the
ð/(1-ð), which represents the probability of an event to happen, divided with the probability of not happening and it is called the odds ratio.
By using natural logarithm on both sides the initial equation is transformed to :
log(ð/1-ð)=b0 + b1x1 + b2x2 +....bnxn (Samprit et al.2000/p:321)
The coefficients in the logistic regression are (most commonly) calculated by using the method of maximum likelihood. According to Thomas R. (1997), “the likelihood function is in general , defined as the joint probability function of the random variables whose realizations constitute the sample”. For Yn variables, the joint probability function can be written as:
g(Yð)=Ði=1fi(Yi)=Ði=1ðiYi(1-ði)1-Yi (Thomas R.,1997:258)
In other words it expresses the probability of particular sequences of 0s and 1s. The log of this function is frequently used to access how much the model has explained.
This logistic regression approach is using the Newton-Rapson Method to reach the optimum solution. You need to insert the predictor matrix, namely matrix [Rows] [Columns]. Your Target Variable need to be either '1' or '0'.
| Constructor and Description |
|---|
Logistic_Regression() |
| Modifier and Type | Method and Description |
|---|---|
double[] |
get_odds() |
double |
getAIC() |
double[] |
getbetas() |
double |
getBIC() |
double |
getMAXIMUMlikelihood() |
double[] |
getprobabilities() |
double[] |
getresiduals() |
double[] |
getWald_P_Values() |
double[] |
getWald() |
void |
regression(double[][] matrix,
double[] Target)
A constructor of logistic regression for faster execution.
|
void |
regression(double[][] matrix,
double[] Target,
boolean Constant,
double tolerance,
int maxim_Iteration)
This is the main Newton-Raphson method for Logistic Regression
|
void |
regression(double[][] matrix,
double[] Target,
double tolerance,
int iterations)
A constructor of logistic regression for faster execution, where we put an intercept
|
double[] |
score(double[][] X,
double[] beta,
boolean Constant)
The purpose of this method is to score a new set of variables where the all the data is provided
|
public void regression(double[][] matrix,
double[] Target,
boolean Constant,
double tolerance,
int maxim_Iteration)
This is the main Newton-Raphson method for Logistic Regression
matrix - : The matrix of double covariate values in [row][column] format. This is also called the "predictors' set"Target - : The Binary target double Array that takes values of 1 (e.g. event occurred) or 0Constant - : it holds a String value showing whether a constant (b0) should be added. if Constant is not "n", "N","no","NO" then a constant is added.tolerance - : This can be interpreted as precision. That is how much we want to be the minimum change in coefficients so as to stop the algorithm.maxim_Iteration - : an integer value for allowing the algorithm to run multiple rounds in case the optimum point is not achieved earlier by setting the tolerance.public void regression(double[][] matrix,
double[] Target)
matrix - : The matrix of double covariate values in [row][column] format. This is also called the "predictors' set"Target - : The Binary target double Array that takes values of 1 (e.g. event occurred) or 0public void regression(double[][] matrix,
double[] Target,
double tolerance,
int iterations)
matrix - : The matrix of double covariate values in [row][column] format. This is also called the "predictors' set"Target - : The Binary target double Array that takes values of 1 (e.g. event occurred) or 0tolerance - : This can be interpreted as precision. That is how much we want to be the minimum change in coefficients so as to stop the algorithm.iterations - : an integer value for allowing the algorithm to run multiple rounds in case the optimum point is not achieved earlier by setting the tolerance.public double[] score(double[][] X,
double[] beta,
boolean Constant)
The purpose of this method is to score a new set of variables where the all the data is provided
X - : The set of predictors (independent variables) in [rows][columns] formatbeta - : the beta to be applied.Constant - : an indication on whether there was a constant (in location 0-beta[0]) or notpublic double[] getprobabilities()
public double[] getresiduals()
public double[] getbetas()
public double[] getWald()
public double[] getWald_P_Values()
public double[] get_odds()
public double getMAXIMUMlikelihood()
public double getAIC()
public double getBIC()