Pre-fitting procedure
prefit.RdSearch good starting values
Usage
prefit(data, distr, method = c("mle", "mme", "qme", "mge"),
feasible.par, memp=NULL, order=NULL,
probs=NULL, qtype=7, gof=NULL, fix.arg=NULL, lower,
upper, weights=NULL, silent=TRUE, ...)Arguments
- data
A numeric vector.
- distr
A character string
"name"naming a distribution for which the corresponding density functiondname, the corresponding distribution functionpnameand the corresponding quantile functionqnamemust be defined, or directly the density function.- method
A character string coding for the fitting method:
"mle"for 'maximum likelihood estimation',"mme"for 'moment matching estimation',"qme"for 'quantile matching estimation' and"mge"for 'maximum goodness-of-fit estimation'.- feasible.par
A named list giving the initial values of parameters of the named distribution or a function of data computing initial values and returning a named list. This argument may be omitted (default) for some distributions for which reasonable starting values are computed (see the 'details' section of
mledist). It may not be into account for closed-form formulas.- order
A numeric vector for the moment order(s). The length of this vector must be equal to the number of parameters to estimate.
- memp
A function implementing empirical moments, raw or centered but has to be consistent with
distrargument (andweightsargument).- probs
A numeric vector of the probabilities for which the quantile matching is done. The length of this vector must be equal to the number of parameters to estimate.
- qtype
The quantile type used by the R
quantilefunction to compute the empirical quantiles, (default 7 corresponds to the default quantile method in R).- gof
A character string coding for the name of the goodness-of-fit distance used : "CvM" for Cramer-von Mises distance,"KS" for Kolmogorov-Smirnov distance, "AD" for Anderson-Darling distance, "ADR", "ADL", "AD2R", "AD2L" and "AD2" for variants of Anderson-Darling distance described by Luceno (2006).
- fix.arg
An optional named list giving the values of fixed parameters of the named distribution or a function of data computing (fixed) parameter values and returning a named list. Parameters with fixed value are thus NOT estimated by this maximum likelihood procedure. The use of this argument is not possible if
method="mme"and a closed-form formula is used.- weights
an optional vector of weights to be used in the fitting process. Should be
NULLor a numeric vector. If non-NULL, weighted MLE is used, otherwise ordinary MLE.- silent
A logical to remove or show warnings.
- lower
Lower bounds on the parameters.
- upper
Upper bounds on the parameters.
- ...
Further arguments to be passed to generic functions, or to one of the functions
"mledist","mmedist","qmedist"or"mgedist"depending of the chosen method. Seemledist,mmedist,qmedist,mgedistfor details on parameter estimation.
Details
Searching good starting values is achieved by transforming the parameters (from their constraint interval to the real line) of the probability distribution. Indeed,
positive parameters in \((0,Inf)\) are transformed using the logarithm (typically the scale parameter
sdof a normal distribution, see Normal),parameters in \((1,Inf)\) are transformed using the function \(log(x-1)\),
probability parameters in \((0,1)\) are transformed using the logit function \(log(x/(1-x))\) (typically the parameter
probof a geometric distribution, see Geometric),negative probability parameters in \((-1,0)\) are transformed using the function \(log(-x/(1+x))\),
real parameters are of course not transformed at all, typically the
meanof a normal distribution, see Normal.
Once parameters are transformed, an optimization is carried out by a quasi-Newton algorithm (typically BFGS) and then we transform them back to original parameter value.
References
Delignette-Muller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34, doi:10.18637/jss.v064.i04 .