Pre-fitting procedure
prefit.Rd
Search good starting values
Usage
prefit(data, distr, method = c("mle", "mme", "qme", "mge"),
feasible.par, memp=NULL, order=NULL,
probs=NULL, qtype=7, gof=NULL, fix.arg=NULL, lower,
upper, weights=NULL, silent=TRUE, ...)
Arguments
- data
A numeric vector.
- distr
A character string
"name"
naming a distribution for which the corresponding density functiondname
, the corresponding distribution functionpname
and the corresponding quantile functionqname
must be defined, or directly the density function.- method
A character string coding for the fitting method:
"mle"
for 'maximum likelihood estimation',"mme"
for 'moment matching estimation',"qme"
for 'quantile matching estimation' and"mge"
for 'maximum goodness-of-fit estimation'.- feasible.par
A named list giving the initial values of parameters of the named distribution or a function of data computing initial values and returning a named list. This argument may be omitted (default) for some distributions for which reasonable starting values are computed (see the 'details' section of
mledist
). It may not be into account for closed-form formulas.- order
A numeric vector for the moment order(s). The length of this vector must be equal to the number of parameters to estimate.
- memp
A function implementing empirical moments, raw or centered but has to be consistent with
distr
argument (andweights
argument).- probs
A numeric vector of the probabilities for which the quantile matching is done. The length of this vector must be equal to the number of parameters to estimate.
- qtype
The quantile type used by the R
quantile
function to compute the empirical quantiles, (default 7 corresponds to the default quantile method in R).- gof
A character string coding for the name of the goodness-of-fit distance used : "CvM" for Cramer-von Mises distance,"KS" for Kolmogorov-Smirnov distance, "AD" for Anderson-Darling distance, "ADR", "ADL", "AD2R", "AD2L" and "AD2" for variants of Anderson-Darling distance described by Luceno (2006).
- fix.arg
An optional named list giving the values of fixed parameters of the named distribution or a function of data computing (fixed) parameter values and returning a named list. Parameters with fixed value are thus NOT estimated by this maximum likelihood procedure. The use of this argument is not possible if
method="mme"
and a closed-form formula is used.- weights
an optional vector of weights to be used in the fitting process. Should be
NULL
or a numeric vector. If non-NULL
, weighted MLE is used, otherwise ordinary MLE.- silent
A logical to remove or show warnings.
- lower
Lower bounds on the parameters.
- upper
Upper bounds on the parameters.
- ...
Further arguments to be passed to generic functions, or to one of the functions
"mledist"
,"mmedist"
,"qmedist"
or"mgedist"
depending of the chosen method. Seemledist
,mmedist
,qmedist
,mgedist
for details on parameter estimation.
Details
Searching good starting values is achieved by transforming the parameters (from their constraint interval to the real line) of the probability distribution. Indeed,
positive parameters in \((0,Inf)\) are transformed using the logarithm (typically the scale parameter
sd
of a normal distribution, see Normal),parameters in \((1,Inf)\) are transformed using the function \(log(x-1)\),
probability parameters in \((0,1)\) are transformed using the logit function \(log(x/(1-x))\) (typically the parameter
prob
of a geometric distribution, see Geometric),negative probability parameters in \((-1,0)\) are transformed using the function \(log(-x/(1+x))\),
real parameters are of course not transformed at all, typically the
mean
of a normal distribution, see Normal.
Once parameters are transformed, an optimization is carried out by a quasi-Newton algorithm (typically BFGS) and then we transform them back to original parameter value.
References
Delignette-Muller ML and Dutang C (2015), fitdistrplus: An R Package for Fitting Distributions. Journal of Statistical Software, 64(4), 1-34, doi:10.18637/jss.v064.i04 .