cdef.gof() - the Covariate-Space Directed
Ebrahim-Farrington test. Like def.gof() but the directed
basis lives in covariate space (polynomials and pairwise products,
natural splines, or a "combined" basis that also includes
fitted-probability bends), with the same Farrington Omega-projection
calibration. It detects omitted interactions and local/oscillatory
misfit that fitted-probability grouping can miss; rank-deficient bases
are reduced automatically.gof.features() - the goodness-of-fit evidence vector
(one-sided z-scores from a panel of tests plus the covariate-space
directed tests), the input to a learned-ensemble GOF test.deploy.gof() - a deployable learned-ensemble test:
given a pre-trained scorer, it calibrates the p-value by a per-dataset
parametric bootstrap from the fitted model, so it is valid on any data
set without knowing the truth.run.all.gof()
additions and improvementsMcCullagh - the McCullagh (1985)
exact-conditional-moments standardization of the Pearson statistic (SAS
GOFLOGIT / Kuss 2002 algorithm). Verified to reproduce the thesis
low-birth-weight result (p = 0.937) to machine precision.GiViTI - the GiViTI polynomial calibration test
(Nattino, Finazzi & Bertolini), wrapping givitiR run
inside an isolated callr subprocess so a crash in givitiR’s
compiled dependencies returns NA instead of aborting the
session. Verified against the thesis result (internal p = 0.586). Opt-in
slow; control = list(GiViTI = list(devel = "internal")).
Adds givitiR and callr to Suggests.BAGofT now runs on single-predictor models: its
random-forest partitioner needs at least two predictors, so a constant
helper column is added to the data (not the formula) - the workaround
documented in Kuss (2002) / the thesis - instead of failing.run.all.gof() returns an object of
class gof_battery (still a data.frame) with a
dedicated print method - rows grouped by test family,
p-values formatted (four decimals or scientific, - when not
available), and a significance flag. All Note messages were
rewritten to clear, human-readable phrases.include_slow now defaults to TRUE, so the
full battery runs by default; a one-time message notes which slow tests
are included and that include_slow = FALSE gives a quick
fast-tests-only run.calibration_plot argument: with
calibration_plot = TRUE (and GiViTI among the
tests) the GiViTI calibration belt is computed, stored on the result,
and drawn; a plot() method (plot.gof_battery)
redraws the stored belt.F-test - the modified Hosmer-Lemeshow F-test (deviance
residuals ANOVA-F-tested across deciles), following
LogisticDx::gof.glm.GiViTI-external - the GiViTI calibration test under the
external development assumption, so the internal and external forms now
run side by side (matching the thesis, which reported both).def.gof() - the Directed Ebrahim-Farrington (DEF)
goodness-of-fit test. Projects grouped standardized residuals onto a
smooth calibration-shape basis ("poly2",
"poly3", "stukel") and calibrates the
statistic as a weighted sum of chi-square_1 variables (Satterthwaite by
default; Imhof via the suggested CompQuadForm).
basis = "ensemble" is a shortcut to
def.ensemble.gof().def.ensemble.gof() - combines the three DEF bases
(optionally the omnibus EF, or extra p-values) into one decision via the
Cauchy combination test (default), with minp and
fisher offered for comparison.ef.gof(), def.gof(), and
def.ensemble.gof() now accept either a
fitted glm or
(y, predicted_probs) as input. For def.gof,
supplying the design matrix X (with the
y/predicted_probs form) gives the exact
calibration; without it a conservative chi-square reference is used and
a warning is issued.ef.gof() now defaults to the
chi-square reference (method = "chisq"):
the grouped statistic is referred to a chi-square_{G-2} distribution.
Use method = "normal" to reproduce the previous
(standardized-normal) p-value.
run.all.gof() - a one-shot runner that returns a
tidy data.frame, one row per test. Pass a fitted
glm for the whole battery, or
(y, predicted_probs) for the prediction-only tests. One
failing test never aborts the run. This build bundles Pearson, Deviance,
Osius-Rojek, Copas-RSS, Hosmer-Lemeshow (deciles and equal-width),
Pigeon-Heyse, EF, the three DEF bases, Stukel, the covariate-space tests
Tsiatis, Xie, and Pulkstenis-Robinson, and the two Cauchy-combination
ensemble rows. Osius-Rojek, Copas-RSS, Pigeon-Heyse, Tsiatis, and
Pulkstenis-Robinson were verified to match their original
implementations to ~1e-15 (Xie’s statistic also matches).
All run.all.gof() tests were verified to reproduce
the implementations used in the original thesis simulation. In
particular Osius-Rojek and Stukel now follow
LogisticDx::gof.glm (Stukel via
statmod::glm.scoretest; statmod added to
Suggests), matching it numerically; Copas-RSS matches
rms’s gof residual; HL matches
ResourceSelection::hoslem.test; and
HL-equalwidth, Pigeon-Heyse, Tsiatis, Xie, and
Pulkstenis-Robinson match their source scripts.
A second EF row, EF-normal, reports the omnibus EF
test with the normal reference used in the thesis simulation (the
EF row uses the chi-square default).
More opt-in slow (include_slow = TRUE) tests: the
GAM-based HL-GAM, PR-GAM, and
Xie-GAM (Xie et al. 2021; need mgcv; HL-GAM
and PR-GAM match the source gam_gof_tests exactly, Xie-GAM
uses a fixed clustering seed), and Stute-Zhu
(cumulative-residual parametric-bootstrap test; sequential, set reps via
control = list("Stute-Zhu" = list(B = ...)); statistic
matches the source exactly).
Lai-Liu-HL (Lai & Liu 2018, standardized-power
procedure for the Hosmer-Lemeshow test). It has no p-value: it resamples
to a target size, fits the model, estimates the HL rejection rate
(“standardized power”), and returns a randomized accept/reject decision.
The standardized power is reported as the statistic and the decision in
the Note (set n0/k via
control). Verified to match the source
lai_liu_test exactly.
Two further opt-in slow tests: eHL (the e-value
Hosmer-Lemeshow test of Henzi et al. 2024; base-R reimplementation, with
attribution, of the marius-cp/eHL code, matching it to ~1e-11; reported
as p = min(1, 1/e)), and BAGofT (the
binary-adaptive GOF test, wrapping the BAGofT package; set
nsim via
control = list(BAGofT = list(nsim = ...))).
An opt-in slow test, le-Cessie (le Cessie-van
Houwelingen 1995, general multivariate smoothed-residual test), runs
when include_slow = TRUE. It is O(n2)-O(n3).
Adapted with attribution from the USGS smwrStats package
(public domain); verified to match it exactly.
The Xie test uses the corrected degrees of freedom
G - k/2 - 1 with k the number of predictors.
(Earlier thesis runs used df = G - 0.5, an artifact of
coef() returning NULL on a
predicted-probability list; the statistic is the same, only the p-value
differs.)
Added the Information-Matrix test (White 1982 / Orme
1988), the closed-form IM test; verified to match the thesis
IMtest_fast exactly.
include_slow = TRUE tests in a later build:
the GAM-based tests (HL-GAM, PR-GAM, Xie-GAM; need mgcv),
the bootstrap tests (Hosmer bootstrap, Stute-Zhu), the e-value HL
(eHL; needs isotone), and
BAGofT.This is the first release of the ebrahim.gof package, implementing the Ebrahim-Farrington goodness-of-fit test for logistic regression models.
ef.gof() - Performs the
Ebrahim-Farrington goodness-of-fit testEbrahim Khaled Ebrahim (Alexandria University) Email: ebrahimkhaled@alexu.edu.eg