Title: | An Implementation of Z-Curves |
---|---|
Description: | An implementation of z-curves - a method for estimating expected discovery and replicability rates on the bases of test-statistics of published studies. The package provides functions for fitting the new density and EM version (Bartoš & Schimmack, 2020, <doi:10.31234/osf.io/urgtn>), censored observations, as well as the original density z-curve (Brunner & Schimmack, 2020, <doi:10.15626/MP.2018.874>). Furthermore, the package provides summarizing and plotting functions for the fitted z-curve objects. See the aforementioned articles for more information about the z-curves, expected discovery and replicability rates, validation studies, and limitations. |
Authors: | František Bartoš [aut, cre], Ulrich Schimmack [aut] |
Maintainer: | František Bartoš <[email protected]> |
License: | GPL-3 |
Version: | 2.4.2 |
Built: | 2024-12-26 05:44:23 UTC |
Source: | https://github.com/fbartos/zcurve |
All settings are passed to the density fitting
algorithm. All unspecified settings are set to the default value.
Setting model = "KD2"
sets all settings to the default
value irrespective of any other setting and fits z-curve as
describe in Bartoš and Schimmack (2020). In order to fit the
z-curve 1.0 density algorithm, set model = "KD1"
and go to
control_density_v1
version |
Which version of z-curve should be fitted. Defaults to
|
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
mu |
Means of the components, defaults to |
sigma |
A standard deviation of the components, "Don't touch this"
\- Ulrich Schimmack, defaults to |
theta_min |
Lower limits for weights, defaults to
|
theta_max |
Upper limits for weights, defaults to
|
max_iter |
A maximum number of iterations for the nlminb
optimization for fitting mixture model, defaults to |
max_eval |
A maximum number of evaluation for the nlminb
optimization for fitting mixture model, defaults to |
criterion |
A criterion to terminate nlminb optimization,
defaults to |
bw |
A bandwidth of the kernel density estimation, defaults to |
aug |
Augment truncated kernel density, defaults to |
aug.bw |
A bandwidth of the augmentation, defaults to |
n.bars |
A resolution of density function, defaults to |
density_dbc |
Use bckden to estimate a truncated kernel density,
defaults to |
compute_FDR |
Whether to compute FDR, leads to noticeable increase in
computation, defaults to |
criterion_FDR |
A criterion for estimating the maximum FDR, defaults
to |
criterion_FDR_dbc |
A criterion for estimating the maximum FDR using
the bckden function, defaults to |
precision_FDR |
A maximum FDR precision, defaults to |
Bartoš F, Schimmack U (2020). “Z-curve. 2.0: Estimating Replication Rates and Discovery Rates.” doi:10.31219/osf.io/wr93f, submitted for publication.
zcurve()
, control_density_v1, control_EM
# to decrease the criterion and increase the number of iterations ctrl <- list( max_iter = 300, criterion = 1e-4 ) ## Not run: zcurve(OSC.z, method = "density", control = ctrl)
# to decrease the criterion and increase the number of iterations ctrl <- list( max_iter = 300, criterion = 1e-4 ) ## Not run: zcurve(OSC.z, method = "density", control = ctrl)
All settings are passed to the density fitting
algorithm. All unspecified settings are set to the default value.
Setting model = "KD1"
sets all settings to the default
value irrespective of any other setting and fits z-curve as described
in Brunner and Schimmack (2020).
version |
Set to |
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
K |
Number of mixture components, defaults to |
max_iter |
A maximum number of iterations for the nlminb
optimization for fitting mixture model, defaults to |
max_eval |
A maximum number of evaluation for the nlminb
optimization for fitting mixture model, defaults to |
criterion |
A criterion to terminate nlminb optimization,
defaults to |
bw |
A bandwidth of the kernel density estimation, defaults to |
Brunner J, Schimmack U (2020). “Estimating population mean power under conditions of heterogeneity and selection for significance.” Meta-Psychology, 4. doi:10.15626/MP.2018.874.
zcurve()
, control_density, control_EM
# to increase the number of iterations ctrl <- list( version = 1, max_iter = 300 ) ## Not run: zcurve(OSC.z, method = "density", control = ctrl)
# to increase the number of iterations ctrl <- list( version = 1, max_iter = 300 ) ## Not run: zcurve(OSC.z, method = "density", control = ctrl)
All these settings are passed to the Expectation Maximization
fitting algorithm. All unspecified settings are set to the default value.
Setting model = "EM"
sets all settings to the default
value irrespective of any other setting and fits z-curve as described in
Bartoš and Schimmack (2020)
model |
A type of model to be fitted, defaults to |
sig_level |
An alpha level of the test statistics, defaults to
|
a |
A beginning of fitting interval, defaults to
|
b |
An end of fitting interval, defaults to |
mu |
Means of the components, defaults to
|
sigma |
A standard deviation of the components, defaults to
|
theta_alpha |
A vector of alpha parameters of a Dirichlet distribution
for generating random starting values for the weights, defaults to
|
theta_max |
Upper limits for weights, defaults to
|
criterion |
A criterion to terminate the EM algorithm,
defaults to |
criterion_start |
A criterion to terminate the starting phase
of the EM algorithm, defaults to |
criterion_boot |
A criterion to terminate the bootstrapping phase
of the EM algorithm, defaults to |
max_iter |
A maximum number of iterations of the EM algorithm
(not including the starting iterations) defaults to |
max_iter_start |
A maximum number of iterations for the
starting phase of EM algorithm, defaults to |
max_iter_boot |
A maximum number of iterations for the
booting phase of EM algorithm, defaults to |
fit_reps |
A number of starting fits to get the initial
position for the EM algorithm, defaults to |
Bartoš F, Schimmack U (2020). “Z-curve. 2.0: Estimating Replication Rates and Discovery Rates.” doi:10.31219/osf.io/wr93f, submitted for publication.
# to increase the number of starting fits # and change the means of the mixture components ctrl <- list( fit_reps = 50, mu = c(0, 1.5, 3, 4.5, 6) ) ## Not run: zcurve(OSC.z, method = "EM", control = ctrl)
# to increase the number of starting fits # and change the means of the mixture components ctrl <- list( fit_reps = 50, mu = c(0, 1.5, 3, 4.5, 6) ) ## Not run: zcurve(OSC.z, method = "EM", control = ctrl)
Prints first few rows of a z-curve data object
## S3 method for class 'zcurve_data' head(x, ...)
## S3 method for class 'zcurve_data' head(x, ...)
x |
z-curve data object |
... |
Additional arguments |
Reports whether x is a zcurve object
is.zcurve(x)
is.zcurve(x)
x |
an object to test |
The dataset contains z-scores from subset of original studies featured in psychology reproducibility project (Collaboration and others 2015). Only z-scores from studies with unambiguous original outcomes are supplied (eliminating 7 studies with marginally significant results). The real replication rate for those studies is 35/90 (the whole project reports 36/97).
OSC.z
OSC.z
A vector with 90 observations
Collaboration OS, others (2015). “Estimating the reproducibility of psychological science.” Science, 349(6251). doi:10.1126/science.aac4716.
Plot fitted z-curve object
## S3 method for class 'zcurve' plot( x, annotation = FALSE, CI = FALSE, extrapolate = FALSE, plot_type = "base", y.anno = c(0.95, 0.88, 0.78, 0.71, 0.61, 0.53, 0.43, 0.35), x.anno = 0.6, cex.anno = 1, ... )
## S3 method for class 'zcurve' plot( x, annotation = FALSE, CI = FALSE, extrapolate = FALSE, plot_type = "base", y.anno = c(0.95, 0.88, 0.78, 0.71, 0.61, 0.53, 0.43, 0.35), x.anno = 0.6, cex.anno = 1, ... )
x |
Fitted z-curve object |
annotation |
Add annotation to the plot. Defaults
to |
CI |
Plot confidence intervals for the estimated z-curve. Defaults
to |
extrapolate |
Scale the chart to the extrapolated area. Defaults
to |
plot_type |
Type of plot to by produced. Defaults to |
y.anno |
A vector of length 8 specifying the y-positions
of the individual annotation lines relative to the figure's height.
Defaults to |
x.anno |
A number specifying the x-position of the block of annotations relative to the figure's width. |
cex.anno |
A number specifying the size of the annotation text. |
... |
Additional arguments including |
## Not run: # simulate some z-statistics and fit a z-curve z <- abs(rnorm(300,3)) m.EM <- zcurve(z, method = "EM", bootstrap = 100) # plot the z-curve plot(m.EM) # add annotation text and model fit CI plot(m.EM, annotation = TRUE, CI = TRUE) # change the location of the annotation to the left plot(m.EM, annotation = TRUE, CI = TRUE, x_text = 0) ## End(Not run)
## Not run: # simulate some z-statistics and fit a z-curve z <- abs(rnorm(300,3)) m.EM <- zcurve(z, method = "EM", bootstrap = 100) # plot the z-curve plot(m.EM) # add annotation text and model fit CI plot(m.EM, annotation = TRUE, CI = TRUE) # change the location of the annotation to the left plot(m.EM, annotation = TRUE, CI = TRUE, x_text = 0) ## End(Not run)
A function for computing z-scores of two-sided tests
corresponding to power power
for a given significance level
alpha alpha
(or corresponding cut-off z-statistic a
).
power_to_z( power, alpha = 0.05, a = stats::qnorm(alpha/2, lower.tail = FALSE), two.sided = TRUE, nleqslv_control = list(xtol = 1e-15, maxit = 300, stepmax = 0.5) )
power_to_z( power, alpha = 0.05, a = stats::qnorm(alpha/2, lower.tail = FALSE), two.sided = TRUE, nleqslv_control = list(xtol = 1e-15, maxit = 300, stepmax = 0.5) )
power |
A vector of powers |
alpha |
Level of significance alpha |
a |
Or, alternatively a z-score corresponding to |
two.sided |
Whether directionality of the effect size should be taken into account. |
nleqslv_control |
A named list of control parameters passed to the nleqslv function used for solving the inverse of z_to_power function. |
# z-scores corresponding to the (aproximate) power of components of EM2 power_to_z(c(0.05, 0.20, 0.40, 0.60, 0.80, 0.974, 0.999), alpha = .05)
# z-scores corresponding to the (aproximate) power of components of EM2 power_to_z(c(0.05, 0.20, 0.40, 0.60, 0.80, 0.974, 0.999), alpha = .05)
Prints estimates from z-curve object
## S3 method for class 'zcurve' print.estimates(x, ...)
## S3 method for class 'zcurve' print.estimates(x, ...)
x |
Estimate of a z-curve object |
... |
Additional arguments |
Prints summary object for z-curve method
## S3 method for class 'zcurve' print.summary(x, ...)
## S3 method for class 'zcurve' print.summary(x, ...)
x |
Summary of a z-curve object |
... |
Additional arguments |
Prints a fitted z-curve object
## S3 method for class 'zcurve' print(x, ...)
## S3 method for class 'zcurve' print(x, ...)
x |
Fitted z-curve object |
... |
Additional arguments |
Prints a z-curve data object
## S3 method for class 'zcurve_data' print(x, ...)
## S3 method for class 'zcurve_data' print(x, ...)
x |
z-curve data object |
... |
Additional arguments |
Summarize fitted z-curve object
## S3 method for class 'zcurve' summary( object, type = "results", all = FALSE, ERR.adj = 0.03, EDR.adj = 0.05, round.coef = 3, ... )
## S3 method for class 'zcurve' summary( object, type = "results", all = FALSE, ERR.adj = 0.03, EDR.adj = 0.05, round.coef = 3, ... )
object |
A fitted z-curve object. |
type |
Whether the results |
all |
Whether additional results, such as file drawer
ration, expected and missing number of studies, and Soric FDR
be returned. Defaults to |
ERR.adj |
Confidence intervals adjustment for ERR. Defaults
to |
EDR.adj |
Confidence intervals adjustment for EDR. Defaults
to |
round.coef |
To how many decimals should the coefficient
be rounded. Defaults to |
... |
Additional arguments |
Summary of a z-curve object
A function for computing power of two-sided tests
corresponding to z-scores for a given significance level.
alpha
(or corresponding cut-off z-score a
)
z_to_power( z, alpha = 0.05, a = stats::qnorm(alpha/2, lower.tail = FALSE), two.sided = TRUE )
z_to_power( z, alpha = 0.05, a = stats::qnorm(alpha/2, lower.tail = FALSE), two.sided = TRUE )
z |
A vector of z-scores |
alpha |
Level of significance alpha |
a |
Or, alternatively a z-score corresponding to |
two.sided |
Whether directionality of the effect size should be taken into account. |
# mean powers corresponding to the mean components of KD2 z_to_power(0:6, alpha = .05)
# mean powers corresponding to the mean components of KD2 z_to_power(0:6, alpha = .05)
zcurve
is used to fit z-curve models. The function
takes input of z-statistics or two-sided p-values and returns object of
class "zcurve"
that can be further interrogated by summary and plot
function. It default to EM model, but different version of z-curves can
be specified using the method
and control
arguments. See
'Examples' and 'Details' for more information.
zcurve( z, z.lb, z.ub, p, p.lb, p.ub, data, method = "EM", bootstrap = 1000, parallel = FALSE, control = NULL )
zcurve( z, z.lb, z.ub, p, p.lb, p.ub, data, method = "EM", bootstrap = 1000, parallel = FALSE, control = NULL )
z |
a vector of z-scores. |
z.lb |
a vector with start of censoring intervals of censored z-scores. |
z.ub |
a vector with end of censoring intervals of censored z-scores. |
p |
a vector of two-sided p-values, internally transformed to z-scores. |
p.lb |
a vector with start of censoring intervals of censored two-sided p-values. |
p.ub |
a vector with end of censoring intervals of censored two-sided p-values. |
data |
an object created with |
method |
the method to be used for fitting. Possible options are
Expectation Maximization |
bootstrap |
the number of bootstraps for estimating CI. To skip
bootstrap specify |
parallel |
whether the bootstrap should be performed in parallel.
Defaults to |
control |
additional options for the fitting algorithm more details in control EM or control density. |
The function returns the EM method by default and changing
method = "density"
gives the KD2 version of z-curve as outlined in
Bartoš and Schimmack (2020). For the original z-curve
(Brunner and Schimmack 2020), referred to as KD1, specify
'control = "density", control = list(model = "KD1")'
.
The fitted z-curve object
Bartoš F, Schimmack U (2020).
“Z-curve. 2.0: Estimating Replication Rates and Discovery Rates.”
doi:10.31219/osf.io/wr93f, submitted for publication.
Brunner J, Schimmack U (2020).
“Estimating population mean power under conditions of heterogeneity and selection for significance.”
Meta-Psychology, 4.
doi:10.15626/MP.2018.874.
summary.zcurve()
, plot.zcurve()
, control_EM, control_density
# load data from OSC 2015 reproducibility project OSC.z # fit an EM z-curve (with disabled bootstrap due to examples times limits) m.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE) # a version with 1000 boostraped samples would looked like: m.EM <- zcurve(OSC.z, method = "EM", bootstrap = 1000) # or KD2 z-curve (use larger bootstrap for real inference) m.D <- zcurve(OSC.z, method = "density", bootstrap = FALSE) # inspect the results summary(m.EM) summary(m.D) # see '?summary.zcurve' for more output options # plot the results plot(m.EM) plot(m.D) # see '?plot.zcurve' for more plotting options # to specify more options, set the control arguments # ei. increase the maximum number of iterations and change alpha level ctr1 <- list( "max_iter" = 9999, "alpha" = .10 ) ## Not run: m1.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE, control = ctr1) # see '?control_EM' and '?control_density' for more information about different # z-curves specifications
# load data from OSC 2015 reproducibility project OSC.z # fit an EM z-curve (with disabled bootstrap due to examples times limits) m.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE) # a version with 1000 boostraped samples would looked like: m.EM <- zcurve(OSC.z, method = "EM", bootstrap = 1000) # or KD2 z-curve (use larger bootstrap for real inference) m.D <- zcurve(OSC.z, method = "density", bootstrap = FALSE) # inspect the results summary(m.EM) summary(m.D) # see '?summary.zcurve' for more output options # plot the results plot(m.EM) plot(m.D) # see '?plot.zcurve' for more plotting options # to specify more options, set the control arguments # ei. increase the maximum number of iterations and change alpha level ctr1 <- list( "max_iter" = 9999, "alpha" = .10 ) ## Not run: m1.EM <- zcurve(OSC.z, method = "EM", bootstrap = FALSE, control = ctr1) # see '?control_EM' and '?control_density' for more information about different # z-curves specifications
zcurve_clustered
is used to fit z-curve models to
clustered data. The function requires a data object created with the
zcurve_data()
function as the input (where id denotes clusters).
Two different methods that account for clustering ar implemented via
the EM model: "w"
for down weighting the likelihood of the test
statistics proportionately to the number of repetitions in the clusters,
and "b"
for a nested bootstrap where only a single study from each
bootstrap is selected for model fitting.
zcurve_clustered( data, method = "b", bootstrap = 1000, parallel = FALSE, control = NULL )
zcurve_clustered( data, method = "b", bootstrap = 1000, parallel = FALSE, control = NULL )
data |
an object created with |
method |
the method to be used for fitting. Possible options are
down weighting |
bootstrap |
the number of bootstraps for estimating CI. To skip
bootstrap specify |
parallel |
whether the bootstrap should be performed in parallel.
Defaults to |
control |
additional options for the fitting algorithm more details in control EM. |
The fitted z-curve object
There are no references for Rd macro \insertAllCites
on this help page.
zcurve()
, summary.zcurve()
, plot.zcurve()
, control_EM, control_density
zcurve_data
is used to prepare data for the
zcurve()
function. The function transform strings containing
reported test statistics "z", "t", "f", "chi", "p"
into two-sided
p-values. Test statistics reported as inequalities are as considered
to be censored as well as test statistics reported with low accuracy
(i.e., rounded to too few decimals). See details for more information.
zcurve_data(data, id = NULL, rounded = TRUE, stat_precise = 2, p_precise = 3)
zcurve_data(data, id = NULL, rounded = TRUE, stat_precise = 2, p_precise = 3)
data |
a vector strings containing the test statistics. |
id |
a vector identifying observations from the same cluster. |
rounded |
an optional argument specifying whether de-rounding should be applied.
Defaults to |
stat_precise |
an integer specifying the numerical precision of
|
p_precise |
an integer specifying the numerical precision of p-values treated as exact values. |
By default, the function extract the type of test statistic:
"F(df1, df2)=x"
F-statistic with df1 and df2 degrees of freedom,
"chi(df)=x"
Chi-square statistic with df degrees of freedom,
"t(df)=x"
for t-statistic with df degrees of freedom,
"z=x"
for z-statistic,
"p=x"
for p-value.
The input is not case sensitive and automatically removes empty spaces. Furthermore,
inequalities ("<"
and ">"
) can be used to denote censoring. I.e., that
the p-value is lower than "x"
or that the test statistic is larger than "x"
respectively. The automatic de-rounding procedure (if rounded = TRUE
) treats
p-values with less decimal places than specified in p_precise
or test statistics
with less decimal places than specified in stat_precise
as censored on an interval
that could result in a given rounded value. I.e., a "p = 0.03"
input would be
de-rounded as a p-value lower than 0.035 but larger than 0.025.
An object of type "zcurve_data"
.
zcurve()
, print.zcurve_data()
, head.zcurve_data()
# Specify a character vector containing the test statistics data <- c("z = 2.1", "t(34) = 2.21", "p < 0.03", "F(2,23) > 10", "p = 0.003") # Obtain the z-curve data object data <- zcurve_data(data) # inspect the resulting object data
# Specify a character vector containing the test statistics data <- c("z = 2.1", "t(34) = 2.21", "p < 0.03", "F(2,23) > 10", "p = 0.003") # Obtain the z-curve data object data <- zcurve_data(data) # inspect the resulting object data
A placeholder object and functions for the zcurve package. (adapted from the runjags R package).
zcurve.options(...) zcurve.get_option(name)
zcurve.options(...) zcurve.get_option(name)
... |
named option(s) to change - for a list of available options, see details below. |
name |
the name of the option to get the current value of - for a list of available options, see details below. |
The current value of all available zcurve options (after applying any changes specified) is returned invisibly as a named list.
The following functions extract estimates from the z-curve object.
ERR(object, round.coef = 3) EDR(object, round.coef = 3) ODR(object, round.coef = 3) Soric(object, round.coef = 3) file_drawer_ration(object, round.coef = 3) expected_n(object, round.coef = 0) missing_n(object, round.coef = 0) significant_n(object) included_n(object)
ERR(object, round.coef = 3) EDR(object, round.coef = 3) ODR(object, round.coef = 3) Soric(object, round.coef = 3) file_drawer_ration(object, round.coef = 3) expected_n(object, round.coef = 0) missing_n(object, round.coef = 0) significant_n(object) included_n(object)
object |
the z-curve object |
round.coef |
rounding for the printed values |
Technically, ODR, significant n, and included n are not z-curve estimates but they are grouped in this category for convenience.