1 Data requirements

1.1 General requirements

Currently ShapleyVIC applies to binary, ordinal and continuous outcomes.
Code binary outcomes as 0/1, and ordinal outcomes as integers starting from 0.
No space or special characters (e.g., [, ], (, ), ,) in variable names. Replace them using _.
Variable centering/standardization is not required.

1.2 Missing values and sparsity

Handle missing entries appropriately before applying ShapleyVIC. Missing entry is not supported
Check data distribution and handle data sparsity before applying ShapleyVIC. Data sparsity may increase run time and lead to unstable results.

1.3 Additional pre-processing for high-dimensional data

Although theoretically permissible, it is not advisable to apply ShapleyVIC to data with a large number of variables.
Screen out variables with low importance (e.g., based on univariable or multivariable analysis p-values) to reduce dimension (e.g., to <50 variables) before applying ShapleyVIC.

1.4 General suggestions on the size of explanation set

Larger number of variables generally requires larger explanation set for stable results.
Increase in the size of explanation set and/or number of variables increases time required to compute ShapleyVIC values.
Use of >3500 samples in explanation set leads to long run time and is generally not recommended.