eli5.xgboost¶
eli5 has XGBoost support - eli5.explain_weights()
shows feature importances,
and eli5.explain_prediction()
explains predictions by showing feature weights.
Both functions work for XGBClassifier and XGBRegressor.
-
explain_prediction_xgboost
(xgb, doc, vec=None, top=None, top_targets=None, target_names=None, targets=None, feature_names=None, feature_re=None, feature_filter=None, vectorized=False, is_regression=None, missing=None)[source]¶ Return an explanation of XGBoost prediction (via scikit-learn wrapper XGBClassifier or XGBRegressor, or via xgboost.Booster) as feature weights.
See
eli5.explain_prediction()
for description oftop
,top_targets
,target_names
,targets
,feature_names
,feature_re
andfeature_filter
parameters.Parameters: - vec (vectorizer, optional) – A vectorizer instance used to transform
raw features to the input of the estimator
xgb
(e.g. a fitted CountVectorizer instance); you can pass it instead offeature_names
. - vectorized (bool, optional) – A flag which tells eli5 if
doc
should be passed throughvec
or not. By default it is False, meaning that ifvec
is not None,vec.transform([doc])
is passed to the estimator. Set it to True if you’re passingvec
, butdoc
is already vectorized. - is_regression (bool, optional) – Pass if an
xgboost.Booster
is passed as the first argument. True if solving a regression problem (“objective” starts with “reg”) and False for a classification problem. If not set, regression is assumed for a single target estimator and proba will not be shown. - missing (optional) – Pass if an
xgboost.Booster
is passed as the first argument. Set it to the same value as themissing
argument toxgboost.DMatrix
. Matters only if sparse values are used. Default isnp.nan
. - Method for determining feature importances follows an idea from
- http (//blog.datadive.net/interpreting-random-forests/.)
- Feature weights are calculated by following decision paths in trees
- of an ensemble.
- Each leaf has an output score, and expected scores can also be assigned
- to parent nodes.
- Contribution of one feature on the decision path is how much expected score
- changes from parent to child.
- Weights of all features sum to the output score of the estimator.
- vec (vectorizer, optional) – A vectorizer instance used to transform
raw features to the input of the estimator
-
explain_weights_xgboost
(xgb, vec=None, top=20, target_names=None, targets=None, feature_names=None, feature_re=None, feature_filter=None, importance_type='gain')[source]¶ Return an explanation of an XGBoost estimator (via scikit-learn wrapper XGBClassifier or XGBRegressor, or via xgboost.Booster) as feature importances.
See
eli5.explain_weights()
for description oftop
,feature_names
,feature_re
andfeature_filter
parameters.target_names
andtargets
parameters are ignored.Parameters: importance_type (str, optional) – A way to get feature importance. Possible values are:
- ‘gain’ - the average gain of the feature when it is used in trees (default)
- ‘weight’ - the number of times a feature is used to split the data across all trees
- ‘cover’ - the average coverage of the feature when it is used in trees