打開預測模型的黑盒子

Dec 4th, 2017

怎麼挑模型

`Model Selection` 根據`問題`與`資料型態`

`Cross Validation` 交叉驗證

黑魔法！

解釋

Explainable AI

確認它的判斷合理（verification of the system）

改良它的算法（improvement of the system）

Explainable AI

從它身上學習（learning from the system）

AlphaGO

符合法規要求（compliance to legislation）

歐盟 GDPR 規定使用者有「要求解釋的權力」（right to explanation）

LIME

Local Interpretable Model-Agnostic Explanations
"Why Should I Trust You?" Explaining the Predictions of Any Classifier. By Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin from the University of Washington in Seattle

LIME

Local Interpretable Model-Agnostic Explanations
"Why Should I Trust You?" Explaining the Predictions of Any Classifier. By Marco Tulio Ribeiro, Sameer Singh and Carlos Guestrin from the University of Washington in Seattle

LIME

Local Interpretable Model-Agnostic Explanations

LIME

Local Interpretable Model-Agnostic Explanations

LIME

Local Interpretable Model-Agnostic Explanations

LIME

Local Interpretable Model-Agnostic Explanations

LIME

KDD2016 paper 573

LIME

LIME

基本概念

For each prediction to explain, permute the observation n times.

Let the complex model predict the outcome of all permuted observations.

Calculate the distance from all permutations to the original observation.

Convert the distance to a similarity score.

LIME

基本概念

Select m features best describing the complex model outcome from the permuted data.

Fit a simple model to the permuted data, explaining the complex model outcome with the m features from the permuted data weighted by its similarity to the original observation.

Extract the feature weights from the simple model and use these as explanations for the complex models local behavior.

LIME

Feature Selection

none: Use all features for the explanation.
forward selection: Features are added one by one based on their improvements to a ridge regression fit of the complex model outcome.
highest weights: The m features with highest absolute weight in a ridge regression fit of the complex model outcome are chosen.

LIME

Feature Selection

lasso: The m features that are least prone to shrinkage based on the regularization path of a lasso fit of the complex model outcome is chosen.
tree: A tree is fitted with log2(m) splits, to use at max m features. It may possibly select less.
auto: Uses forward selection if m <= 6 and otherwise highest weights.

Demo

library(lime)
library(MASS)
data(biopsy)

# First we'll clean up the data a bit
biopsy$ID <- NULL
biopsy <- na.omit(biopsy)
names(biopsy) <- c('clump thickness', 'uniformity of cell size', 
                   'uniformity of cell shape', 'marginal adhesion',
                   'single epithelial cell size', 'bare nuclei', 
                   'bland chromatin', 'normal nucleoli', 'mitoses',
                   'class')

Demo

# Now we'll fit a linear discriminant model on all but 4 cases
set.seed(4)
test_set <- sample(seq_len(nrow(biopsy)), 100)
prediction <- biopsy$class
biopsy$class <- NULL
model <- lda(biopsy[-test_set, ], prediction[-test_set])

sum(predict(model, biopsy[test_set, ])$class == prediction[test_set])/100

## [1] 0.96

Demo

# Train the explainer
explainer <- lime(biopsy[-test_set,], model, bin_continuous = TRUE, quantile_bins = FALSE)
# Use the explainer on new observations
explanation <- explain(biopsy[test_set[1:4], ], explainer, n_labels = 1, n_features = 4)
tibble::glimpse(explanation)

## Observations: 16
## Variables: 13
## $ model_type       <chr> "classification", "classification", "classifi...
## $ case             <chr> "416", "416", "416", "416", "7", "7", "7", "7...
## $ label            <chr> "benign", "benign", "benign", "benign", "beni...
## $ label_prob       <dbl> 0.9964864, 0.9964864, 0.9964864, 0.9964864, 0...
## $ model_r2         <dbl> 0.5659044, 0.5659044, 0.5659044, 0.5659044, 0...
## $ model_intercept  <dbl> 0.08837631, 0.08837631, 0.08837631, 0.0883763...
## $ model_prediction <dbl> 1.0244738, 1.0244738, 1.0244738, 1.0244738, 0...
## $ feature          <chr> "normal nucleoli", "bare nuclei", "uniformity...
## $ feature_value    <int> 5, 3, 3, 3, 1, 10, 1, 1, 5, 10, 10, 3, 1, 1, ...
## $ feature_weight   <dbl> -0.018041571, 0.573050022, 0.202345467, 0.178...
## $ feature_desc     <chr> "3.25 < normal nucleoli <= 5.50", "bare nucle...
## $ data             <list> [[3, 3, 2, 6, 3, 3, 3, 5, 1], [3, 3, 2, 6, 3...
## $ prediction       <list> [[0.9964864, 0.003513577], [0.9964864, 0.003...

Demo

explanation <- explain(biopsy[test_set[1:4], ], explainer, n_labels = 1, 
                       n_features = 4, kernel_width = 0.5, feature_select = "auto")
explanation[, 2:9]

##    case     label label_prob  model_r2 model_intercept model_prediction
## 1   416    benign  0.9964864 0.4804993       0.4323631        1.0029394
## 2   416    benign  0.9964864 0.4804993       0.4323631        1.0029394
## 3   416    benign  0.9964864 0.4804993       0.4323631        1.0029394
## 4   416    benign  0.9964864 0.4804993       0.4323631        1.0029394
## 5     7    benign  0.9244742 0.4680113       0.3216358        0.6370384
## 6     7    benign  0.9244742 0.4680113       0.3216358        0.6370384
## 7     7    benign  0.9244742 0.4680113       0.3216358        0.6370384
## 8     7    benign  0.9244742 0.4680113       0.3216358        0.6370384
## 9   207 malignant  0.9999911 0.6543314       0.1583423        1.0001967
## 10  207 malignant  0.9999911 0.6543314       0.1583423        1.0001967
## 11  207 malignant  0.9999911 0.6543314       0.1583423        1.0001967
## 12  207 malignant  0.9999911 0.6543314       0.1583423        1.0001967
## 13  195    benign  0.9999981 0.4631399       0.5493353        1.0047116
## 14  195    benign  0.9999981 0.4631399       0.5493353        1.0047116
## 15  195    benign  0.9999981 0.4631399       0.5493353        1.0047116
## 16  195    benign  0.9999981 0.4631399       0.5493353        1.0047116
##                     feature feature_value
## 1                   mitoses             1
## 2               bare nuclei             3
## 3           clump thickness             3
## 4   uniformity of cell size             3
## 5                   mitoses             1
## 6               bare nuclei            10
## 7           clump thickness             1
## 8   uniformity of cell size             1
## 9                   mitoses             1
## 10  uniformity of cell size            10
## 11          clump thickness            10
## 12 uniformity of cell shape             9
## 13                  mitoses             1
## 14              bare nuclei             1
## 15          clump thickness             3
## 16  uniformity of cell size             1

Demo

plot_features(explanation, ncol = 2)

Demo

使用外部 model 的範例

HR ANALYTICS: USING MACHINE LEARNING TO PREDICT EMPLOYEE TURNOVER

怎麼挑模型

Model Selection 根據問題與資料型態

Cross Validation 交叉驗證

黑魔法！

解釋

Explainable AI

Explainable AI

確認它的判斷合理（verification of the system）

改良它的算法（improvement of the system）

Explainable AI

從它身上學習（learning from the system）

符合法規要求（compliance to legislation）

LIME

LIME

LIME

LIME

LIME

LIME

LIME

KDD2016 paper 573

LIME

LIME

基本概念

LIME

基本概念

LIME

Feature Selection

LIME

Feature Selection

Demo

Demo

Demo

Demo

Demo

Demo

Demo

使用外部 model 的範例

更多參考資料

`Model Selection` 根據`問題`與`資料型態`

`Cross Validation` 交叉驗證