To follow this tutorial you need NLTK > 3.x and sklearn-crfsuite Python
packages. The tutorial uses Python 3.
4. Inspect model weights
CRFsuite CRF models use two kinds of features: state features and
transition features. Let’s check their weights using
eli5.explain_weights:
eli5.show_weights(crf, top=30)
From \ To |
O |
B-LOC |
I-LOC |
B-MISC |
I-MISC |
B-ORG |
I-ORG |
B-PER |
I-PER |
O |
3.281
|
2.204
|
0.0
|
2.101
|
0.0
|
3.468
|
0.0
|
2.325
|
0.0
|
B-LOC |
-0.259
|
-0.098
|
4.058
|
0.0
|
0.0
|
0.0
|
0.0
|
-0.212
|
0.0
|
I-LOC |
-0.173
|
-0.609
|
3.436
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
B-MISC |
-0.673
|
-0.341
|
0.0
|
0.0
|
4.069
|
-0.308
|
0.0
|
-0.331
|
0.0
|
I-MISC |
-0.803
|
-0.998
|
0.0
|
-0.519
|
4.977
|
-0.817
|
0.0
|
-0.611
|
0.0
|
B-ORG |
-0.096
|
-0.242
|
0.0
|
-0.57
|
0.0
|
-1.012
|
4.739
|
-0.306
|
0.0
|
I-ORG |
-0.339
|
-1.758
|
0.0
|
-0.841
|
0.0
|
-1.382
|
5.062
|
-0.472
|
0.0
|
B-PER |
-0.4
|
-0.851
|
0.0
|
0.0
|
0.0
|
-1.013
|
0.0
|
-0.937
|
4.329
|
I-PER |
-0.676
|
-0.47
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
-0.659
|
3.754
|
y=O
top features
|
y=B-LOC
top features
|
y=I-LOC
top features
|
y=B-MISC
top features
|
y=I-MISC
top features
|
y=B-ORG
top features
|
y=I-ORG
top features
|
y=B-PER
top features
|
y=I-PER
top features
|
Weight?
|
Feature |
+4.416
|
postag[:2]:Fp
|
+3.116
|
BOS
|
+2.401
|
bias
|
+2.297
|
postag[:2]:Fc
|
+2.297
|
word.lower():,
|
+2.297
|
postag:Fc
|
+2.297
|
word[-3:]:,
|
+2.124
|
postag[:2]:CC
|
+2.124
|
postag:CC
|
+1.984
|
EOS
|
+1.859
|
word.lower():y
|
+1.684
|
postag:RG
|
+1.684
|
postag[:2]:RG
|
+1.610
|
word.lower():-
|
+1.610
|
postag[:2]:Fg
|
+1.610
|
word[-3:]:-
|
+1.610
|
postag:Fg
|
+1.582
|
postag:Fp
|
+1.582
|
word[-3:]:.
|
+1.582
|
word.lower():.
|
+1.372
|
word[-3:]:y
|
+1.187
|
postag:CS
|
+1.187
|
postag[:2]:CS
|
+1.150
|
word[-3:]:(
|
+1.150
|
postag:Fpa
|
+1.150
|
word.lower():(
|
… 16444 more positive …
|
… 3771 more negative …
|
-2.106
|
postag:NP
|
-2.106
|
postag[:2]:NP
|
-3.723
|
word.isupper()
|
-6.166
|
word.istitle()
|
|
Weight?
|
Feature |
+2.530
|
word.istitle()
|
+2.224
|
-1:word.lower():en
|
+0.906
|
word[-3:]:rid
|
+0.905
|
word.lower():madrid
|
+0.646
|
word.lower():españa
|
+0.640
|
word[-3:]:ona
|
+0.595
|
word[-3:]:aña
|
+0.595
|
+1:postag[:2]:Fp
|
+0.515
|
word.lower():parís
|
+0.514
|
word[-3:]:rís
|
+0.424
|
word.lower():barcelona
|
+0.420
|
-1:postag:Fg
|
+0.420
|
-1:word.lower():-
|
+0.420
|
-1:postag[:2]:Fg
|
+0.413
|
-1:word.isupper()
|
+0.390
|
-1:postag[:2]:Fp
|
+0.389
|
-1:postag:Fpa
|
+0.389
|
-1:word.lower():(
|
+0.388
|
word.lower():san
|
+0.385
|
postag:NC
|
… 2282 more positive …
|
… 413 more negative …
|
-0.389
|
-1:word.lower():"
|
-0.389
|
-1:postag:Fe
|
-0.389
|
-1:postag[:2]:Fe
|
-0.406
|
-1:postag[:2]:VM
|
-0.646
|
word[-3:]:ión
|
-0.759
|
-1:word.lower():del
|
-0.818
|
bias
|
-0.986
|
postag:SP
|
-0.986
|
postag[:2]:SP
|
-1.354
|
-1:word.istitle()
|
|
Weight?
|
Feature |
+0.886
|
-1:word.istitle()
|
+0.664
|
-1:word.lower():de
|
+0.582
|
word[-3:]:de
|
+0.578
|
word.lower():de
|
+0.529
|
-1:word.lower():san
|
+0.444
|
+1:word.istitle()
|
+0.441
|
word.istitle()
|
+0.335
|
-1:word.lower():la
|
+0.262
|
postag:SP
|
+0.262
|
postag[:2]:SP
|
+0.235
|
word[-3:]:la
|
+0.228
|
word[-3:]:iro
|
+0.226
|
word[-3:]:oja
|
+0.218
|
word[-3:]:del
|
+0.215
|
word.lower():del
|
+0.213
|
-1:postag:NC
|
+0.213
|
-1:postag[:2]:NC
|
+0.205
|
-1:word.lower():nueva
|
… 1665 more positive …
|
… 258 more negative …
|
-0.206
|
-1:postag[:2]:Z
|
-0.206
|
-1:postag:Z
|
-0.213
|
-1:postag[:2]:CC
|
-0.213
|
-1:postag:CC
|
-0.219
|
-1:word.lower():en
|
-0.222
|
+1:word.isupper()
|
-0.235
|
+1:postag:VMI
|
-0.342
|
word.isupper()
|
-0.366
|
+1:postag[:2]:AQ
|
-0.366
|
+1:postag:AQ
|
-0.392
|
+1:postag[:2]:VM
|
-1.690
|
BOS
|
|
Weight?
|
Feature |
+1.770
|
word.isupper()
|
+0.693
|
word.istitle()
|
+0.606
|
word.lower():"
|
+0.606
|
word[-3:]:"
|
+0.606
|
postag:Fe
|
+0.606
|
postag[:2]:Fe
|
+0.538
|
+1:word.istitle()
|
+0.508
|
-1:word.lower():"
|
+0.508
|
-1:postag:Fe
|
+0.508
|
-1:postag[:2]:Fe
|
+0.484
|
-1:postag[:2]:DA
|
+0.484
|
-1:postag:DA
|
+0.479
|
+1:word.isupper()
|
+0.457
|
postag[:2]:NC
|
+0.457
|
postag:NC
|
+0.400
|
word.lower():liga
|
+0.399
|
word[-3:]:iga
|
+0.367
|
-1:word.lower():la
|
+0.354
|
postag:Z
|
+0.354
|
postag[:2]:Z
|
+0.332
|
-1:word.lower():del
|
+0.286
|
+1:postag[:2]:Z
|
+0.286
|
+1:postag:Z
|
+0.284
|
+1:postag:NC
|
+0.284
|
+1:postag[:2]:NC
|
… 2284 more positive …
|
… 314 more negative …
|
-0.308
|
BOS
|
-0.377
|
-1:postag[:2]:VM
|
-0.908
|
postag[:2]:SP
|
-0.908
|
postag:SP
|
-1.094
|
-1:word.istitle()
|
|
Weight?
|
Feature |
+1.364
|
-1:word.istitle()
|
+0.675
|
-1:word.lower():de
|
+0.597
|
+1:postag:Fe
|
+0.597
|
+1:word.lower():"
|
+0.597
|
+1:postag[:2]:Fe
|
+0.369
|
-1:postag:NC
|
+0.369
|
-1:postag[:2]:NC
|
+0.324
|
-1:word.lower():liga
|
+0.318
|
word[-3:]:de
|
+0.304
|
word.lower():de
|
+0.303
|
word.isdigit()
|
+0.261
|
-1:postag[:2]:SP
|
+0.261
|
-1:postag:SP
|
+0.258
|
-1:word.lower():copa
|
+0.240
|
word.lower():campeones
|
+0.235
|
word[-3:]:000
|
+0.234
|
+1:postag:Z
|
+0.234
|
+1:postag[:2]:Z
|
+0.229
|
word.lower():2000
|
… 3675 more positive …
|
… 573 more negative …
|
-0.235
|
EOS
|
-0.264
|
-1:word.lower():y
|
-0.265
|
word.lower():y
|
-0.265
|
+1:postag:VMI
|
-0.274
|
postag[:2]:VM
|
-0.306
|
-1:postag:CC
|
-0.306
|
-1:postag[:2]:CC
|
-0.320
|
postag:CC
|
-0.320
|
postag[:2]:CC
|
-0.370
|
+1:postag[:2]:VM
|
-0.641
|
bias
|
|
Weight?
|
Feature |
+2.695
|
word.lower():efe
|
+2.519
|
word.isupper()
|
+2.084
|
word[-3:]:EFE
|
+1.174
|
word.lower():gobierno
|
+1.142
|
word.istitle()
|
+1.018
|
-1:word.lower():del
|
+0.958
|
word[-3:]:rno
|
+0.671
|
word[-3:]:PP
|
+0.671
|
word.lower():pp
|
+0.667
|
-1:word.lower():al
|
+0.555
|
-1:word.lower():el
|
+0.499
|
word[-3:]:eal
|
+0.413
|
word.lower():real
|
+0.393
|
word.lower():ayuntamiento
|
+0.391
|
postag:AQ
|
+0.391
|
postag[:2]:AQ
|
… 3518 more positive …
|
… 619 more negative …
|
-0.430
|
-1:postag[:2]:AQ
|
-0.430
|
-1:postag:AQ
|
-0.450
|
+1:word.lower():de
|
-0.455
|
postag[:2]:Z
|
-0.455
|
postag:Z
|
-0.500
|
-1:word.istitle()
|
-0.642
|
-1:word.lower():los
|
-0.664
|
-1:word.lower():de
|
-0.707
|
-1:word.isupper()
|
-0.746
|
-1:word.lower():en
|
-0.747
|
-1:postag[:2]:VM
|
-1.100
|
bias
|
-1.289
|
postag[:2]:SP
|
-1.289
|
postag:SP
|
|
Weight?
|
Feature |
+1.499
|
-1:word.istitle()
|
+1.200
|
-1:word.lower():de
|
+0.539
|
-1:word.lower():real
|
+0.511
|
word[-3:]:rid
|
+0.446
|
word[-3:]:de
|
+0.433
|
word.lower():de
|
+0.428
|
-1:postag:SP
|
+0.428
|
-1:postag[:2]:SP
|
+0.399
|
word.lower():madrid
|
+0.368
|
word[-3:]:la
|
+0.365
|
-1:word.lower():consejo
|
+0.363
|
word.istitle()
|
+0.352
|
-1:word.lower():comisión
|
+0.336
|
postag[:2]:AQ
|
+0.336
|
postag:AQ
|
+0.332
|
+1:postag:Fpa
|
+0.332
|
+1:word.lower():(
|
+0.311
|
-1:word.lower():estados
|
+0.306
|
word.lower():unidos
|
… 3473 more positive …
|
… 703 more negative …
|
-0.304
|
postag[:2]:NP
|
-0.304
|
postag:NP
|
-0.306
|
-1:word.lower():a
|
-0.384
|
+1:postag[:2]:NC
|
-0.384
|
+1:postag:NC
|
-0.391
|
-1:word.isupper()
|
-0.507
|
+1:postag:AQ
|
-0.507
|
+1:postag[:2]:AQ
|
-0.535
|
postag[:2]:VM
|
-0.540
|
postag:VMI
|
-1.195
|
bias
|
|
Weight?
|
Feature |
+1.698
|
word.istitle()
|
+0.683
|
-1:postag:VMI
|
+0.601
|
+1:postag[:2]:VM
|
+0.589
|
postag:NP
|
+0.589
|
postag[:2]:NP
|
+0.589
|
+1:postag:VMI
|
+0.565
|
-1:word.lower():a
|
+0.520
|
word[-3:]:osé
|
+0.503
|
word.lower():josé
|
+0.476
|
-1:postag[:2]:VM
|
+0.472
|
postag:NC
|
+0.472
|
postag[:2]:NC
|
+0.452
|
-1:postag[:2]:Fc
|
+0.452
|
-1:word.lower():,
|
+0.452
|
-1:postag:Fc
|
… 4117 more positive …
|
… 351 more negative …
|
-0.472
|
-1:word.lower():en
|
-0.475
|
-1:postag[:2]:Fe
|
-0.475
|
-1:word.lower():"
|
-0.475
|
-1:postag:Fe
|
-0.543
|
word.lower():la
|
-0.572
|
-1:word.lower():de
|
-0.693
|
-1:word.istitle()
|
-0.712
|
postag[:2]:SP
|
-0.712
|
postag:SP
|
-0.778
|
-1:word.lower():del
|
-0.818
|
-1:postag[:2]:DA
|
-0.818
|
-1:postag:DA
|
-0.923
|
-1:word.lower():la
|
-1.319
|
postag:DA
|
-1.319
|
postag[:2]:DA
|
|
Weight?
|
Feature |
+2.742
|
-1:word.istitle()
|
+0.736
|
word.istitle()
|
+0.660
|
-1:word.lower():josé
|
+0.598
|
-1:postag[:2]:AQ
|
+0.598
|
-1:postag:AQ
|
+0.510
|
-1:postag[:2]:VM
|
+0.487
|
-1:word.lower():juan
|
+0.419
|
-1:word.lower():maría
|
+0.413
|
-1:postag:VMI
|
+0.345
|
-1:word.lower():luis
|
+0.319
|
-1:word.lower():manuel
|
+0.315
|
postag[:2]:NC
|
+0.315
|
postag:NC
|
+0.309
|
-1:word.lower():carlos
|
… 3903 more positive …
|
… 365 more negative …
|
-0.301
|
postag[:2]:NP
|
-0.301
|
postag:NP
|
-0.301
|
word[-3:]:ión
|
-0.305
|
postag[:2]:Fe
|
-0.305
|
word.lower():"
|
-0.305
|
postag:Fe
|
-0.305
|
word[-3:]:"
|
-0.305
|
+1:word.lower():que
|
-0.324
|
-1:word.lower():el
|
-0.377
|
+1:postag[:2]:Z
|
-0.377
|
+1:postag:Z
|
-0.396
|
postag:VMI
|
-0.433
|
+1:postag:SP
|
-0.433
|
+1:postag[:2]:SP
|
-0.485
|
postag[:2]:VM
|
-1.431
|
bias
|
|
Transition features make sense: at least model learned that I-ENITITY
must follow B-ENTITY. It also learned that some transitions are
unlikely, e.g. it is not common in this dataset to have a location right
after an organization name (I-ORG -> B-LOC has a large negative weight).
Features don’t use gazetteers, so model had to remember some geographic
names from the training data, e.g. that España is a location.
If we regularize CRF more, we can expect that only features which are
generic will remain, and memoized tokens will go. With L1 regularization
(c1 parameter) coefficients of most features should be driven to zero.
Let’s check what effect does regularization have on CRF weights:
crf = sklearn_crfsuite.CRF(
algorithm='lbfgs',
c1=200,
c2=0.1,
max_iterations=20,
all_possible_transitions=False,
)
crf.fit(X_train, y_train)
eli5.show_weights(crf, top=30)
From \ To |
O |
B-LOC |
I-LOC |
B-MISC |
I-MISC |
B-ORG |
I-ORG |
B-PER |
I-PER |
O |
3.232
|
1.76
|
0.0
|
2.026
|
0.0
|
2.603
|
0.0
|
1.593
|
0.0
|
B-LOC |
0.035
|
0.0
|
2.773
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
I-LOC |
-0.02
|
0.0
|
3.099
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
B-MISC |
-0.382
|
0.0
|
0.0
|
0.0
|
4.758
|
0.0
|
0.0
|
0.0
|
0.0
|
I-MISC |
-0.256
|
0.0
|
0.0
|
0.0
|
4.155
|
0.0
|
0.0
|
0.0
|
0.0
|
B-ORG |
0.161
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
3.344
|
0.0
|
0.0
|
I-ORG |
-0.126
|
-0.081
|
0.0
|
0.0
|
0.0
|
0.0
|
4.048
|
0.0
|
0.0
|
B-PER |
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
3.449
|
I-PER |
-0.085
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
0.0
|
2.254
|
y=O
top features
|
y=B-LOC
top features
|
y=I-LOC
top features
|
y=B-MISC
top features
|
y=I-MISC
top features
|
y=B-ORG
top features
|
y=I-ORG
top features
|
y=B-PER
top features
|
y=I-PER
top features
|
Weight?
|
Feature |
+3.363
|
BOS
|
+2.842
|
bias
|
+2.478
|
postag[:2]:Fp
|
+0.665
|
-1:word.isupper()
|
+0.439
|
+1:postag[:2]:AQ
|
+0.439
|
+1:postag:AQ
|
+0.400
|
postag[:2]:Fc
|
+0.400
|
word.lower():,
|
+0.400
|
word[-3:]:,
|
+0.400
|
postag:Fc
|
+0.391
|
postag:CC
|
+0.391
|
postag[:2]:CC
|
+0.365
|
EOS
|
+0.363
|
+1:postag:NC
|
+0.363
|
+1:postag[:2]:NC
|
+0.315
|
postag:SP
|
+0.315
|
postag[:2]:SP
|
+0.302
|
+1:word.isupper()
|
… 15 more positive …
|
… 14 more negative …
|
-0.216
|
postag:AQ
|
-0.216
|
postag[:2]:AQ
|
-0.334
|
-1:postag:SP
|
-0.334
|
-1:postag[:2]:SP
|
-0.417
|
postag[:2]:NP
|
-0.417
|
postag:NP
|
-0.547
|
postag[:2]:NC
|
-0.547
|
postag:NC
|
-0.547
|
word.lower():de
|
-0.600
|
word[-3:]:de
|
-3.552
|
word.isupper()
|
-5.446
|
word.istitle()
|
|
Weight?
|
Feature |
+1.417
|
-1:word.lower():en
|
+1.183
|
word.istitle()
|
+0.498
|
+1:postag[:2]:Fp
|
+0.150
|
+1:word.lower():,
|
+0.150
|
+1:postag:Fc
|
+0.150
|
+1:postag[:2]:Fc
|
+0.098
|
-1:postag[:2]:Fp
|
+0.081
|
-1:postag:Fpa
|
+0.081
|
-1:word.lower():(
|
+0.080
|
postag[:2]:NP
|
+0.080
|
postag:NP
|
+0.056
|
-1:postag:SP
|
+0.056
|
-1:postag[:2]:SP
|
+0.022
|
postag:NC
|
+0.022
|
postag[:2]:NC
|
+0.019
|
BOS
|
-0.008
|
+1:word.istitle()
|
-0.028
|
-1:word.lower():del
|
-0.572
|
-1:word.istitle()
|
|
Weight?
|
Feature |
+0.788
|
-1:word.istitle()
|
+0.248
|
word[-3:]:de
|
+0.237
|
word.lower():de
|
+0.199
|
-1:word.lower():de
|
+0.190
|
postag[:2]:SP
|
+0.190
|
postag:SP
|
+0.060
|
-1:postag:SP
|
+0.060
|
-1:postag[:2]:SP
|
+0.040
|
+1:word.istitle()
|
|
Weight?
|
Feature |
+0.349
|
word.isupper()
|
+0.053
|
-1:postag[:2]:DA
|
+0.053
|
-1:postag:DA
|
+0.030
|
word.istitle()
|
-0.009
|
-1:postag:SP
|
-0.009
|
-1:postag[:2]:SP
|
-0.060
|
bias
|
-0.172
|
-1:word.istitle()
|
|
Weight?
|
Feature |
+0.432
|
-1:word.istitle()
|
+0.158
|
-1:postag[:2]:NC
|
+0.158
|
-1:postag:NC
|
+0.146
|
+1:postag[:2]:Fe
|
+0.146
|
+1:word.lower():"
|
+0.146
|
+1:postag:Fe
|
+0.030
|
postag[:2]:SP
|
+0.030
|
postag:SP
|
-0.087
|
word.istitle()
|
-0.094
|
bias
|
-0.119
|
word.isupper()
|
-0.120
|
-1:word.isupper()
|
-0.121
|
+1:word.isupper()
|
-0.211
|
+1:word.istitle()
|
|
Weight?
|
Feature |
+1.681
|
word.isupper()
|
+0.507
|
-1:word.lower():del
|
+0.350
|
-1:postag:DA
|
+0.350
|
-1:postag[:2]:DA
|
+0.282
|
word.lower():efe
|
+0.234
|
word[-3:]:EFE
|
+0.195
|
-1:word.lower():(
|
+0.195
|
-1:postag:Fpa
|
+0.192
|
word.istitle()
|
+0.178
|
+1:postag:Fpt
|
+0.178
|
+1:word.lower():)
|
+0.173
|
-1:postag[:2]:Fp
|
+0.136
|
-1:word.lower():el
|
+0.110
|
postag[:2]:NC
|
+0.110
|
postag:NC
|
-0.004
|
+1:word.istitle()
|
-0.023
|
+1:postag[:2]:Fp
|
-0.041
|
+1:postag:NC
|
-0.041
|
+1:postag[:2]:NC
|
-0.210
|
-1:word.lower():de
|
-0.515
|
bias
|
|
Weight?
|
Feature |
+1.318
|
-1:word.istitle()
|
+0.762
|
-1:word.lower():de
|
+0.185
|
-1:postag:SP
|
+0.185
|
-1:postag[:2]:SP
|
+0.185
|
word[-3:]:de
|
+0.058
|
word.lower():de
|
-0.043
|
-1:word.isupper()
|
-0.267
|
+1:word.istitle()
|
-0.536
|
bias
|
|
Weight?
|
Feature |
+0.800
|
word.istitle()
|
+0.463
|
-1:word.lower():,
|
+0.463
|
-1:postag[:2]:Fc
|
+0.463
|
-1:postag:Fc
|
+0.148
|
+1:postag:VMI
|
+0.125
|
+1:word.istitle()
|
+0.095
|
+1:postag[:2]:VM
|
+0.007
|
+1:postag:AQ
|
+0.007
|
+1:postag[:2]:AQ
|
-0.039
|
-1:word.istitle()
|
-0.058
|
postag:DA
|
-0.058
|
postag[:2]:DA
|
-0.063
|
bias
|
-0.067
|
-1:word.lower():de
|
-0.159
|
-1:postag:SP
|
-0.159
|
-1:postag[:2]:SP
|
-0.263
|
-1:postag:DA
|
-0.263
|
-1:postag[:2]:DA
|
|
Weight?
|
Feature |
+2.127
|
-1:word.istitle()
|
+0.331
|
word.istitle()
|
+0.016
|
+1:postag[:2]:Fc
|
+0.016
|
+1:word.lower():,
|
+0.016
|
+1:postag:Fc
|
-0.089
|
+1:postag:SP
|
-0.089
|
+1:postag[:2]:SP
|
-0.648
|
bias
|
|
As you can see, memoized tokens are mostly gone and model now relies on
word shapes and POS tags. There is only a few non-zero features
remaining. In our example the change probably made the quality worse,
but that’s a separate question.
Let’s focus on transition weights. We can expect that O -> I-ENTIRY
transitions to have large negative weights because they are impossible.
But these transitions have zero weights, not negative weights, both in
heavily regularized model and in our initial model. Something is going
on here.
The reason they are zero is that crfsuite haven’t seen these transitions
in training data, and assumed there is no need to learn weights for
them, to save some computation time. This is the default behavior, but
it is possible to turn it off using sklearn_crfsuite.CRF
all_possible_transitions
option. Let’s check how does it affect the
result:
crf = sklearn_crfsuite.CRF(
algorithm='lbfgs',
c1=0.1,
c2=0.1,
max_iterations=20,
all_possible_transitions=True,
)
crf.fit(X_train, y_train);
eli5.show_weights(crf, top=5, show=['transition_features'])
From \ To |
O |
B-LOC |
I-LOC |
B-MISC |
I-MISC |
B-ORG |
I-ORG |
B-PER |
I-PER |
O |
2.732
|
1.217
|
-4.675
|
1.515
|
-5.785
|
1.36
|
-6.19
|
0.968
|
-6.236
|
B-LOC |
-0.226
|
-0.091
|
3.378
|
-0.433
|
-1.065
|
-0.861
|
-1.783
|
-0.295
|
-1.57
|
I-LOC |
-0.184
|
-0.585
|
2.404
|
-0.276
|
-0.485
|
-0.582
|
-0.749
|
-0.442
|
-0.647
|
B-MISC |
-0.714
|
-0.353
|
-0.539
|
-0.278
|
3.512
|
-0.412
|
-1.047
|
-0.336
|
-0.895
|
I-MISC |
-0.697
|
-0.846
|
-0.587
|
-0.297
|
4.252
|
-0.84
|
-1.206
|
-0.523
|
-1.001
|
B-ORG |
0.419
|
-0.187
|
-1.074
|
-0.567
|
-1.607
|
-1.13
|
5.392
|
-0.223
|
-2.122
|
I-ORG |
-0.117
|
-1.715
|
-0.863
|
-0.631
|
-1.221
|
-1.442
|
5.141
|
-0.397
|
-1.908
|
B-PER |
-0.127
|
-0.806
|
-0.834
|
-0.52
|
-1.228
|
-1.089
|
-2.076
|
-1.01
|
4.04
|
I-PER |
-0.766
|
-0.242
|
-0.67
|
-0.418
|
-0.856
|
-0.903
|
-1.472
|
-0.692
|
2.909
|
With all_possible_transitions=True
CRF learned large negative
weights for impossible transitions like O -> I-ORG.
5. Customization
The table above is large and kind of hard to inspect; eli5 provides
several options to look only at a part of features. You can check only a
subset of labels:
eli5.show_weights(crf, top=10, targets=['O', 'B-ORG', 'I-ORG'])
From \ To |
O |
B-ORG |
I-ORG |
O |
2.732
|
1.36
|
-6.19
|
B-ORG |
0.419
|
-1.13
|
5.392
|
I-ORG |
-0.117
|
-1.442
|
5.141
|
y=O
top features
|
y=B-ORG
top features
|
y=I-ORG
top features
|
Weight?
|
Feature |
+4.931
|
BOS
|
+3.754
|
postag[:2]:Fp
|
+3.539
|
bias
|
+2.328
|
word[-3:]:,
|
+2.328
|
word.lower():,
|
+2.328
|
postag[:2]:Fc
|
+2.328
|
postag:Fc
|
… 15039 more positive …
|
… 3905 more negative …
|
-2.187
|
postag[:2]:NP
|
-3.685
|
word.isupper()
|
-7.025
|
word.istitle()
|
|
Weight?
|
Feature |
+3.041
|
word.isupper()
|
+2.952
|
word.lower():efe
|
+1.851
|
word[-3:]:EFE
|
+1.278
|
word.lower():gobierno
|
+1.033
|
word[-3:]:rno
|
+1.005
|
word.istitle()
|
+0.864
|
-1:word.lower():del
|
… 3524 more positive …
|
… 621 more negative …
|
-0.842
|
-1:word.lower():en
|
-1.416
|
postag[:2]:SP
|
-1.416
|
postag:SP
|
|
Weight?
|
Feature |
+1.159
|
-1:word.lower():de
|
+0.993
|
-1:word.istitle()
|
+0.637
|
-1:postag[:2]:SP
|
+0.637
|
-1:postag:SP
|
+0.570
|
-1:word.lower():real
|
+0.547
|
word.istitle()
|
… 3517 more positive …
|
… 676 more negative …
|
-0.480
|
postag:VMI
|
-0.508
|
postag[:2]:VM
|
-0.533
|
-1:word.isupper()
|
-1.290
|
bias
|
|
Another option is to check only some of the features - it helps to check
if a feature function works as intended. For example, let’s check how
word shape features are used by model using feature_re
argument and
hide transition table:
eli5.show_weights(crf, top=10, feature_re='^word\.is',
horizontal_layout=False, show=['targets'])
y=O
top features
Weight?
|
Feature |
-3.685
|
word.isupper()
|
-7.025
|
word.istitle()
|
y=B-LOC
top features
Weight?
|
Feature |
+2.397
|
word.istitle()
|
+0.099
|
word.isupper()
|
-0.152
|
word.isdigit()
|
y=I-LOC
top features
Weight?
|
Feature |
+0.460
|
word.istitle()
|
-0.018
|
word.isdigit()
|
-0.345
|
word.isupper()
|
y=B-MISC
top features
Weight?
|
Feature |
+2.017
|
word.isupper()
|
+0.603
|
word.istitle()
|
-0.012
|
word.isdigit()
|
y=I-MISC
top features
Weight?
|
Feature |
+0.271
|
word.isdigit()
|
-0.072
|
word.isupper()
|
-0.106
|
word.istitle()
|
y=B-ORG
top features
Weight?
|
Feature |
+3.041
|
word.isupper()
|
+1.005
|
word.istitle()
|
-0.044
|
word.isdigit()
|
y=I-ORG
top features
Weight?
|
Feature |
+0.547
|
word.istitle()
|
+0.014
|
word.isdigit()
|
-0.012
|
word.isupper()
|
y=B-PER
top features
Weight?
|
Feature |
+1.757
|
word.istitle()
|
+0.050
|
word.isupper()
|
-0.123
|
word.isdigit()
|
y=I-PER
top features
Weight?
|
Feature |
+0.976
|
word.istitle()
|
+0.193
|
word.isupper()
|
-0.106
|
word.isdigit()
|
Looks fine - UPPERCASE and Titlecase words are likely to be entities of
some kind.