Adding SPICE to Meshed-Memory Transformer
SPICE Evaluation Support
File Setup
- Download the ZIP file for the COCO evals repo
- Extract the zip folder
- Copy the
pycocoevalcap/spice/directory into theevaluation/directory of Meshed Memory directory - Copy the
get_stanford_models.shbash script into theevaluation/directory
File Edits
- Edit the
get_stanford_models.shso theSPICELIBvariable is as follows:SPICELIB=spice/lib - Add
from .spice import Spiceto the top ofevaluation/__init__.py - Add
from .spice import Spicetoevaluation/spice/__init__.py
Run
- Run the
get_stanford_models.shfrom within theevalations/directory
Additional Edits
Pretty Print the SPICE Metric
The compute_scores method in evaluation/__init__ uses the str() function to get the metric name. To prevent the function printing a memory address, edit evaluation/spice.py and change:
def method(self):
return "SPICE"
to
def __str__(self):
return "SPICE"
Control Whether SPICE is Used
SPICE can be quite slow to run, so you may not want to run it after every training epoch on the validation set, instead opting to run it only on the test set after training is complete. In order to do this, edit the compute_scores function in evaluation/__init__.py to be the following:
def compute_scores(gts, gen, is_test=False):
if is_test:
metrics = (Bleu(), Meteor(), Rouge(), Cider(), Spice())
else:
metrics = (Bleu(), Meteor(), Rouge(), Cider())
all_score = {}
all_scores = {}
for metric in metrics:
score, scores = metric.compute_score(gts, gen)
all_score[str(metric)] = score
all_scores[str(metric)] = scores
return all_score, all_scores
Then ensure you add is_test=True to the compute_scores call you use for your test set evaluation.