Adding SPICE to Meshed-Memory Transformer

1 minute read

Published: April 03, 2024

SPICE Evaluation Support

File Setup

Download the ZIP file for the COCO evals repo
Extract the zip folder
Copy the pycocoevalcap/spice/ directory into the evaluation/ directory of Meshed Memory directory
Copy the get_stanford_models.sh bash script into the evaluation/ directory

File Edits

Edit the get_stanford_models.sh so the SPICELIB variable is as follows: SPICELIB=spice/lib
Add from .spice import Spice to the top of evaluation/__init__.py
Add from .spice import Spice to evaluation/spice/__init__.py

Run

Run the get_stanford_models.sh from within the evalations/ directory

Additional Edits

Pretty Print the SPICE Metric

The compute_scores method in evaluation/__init__ uses the str() function to get the metric name. To prevent the function printing a memory address, edit evaluation/spice.py and change:

def method(self):
        return "SPICE"

def __str__(self):
        return "SPICE"

Control Whether SPICE is Used

SPICE can be quite slow to run, so you may not want to run it after every training epoch on the validation set, instead opting to run it only on the test set after training is complete. In order to do this, edit the compute_scores function in evaluation/__init__.py to be the following:

def compute_scores(gts, gen, is_test=False):
    if is_test:
        metrics = (Bleu(), Meteor(), Rouge(), Cider(), Spice())
    else:
        metrics = (Bleu(), Meteor(), Rouge(), Cider())
    all_score = {}
    all_scores = {}
    for metric in metrics:
        score, scores = metric.compute_score(gts, gen)
        all_score[str(metric)] = score
        all_scores[str(metric)] = scores

    return all_score, all_scores

Then ensure you add is_test=True to the compute_scores call you use for your test set evaluation.