CIDEr Score Evaluator
Evaluate the alignment between an image and a text using CIDEr Score.
Examples
| References | Predictions |
|---|
Metric Card for CIDEr
Module Card Instructions: This module implements the CIDEr metric for image captioning evaluation.
Metric Description
CIDEr (Consensus-based Image Description Evaluation) is a metric used to evaluate the quality of image captions by measuring their similarity to human-generated reference captions. It does this by comparing the n-grams of the candidate caption to the n-grams of the reference captions, and measuring how many n-grams are shared between the candidate and the references.
How to Use
To use this metric, you can call the compute method with the following parameters:
Inputs
- predictions (batch of list of strings): The generated captions to evaluate.
- references (batch of list of strings): The reference captions for each generated caption.
Output Values
- score (dict): The CIDEr score, which ranges from 0 to 1, with higher scores indicating better quality captions.
Examples
import evaluate
metric = evaluate.load("sunhill/cider")
results = metric.compute(
predictions=[["train traveling down a track in front of a road"]],
references=[
[
"a train traveling down tracks next to lights",
"a blue and silver train next to train station and trees",
"a blue train is next to a sidewalk on the rails",
"a passenger train pulls into a train station",
"a train coming down the tracks arriving at a station",
]
]
)
print(results)
Citation
@InProceedings{Vedantam_2015_CVPR,
author = {Vedantam, Ramakrishna and Lawrence Zitnick, C. and Parikh, Devi},
title = {CIDEr: Consensus-Based Image Description Evaluation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2015}
}