In this paper, we introduce a multimedia recipe dataset with
annotation of ingredients at every instructional step, named MIAIS
(Multimedia recipe dataset with Ingredient Annotation at every
Instructional Step). One unique feature of recipe data is that it is
usually presented in a sequential and multimedia form. However, few
publicly available recipe datasets contain multimedia text-image
paired data for every cooking step. Our goal is to construct a recipe
dataset that contains sufficient multimedia data and the annotations
to them for every cooking step, which is important for many research
topics, such as cooking flow graph generation, recipe text generation,
and cooking action recognition. MIAIS contains 12,000 recipes; each
recipe has 9.13 cooking instruction steps on average, each of which is
a tuple of a text description and an image. The text descriptions and
images are collected from the NII Cookpad Dataset and Cookpad Image
Dataset, respectively. We have already released our annotation data
and related information.