[top]

Adaptive Feature Inheritance and Thresholding for Ingredient Recognition in Multimedia Cooking Instructions

by Yixing Zhang, Yoko Yamakata, Keishi Tajima

Abstract

In this paper, we propose a method for recognizing ingredients present in each cooking step in multimedia recipes. We first introduce and validate three hypotheses on the characteristics of cooking steps in recipes: (1) ingredients are most difficult to recognize in the intermediate and finishing stages, where they lose their original appearance, (2) a step often inherits ingredients from the previous step but not always from the immediately previous step when there are parallel subtasks, and (3) the last step includes all ingredients used in the recipe. Consequently, based on these hypotheses, we introduce the following features into our method: (1) each step adaptively inherits features from similar preceding steps, where ingredients are easier to recognize, and (2) we decide the thresholds for each class and each recipe adaptively by using the prediction result of our method for the last step, where all ingredients appear. The experimental results demonstrate the improved performance of our method compared to the baseline methods, showcasing the effectiveness of our approach.

Full Text: free download from ACM

Slides: pdf

BibTex entry

Keywords

recipe, multi-modal annotation, multi-label recognition, datasets
Published in Proc. of 6rd ACM Multimedia Asia, pp.94:1-94:7, Auckland, New Zealand, 2024


tajima@i.kyoto-u.ac.jp / Fax: +81(Japan) 75-753-5978 / Office: Research Bldg. #7, room 404