Site Navi

news

  • HOME
  • news
  • Yixin Zhang gave presentation at ACM Multimedia Asia
  • Yixin Zhang gave presentation at ACM Multimedia Asia

    2021.12.3


    Yuexin Zhang presented her research findings at ACM Multimedia Asia, one of the international conferences on multimedia organized by ACM.

    The title of her paper is “MIRecipe: A Recipe Dataset for Stage-Aware Recognition of Changes in Appearance of Ingredients.” In recent years, many recipe datasets have been accumulated on websites such as Cookpad. These recipe datasets describe each step of the cooking process using text or a combination of text and images. However, when images are included, certain details about the ingredients may be omitted from the text descriptions. In such cases, automatically recognizing the ingredients in the images can help supplement the missing information. Zhang’s research focuses on ingredient recognition in recipe images.

    The ingredients appearing in recipe images are initially easy to recognize since their appearance is consistent across recipes. However, as the cooking process progresses, their appearance changes in various ways, making recognition more challenging. For example, potatoes are usually brown and round when they first appear in a recipe. As the cooking process continues, they may be julienned or cubed, and by the end, their appearance may have further transformed and mixed with other ingredients.




    Considering these characteristics, Zhang proposed two methods:

    • Stage-Aware Model Selection: Uses different models depending on the stage of the cooking process. For example, an image recognition model trained on early-stage recipe images is used for recognizing early-stage images, taking the ingredient’s transformation over time into account.
    • Curriculum Learning-Based Model: Starts training the model with early-stage images that are easier to recognize, and then progressively trains on mid-stage and late-stage images. This approach, known as curriculum learning, helps the model gradually adapt to more complex recognition tasks.

    She also demonstrated through experiments that combining both approaches into a hybrid method improves recognition accuracy.

    The conference was initially planned to be held in Gold Coast, Australia, but due to COVID-19, it was conducted in a hybrid format (both in-person and online). While there were reports in late November that Australia would lift travel restrictions for visitors from Japan, Zhang finally had to participate online.