[top]

Home Location Leakage via Weather-Related Social Media Posts

by Akitaka Yamashita, Keishi Tajima

Abstract

We analyze the risk of home location leakage via social media posts about current weather at the user's location. To quantify this risk, we develop a two-step location estimation method: (1) identifying posts mentioning current rain or snow at the user's location, and (2) ranking locations by matching the post timestamps against nationwide precipitation data. To train a post classifier for Step (1), we collect posts including the words "rain" or "snow" from users with known home locations, and automatically label them as follows: if there was no precipitation at the user's home location at the time of posting, the post is not about the current weather of the user's home location; otherwise, it may or may not be about it. Thus, the problem corresponds to Positive-Unlabeled learning under the Selected At Random (SAR) assumption with a known labeling mechanism, where the labeling probability depends on the precipitation rate at the user's location. For Step (2), to avoid bias towards areas with higher precipitation rates, we design a probabilistic model of users' posting behavior and rank locations based on likelihood that the observed set of posts were generated at each location. Our experiment on X data demonstrates a non-negligible privacy vulnerability: our method successfully identified the home locations of 68% of users with 20 posts about precipitation.

Keywords

social network analysis; user profiling; geographic information
Published in Proc. of ACM Conference on Web Science, 6 pages, Braunschweig, Germany, 2026


tajima@i.kyoto-u.ac.jp / Fax: +81(Japan) 75-753-5978 / Office: Research Bldg. #7, room 404