NCRM videos



Lisa Donlan: Identifying lexical diffusion in a large online social network

08-11-2018

Lisa Donlan presents at methods@manchester Methods Fair 2018. Abstract: My research utilises a corpus of 1,097,756 posts retrieved from Popheads, a Reddit-based online music community, to analyse the innovation and diffusion of words through digital social networks. However, how can one efficiently identify the words which are diffusing through a network of this magnitude? This investigation tackles this dilemma by employing methodology pioneered by Grieve et al. (2017). The method involves calculating the daily relative frequencies (normalised per hundred words) of each of the 150,000 unique words in the corpus. The daily frequency of each word is correlated with the passage of time, and the Spearman rank correlation coefficient test is subsequently used to determine if a word has significantly increased in frequency over the lifespan of the network. Using this method, I have identified over twenty words which are successfully diffusing through Popheads including bop (a catchy song), salty (‘rude’), and tea (‘gossip’). The method is also adapted to calculate the speed and the extent to which each word has diffused into the vocabularies of individual community members. This method is adaptable to any time-stamped dataset and could be used to identify emerging trends more generally when applied in other disciplines.