Skip to main content
This summer, I worked with Professor Jonah Berger in Wharton’s Marketing department to investigate how song lyrics and music taste change over time. Using songs from the last 60 years collected from the Billboard charts, I trained topic models that identified key topics present in the lyrics. From there, I used the topic models to investigate two hypotheses about how song lyrics may change over time. First, I analyzed songs over multiple time periods to see if songs with lyrics more differentiated from its genre become more popular. Results showed that more differentiated songs were associated with higher rankings on the Billboard charts, but the strength of the relationship varies between time periods and topic models. Secondly, I compared lyrics over multiple periods to see if new song lyrics imitate the lyrics of the most popular songs. Initial results indicated that there were little differences between the lyrics. Further research is needed to see if changes to the comparison method such as adjusting the topic model could improve results. 
From this project, I learned a lot about natural language processing and how it can be applied to study culture and human behavior. In particular, I enjoyed learning how to create and interpret topic models which I had never done before. Over the course of the summer, I improved my data analysis and coding skills as well as my ability to communicate findings. I also gained a better understanding of what it is like to do research in this field. All in all, I found my summer research experience very rewarding.  
This summer, I worked with Professor Jonah Berger in Wharton’s Marketing department to investigate how song lyrics and music taste change over time. Using songs from the last 60 years collected from the Billboard charts, I trained topic models that identified key topics present in the lyrics. From there, I used the topic models to investigate two hypotheses about how song lyrics may change over time. First, I analyzed songs over multiple time periods to see if songs with lyrics more differentiated from its genre become more popular. Results showed that more differentiated songs were associated with higher rankings on the Billboard charts, but the strength of the relationship varies between time periods and topic models. Secondly, I compared lyrics over multiple periods to see if new song lyrics imitate the lyrics of the most popular songs. Initial results indicated that there were little differences between the lyrics. Further research is needed to see if changes to the comparison method such as adjusting the topic model could improve results. 
From this project, I learned a lot about natural language processing and how it can be applied to study culture and human behavior. In particular, I enjoyed learning how to create and interpret topic models which I had never done before. Over the course of the summer, I improved my data analysis and coding skills as well as my ability to communicate findings. I also gained a better understanding of what it is like to do research in this field. All in all, I found my summer research experience very rewarding.