Experts using Twitter to follow the flu

by Hannah Beausang

In this hard-hitting flu season, experts are turning to Twitter and crowdsourcing to track the virus.

This flu season, more than 2,000 people have been hospitalized and there have been more than 18 deaths from symptoms associated with the flu.

Brigham Young University conducted a study in which 24 million tweets were collected from 10 million random Twitter users. The researchers looked for recurring terms such as “flu,” “coughing” and “fever.”

Location data was collected for roughly 15 percent of the tweets, allowing researchers to get an idea about the distribution and prevalence of the virus. Further location information was gathered from user-generated profiles, which proved to be accurate 88 percent of the time.

Using the collective data, researchers were able to collect incidences of the flu at the state level. This tool allows health professionals to track the virus with more immediacy than the information provided by the Center for Disease Control and Prevention, which can take approximately two weeks to process.

New language-processing technologies are being developed to weed out “chatter” about the flu, eliminating noise from people expressing concern about the flu and honing in on people reporting actual flu-like symptoms.

Because social media is becoming a  prominent part of people’s daily lives, this type of research has potential to thrive in the future. A 2012 Pew Research Center study indicated 69 percent of online adults use social networking sites and 16 percent of online adults use Twitter.

San Diego State School of Journalism & Media Studies assistant professor Rebecca Nee conducted a study through the Social Science Research Laboratory last fall that reported 30 percent of SDSU students use Twitter and 90 percent use Facebook. Because the technology-reliant younger generation tends to dominate the cyber world of social media, the online chatter is booming.

“College-age students are coming down with the flu and they may be more likely to be public about that,” Nee said. “They might be reporting it more to social media than to their health professionals.”

SDSU geography professor Ming-Hsiang Tsou, has been working closely with Anna Nagel, a public health graduate student since 2010 to use social media and online search engines, such as Yahoo and Bing for tracking.

The team has been tracking the flu via Twitter in 31 cities. Tsou says San Diego is one of the most highly correlated cities for accuracy of flu-related tweets matching CDC data.

The team is creating “word clouds” to aggregate weekly tweets to track the frequency of vocabulary to monitor the top keywords. The word clouds help detect the sentiment of the tweet and analyze the content to cut down on error.

“We want to create a linkage between cyberspace and real space,” Tsou said. “We think all the activity, chatting and buzz will reflect some level of truth in the real world.”

In the future, the team hopes to develop methods of linguistic analysis to examine writing styles in order to categorize tweets by age group. It will also be able to differentiate gender by user names and profile information to better understand data.

“This is much more efficient than conducting a survey asking how you are feeling about the flu,” Tsou said.

Twitter has been successful in monitoring disasters such as hurricanes Katrina and Sandy and tracking major events, such as the recent elections.

Although Twitter is a large source of data collection for this type of tracking, researchers also look at Internet search engine data.

To help monitor the flu, Google has created Google Flu Trends, a global flu-tracking system. HealthMap, another flu tracker from Boston Children’s Hospital, looks at online news about the flu to track outbreaks and Flu Near You, is an online project that collects weekly reports of illnesses.