Skip to content
Home » New Tools to Measure Our Mood and Predict the Future: Chris Hansen at TEDxMileHigh (Full Transcript)

New Tools to Measure Our Mood and Predict the Future: Chris Hansen at TEDxMileHigh (Full Transcript)

Chris Hansen – TRANSCRIPT

Take a moment to consider the economy in which we live in. The global economy, the US economy, the Colorado economy. The connections between all the different parts. It’s kind of mind boggling amount of complexity. The US produces about 20 trillion dollars of goods and services every year, and those are just the ones we count, we miss plenty along the way. I’ve been fortunate to work in my career on some pretty big, complicated models to try and make sense out of that: two, three, four thousand variables. But what becomes clearer the further you dive into that type of work is it gets more and more difficult every variable you add.

Here’s a pretty preeminent economic forecaster you may have heard of, Alan Greenspan, in charge of the US Federal Reserve for many years. He was being interviewed by one of my favorite newsmen, John Stewart, and he said basically, “Look, if I can do all of these fancy equations, I can get all these variables, but if you could just tell me how people feel, how they are reacting to the world around them, what’s their emotional mood that day, I could really start to make sense out of the economy.” And this was kind of the challenge that — I was on a team that just decided to take this on. Can we start to figure out how does the country feel today? How does Colorado feel today? And so, this is the work that we started to dive into, and I’m happy to share today with you a few of the things that we’ve started to discover.

So, the first thing that I often get asked is well, if you’re using things like Twitter data, social media data, to figure out how we all feel, and come up with this index of the mood of the nation. Or isn’t that just polling? Why don’t you just call people on the phone and ask them how they feel, and then put it all together? Well, there is a few problems with that. One, raise your hand if you still have a land line at home. That’s going to be highly correlated to age. I just want to tell you I got rid of my land line 15 years ago; probably never going back.

It’s a big problem for pollsters, right? They are trying to call people, trying to get in touch with the people, and get a scientific sample, and that’s great. We need scientific polling to answer lots of important questions, but it’s getting more and more difficult to do that work. The other thing that is really hard about polling is this idea of the Hawthorne effect, meaning if you know you are being observed, you change your answer. And this goes back to turn of the century time clock studies in factories in Massachusetts, for the historians in the room. If you’re being watched you’re going to work a little bit faster, right? If you’re getting a call from a pollster, and he says, “How do you feel today?” You might say, “Well, you know, I’m doing all right, I’m doing OK.” And so, the results can get a little bit skewed.

What we were trying to do was “OK, what if we could passively monitor people just by the words that they are using on social media? Figure out their mood by the way they were using language. And that’s exactly what we were trying to do with this project. Now, there are some problems on the social media side, too, right? I mean, it’s nice because it’s immediate: every millisecond, there are thousands of tweets being sent. In fact, about six hundred million tweets a day now around the globe. I’m sure there is a thousand being sent as I’m speaking, right now.

Right? Everybody is live tweeting? You’re being watched; that’s the take away from this talk. But if you add up all of these Twitter users around the globe, you can get this really instantaneous feedback on how people are feeling. But there is a problem on that side because what if you are oversampling, if you are counting too many people that do not represent the full population? Well, when Twitter started, that was a huge problem, right? It was the 25-year old white guys in San Francisco; they were the only ones tweeting. The good news is, since then, the Twitter use in the US, around the globe, and places like Western Europe, places like Saudi Arabia, the user base has increased so much that now, we have a sample that looks a lot like the rest of the population.

So, here in 2009, you can see a kind of my heat map of the predominance of male use on Twitter In the US right now, there is actually 51% female, 49% male. So this is almost perfect. It is good we’ve got over-representation from women. That probably makes it a better sample. The other problem that comes up is around ethnicity. What if there’s too many Caucasians or too many of whatever group? The other good news is that Twitter, in the US in particular, now looks basically the same as the percent in the general population. So, this data is getting better every single day. The other thing that’s happening is that we’ve got global use. These charts are a little bit tough to read, but the red line, which is US and Canadian data, as a percent of the total, is going down. It used to be 67 to 80%, now it’s less than a third of all tweets sent from Canada and the US. So it’s a democratizing user base around the globe.

So, our signal that we can tap into to make these measurements is getting better and better every day. With a good data set and some great scientists I work with, we started to dive into this idea of understanding sentiment. Now, how do you get your hands around this? Let me give you a tangible example. Let’s say I give you the word “home” versus the word “house.” Which one is warmer? Which one is more positive? You tweet out, “I can’t wait to get home” versus “Hey, I’m headed back to my house.” Home is a warmer word, right? And let’s repeat that process for tens of thousands of words in the English language, and Spanish, and Italian, and German, etc. That’s how we can start to build up this map.

The other thing that we needed: it turns out, that approach only gives you about 50% of the way there. We also started using emoticons. It turns out, about 5% of tweets, give or take, depending on the country, use a frowny face, a smiley face, an emoji of some sort. And then we can start to map the words that are next to those emoticons.

And with those two techniques, plus some adjustment for things like sarcasm, and irony, and cuss words, by the way, cuss words are the toughest things to figure out. There is a lot of different ways to use cuss words in the English language. But if we start to adjust for some of these things, we can build up this really great map of how people are expressing themselves on social media. And this slide just shares a little bit about how we were testing that. We were comparing human raters to our machine raters, and the more we did that and fine-tuned the model, we basically got those two results to match up.

So it gave us good confidence that our algorithms and our machine approach were headed in the right direction. Our first version is, since 2010, called “The Pulse of the Nation” – you can look this up on YouTube if you want – kind of this nice 24-hour view of how the US feel, state by state, kind of our first dry-run to make an attempt at measuring the mood of the nation. We also tested things like, “Well, I bet people in New Orleans are happy during Mardi Gras. Let’s check that out.” So, as you can see, Louisiana is dark green, it means their mood was above average the week of Mardi Gras. What a surprise! It turns out people in Massachusetts were kind of bummed out, maybe it was one of those snow storms. I knew there was a reason I left Boston I knew there was a reason.

All right, so here it is: put it all together, 2012 to 2015. Here is a snapshot of the mood of the nation. The black line is kind of a squiggle, we’re kind of getting happier as we’re coming out of the Recession, things looked pretty good last year, a little bit of doubt creeping in now, in the early part of 2015. The good news is for everybody in the room: Colorado, it turns out – the blue line here – we were a little happier than the rest of the nation.  Surprise, surprise.

I’ll share with you in just a second a time when that was not the case, but hold that for just a moment. Oh, look at that. Oh, it’s tough, a tough day. Any Broncos fans in the room? All right, so what’s going on here? Let’s take a closer look at this. This is basically the end of 2013 to February of 2014, and I’ve got zoomed in here: the black line is all of the United States, the blue line is Colorado again, and I added Washington, right?  Yeah, it’s good for comparison.

So, everybody is pretty happy at Christmas, pretty happy at New Year’s, then we’re all kind of bummed out because we have to get back to work the next day. And things kind of float along on average, and then you can see the Colorado and Washington lines start to go up. We are all getting pumped up, get ready for the Super Bowl parties. And then the first quarter happens and Colorado drops like a rock. You can see the blue line falling down there.

The good new is it does not hang around for too long. There’s this idea of mean reversion in the data meaning we kind of go back to normal. And that’s great, we weren’t bummed out too long. We went back to our jobs and got ourselves back together. And then, by Valentine’s Day everybody was happy again; see here at the end of the graph.

Why does this matter? I mean, what can you do with this? This is kind of a fun example, and paints a little upset here, but what can you actually do with this data? There are some really interesting correlations that we’ve been working on, doing a lot of econometrics to understand how the sentiment, the mood of a nation affects the way that we spend our money. I started with this complex economy that we are in a middle of. It turns out, when we’re bummed out, we buy things that make us feel better: chocolate, beer, lipstick, nail polish. I mean there’s this whole basket of goods called hedonistic goods, that we buy more of when we are bummed out. And it turns out the opposite is true: when we are feeling good, we are more likely to buy durable goods: couches, and refrigerators, and new cars.

So, you can imagine all the ways you can start to use this, right? You want to manage your supply chain, you’re the manager at a store, and you want to change what promotions you have available. Lots of interesting case-studies start to flow out of this type of approach. The other thing that this data is interesting for are things like public policy issues, fracking, hydraulic fracturing in Colorado Really controversial issue, right? We’ve had different ballot measures at the local level, looking closely at how this is regulated. So, we were taking a look here, at some data from 2013, as the controversy was heating up.

And you can see a big volume of tweets in the spring of 2013 – the green line – and then the blue line is the sentiment associated with those tweets. And, lo and behold, as we got into the fall, all five of the local ballot measures to ban hydraulic fracturing passed. So we know that there’s a correlation between how people are discussing public policy issues, there’s sentiment about those issues and things like tangible results at the ballot box. This is a piece of analysis we did following up on that, where we started to look at who were the influential people in that debate. And this is where you go the step beyond the sentiment analysis and you do something called network analysis.

We understand the degrees of Kevin Bacon, do you guys know that game? Who’s connected to who, whose messages are spreading in a viral wave throughout social media. And as we understand that network, we can start to make sense out of– understanding who are the influencers. It’s that next level of insight that we can start to bring to this type of analysis. So, step one: get all these tweets. Step two: measure the sentiment of those tweets. And then, step three: move into the network analysis to figure out who matters, who doesn’t, who should I pay attention to, and whom I might, or not, have to worry about.

The other nice thing about this is that it’s also very useful in some of the global hotspots around the globe. You may remember a couple of years ago a big protest going on in Istanbul around Taksim Square. We took a look at that data, we were using the geolocated tags that go with those tweets, and we know exactly where those tweets were sent and exactly the time they were sent. And we could start to track where are the protest tweets being sent from.

Here you can see: we zoomed in on Istanbul, and we could literally find the street corners. What a great tool if you’re trying to understand what’s going on around the globe. You can find those hotspots, and if you’re watching the Twitter signal closely, and understanding how the sentiment is changing, and understanding how the words that people are using are changing, you can also identify if the situation is going to get blown up, if it’s going to take off in a protest. Here’s an example from Turkey where you can see here on a log scale that the signal just goes straight up from about a couple of hundred tweets to almost 10,000; almost a 100,000 on the third day. This story did not even show up in the Turkish media until 24 hours later.

And it did not show up in the English language press for another 24 hours. So, here we’ve been able to develop an early tripwire on hotspots around the globe. And then, as I mentioned, we could take these same techniques and do the network analysis, and start to figure out who’s working together, who’s communicating, who’s important to these protests, which groups do we need to pay attention to. And some great results coming out of the Turkey example, where we actually saw some of the young up and coming college groups straight out of the university started cooperating with some of the more conservative religious groups in Turkey. And suddenly, they started cooperating to push back on government policy, in this particular case.

So again, a real-time analysis of what’s happening on the ground, how coalitions are forming, how people are feeling about a particular issue. I leave you really with one final picture: here’s the map of the Twitter universe. We know what the US map looks like, and the global map.

Here is what the map of connections on Twitter looks like. You see these clusters forming in North America, English language clusters, Spanish language clusters, the tiny dots on the edges, maybe that’s Lapland in January. But we’re really interconnected in a way that is just taking off in a very rapid pace. And I love this final picture because it shows how we are connected together. And as we do a better job of listening to those messages, and those signals that we are sending each other, we can get a new lens, a new and better understanding of what’s happening in our world, what’s happening in our economy, and what’s happening on the street corner.

Thanks for sharing your afternoon with me.

Related Posts