Deepak: What you see here is a small slice of my life. It’s a picture of five years of emails that I have sent to my contacts, from 2008 to today. Each bar represents six months. Clearly, something happened in the second half of 2010. Something significantly changed in my communication pattern.
A friend of mine, whom I shared an office with at the lab, happened to see this. And he was quite concerned and curious for me. He knew I had just started a job around that time, so I wasn’t on a vacation. And that led him to ask a very direct question: “Hey Deepak, did you go through a very difficult personal situation?” My friend was right. That was a time when an important personal relationship had ended for me. Following which my circle of friends changed, and I went through a long period of self-reflection.
Now what’s odd is that my friend knew nothing of my personal history for him to be asking this question. And also the only parameter that was used to construct this picture, was the date field in my emails. The date field is part of email metadata. And metadata is what we want to talk to you about today.
Daniel: So what exactly is metadata? Metadata is data generated from the interactions you have with other people in organizations, as you use technology. In a personal context, it’s about who you call on the phone and when. It is about the time and location where you have swiped your credit card. It is about the recipients and the time of every single email you have exchanged.
So, Deepak and I are graduate students at the MIT Media Lab. And since we work with data, a natural question for us was: what can we learn from it?
We realize that understanding and appreciating metadata is difficult for most people, because the interfaces we use to interact with our data are shallow and repetitive. Take email clients as an example.
For the last several decades, our emails have been presented to us as a time ordered list. Every single day, we get the same view with just a new set of emails replacing old ones. But the number of emails that we’ve sent and received over many years is way more. And as we send and receive these thousands of emails, we leave behind unique digital traces. So we realized what we lack are tools that can help us revisit and learn from our own digital trail.
Deepak: Speaking of trails, let’s go on a road trip together. And imagine that on this road trip, this is all that we get to see. That’s a view of the road right beneath us. Now, there are other ways of looking at the road trip too. For example, that view. But there’s another one, like this one.
Now if I would ask you which one you prefer, most of us would say the second and third ones. Why ? Because they provide us with perspective and context. They tell us where we come from, where we are at the moment, and where we’re headed to. And a digital trail like the one Daniel was talking about is no different. And frustrated by getting to see only the road in the context of emails, we took a small step towards solving this problem. We created a tool called Immersion, that combines, analyzes and visualizes your email metadata.
You see, Immersion only looks at information above the subject line of emails. Which means it looks at only the From, To, Cc, and the TimeStamp, never touching the subject or the body content of emails. When we built this project, one of the philosophies we had was to center it around people and the social links that are found around people; not basing it on ordering of TimeStamp messages.
And privacy was very important to us. So any user of Immersion has the freedom to delete any of their metadata that is collected by that tool. And Immersion has the power to transform the raw metadata that you see in emails into a visual form, that can reveal much more than what you see today through your clients. We’re going to show you what Immersion looks like right now.
Daniel: And we’re going to show you how it looks like using just my metadata. Immersion represents my contacts as circles. And simply by counting the number of emails exchanged with each person, it sizes the circles accordingly. Now, email conversations between multiple people are represented as lines between circles. So seeing Deepak and César connected by a line, means that the three of us had conversations as a group.
Now let’s focus on my relationship with Deepak. When I click on his circle, I can immediately see the people that Deepak and I have been in contact with together. And the thicker the line between Deepak and another person, the more emails were exchanged between the three of us. The person that stands out in my relationship with Deepak is our advisor, César, for obvious reasons.
Another thing is that Immersion can show me how my relationship with Deepak evolved over time. If you take a look at the histogram on the right, it’s very easy to see when I first met Depak, when we started working together, but also when we became very close, when we launched the tool. It was when things got crazy — that tall bar over there.
One other thing is that, if I click on César now, and if I see who he has introduced me to, I can actually see who César introduced me to. And that’s just by metadata. And I can see that there are plenty of people in that list, so he’s definitely helped me expand my network. I learned something there. What is also apparent, is that this network spatially organizes the different groups of people.
Another thing is that, if you take a deeper look, you can actually reveal the social circles in your life, and this tool represents them using different colors for each community. What you see here, the green and the orange, even though they look as one group, they’re actually two groups. My undergrad friends tightly interconnected with my previous research group back home.
But there is another dimension. And that is time. Things change over time. And Immersion can reveal how my network looked like before and after I joined the Media Lab. And it’s interesting how much someone’s life can change. New connections appear, some old ones unfortunately decay over time, but some of them thrive.
Deepak: All of these insights that Daniel has mentioned, he learned from his own metadata. Can we push it a bit further? Can we learn something personal about other people in his network, using only his data? Well, it turns out you can. Daniel is playing with his data one day, and he comes to me with two graphs, that we’re going to share with you today. The first one actually shows you the interaction pattern of our advisor César with Daniel, aggregated over every hour of a day. Clearly, he leads a very disciplined life, because between midnight and 7am, he isn’t sending many emails.
Now compare this in contrast with my interaction pattern with Daniel. And that’s what it looks like. It’s all over the place, I clearly need to work on that. But you see, this implicit exposure of personal information of other people through our data is something we should be very mindful of. We have been working on this project for a few months, and the launch just happened to coincide with this global debate about metadata. While we can’t claim that all forms of metadata are powerful, we were definitely surprised at just how much email metadata can reveal.
We’ve had different kinds of reactions from people. Over 600,000 have used the tool, and here are some of the reactions. For example, WikiLeaks says : “Wanna see what NSA ‘metadata’ really means?” To which there was a really interesting response: “You know, trusting three MIT hackers with full access to email is probably not a good idea.” But we’ve got the whole gamut of responses.
For instance, there’s another guy who says: “8.5 years worth of love, play and work, represented as a network graph.” That’s quite fulfilling for us. You can try out the tool for yourself, and see what your Immersion network looks like, by visiting this particular link. What we’ve shown you today is only the tip of the iceberg.
Truth be told, there is a lot more data in emails that we can analyze. And there are many more methods that we can use. We showed you how metadata can be used as a tool for self-reflection, to learn more about ourselves, improve our choices, and strategize who we connect with. However, we should also be mindful that any third party who has access to our data, be it the company who holds the data, or government agencies, they can learn plenty about us from it.
With ownership of data comes power, and this is power that we should treat very responsibly. A Buddhist monk’s words to Richard Feynman comes to mind that really helps encapsulate this double-edged nature of metadata. “To every man is given the key to the gates of heaven; the same key opens the gates of hell.” Thank you.