Skip to content
Home » Computer Imaging: Dan Connors at TEDxMileHigh (Full Transcript)

Computer Imaging: Dan Connors at TEDxMileHigh (Full Transcript)

Dan Connors – TRANSCRIPT

I’m going to start off today with a very important question, and that is what would you do with infinite computing power? I’m not asking what society or the government is going to do, I’m asking you a personal question: what would you do with infinite computing?

As a professor at University of Colorado, this is something I’m very close to, and this is actually how I start off each year talking to my students in “Computer Architecture: the Foundations of Computer Design.” I pose that question to them, and just like you, maybe you’ve already thought, your answer and what that’s going to be.

Maybe you want to use computers to make money by somehow predicting the stock market. Maybe that was your first guess. Maybe your second one is you will decrypt all encrypted material so that you don’t have any secrets out there. So while this question isn’t meant to be a philosophical question for my engineering students, it quickly comes out with many parallels in real life. The first thing we point out is that this is no longer a question, or at least we should be aware of this as far as our field.

So if I look at the annual report coming out of the Top 500, this is the listing of the Top 500 fastest supercomputers in the world. N equals 1 is the fastest, N equals 500 is the lowest of the 500. What’s very interesting to look at as far as supercomputers behave in terms of calculations per second, that’s the measurement we use typically to define the fastest computer, the number of math calculations per second. What’s fascinating about supercomputers is equally fascinating about where we’re at now in portable electronics: your laptop, your notebook, your iPad. So if we look at 2013 and look back 20 years to 1993, at the cost of millions of dollars back then to do that level of computing, you now have equally at 500 dollars, all within a hand’s reach.

So this brings up right away that this question isn’t even about talking about the future, x at scale level of a billion, trillion operations per second. It’s right now that we have to answer this question. What happens in other fields – and computers aren’t often given enough credit – is we think our own society should be more reactive to biology and chemistry. We have already started to regulate cloning of human tissue, as very well important in our society, but there’s also dangers of biological agents that we would say no individual person should be in possession of these things.

Likewise in chemistry. There are chemical compounds that you do not want in society. Right now you cannot go to a drug store and get 100 boxes of Sudafed because there’s a chance that you might go and make meth drugs, and so we prevent that in society. Same thing with chemical compounds; we don’t want hazardous material by individuals. But what about computers? How much computing power should an individual have? What would you do with it? Very often you think about computers as just being dangerous when you get a virus, when your computer has been hacked, and your data is stolen. But while that might seem personal, you aren’t yourself affected as far as your body.

We’ll see if that’s about to change. So where is the area that is most absorbing all of this computing power that’s come about? And I’m going to point you to computer vision, things you’d even take for granted today that the cheapest of cell phone can probably draw a box around the faces of your family and friends when you take a picture. It might be a free cell phone that you didn’t even have to pay for. So this is what is part of this overarching view of computer vision, which is how to find the objects of interest or information of interest in images or in video. So, while this is just the start, we can also extend that to other areas, so for instance, in the realm of looking to the context of a photo.

So rather than just a single object, we might want to detect what the scene contains. Not just face detection and face recognition but to detect if there are buildings, if it’s a cloudy day, anything in that context of information. With this, you start to think computers are catching up to humans. They are just going to mimic what we can already do. If I showed you this without the labels, you would be able to tag each of these as far as a subject matter.

When you try to equate that, a brain, and then a circuit, and you try to say, “Oh, well; give us some analogy.” Is a synapse and a wire the same thing? Is a neuron and a transistor the same thing? How many transistors make up a neuron, because we want to know when computers are going to be capable of holding artificial intelligence, or doing everything that humans can do. I want to dispel this right away, that when we are doing computer vision it is different from human vision in two ways. First, the development process. We as humans develop as infants looking up at faces, and we’re used to looking at a very small number of objects. That patterns us in a certain way.

Likewise, the math that computers do for computer vision is completely unrelated to how we ourselves are doing visual interpretation and perception. So don’t try to think that there’s a hybrid together that the brain and the circuit is going to come into one, or they’re going to be equivalent. Specifically, if you’re looking at this picture, there are four concentric rings that are not overlapping. You might get a little dizzy trying to prove that to yourself, OK? So this is again, a way that humans have developed their own intelligence, or their own visual understanding, that computers have no problem looking at this picture and saying there are four rings patterned by smaller squares or rectangles slightly shifted.

But we have issues with that due to the fact that we’re not trained from birth to look at these types of photos. At the same time, if you look at the bounds of computer vision, we also use the farthest frontier of what computers can do to help us in our daily life, so not enabling, we want to disable computers from doing things. All of you have probably filled out a captcha. You might not have known it was called a challenge captcha. If you’ve lost your password or tried to sign up for tickets online, they want to prove that you’re a human because somewhere, somebody could write a program to just log into the system, and we need some barrier.

So we are falling upon computer vision to be that barrier. We’re using it, and we’re also using it as a barrier to us. What you should’ve been seeing in the last five years is that these captchas are getting harder and harder and harder. So if computer vision is a race, computers are running as fast as they can to constantly evolve. There’s some very important concepts with this.

So first of all, what will be some of the effects that we see in society very soon? Well, you’ve probably heard about a number of projects from Google, as well as Volkswagen, to do autonomous vehicles. This is the notion of not just taking a single camera, but taking cameras and computers, putting them on board a vehicle so that we can actually do safe autonomous navigation. We don’t have to worry about distracted drivers, we don’t have to worry about impaired drivers in society. This is definitely a positive, and just like all positives, we think about the negative. We think there’s going to be a large government agency that’s able to roam the skies.

We’re worried about what’s on top of us, and the government being that entity because the government clearly has billions of our tax dollars to spend on these things, and that’s what it would take for this. So while you’re thinking about what’s on top, I want you to understand there’s plenty of things to think about what’s on the bottom. You’ve also probably heard about recently Google’s venture into augmented reality with computer vision. This is Sergey Brin of Google showing off and demonstrating how they want to compact that idea of computer vision. That it’s not out there in the clouds, it’s not a government lab, it’s a personal element of your daily life.

This actually matches a lot of what the answers of my students have done over the last 13 years of teaching: is that they said they wanted to know everything. They wanted never to forget a name, never not know a fact about something, and this might be that avenue. And as you see, society has started to react in the fact that Google Glass has been banned at casinos, as you rightly would think they could be counting cards. I want you to understand if you just think that computer vision is the notion of counting cards, well, at the same time, you couldn’t walk into these casinos and pull out a score card and keep track of the aces and kings. The casino boss would ask you to leave.

So it’s not the fact that this is just doing what you could already do, or what computer vision could lend to you. This is where I want you to understand. This is work being done, published in 2012, Siggraph. This is work at MIT labs, and this is overtaking small temporal and spacial differences between the frames of video and magnifying them, and so this is extending what humans can actually not do, but computers can do. So if we look at this, this is just stills on the left and the right side.

This is opening up to, should be a video. Think about this as a test case. He is actually standing still, he is breathing, he’s not blinking. He’s acting as a case example. Now, if we go and look at the small differences that you might not notice just looking at this, you might even think that this is an image frame right here.

But now we’re going to pass this through a computer and magnify those small differences that we can’t see. What you have here is his blood rushing in and out of his skin. Something that we temporally cannot detect. If we’re looking at computer vision, one of the takeaways I want you to have is that they’re doing work that cannot be done by humans. They’re not trying to mimic our human understanding or vision. It’s well beyond that. If you think that this is just one control view. This again is eight seconds of video.

His wrist is on the table, you can see small changes. But now let’s look at what happens when we magnify the changes. You can take a reading of his pulse, or at least see the frequency at which his blood is passing in and out of his wrist. This opens up a large avenue for us to talk about in terms of impact to society, because normally when we think about surveillance, we think that we have rights within the Fourth Amendment as US citizens to be able to be ourselves in our own person.

At the same time, we’ve used the notion of just, if a person was able to observe something, then you shouldn’t view that you have private protection. You should view that you’re in public. Any place that a police officer could be, you should view that you’re in the public standing, and that your own behavior could be recorded. So with this, being able to detect something that’s invisible to us is very important.

How might you actually have this being used? First of all, what we do in my lab is we accelerate the hell out of this. So what might take MIT Labs, and they’ve released their code online, we took their code, and my graduate students accelerated this so it’s 17 times faster. What used to take those 8 seconds to be processed, 300 seconds, we can now do in just over 16 seconds, almost real time. We’re almost to the point where you can pull out a camera and be able to get that magnification of somebody’s blood, somebody’s pulse, in real time. So one second for one second of computing.

This is being worked on at the University of Colorado, Denver. Where are some of the areas we’re going to see this being used? Think about the last presidential debate. This is a very controlled environment, you have the candidates standing still. How many of you would like to know their pulse when they talked about budgets? How many of would’ve paid to know their pulse? Even after the debate there were so many people rushing to talk about the President’s stance was this, and Mitt Romney’s stance was this, and when they straightened up it meant this. What if we start talking about their pulse and their blood pressure?

Likewise, fashion. Think about this scene in terms of law enforcement doing profiling or trying to detect danger in a scene. This is just taken prior to the Boston bombing and this is again, prior to an event being recorded where cameras were on the scene. Wouldn’t it have been interesting if you knew what pulse these two had when they potentially had 30 pounds of explosives strapped to their backs? This gets into a very interesting realm to talk about in terms of our own values in society.

So, I’m going to leave you with this, three things. First, if you’re waiting for computers to be a barrier, that this won’t happen in your life time, it’s too late. We have inexpensive cameras, computing power is infinite, and software is openly available to do these things. Second is that computer vision doesn’t need to understand how human vision works to get the answer. Finally, computers are doing much more than we ever thought human vision could do.

So I’ll leave you with this: what are you going to do with infinite computing power?

Thank you.

Related Posts

Reader Disclosure: Some links on this Site are affiliate links. Which means that, if you choose to make a purchase, we may earn a small commission at no extra cost to you. We greatly appreciate your support.