Home » How Cybercriminals Steal Money by Neil Daswani (Full Transcript)

How Cybercriminals Steal Money by Neil Daswani (Full Transcript)

Full Text of How Cybercriminals Steal Money by Neil Daswani at Google Tech Talks. This presentation event took place on June, 16 2008.

Listen to the MP3 Audio here: MP3 – How Cybercriminals Steal Money

TRANSCRIPT: 

Neil Daswani – Co-director, Stanford Advanced Security Certification Program

My name is Neil Daswani. I’m a security engineer here at Google. And today I’m going to be talking about how cyber criminal steal money. I’m going to be talking specifically about how cyber criminals use various types of web application vulnerabilities to steal money. And I’m going to start with simple examples and then I’ll go to more complicated examples.

In the course of my talk, I’m may refer to many resources, presentations, reports, books, certification, courses, et cetera. Links to those are all going to be available at my site at neildaswani.com. And at the end of the talk, I’ll be also be giving out couple free copies of security book that I published but you’ll have to answer some trivia questions to get those at the end of the talk.

Before I go ahead and get started, the one additional thing that I wanted to mention is that, given that this is a security talk, if you have any Google specific questions, I’m going to ask you to hold them until the end of the talk, until we stop taping and then you can ask those. But if you have general questions about the presentation or some of the techniques, then I’d be happy to take those either at the middle of –in the middle of the talk or at the end of the talk before or before we stop video taping.

So, let me go ahead and get started. One of the major shifts that occurred over the past three or four years is that the profile of the attackers has changed. So, up until about three or four years ago, when people used to write worms and viruses, they would typically want to — just make names for themselves. They would release their worms out there, it would cause lots of traffics and servers would come down, some pact would get deployed and the game would be over.

But the big shift that’s occurred over the past three or four years is that as more and more commerce has started taking place on the internet and as more businesses have started making more money from that commerce activity, the bad guys want to get their share. And so, their end goal is money, actual money. And so, in many of the attacks that I’ll talk about, in the examples that I’ll present, I’ll tell you how these bad guys are working to get at money.

Now, the bad guys may have a set of intermediate goals that may help them get to that money. And so, some of the intermediate goals, for instance, are data theft, they’ll steal identity information, they’ll steal credit card information, they may decide to conduct extortion. So they will launch a denial-of-service attack against a website at say, 8:30 AM, to the point that it’ll take down all that bank servers. They will send in a ransom note to the bank. They will say, “Please pay us X thousands of dollars or we’re going to shut down your servers. By the way, if you check your web server logs at 8:30, you’ll notice that all your web servers were down so I’m not kidding.” And– so extortion’s another goal.

Another goal to make money is to distribute malware. Once the bad guys distribute malware, they can then do all kinds of things with the compromised machines, assemble them into botnets and/or do what they would please with those botnets. So, there’s a number of intermediate goals. To give you a concrete example of such a organized crime networks — so I mentioned that the attacker profile has shifted from amateurs to professionals that want to make money and in many cases those professionals are very organized. It is their full-time job to attack sites. The bad guys, in some cases, will hire other people as mules to transfer money from one place to the other.

So, it’s an extensive organized network. We’re not fighting against amateurs anymore. One example of such an organized crime network is the Russian Business Network. If you’ve heard of botnets like Storm which have compromised anywhere from a million to five million machines, depending upon who you want to believe, the real number is probably closer to a million, million and a half. They’re responsible for these types of botnets. Storm for instance is a pair-to-pair based botnets that can be used for denial of service, key-logging, pretty much whatever — whatever one would like. The bad guys will rent out the machines on those botnets. They’ll say, “Hey, I have a botnet. I have these many machines. I’ll rent them to you for X cents per day and you give me a binary, I’ll put whatever binary you give me on those machines and farm them out.”

Another thing that the Russian Business Network is alleged for is a piece of software called malware alarm. So malware alarm is this piece of software which will pop up a dialogue box on your PC and it will say, “We think your computer is infected by malware. Please click here to disinfect.” Of course, if you click here to disinfect, it will infect your computer as opposed to disinfect it. And so, the Russian Business Network is a thoroughly organized group. For those of you that are interested in learning more about the Russian Business Network, come up to me after the talk, I can tell you some fun stories.

So, the goals of cyber criminals have changed over the years. To give you a little bit of data about various pseudo-goals and intermediate goals that the bad guys have, this is a graph that I pulled out from a web hacking incidents database report that was done by Breach Security, they basically looked at a whole bunch of organizations over the course of 2007 and what types of attacks were reported for all of 2007. And so, you can see that what the bad guys were mainly trying to do is steal sensitive information like credit card numbers, identity information. Once they steal that information, they can do various things with it. They could decide to use that sensitive information for their own games.

For instance, if they have stolen credit card numbers, they can then burn those credit card numbers onto blank magnet stripes of their own and hand those out to mules who they then tell to go to ATMs and try to do cash advances and what not or use the cards at various points of sale. So that’s one thing they can do with stealing sensitive information.

The other thing that they could decide to do is to just sell the information on the black market. So, the bad guys have a whole bunch of IRC channels as well as other ways of communicating with each other. And there’s an underground economy, there’s a market. So on the underground market, a credit card number might be worth say, $10 per credit card number and they can get bought and sold in bulk. So stealing sensitive information is one thing that the bad guys do.

The next highest category of intermediate goals for the bad guys – that’s what’s on this slide is defacement. So that’s simply – somebody changes what’s on the front page of the website to get their own, say, political messages across. Now, I should mention that this particular –this particular report, queried a lot of government agencies. And so, because the number of government agencies queried in this report tends to be a little bit on the high side, we said defacement is probably a little bit on the high side with regards to other databases that I’ve seen describing incidents.

The next highest is planting malware. So once the bad guys, say, break into a website, they will put some JavaScript and/or Active X object tags on to pages, such that when good users visit those pages, visit these legitimate sites, they have a binary downloaded to their machine and they get infected without even knowing it. So, that’s the next biggest category. You can imagine that as we — if we were to look at more commercial websites as opposed to government websites, the amount of planting malware would increase and the amount of defacement would decrease simply because if a website is highly trafficked and there’s lot of users going there, the bad guys simply want to take advantage of that traffic to deploy their malware and then decide what to do with users after that point.

There’s one final point I want to make about this graph and this graph is based on incidents as opposed to vulnerabilities. So for those of you that are familiar with web application vulnerabilities, you may be aware that Cross-site scripting (XSS) happens to be a major problem on the internet. A lot of vulnerabilities reported are typically Cross-site scripting vulnerabilities. And I’ll chat a little bit about some types of Cross-site types of attacks later in the talk, but the thing to keep in mind is that those are vulnerabilities as opposed to the actual incidents. This graph shows incidents. So, what the attackers are actually trying versus what say, security researchers are trying.

So, this slide summarizes some things about the intermediate goals on the part of the attacker. So, I’ve given you some high level information about trends in the space. I’ll talk a little bit more about trends, but what I’m going to do is start with a simple concrete attack, show you how it works and then I’m going to show some more complicated attacks as well. So I’m going to talk about SQL injections first. Actually, let me just get a quick show of hands, how many of you are familiar with SQL injection? Okay, good. More than half of the room which is great. So I’m going to go through this example relatively quickly.

Basic idea is that a good user might access website in a following way: They have a web browser, they need to authenticate themselves to a web server, they typically supply a username and password, the web server then uses that username and password to allow the user to log in. Of course, before allowing the user to log in, they need to figure out if this user is indeed authentic.

So, the web sever might make a query to a database. The database command that gets constructed, may get constructed based on the user input. So, the users supply the username, the web server needs to select the corresponding password from the database and see what the corresponding password is for the username that was entered. Of course, the bad guy is not going to enter your regular run-of-the-mill username, the bad guy is instead going to enter a username like quote, semi-colon, drop table user, semi-colon, hyphen, hyphen and enters something for the password, it really doesn’t matter what.

But the idea is that after this input is entered and is substituted into the query, the quote will close off the string literal, the semi-colon will close off the first database command, the rest of the input will make up a second database command, these hyphens will comment out the apostrophe that the web application put in. And in old databases, these actually used to execute just fine and would end up deleting all the information about all the users in the database in one shot. So this would be an example of a denial-of-service attack that occurs based on SQL injection.

Now, these types of attacks have become so prevalent and so popular and they’re so well-known there’s even cute cartoons about them. I’ll go ahead and let you read the cartoon.

So, even high schools are vulnerable to this and if you guys have ever seen the movie WarGames in which Ferris Bueller or Matthew Broderick had to open up the — and get the password for the computer for the high school’s database, he doesn’t even need do that anymore. He just needs to have his name be quote, Robert — Robert, quote, drop table students, et cetera. So, there’s all kinds of other innovative SQL injection attacks that can be done. So I’ll just give you one example. For instance, when you board an airplane, right? And I’ve been to a couple of security conferences where some of these attacks have been demoed. So when you go to the gate and you give them your boarding pass, what do they do with your boarding pass these days? The boarding pass is typically just a bar code and they scan that bar code. Now that bar code is getting translated into a set of numbers and letters and most likely getting fed to a SQL database. So there’s no reason that you couldn’t enter a couple, you know, come up with a bar code of your own, come up with a boarding pass of your own, put in a quote, semi-colon, whatever. Put in all these characters and construct a SQL Injected command to upgrade yourself into first class or do worse. So SQL Injection just isn’t relevant to the web, it’s relevant to many other types of applications, pretty much anything that uses the database on the backend.

So you might think, “Oh, this should be pretty easy,” right? The problem occurred because there was a quote and a semi-colon; I’ll actually just filter out those characters, right? That seems like the easy way to deal with this problem. Unfortunately, that’s not enough. So let me go ahead and give you an example in which putting in a, or filtering out quote and semi-colons will not solve your problem. This is an example of a – imagine there’s a website that allows you to buy pizzas online and that web application may have a page which allows you to view your order history so that you could see all the pizzas that you’ve ordered in the past month or so.

And in this screen, you might imagine that there’s a pull down record, select the month so that I could view all my pizzas for that month, and the html form corresponding to this form might look as follows: It has basically a select block in which there’s a couple of different options, each of the options have a month and a corresponding value from one to twelve, and when the user hits submit, that value would get transmitted to the server and will become, say, part of the SQL command that will retrieve all of the pizzas that were ordered in that month.

ALSO READ:   Alain de Botton on The News: A User's Manual at Talks@Google (Full Transcript)

Now, of course, the bad guy is not going to enter a number from one to twelve, what the bad guy is going to do is enter something else. The bad guy is going to enter something like say, zero or one equals one. So at the top, this is what a normal SQL Query for getting the order history might look like: It might get the pizza, the toppings, the quantities and the order day from the order’s table for the particular user that’s logged in, and the order month is the order month that comes from the form.

On the other hand, if the bad guy types in zero or one equals one for the input, what’s going to happen is they — the order month equals zero will evaluate to nothing but the order one equals one will evaluate to true for every single row that the database evaluates this form. And so what will get returned is all of the pizzas that were ordered by anybody. And so what the attacker’s going to get is a screen that will looks like this, where it will contain the pizzas, the toppings, etcetera, for all of the pizzas that were ever ordered from that website.

Now, for pizzas, this might be such a big deal, right? But imagine if this was a medical database? Well, now you have everybody’s medical records. So, that’s pretty darn bad.

Now, at the same time, this is a talk about how do bad guys steal money? So, stealing information about pizzas may not be that interesting, but stealing information about money might be much more serious. So you could imagine that the attacker could instead type in a query such as or type an input such as zero and one equals zero, union, select cardholder number, expiration month, expiration year from credit cards. So what this does, the zero and one equals zero will basically evaluate to nothing, right? One never equals zero, none of the order months should be equaled to zero. So that part of the SQL statement gives me nothing.

On the other hand what happens is this, the results — the nothing results get unioned with a whole bunch of results that come from the credit cards table. Now, the key thing here is that the columns, cardholder number, expiration month, expiration year were chosen such that they exactly match the data types of the columns in the preceding order history tables and that’s why the SQL Query will work. Once, the bad guy is able to do this, what the bad guy will get is a result page that looks like this: There is a pizza order table that contains people’s names, credit card numbers, etcetera, and this is how the bad guy got all the credit card numbers. So, this is pretty darn bad.

Note that in that query, there were no quotes. There were no semi-colons. There were no metacharacters, yet the attack worked. And so, how do you solve this kind of problem? Well, you need to use an approach based on whitelisting as opposed to blacklisting. When you try to filter out certain characters like quotes and semi-colons, it typically doesn’t work because you might forget about some characters; every database has its own metacharacters. Who knows, the next version of the database might introduce some new metacharacters and so it becomes a very hard game to fight. On top of that, you might have some users that might enter one metadata like O-quote-Brien.

Why shouldn’t I allow O’Brien to have a quote in his name? I want the user to have a more personal connection with my website. Why? Because of whatever technology I used to build the website should the person not be able to use that quote. And so, the right way to deal with these problems is to use an approach based on whitelisting. Anytime they’re taking input from user – from users on a website, you want to constrain and specify, what is the exact set of safe valid values that should be allowed? And so you could imagine that if I’m accepting some kind of alpha numeric string, I might use a regular expression like this, or if I’m accepting a month, I might use a regular expression like zero dash one to zero dash nine. You could of course write much, much better regular expression but the idea here is not to, you know, show you the regular expression in particular, but to just basically make the point that you should use regular expression to constrain the correct set of input to what you would expect it to be and basically give the user an error, some kind of error which doesn’t reveal a lot of information to the user if the input doesn’t match what you might expect.

Once you’ve come up with a set of regular expressions that properly specify the set of safe inputs, you can — in addition to checking for them in your application, you could implement them in what’s called the web application firewall. So for those of you that might be familiar with Apache, you can write a whole bunch of modules or use a whole bunch of Apache modules to extend the functionality of the web server. One of the modules that you can use is a module called Mod_security and it will let you specify a whole bunch of rules for all the arguments in your web application. So, that is a tool that I encourage you take advantage of.

Finally, I mentioned that we want to allow usernames like O-quote-Brien and so the way that you do that properly is you take advantage of escaping functions that your database provides you with. So, the name of the escaping function will differ depending upon what database you’re using. But in this particular example, you just call escape with the user input and in some databases like my SQL, it will take the single quote, turn into a double quote and the double quote, when the database receives that in a SQL statement will basically mean that the database interpret this double quote as a piece of data and not as a piece of control and that will help deal with the issue.

Now, there’s a number of other things that you can do as well to prevent SQL Injection. One thing that I would strongly recommend is taking advantage of prepared statements and bind variables. So in this approach, what you do is for all the queries that your application might want to do, you can come up with a query template and prepare a statement with that query template. So in this particular case, we’ve taken the pizza order query and we have provided that information and what we’ve done is we put in question marks for the places in the query where I expect data and only data to be filed in. And once the query template is specified, you can then have separate statements that will fill in these place holders.

Place holder number one with the user ID, place holder number two with the parameter corresponding to the month. And the idea here is that, when the database driver executes this, when the database looks at this, it will always interpret this information as data. It will not allow the structure of the query to change. And so, you could imagine that — what you could also do is create just one file in your application or keep all of your queries in separate places such that in your regular code, you should never really have SQL. And that will help you from a credibility standpoint to make sure that you don’t have these types of SQL Injection errors. So, I’d highly encourage the use of prepared statements and bind variables to deal with SQL Injection.

So, I talked a little bit about MOD_security. MOD_security is a type of web application firewall. And so the idea here is that, without a WAF, when the user makes connection to your web server, the connection’s pretty much directly to the web server. There is nothing that can really intermediate the user’s request. If you do you use a WAF, that’s either linked into the web server or that sitting in front the web server, the idea is that when the user input comes in, it can look for — it can basically run all the user input through various regular expressions and check for invalid types of input. So that’s how MOD_Security works. Except MOD_Security is actually a link in the web server as opposed to a separate box. There are also web application firewalls that sit in front of a box but, you know, the last thing you need is another security appliance to manage. And so, while that is — that is one option, there are others as well that can base on software.

To give you an example — another example for how to use web application firewalls. To prevent some types of attacks like SQL Injection, you can imagine that there’s going to be whole bunch of rules and there’s a whole bunch public rules available. So especially if you’re using Apache or you’re using a public, you know, that type of a website, what you can do is take advantage of rules that are available and this particular rule is optimized, it’s a whole bunch of regular expressions compiled into one, but you can get the idea that it looks for various parameters in the http request and looks for certain scary strings like table, like, objects, like the word, password, etcetera. And so the web application firewall can be used to provide you defense-in-depth. So the idea is that you should absolutely try to write your code as perfectly as possible so that it doesn’t have vulnerabilities, but you just never know. And so, the web application firewall can provide defense-in-depth. If there’s some class of attack that you didn’t have in mind when you built your application, the web app firewall might be able to help protect you.

So, what else can you do to mitigate SQL Injection? Well, one thing you can do is you can limit privileges if your database doesn’t need to change data, then there’s no reason that the user account that the web server is logged in the database with, or the web app server should belong to the database with, there’s no reason it should have those right privileges. But as you can imagine, you know, just reading that it can also be pretty bad. You can harden your database server; harden your host until it last. You know, some databases like Microsoft SQL server shift by default with a whole bunch of functionality turned on, so that I could write SQL commands that could be used to initiate outbound network connections or e-mails and you definitely want to shut all of that functionality off.

There is a lot more to learn about SQL Injection. For instance, there’s attacks like second order SQL Injection where the bad guy takes advantage of data that’s already been sanitized and is in the database and does a command to re-inject that into a query and use it to their advantage.

Another type of attack is a blind SQL Injection. You notice in some of the attacks that I demoed, the bad guy needs to know a little something about the structure of the database. And so blind SQL Injection can be — is a technique that can be used to reverse engineer information about the scheme of the database. If you’re interested more in that, I suggest Chapter 8 in my book or the web, there’s a lot of good information out there. So, that is attack type number one, SQL Injection, that’s one of the simpler attacks. Can I answer any questions about SQL Injection before I move on to the second type of attack that I’m going to talk about, Cross-Site Request Forgery. No question about SQL Injection.

Okay. I’m going to move on to talk about Cross-Site Request Forgery. So, this is another attack which can be leveraged by the bad guys to steal dough. And how is the bad guy going to do that? Well, I’m going to show you by way of example. Let’s assume that we have two sites. One site called bank.com which is a legitimate site that has an online banking application, and let’s assume that a user logs in to that online banking application and is authenticated with a cookie. So the idea is that the user goes to the front page of bank.com, provides a username and password. Once the user is logged in with that username and password, the user’s browser is given a cookie and on subsequent http request, the user’s browser supplies that cookie to the bank’s web server, that’s what authenticates the user. So let’s assume that’s how bank.com works.

Now, let’s say that the user is logged in to bank.com and the user happens to get lured to some evil site called evil.org. The idea here is that, a user might receive an e-mail, there might be a link in it, they might end up clicking on it. There’s many, many different ways to lure the bad guy to evil.org, but let’s say that, there is this site called evil.org and has some malicious web pages. Let’s look at how evil.org can effectively steal money right out of the user’s bank account.

So, how is this attack going to work? Is Alice, our good guy or good girl in this case, is going to go to bank.com and request the log-in form. The log-in form is going to have a form for a username and password. The user, Alice, is going to fill out that form and provide bank.com with the username and password. The username and password gets transmitted to bank.com, let’s say that particular script is called auth. The username happens to be victim, the password happens to be whatever and bank.com checks that information in their back-end database, it checks out and so bank.com sends Alice a cookie, “Here’s your session ID, you can give me back this identifier to authenticate yourself.”

Now, you can imagine that in a normal world, in a good world, Alice might then, once she’s logged in to the bank’s web server, she might click on a link saying “I want to view my balance”. So what happens when Alice clicks on view balance link? Well, a view balance script might get called and of course, because this cookie came from bank.com and Alice is making another request to bank.com. The browser provides bank.com the cookie. And so bank.com says, “Oh, okay. I know this is Alice, she just logged in. Let me go ahead and give her bank balance because she’s an authentic user.” So it says, “Your balance is $25,000.”

ALSO READ:   Faster than a Calculator by Arthur Benjamin at TEDxOxford (Transcript)

Now, of course, let’s look at what happens when Alice gets lured to a evil.org site in the middle of this interaction. So, as before, Alice makes a connection to bank.com, she requests the log-in script, she authenticates herself with the username and password, she’s given a cookie. Now, what’s going to happen at this point is, let’s say that while she is logged in to bank.com in one window, she opens up another window and is reading her e-mail and happens to click on some link, or comes up on some drive-by download page or whatever and is served a page by evil.org. Well, what evil.org can do is serve her a webpage that has some html on it. So, let’s say that, Alice request evil.html, and on evil.html is the following tag. And I’ve simplified this from a technical standpoint just to make the clarity of the explanation a little bit better, but you can use JavaScript or whatever else. But the idea is that there’s this image source tag that has a URL that says “go make your request to bank.com”. And so it specifically says, “Hey, go ahead and make this request to this pay bill script at bank.com to go ahead and get this “image” and by the way, supply these parameters. The parameters are the attackers address and the amount, in this case $10,000.”

And so, what does Alice’s trusty browser do? Well, Alice’s browser wants to do a good job for Alice and so it makes a request to bank.com and it calls the pay bill script as was requested. It passes the parameter as requested, and oh, by the way, since the browser was already logged-in to bank.com, it sends the bank.com cookie to bank.com. So, bank.com gets this request saying — from Alice saying that, she should pay a bill to 123 evil script in the amount of $10,000 and looks up the cookie. The cookie is completely legitimate. So, what does bank.com do? Bank.com goes ahead and issues the payment, right?

Now, of course, if you’re the attacker, you want to do a bunch of things to make sure that Alice didn’t see that so you open up — the way you do this is you send back this kind of script in a zero size iframe so that Alice didn’t really see anything. The other thing is that, from the attacker standpoint, this would generate lots of forensic information, right? The evil guy doesn’t want to go ahead and give exactly his address to the bank because at some point Alice is going to complain, is going to go back to the bank say, “I didn’t withdraw this $10,000.” So this is where hiring mules comes in. So if you’ve ever seen these websites where — what the bad guys do is they’ll put up a website that looks like it’s legitimate company. The legitimate company would have a bunch of job descriptions on it. One of the job descriptions will say, “Oh, we’re looking for an assistant or work from home job at some point.” And that will have a whole bunch of qualifications listed for this person like, you know, they should know how to type at a certain speed, they should know how to use the internet. And one of the interesting job qualifications is that they should have access to an internet bank account or a mail address. And so what the bad guy does is hire a bunch of mules and puts the mules address here and it gets the $10,000 sent to the mule and then the mule is instructed as part of their job to keep 10% of it and then transfer the rest of it to the attacker’s bank account. So that’s how the bad guys turn Cross-Site Request Forgery attacks, web based Cross-Site Request Forgery vulnerabilities into ways to steal money. Does that make sense? Any questions? Okay. So that’s how to use Cross-Site Request Forgery to steal money.

How many of you have heard of an attack called Drive-By-Farming? Okay. Very few of you. Okay. How many of you – when you got your home router changed your default username and password? Why isn’t everybody? So the issue here is that because of this Cross-Site Request Forgery vulnerability, it’s possible for the attackers to completely takeover all of the user’s internet browsing simply because of the fact that they didn’t change that password. So most home routers will have a web based interface which lets you administer the router and what the bad guys do is they know that approximately 50% of home users use a broadband router and they don’t change the default username and then password. So what the bad guy can do is take advantage of Cross-Site Request Forgery to mount an attack against the home router or what they do is they change the user’s DNS settings, right? So basically, I include this image source and other type of JavaScript that will make a request to the router at 192.168.0.1 or whatever predictable addresses. The attacker can try a whole bunch of them and it will basically send a message to the user’s router saying, “Please change the user’s DNS settings to use the attackers DNS server.” Once that’s done, even if the user types in www.sitename.com it will get results to the attackers IP address and the attacker can put up a website that looks exactly the same. Once the user enters their log-in credentials, he just fetches those out. And so this slide has a whole bunch of details about how you actually do this but you could imagine that there’s other applications here as well.

So Pharming is an application where I change the user’s DNS settings but there’s other applications. I couldn’t, for instance, build a port scanner using this kind of technique where if I’m trying to — if I’m trying to get at the people behind a corporate — corporate internet sites, I know they may have say, a whole bunch of 10.addresses. So what I do is that I have whole bunch of this tags, I have a whole bunch of 10. or 192.168 internal IP addresses and I can basically, you know, using handlers like on error, figure out exactly when my successful request did get sent and did not get sent. And so there’s a lot of applications here for Cross-Site Request Forgery. Drive-By-Pharming is one. It’s some good work that was done by Sid Stamm, a PhD student who interned with me last year. Markus Jakobsson and Z. Ramzan were also on this paper.

So I’ve told you about Cross-Site Request Forgery, I’ve told you how you can use it to steal bank accounts, how you can use it to take over the user’s DNS, you can use it to do corporate espionage. How do you stop it? Well, there is a variety of techniques, right? Earlier I said that the request looks completely authentic to bank.com and it does, but for those of you that are familiar with HTTP, you know that there is a referrer header that gets sent along with the browser. And so one question that you may have in your mind is that kind of look at what the referrer is. You can imagine that when I try to view balance or pay bill and the user is only on the bank’s website, the referrer will also be the bank’s website. But if it’s evil.org site that’s sending back evil.html and making a request to bank.com, the referrer should in fact be evil.org’s, evil.html file. So it’s a good idea, the problem is that lots of users may use HTTP proxies for whatever reason. They may use anonymizing proxy. You might want to browse the web remotely. And so referrers don’t always necessarily get sent. So if you’re looking at using referrers to be a completely fail safe mechanism to deal with Cross-Site Request Forgery, you’re not going to get very far.

On the other hand, there’s some good work that’s come out of Professor John Mitchell’s research group at Stanford, and I believe there’s similar research reports elsewhere where — what you could do is look at all of your aggregate traffic on a site and look at what percentage of that traffic doesn’t have the referrer fields, is that normal, is that not? For those requests that I am getting, am I getting lots of request to the pay bills script without a referrer field? Well that’s a bad sign. And by the way if I do see some pages that I don’t expect, well that might be a sign that there is some kind of Cross-Site Request Forgery going on. So inspecting referrer headers wouldn’t give you a fail safe defense, but it may give you some indication. It may allow you to detect that the problem is happening even if it doesn’t allow you to prevent it.

You can attempt to use a Web Application Firewall but the request pretty much looks legitimate to bank.com so it’s going to look legitimate to the Web Application Firewall. Now, at the same time I mentioned that you could look at distributions of traffic and the Web Application Firewall may help you do that except for the fact that most firewalls — Web Application Firewall, for instance MOD_Security doesn’t support that level of functionality. So that’s another potential defense. Even better defenses are validation via a user provided secret. So the idea here is that whenever the user is going to do some sensitive operation like paying a bill, what you want them to do is you might want them to enter their password again, right? The reason that the Cross-Site Request Forgery attack worked was because the bad guy was able to make a right request and didn’t have to know any information about the user because of the fact that the cookie automatically got sent and automatically authenticated user. On the other hand, if XMLHttpRequest to pay the bill, you require that the user provides that password again, then of course the hope is that the attacker does not know the user’s password and will not be able to specify that secret in order to do the transaction. And of course if the attacker does have the user’s password, they can do a whole bunch of other bad things anyhow. So, asking the user to type in their password or some other user provided secret, again, a PIN or whatever, is a good way to mitigate Cross-Site Request Forgery attacks.

A fourth way that you can deal with Cross-Site Request Forgery is validation via an Action Token. So what you can do is take some secret that should only be known to the sever, combine it together with things like the user session ID and use a cryptographic primitive like a HMac to generate some new piece of data, some new signature that get sent out with every page that you serve the user and basically put that in all forms. And so the idea is that the bad guy won’t know what Action Token to provide when doing the sensitive operation. And I’ve kind of glossed through the details of how to do that here, but if you’re interested in all the details, Christoph Kern my co-author wrote up a great section on how to do this in the cross domain attacks section of my book, Chapter 10 and I’ll be giving out some free copies. And for those of you at Google, we actually have that book available internally if you’re interested in a copy. So that’s how you can prevent and detect Cross-Site Request Forgery attacks. Can I take any questions about Cross-Site Request Forgery before I move on to Cross-Site Script Inclusion? No. Clear. Good.

So, the next attack that I’m going to talk about is Cross Site Script Inclusion. It’s similar to Cross-Site Scripting except it takes advantage of the fact that lots of web applications are now using Web 2.0, Ajax type technology. And so, I’ll tell you a little bit about some of the downsides and some of the new attacks that can happen if you’re not careful about how you use Ajax objects like XMLHttpRequest and or — if you use a framework, some kind of framework to automate the development of your Web 2.0 application, like Xajax or Sajax or whatnot. You should have an idea of what the limitations of that framework are. So let me go ahead and talk about Cross-Site Script Inclusion.

As you all know, when you’re building a web application, you can have script tags and you may put the JavaScript directly in the HTML file, but you may also decide to factor out your JavaScript into a separate file. And then you can use a script tag followed by a source attribute to include that pack. And so this is called static script inclusion and it allows you to share a code, right? Between the various pages on your website and it also allows you to source JavaScript from other websites. So there’s a lot of advantages that come with that but it could be — it could be dangerous. For instance, if you decide to source in a third party JavaScript file, then they could do lots of bad things; show a page without you even knowing it. So that’s Static Script Inclusion.

There is also a Dynamic Script Inclusion. And maybe what I think I should do is just go into — to an example, but you can imagine that when I make a request to an HTTP server, that HTTP sever, in addition to sending back HTML, can also send back JavaScript. And it can send back different JavaScript depending upon what’s in the request. And that’s an example of Dynamic Script Inclusion. So let me make things a bit more concrete and then show you what can go wrong.

So, in an example where we have Static Script Inclusion, you can have a Web page and I might be showing the users their mail and I might like their page with their mail to have a cool menu on it. And so I can go ahead and source in a menu .JsJavaScript file which will go ahead and do all the right things and render that menu. And so Static Script Inclusion itself is not a bad thing so long as you trust menusite.com.

ALSO READ:   The Most Dangerous Question On Earth by Bryan Franklin (Transcript)

Now, there’s also Dynamic Script Inclusion. So you can imagine I might be building a banking website and I want the user interface to have very low latency and I don’t want to re-render the whole page to say “just show the user of their balance”. And so I might build a page like this where there’s some JavaScript on view balance, so what it can do is make a request to the server to get the user’s balance and then call a RenderData function to render that in place without having to reload the entire Web page. So in this script, the first line will go ahead and create a new XMLHttpRequest object, I’m assuming everyone here is familiar with XHR. And what you can do then is you can set a handler so that when this XMLHttpRequest is complete, this handler will get called. What this particular handler does is it just takes the response that it gets from the server, the server will basically give back a JavaScript function call and it’ll go ahead and call evil on it. So it will go ahead and execute that function call.

Once the state change handler is set, this code then makes a — opens up HTTP POST Request to this particular URL. It says, “Go to bank.com, call this get DataScript,” it gets back the data in JSON format which is a popular way of giving back attribute value pairs. And it says that, “Oh, when you’re finished with getting this DataBank back from bank.com, you should go ahead and call a RenderData.” Or it’s basically telling the server that give me a RenderData call as your result and then it will go ahead and send the HTTP request. And of course, also in this code, will probably be the code of the function for RenderData which will find the appropriate dome object in the Web page in which to stick the balance once it arrives back from the sever. So, this is what things look like in this particular application and in a normal world we wouldn’t have problems, but of course we live in the real world so we’re going to have problems. So let me demonstrate exactly what happens with regards to how this code works and then show you what can go wrong.

So Alice may decide to log into bank.com, she goes and gets the regular login script, she gets back a cookie, she might click view balance and when she clicks view balance, bank.com gives back view balance HTML and when Alice’s browser gets that code back, it starts evaluating the JavaScript in it. So it goes ahead and makes the post request that was in the JavaScript. It passes the appropriate call back parameter and so bank.com gets the request, bank.com gets the cookie, “This is Alice, I’m going to ahead and send back her bank balance information,” except that bank.com is a cool Web 2.0 application now, and so it sends back this data in the form of a JavaScript call back with some JSON that has attribute value pair for the account number and an attribute value pair for the balance. And that data goes back to the the browser, the browser says, “Oh, okay. I’m just going to go ahead and eval this. I’m going to call a RenderData with these JSON parameters to render the data on the Web page.” And so things should work just fine unless we have an attacker on the loop.

What the attacker is going to do is, similar to what happened with the Cross-Site Request Forgery attack, what’s going to happen here is the user might get lured to evil.org while logged in to the bank.com website and the evil.html that the attacker is going to send back is going to be a little bit different in this case. In this case, as in the previous case, the bank.com or evil.org has studied bank.com’s website, and so it sends back some JavaScript which takes advantage of this RenderData function. So evil.org knows that there is a RenderData function being used and the evil site provides its own implementation of a RenederData function. So this RenederData function, instead of rending the data on any kind of Web page, will call another JavaScript function which will send the arguments pass to this function off to evil.org. Of course, because this page comes from the evil.org domain, it can go ahead and send data back to that domain as per the same origin policy. So that’s one part of the script.

The other part of script that is an evil.html that got sent back, is basically a tag which says, “Go ahead and call that get DataScript on the bank.com website,” and of course call back the RenderData function. So the user’s browser gets evil.html and wants to faithfully render evil.html for the user and makes the post request to the website, passes the appropriate parameter of course because the user is logged into bank.com, provides the session ID in the cookie, bank.com successfully authenticates Alice, bank.com sends back the RenderData script and the JSON parameters containing the account number and the balance just as before, except now what happens is that because this is being rendered on evil.html, it goes ahead and calls the RenderData function in evil.html which basically says, “Send these parameters off to evil.org.” And so the attacker gets the account number and gets the balance. Bad news. And then the attacker can do whatever the attacker would like with this information. So this is yet another example of cross-domain security problem and how the attacker can, in this case, steal the account number as well as find out what’s in the balance and then sell that information or do whatever we would like.

So what happened in Cross-Site Script Inclusion is the malicious website requests a Dynamic JavaScript, the browser authentication cookies would be sent because the user’s already logged in into bank.com, the JSON fragment returned by the server is accessible and it’s going to be sent back to the malicious site, and the bad guy simply redefine the call back method to make this happen. So that’s Cross-Site Script Inclusion.

So at this point I hope I’ve scared everyone a little bit with all the technical details of how things can go wrong. Let me talk about a couple Trends. So I’m going to cite a couple research reports, one is by Symantec and in Symantec’s latest internet security threat report, reported that 58 of all vulnerabilities of in second half of 2007 affected Web applications. So the bad guys are shifting more and more to attacking Web applications as opposed to, say, attacking open DCOM vulnerabilities on your PC.

One other interesting thing that was mentioned in this Symantec report is that the bad guys can go ahead and look at large popular sites that have vulnerabilities and it sometimes makes sense for them to put a lot of effort exploiting those vulnerabilities on large traffic sites, instead of going for smaller sites. So that’s a particular trend that was highlighted in the Symantec report.

Now, about two weeks ago, there were also attacks, but what the bad guys did is they took advantage of kind of the opposite; they attacked a whole bunch of small websites. So what they did is they used various search engines to look for IIS servers, Microsoft’s internet information servers, and looked for SQL Injection Vulnerabilities and applications that would use that database. So basically, they would get back a whole bunch of search results and they’d know exactly what servers to exploit. So this can also go the other way around where the bad guys can attack lots of small sites very quickly. So on one hand, large sites are — this is an issue for all large sites, on the other hand, it’s an issue for small sites. It’s basically an issue for every site.

Another statistic is that Sophos noticed that 80% of sites hosting malware are legitimate sites that have been hacked. So many of the small websites that I just mentioned would all render content to their users that came from the database. So what the bad guys did is they used technique like SQL Injection to inject malicious JavaScript and or ActiveX type objects that once user would — end up on those web pages, they would get infected by malware. So that speaks to this statistics. So users are not safe necessarily browsing legitimate sites. We basically need a lot more AV technology on this front; we need websites to get better about their server sites security.

And statistics from Symantec and Sophos are good, I’m sometimes skeptical about statistics that come from market research firms, but even Gartner says that today over 70% of attacks are against the company’s website or Web applications on their site. So things are — things could be a heck a lot better. And giving this knowledge and how to prevent these kinds of attacks, we can all help make things a lot better.

To show you one more graph, let me show you another graph that came from this web hacking incidents report. It basically shows that with regards to actual incidents, 12% of the attacks are due to cross-site scripting, 20% are due to SQL Injection, et cetera. So, actually I should mention one more thing about this pie chart, there’s an organization called the OWASP, Open Web Application Security Project, and they basically, every year publish a top 10 web vulnerabilities. So what Breach Security did in this case was they took those top 10 web vulnerabilities, part of them against the incidents and found that number one – the number one web vulnerability, Cross-site scripting, actually was only 12% of the attacks, whereas SQL Injection, while it was number two, actually accounted for more incidents. So there’s more cross-site scripting vulnerabilities but there’s more SQL Injection incidents.

So let’s see, I’m going to summarize by telling you where you can learn more and I’ll tell you about some courses certification programs, etcetera. This list is not comprehensive by any means. You can learn more about security pretty much at every university. A lot of the good universities, in addition to having things like cryptography courses, are now introducing systems security courses. So, CS155 at Stanford, W4187 at Columbia, CS161 at Berkeley, there is much more comprehensive list at Avi Rubin’s website. So you can check that out.

There’s also, for those of you that have full time jobs and can’t necessarily afford to take things over the course of semester, there’s an advance security certification program at Stanford that can be taken either online or on campus over the course of the week, where there’s three core courses, three electives and one of the interesting about it is, in addition to just providing lectures about all these attacks and telling you how to abstractly prevent them, what we do, and I have to in the interest of full disclaimer, mention that I helped Stanford with this program. Basically, what we do is we put people in labs and we have them actually construct the attacks under some test websites that we constructed, and then write the actual defenses. So that is useful and we do that on campus. Of course you can do that in a lab where you’re kind of isolated for the real — from the real internet and you make sure that you don’t end up impacting real sites. We also took all of that — all the software that were running in the lab and we put them on VMware images. So even if you need to take the courses online, you can go ahead and benefit from those labs even if you take the courses online. This is the URL.

So let’s see, so there’s a lot more information, this is the URL to go to. Actually, this should be scpd.stanford.edu/advancedsecurity, the next course is coming up in July. So keep an eye on for it.

There’s other security certification programs available. There’s a certification called the CISSP, the Certified Information System Security Professional. It’s much more broad, it focuses on various aspects of security beyond just software security, it focuses on physical security, telecom security, et cetera. There’s also a new certification that’s come out from a group called SANS called the Secure Programming Assessments. It’s basically — both of these are actually multiple choice tasks. So you learn a lot by preparing for them.

There’s also lots of books. There’s my book, Foundation of Security: What Every Programmer Needs To Know. I’ll give out some copies. I’ll do some trivia question in just a second. There’s also a lot of other good books. I’d recommend Ross Anderson’s Security Engineering; it’s actually available for free online. Just enter this into Google or your favorite search engine. I’ll also highly recommend Gary McGraw’s Building Secure Software book and Viega/Messier’s Secure Programming Cookbook books.

Let’s see, there’s also a website that we helped make available called the code.google.com/edu. If you need to teach security course, you can just download materials off of this site and, there’s slides, there’s programming assignments. So if you’re responsible for teaching security to other folks, this is a resource that you could take away for free. There’s also free slides corresponding to every chapter in my book at learnsecurity.com/ntk. And let’s see, so I think that concludes all the information that I wanted to tell you.

The key point here is that software security is every engineer’s problem. Is not as if there’s some group of magical security engineers that are going to take care of securing your application after its get developed. It really needs to be every engineer’s job and I mentioned various set of resources in addition to my website neildaswani.com, you can also go to leansecurity.com. My contact information is there, so I hope that you guys have learned at least one thing new. I’d be happy to take any questions at this point. Okay. Thanks for your time.

 

Multi-Page