Back Seat Webdriving via Browser Automation
12:55AM Jul 27, 2020
Welcome back everyone to hope 2020 to all of our people watching from around the world, we've had over 100 countries, watching our streams, so far. Welcome. Welcome. And on to our next talk. There are many reasons to automate web browsing for security purposes, this talk will share basic concepts and advanced tips and tricks from years of experience, automating browsers. And now the author of info Riley's crafting the infosec playbook creator and host of wall of sheep's lockpick have it forever curious maker and breaker. Matthew politesse.
Thank you very much appreciate that intro.
Hi everybody, nice light is here. I want to thank the organizers of hope for allowing me to share this presentation, and I want to thank all of you here for attending. And I'm going to be talking about how you can automate your web browser. So, I live in the Bay Area. I've been in San Francisco for over a decade now. I've done a little bit of everything in information security, I started doing incident response at the cert here at Cisco. I've also done pre sales over at Splunk, and now I manage research team here at Cisco in the TELUS organization, and TELUS provides the threat intelligence that goes into all of Cisco's security portfolio.
So why me and why this stock.
Well, over the last few years, I've been faced with a number of web based problems, that would be far easier solved if they could be scripted, so I needed a solution to do this, and I started with a program called imacros that was based on a talk I saw from a man named Michael shrank in 2013 DEF CON 21, and he explained how to use imacros to do some of this browser automation. There's a ton of programs, libraries, and documentation on what we're going to chat about what I wanted to do is pull the most relevant and useful bits from that and share my experiences and lessons learned while learning, while attempting to learn all of this with you all. Essentially, this is what I wish someone had shared with me years ago, when I was starting out. There's a ton of ways that you can use browser automation. Let's talk through a couple examples to start, here's four examples that I came up with that we can chat through. The first is really simple form submissions, which forms everywhere on the internet, you use them all the time you enter some data, you click radio button you maybe have a drop down box and then you submit it. You click a button and you submit it do that all the time and there's tons of reasons why you might want to automate that both legitimate and illegitimate about sniping. I think the most obvious is maybe an eBay lottery sniping is where you come in at the last second input enough bidding to win the lottery. There's so many bots out there already if you do a quick search for sniping services. But there's also game sniping that I learned about. There's sniping for things like FIFA Ultimate Team for players. And there's a bunch of open source tools again to do sniping that are based on browser automation. There's also gaming games. So one of the things I found is something called Open ppl, which is open poker programming language. So it's an open source language that is meant to play a game of Texas Hold'em. So what if you attach this to a web based poker system, then you were able to interact with the game in your browser by backing it with some sort of logic driven bot that can make decisions based on what the web browser is showing it. And then one of the most nefarious examples I can think of is credential stuffing. So if you're sitting on a stack of credentials and you want to test those you're going to need to grab one login to a service grab the next one login to a service grab the next one log into a service and do that over a bunch of services to see which credentials work at which service. So you could do that programmatically have a list you just go through it and you submit it. So, hopefully these kind of give you an idea of some of the ways that you could you could use web browser automation, but hopefully it also gives you an idea of some of the complexities that might be involved if you start actually thinking through these. So say you're at the poker game. Maybe you are presented with your cards in a graphical format, you have to do some OCR to read them in there. Maybe if you're submitting forms you're doing it on such a regular basis that your IP gets rate limited, so you need to change your IPS, or perhaps they start feeding you captures and now you have to start service captures. So there's different complexities that happen when you start using these services or doing it in an automated way, and we're going to get to how you can solve some of those in some upcoming slides. But browser automation might not be the best way to solve these problems right you might be sitting there and think okay well there's some other things, I might do to address these, that's totally fair. So let's take a look at some of the things, some of the other tools that you can use to solve some of these problems, there's a number of ways that you can achieve some level of web automation without using a full web browser automation framework, starting with something like w get and curl is tools are ubiquitous, they're lightweight, they're available everywhere, super common, and they're really easy way of getting data on the other end of the spectrum from a full commercial deployment to something like a burp suite deployment. This proxy allows you it's super powerful as you do a ton of stuff comes out I think cost of complexity and license cost as well. There's a number of commercial services that are available, things like Sauce Labs. These are geographically distributed scalable services that allow you to do scraping or other types of web automation. Again a service that you'll pay for. Then there's something like imacros, and this is similar burp has a free offering. Also a paid offering. This is the one I mentioned earlier that I started with still works, still valid and we'll go through some examples of destiny a bit. Then there's something that we'll be spending most of the time on which is foreign browser automation using a browser automation framework. There's other examples right other programming languages. If you've been doing web for long enough, you've probably used pearl WP, which is the worldwide web language library for Perl, or something like triple w mechanize. And then there's a ton of Python libraries that are available that we're going to take a look at next and they're really great for allowing you to interact with static and dynamic web content. So I do most of my coding in Python, sort of most familiar with, so I'll share with you some of the Python libraries that I use, while I'm doing web based programming.
Now on the images on the bottom right you can see a screenshot from the browser configuration window, which many of you may have used. It's easy to change configuration settings here on the left is the corresponding way to do that in Selenium in the code itself. It's really simple set preferences for the same setting. So anything that you can do in the browser configuration window. You can also set in, in the code itself. So, similarly, and how we configure the browser for more effective automation. There are behavioral things that we can do to try and make our automation, not appear as if it's automation and we're going to talk about that next. The general way that I think about stealth, is to try and look like a human, not a bot. The more bot like you appear, the more likely you are to get website protections like captures or CloudFlare rate limits, things like that. So you want to try and mask the fact that this is. This activity is occurring programmatically. And here are some ways that you can do that. Introducing randomly random delays, is what humans would do as they're navigating the web, it's just natural to read something to pause to click to scroll to walk away. So you can introduce random delays in your interactions with page. Don't move your mouse in a linear way, if you are using mouse movements again introduce some randomness into the path of the pointer. Similarly, if you're clicking on a button, don't always click in the center you can get the coordinates for the size of the button and click off center can also use residential IP used to actually appear as if you're coming from a user like environment, and we'll chat a bit about residential VPN options or proxy options here in a little bit. randomize your keepalives, so this will terminate your TCP connections to re initiate a TCP connection to the server. Similarly, randomize your request rates to the server, so that you are appearing to come at a more human like perspective, rather than a predictable rate. And then lastly you vary your user agents. Sometimes just changing the user agent is enough to trick a server into serving you the content for that type of device, and you can often see in tracking
response data that it will identify your device as the user agent that you are passing in here so it's a really easy way to use a one browser, one vendor browser but to change the device that you're coming from, it doesn't work all the time, but it's a really simple way to do that. So these are easy ways that you can appear a bit more stealthy. Another thing that you might want to do when you're writing your web automation is to run in headless environments, so that you can scale, what you're trying to do and we're gonna chat about that next. When you actually run a WebDriver, the browser window is going to open on your machine, you're gonna see the browser window open, it's going to navigate to a web page. It's gonna load it, you're going to see subsequent actions completed like text being entered into text boxes buttons click to things like that. Once you've got a functioning bot that's programmatically browsing the web, there may be no need to have a display window. And so in that case you're going to want to run this headless, and we've got a couple of options to doing that. The first option uses native support in Selenium to pass the headless command argument to the gecko driver, so if you actually were to run the gecko driver on the command line outside of using a web framework and pass it the headless option, you'd achieve the same results. This launches your web browser without actually opening a GUI. And so if you're running this on a system that doesn't have a GUI like a Linux server. This is a great way to run this headless. I have had issues where when trying to capture screenshots, using the native headless option, regardless of the window, setting or scrolling that I've used. I'm unable to capture the entire headshot. And in those instances what I've done is rather than using the headless option in selenium, I run Selenium without the headless option in the virtualized frame buffer. So x vfb has been around for a long time. This lets you run graphical windows in a non graphical environment, and this is a great use case for it. So your mileage may vary depending on what you're trying to do, so it's worthwhile to test, but it's good to have multiple options, and both of these have worked for me in the past. So now let's take a look at another challenge that we're going to face when we're browsing the web any amount of time. And that is something called modals. What is a modal modal pretty simply is a scripted element that allows you to put one object in front of another. Some examples of this might be a login prompt on a web application asking for you to provide your credentials could also be a setting, say in your Facebook application permissions that drop down could be a modal window window could also be an alert, or a notification, say you're in a financial application and you are timed out and they're gonna log you out that warning that they pop up could be done as a modal
on the left here we see an example of a modal window. This is as a result of the GDPR privacy settings. You'll note that the original webpage is shaded So the focus is on the model itself. And that means that I cannot access the webpage without completing the actions within the model. This is considered a true modal window. On the right, we see a developer tool from one of the web browsers I believe this is Firefox, all the major web browsers have this Developer Tools capability within them. And this tool is extremely handy for understanding the DOM of the web page that you are navigating to, so it's a great way to understand, and identify some of the objects manually when you're browsing to these web pages, then you can use those attributes in the code to identify and interact with the elements. Let's take a look at another example of a modal. In this example, we see another pop up this privacy is another privacy setting. This is a result of the California privacy protections. The difference here is that this is a mode lists window. So you can see that I still have focus on the main webpage in the back and I can actually navigate and scroll around that web page, even with this pop up window in the front. Now, if I click on this window it's a different story. Once I click on that motiveless window to set my privacy choices. I'm directed to this window, which is a true modal again you can see here that the focus is on the modal window and not on the background website and I'm not able to navigate that website, until I satisfy what this modal is asking me to do again on the right you can see another example of the developer tools and in this you can see reference to the name of this pop up. Having included some mention of a model. Okay, so we know what models are. Let's take a look at how it is that we can deal with these as we're automating our web browser. Here are three ways that you can deal with modals or pop ups in web applications, they all follow the same basic process, and that is first to determine whether this is something you see consistently every time you visit that page, or whether it's intermittent depending on that you have to write some tests to detect the window to see if it's present, and this is likely to use the weights that we've discussed earlier on in the presentation to wait for that object to appear or not appear once you're able to determine that it's present, you have to actually switch to that window to make that the active object, so that you can then do something with the object. And there are a number of ways to switch to that window as you can see on the right in the sample code. Once you've made the modal or the pop up the primary, the active window, you need to do something with it. And again, depending on the type of model that you are presented with, you might click on it, you might submit it. You might enter some data, you might dismiss it. So having multiple options for handling these different types of models is super important, because you are almost guaranteed to come across these if you do any amount of web crawling for a significant amount of time. So it's a great way, great ways to handle clearing models, let's switch gears a little bit now and talk about proxies, and the role they play within web automation. If you want to change where it appears that you're coming from, or if you want to appear to be coming from somewhere specific geo location wise, you're going to need a proxy, all of the tools that I've mentioned so far, selenium is no different. They all support proxy usage. This is one tool that I've seen recommended before it's called luminati. They offer static residential proxies, as you can see here. They also have mobile residential proxies. So, these are network locations that you can proxy through that make it appear as if you're coming from a network range reserved for mobile devices. This is a great option to have for security research perspective, it gives you the ability to mask where you're coming from, and not look like you're proxying through a data center, which is pretty easy for websites to block data center IP addresses cloud providers like AWS public cloud providers, publish list of IP addresses. So if they see web browsing coming through data centers like that, there's more of a likelihood that they're going to be blocked. Then, if the server sees requests coming from a network range that's typically associated with end user mobile devices or residential networks. I
hash of that image and supply that
for them to solve those. So all the tricks are basically all the captures are basically using the same trick
view of the captcha solving service that they can access and solve and provide you with the answer that you can then submit into the form, and it works quite well.
CAPTCHA solving service, also provides a browser extension as browser extension is smart enough to automatically detect most of the captures extract the site key that we talked about, send that to their service to solve, and then receive and populate the response to solve the capture for you. If you can't do it automatically you have the option to identify it by right clicking on it you can identify the capture and then the input field, and it will learn that the next time it gets the same capture both of these methods are really reliable ways of getting past most captures, then you might encounter and doing so at a relatively inexpensive way.
We're back with Matthew phlebitis, this was a great presentation. Thank you very much for preparing it for us
So we do have a couple of questions from the chat. First question that we have is what your thoughts are on tools like Fiddler and insomnia and similar tools. Yeah. And
so I'm not used insomnia. I'm aware of Fiddler I have not used it. My understanding is Fiddler's, a proxy, so you can do things that you can do with other proxies right so like burp suite. You can see the requests, you can stop them you can change them you can replay them, but things like accessing the DOM accessing client side code that would be running in the browser that is not possible with the proxy server sort of stuff so probably similar use cases something like burp suite. And like I said I'm Sammy I'm not sure I'm unfamiliar with.
Okay. And in the chat. We've also had some folks talking about gopher and project Gemini to go a little bit. Back in the day, are those a good starting point in your opinion.
Now, how detectable are these techniques if you were trying to protect yourself, what approach would you take.
it depends on, on how programmatic you appear, how frequently. You're making requests, where you're making requests from right there's a whole bunch of different attributes and factors that are gonna make yourself more detectable than not detectable. I mentioned using proxies to vary where you're coming from, making it you know your your interactions with the website look as human as possible and not as bad as possible is going to help, but you're. I think you're still probably at some point going to face things like CloudFlare page pages captures and stuff like that which is why you've got to be able to deal with those if and when you do get them. And if you're using this for security research purposes. I you know I'd like to think that sometimes the attackers infrastructure is a little less robust than say you know fortune 50. And so, hopefully, or maybe you'll have a little bit more success getting past some of the protections and stuff but you still got to be cognizant of how you're doing it. And so, again, the whole biggest thing I try to keep in my mind is look like a human, not like a bot.
Of course. Now speaking of those anti CAPTCHA services. Do you have, do you know where or do you have any idea how they get their solves from Do you think that they are offloading to make cheap labor to other kind of Mechanical Turk into machine learning, perhaps, yeah all right so what's the secret sauce. I
don't know if the most of the sites advertise that they employ
we advertise it as if it's a selling point that you're, you know, you're providing employment for people solving it seems like a bit of a reach to me, but it does. I wouldn't be surprised if they were actually having having people solve these, and then when they, you know, when they get the responses back they compare to see if multiple people answered it. But whatever they're using. It's fast and it works pretty well and it's pretty efficient.
And what are your
thoughts. Again, this is a question from the chat What are your thoughts on using Selenium for access testing.
I mean why why not right. Again, I don't know if it's the best use of it but yeah certainly. That seems like a pretty lightweight way of, you know, testing. Testing sites, it seems like an easy approach an easy problem to use this for
you mentioned the VPN thing of course, in the past couple of weeks we've had those revelations, with the free VPNs and they're, they're open buckets full of data. In your experience do you think there are any VPN, that would be reliable or would you just be, or would you shy from that, you know, again, if, if
what you need is that sort of connectivity and you don't necessarily care again that the example that I gave was hola and Illuminati if if you don't necessarily care about someone inspecting what it is that you're browsing to maybe not a big deal. So what is your threat model I guess in what you're trying to accomplish. But the question is, is there any good VPN I I think unless you roll your own, there's just no way to know and so err on the side of caution is probably a good move.
You've also talked about using old school tools are there reasons to use older browsers in the research.
Sure, why not part some sometimes some of the in some of the tests that I've done part of my part of what I'm trying to do is understand what do I get served content wise based on how it appears that I'm accessing the site so by a different browser for instance. So varying the browser of varying the user agent, perhaps in some cases is a super simple way to trick it. There's a lot of value in that. Some of the malware campaigns that we've tracked in the malvertising campaigns that we've tracked are serving flash malicious flash players dmg files and they're just malware of varying different types, but it would only hit, if you were coming from a Safari browser and that you could trick them just by sending user agents so certainly I could see why, especially if you're testing and queuing things maybe not a security researcher perspective. Yeah, you might have to test coming from an older browser so certainly. Although I will say if you're using an older browser I don't know if you'll have a web driver available to you to automate it so that might be, you know, one of the, one of the gotchas.
Now another question that came up with was about these again back to the captcha services. Do you think paying a CAPTCHA solver would be in simply streamlining brows. I can't say the word right simply streamlining browsing and sending the captcha to a third party service leak what web page you're looking at.
Okay, and does html5 provide any advantages, disadvantages or opportunities.
If To be honest, I don't know, I really don't know, I'm not a web person. I this is part of the reason that I gave this talk was because I don't have a web background and so I had to struggle to learn all this stuff so I thought if there are other folks, kind of doing the same thing that don't have necessarily web background or web coding background that this would be useful, but I honestly don't know, I don't know if html5 would be good or bad. From this perspective. Okay.
I'm just looking and I think that covers the questions that I've seen in the chat.