Further Examination into External Attack Surface

Table of Contents

Getting from assets to exposure, or why identifying critical attack surface exposed by your assets might not be that straightforward after all.

Reducing Attack Surface Decreases Security Risk #

In my previous write-up I explained why tracking digital assets is important, and listed some methods to get started with it. I trust that once you read it, you immediately set off to gather a list of your IP and domain assets. Since then, Tuomas Haarala has further elaborated on discovery methods from a systems administrator perspective in a write-up of his own. Armed with these tools, we can now venture further into the realm of attack surface reduction.

This write-up will concentrate on the process of moving from cataloguing assets to having an idea on the attack surface involved. As laid out in my previous post, the steps in this process are:

Research the attack surface, i.e. open services, related to these assets.
Determine whether there is something that needs fixing within these services.

This write-up will focus on the first step, and the second will be covered in a follow-up.

Network-Based Exposure Assessment #

The future is already here—it’s just not very evenly distributed - William Gibson

One could easily think that since modern systems are so easy to deploy, monitor and administer, there were less need to discover their attack surface from the network side. This may be so, but as the previous write-ups have demonstrated, systems owners and administrators must tackle all sorts of systems within their normal duties. I won’t go over all the points at length, but will summarise the main ones below.

Network-based assessment is universal. You can use it for any system with ethernet-based connections, i.e. everywhere. You don’t need to own the system, have administrative rights to it, or any other rights, for that matter. You don’t even need to make sure you have any access to the system, you can just check if you do. You can use network-based assessment for any systems that your security depends on, even those of your subsidiaries, subcontractors, or service providers.

You will need permission for many if not most of the assessments detailed in this post.

Network-based assessment is cost-effective. Most assessments are quick, scale linearly, and don’t need your constant attention. There are a myriad of good open source assessment tools and resources. You can, and should, automate them, run them continuously, and give you alerts for changes.

Network-based monitoring has the same viewpoint as the attackers that are searching your systems for weaknesses. This point is somewhat more subtle, but ignoring it can still hurt you. I’ve come across many cases where system administrators believe that an open network port is closed only because it is unreachable from their administrative systems. Network configurations as well as firewall rulesets have a tendency to become complex, which increases the probabilities of mistakes and misunderstandings. I would argue that the functionality of all nontrivial firewall rules is unknown until assessed. Hardware-based management interfaces might not be visible from the system as they are implemented on a lower layer. Many attackers try to evade detection by hiding their backdoor endpoints from system-level tools. In both of these cases, the services can still be visible from the network side.

Caveats for Network Scanning #

The approach taken by many organisations is to scan their networks themselves using a tool or system with pretty much the default settings, or buying the scan data from third parties. While the garden variety network based scanning is a good start for finding exposed services, there are many caveats to this approach. They are related both to the breadth and the depth of the available exposure information.

I’ll try to summarise the issues I usually encounter when performing these kinds of assessments. Although many tools and frameworks streamline the scanning process, I’m going through all the stages involved one by one in an attempt to describe some of the complexities that are involved in each of the stages. The result is by no means a comprehensive listing and your mileage might vary depending on the technical environment that you work in.

Assessment via Third-Party Services #

When you have an asset inventory, services such as Shodan, Cencys, Zoomeye, Fofa, and BinaryEdge can be used to gain an initial awareness on the extent of your public exposure. You can also use these services to sanity check your inventory as many of them have various search modifiers for searching assets related to organisations. Please note that these services are prone to false positives both at the asset and service levels, so it is only prudent to verify the findings. At this point, there’s a good chance that you already have a list of priority findings to check.

Network Scanning #

Before starting any actual scanning, make sure that:

you have permission to scan before doing any actual scanning.
the target environment is not too fragile so that scanning would represent an undue risk. These include e.g. most industrial control system networks (OT/ICS/SCADA) and other specialised systems.
your target lists and tool parameters are correct. It’s easy to make mistakes that result in wider scans than intended.
the ISP or owner of the system you use for scanning condones it. This is not as critical as the three points above, but you will dodge some nasty surprises.

With that out of the way, it’s time to start scanning. In nearly all the cases I’ve encountered, the asset list has been quite extensive, with masses of netblocks, domain names, and individual IP addresses. In that case, it’s good to take a phased scanning approach, where you scan first for open ports then for network services and finally for vulnerabilities. After a couple of scanning rounds, you’ll probably find a process that works best for you and your environment, but let’s start by describing all the bells and whistles first.

Scan for Open Ports #

In this stage, the main goal is to limit the scope of later, more time-consuming scans. In other words, first you will need to extend minimum effort in order to discover the host-port pairs you want to examine more closely in the later steps. At this point we’re talking about IPv4 and IPv6 addresses as well as IPv4 address blocks. IPv6 blocks are too large to be scanned as even the smallest blocks have literally trillions (or quintillions in American terms) of addresses. If you don’t know which of your IPv6 addresses are in use, you need to go back to discovery mode.

There are many tools for scanning a great number of IP addresses for open ports. Masscan and ZMap are my favourite tools in this category. They are built for scanning the entire Internet, so they should scale for the purposes of pretty much any organisation. I regularly scan millions of IP addresses as part of my daily work and the tools above can handle scanning a network for a single port in some minutes – given a decent Internet link.

Which ports to scan? #

Which ports to scan is a question to ask yourself. The Nmap scanner defaults to a set of 1000 popular TCP ports. There is, however, a tendency with both administrators and vendors to use non-standard ports and you might miss those by concentrating on the popular ones. With efficient scanners, it does not take much time to scan through the entire TCP port space, 0-65535.

I know that port 0 should not be used for anything, but I always include it just in case.

You may find that some security devices such as firewalls have been configured to answer to all of the packets coming their way, supposedly to fool attackers. In your case, they simply mean that you may need to include all the results in subsequent tests, which will increase the total scanning time.

So that was TCP scanning 101. Optimising scanning speed and trying different packet types may be tempting, but in practice I’ve found that the simple stuff will get you surprisingly far.

Scan UDP Ports #

Scanning for open UDP ports is a different beast entirely. TCP is a connection-oriented protocol and the simplest way to determine if a port is open is to send a SYN packet with no data to initialise the connection establishment and check the status based on the responses or connection timeouts. UDP is a connectionless protocol, where in most cases you will need to send some sort of a protocol-specific packet in order to get a response from the target service. Many tools such as Nmap and ZMap do have support for some UDP protocols, but the scope is still quite limited. UDP scanning is often overlooked due to its difficulty.

It is good to keep in mind that your lack of knowledge about a given UDP service does not mean the attackers don’t know about its weaknesses.

Moreover, TCP and UDP are not the only protocols served over IP. The ones I’ve seen the most are related to routing, encapsulation, and IPsec services. Tools such as Nmap support scanning for available protocols by sending IP packets with different protocol numbers and checking for ICMP error messages for protocols being unavailable. Please note that these messages could be filtered by the target or any filtering devices between you and the target. Covering the details of all the other IP protocols is outside the scope of this write-up, but knowing that they exist is a good starting point.

Another variable that can affect scan results is the vantage point from which you scan. Access to some services might be restricted by IP address blocks, typically aimed at blocking users outside a certain geographical area. Although scanning from allowed networks should produce better results, it might also be worth your while to sanity check the implementations of your blocking rules.

Scan for Network Services #

Now that you have identified the open ports in your target IP addresses, you’re naturally curious to find out which services the ports provide. The naive approach is to look up the port number from the registry of the Internet Assigned Numbers Authority (IANA). This approach, however, comes with a number of caveats.

All services do not have an assigned port number. Some services use dynamic ports or even port ranges. Many of the ports may be shared by multiple services, and it’s up to you to find out which service it is in your case. If a port is already in use on a server, it’s common that administrators pick a new port number for a conflicting service.

A common case is to host multiple web services on one server, for example on ports 443, 8443, and 9443. Many administrators deliberately change the common port numbers to something else in order to evade garden-variety scanning and brute force attacks.

So, while the port numbers might give you an initial idea, it’s best to verify the findings by communicating with the target services. The method usually implemented in a given tool is a simple banner grab scan. In many TCP protocols, the service responds to a completed handshake with a hello packet, which is commonly called a banner. It may already contain volumes of information related to service name, software version, protocol version, and supported features. Some TCP protocols, such as the ubiquitous HTTP, expect the first packets containing payload data to come from the client. In these cases, similarly as with UDP, you will need to send protocol-dependent payloads in order to get the banner.

There are many tools that include a number of different protocol payloads that can be used for banner grabbing. Many of them can also automatically extract at least some product and version information for your perusal. There are also scans for determining the operating system and its version from a target host. I’ve always thought of the results of these scans as educated guesses that are helpful in some cases but misleading in others.

How to Discover Web Services #

As a majority of services are web-based, handling them deserves a section of its own. After banner grabbing a web service port, you might know which web server software, say, Apache, Nginx, or F5 BIG-IP, is used to handle incoming HTTP requests or terminate TLS connections. That does not, however, begin to answer questions such as: which code handles customer data? Getting an in-depth answer would take us down a deep rabbit hole, but let’s take a peek.

It is common for a single web server to host multiple domains and a variety of applications. Thus, upon receiving a request, a web server needs to discern which of the contents or applications to serve. The first step is to discover the domain name or IP address the request is meant for. Depending on the protocol, this information can be gathered from the server name indication field in the incoming TLS packet, or the host header of the HTTP protocol. In the web server configuration, virtual hosts with different content can be defined by IP address, domain name, port, or combination thereof.

When scanning for web services, you want to get as much coverage of the virtual hosts as you can. This is why it’s a good idea to make sure to perform the web scans using the domain names associated with the IP addresses of interest. Without the indication of the virtual host, you will only discover the default one as specified in the web server configuration.

In many cases, the root path / either hosts the most important content related to a virtual host or redirects the user to it. If you stop scanning at that point, you will remain unaware of a great deal of the available attack surface. There can be many different paths and applications defined inside a virtual host configuration. They can be either packaged applications or frameworks, their customised versions or bespoke applications. Many applications also include different kinds of management interfaces, as a previous write-up by Lari Huttunen demonstrated.

Enumerate Attack Surface with the Help of Word Lists #

In short, there are a number of different paths that can be interrogated by an attacker. How to discover all of them? A surprisingly popular approach is to simply try all kinds of paths based on different word lists with a tool such as ffuf.

Wordlist-based enumeration is also a useful method for:

Discovering domain names for existing domains.
Searching for web shells and other backdoors left behind by the attackers.

Wordlist-based scanning can be quite noisy and prone to false positives. Many application scanning tools such as nuclei have thousands of different application templates that are much more limited in scope. OWASP hosts an extensive listing of tools that can be used in path discovery as well as vulnerability testing. You can either opt to use them exclusively for path discovery, or to validate any findings from wordlist -based scanning. Some word lists might give you insight to plugins used in frameworks such as WordPress or Drupal, or the usage of different libraries such as jquery by different custom applications.

Of course, nothing stops a clever administrator from using non-standard paths for the installed applications. Or more generally, there’s a great deal that can be done on the web server or load balancer front to make the life of the attacker, or in this case the scanner, more difficult. Some examples include request limiting and user agent detection to thwart scanners. It can be a good idea to include your contact details in your user agent so that worried system administrators can get in touch with you. Note that it’s usually a good sign if someone does contact you as that indicates that someone is actually looking after the system.

You will inevitably run into services that will not respond to you in any meaningful way and there are many reasons for this.

A service may:

require authentication
only respond sensibly to HTTP methods other than GET or HEAD
require magic words beyond mellon to get any responses.

These services might not be your biggest headache, but they are something to push into your enumeration backlog.

Closing Words #

Reducing our attack surface cannot be based on what we know of and what is obvious. Enumerating all the possible paths takes a lot of time and some effort from the attackers. Many of them won’t even try to cover all the bases and instead try to exploit a small set of high-severity vulnerabilities in products with a large install base. Nevertheless, defences should not rely on the inability of attackers to discover your resources.

In the next write-up in this series, I will go on to analyse and prioritise the findings from network exposure assessment and detail the later steps in the attack surface reduction process.

If you enjoyed this post or have thoughts you’d like to share, we’d love to hear from you! The best way to stay updated and never miss a post is by subscribing to our monthly newsletter. No spam, no sharing your details – just valuable insights delivered once a month straight to your inbox.

Subscribe Now