One of the fundamental inputs to any network security plan is an understanding of the network boundary. This boundary, which defines what is reachable and whence, is also referred to as the attack surface. Both defenders and attackers have a keen interest in understanding the attack surface for an organization — defenders must understand what is exposed so they can prioritize what to maintain and monitor — attackers must understand what is reachable so they can decide which services are available to compromise.
Randori’s Automated Attack Platform includes a black box discovery component to help our system automatically discover an organization’s attack surface as part of the attack process known as reconnaissance. There are several techniques involved in this process, one of which is the discovery of subdomains though certificate transparency logs. The foundation of security on the web today is rooted in the trust of certificate authorities (CAs) and the certificates they issue. A web server can prove its identity to a web browser by presenting a certificate signed by a certificate authority. Many certificate authorities publish all the certificates they issue in publicly available databases known as transparency logs.
One of the more recent ways a web browser decides to trust a particular certificate is by checking the transparency logs to ensure the certificate presented is the same as the one reported as having been issued by the authority. However, the public availability of these certificates also means it is possible to search, for example, for all the certificates matching a particular domain name. This is the mechanism used in this discovery technique.
There are dozens of different transparency logs, but luckily there is a service that aggregates the major logs into a single searchable database: https://crt.sh
Using this service is easy. To find the certificates issued to the randori.com domain, just search randori.com. At the time of this writing there were between 50 and 100 records turned up by this search for around five different subdomains.
However, going to a website and searching isn’t always convenient and not every mechanism of searching is obvious with the web interface. For more complex queries, crt.sh service provides direct database access, and custom postgresql queries are possible.
To connect to the crt.sh database, use the psql utility:
psql -P pager=off -P footer=off -U guest -d certwatch --host crt.sh
In our case, we are looking for distinct subdomains. Compiling a list of known subdomains is accomplished with the following query:
select distinct(lower(name_value)) FROM certificate_and_identities cai WHERE plainto_tsquery('randori.com') @@ identities(cai.CERTIFICATE) AND lower(cai.NAME_VALUE) LIKE ('%.randori.com')
In plain English this query says: Return a list of all the unique domain names from the table of certificates where ‘randori.com’ appears anywhere in the identities (subject name and alt names) and exclude any that don’t end in randori.com. The operation to search over the tokenized identities allows the queries to complete very quickly, whereas searching the wildcard domain match alone would take significantly longer and will often timeout.
If you have docker installed, the default postgres image can be used which includes the psql utility. Putting it all together, this snippet can be added to your path (or shell profile) to make subdomain discovery from the command line as simple as
crt.sh <domain> .
if [ "x$1" = "x" ]; then echo "Usage: $0 domain-name" exit fi Q="select distinct(lower(name_value)) FROM certificate_and_identities cai WHERE plainto_tsquery('$1') @@ identities(cai.CERTIFICATE) AND lower(cai.NAME_VALUE) LIKE ('%.$1')" docker run -it --rm postgres psql -P pager=off -P footer=off -U guest -d certwatch --host crt.sh -c "$Q" | sed -e '$d' -e 's/^ //' -e '1,2d'
Try it out!
$ crt.sh randori.com