Working Of Domain Generation Algorithm (DGA)

1. What is a Domain Generation Algorithm (DGA)?

Earlier domain-based attack methods had limitations i.e., attackers depended on a limited set of domains or IPs, making it easier for defenders to recognize patterns and disrupt C2 communication. Modern malware, however, employs more sophisticated techniques such as encrypted channels and DGAs, which bypass traditional blacklisting and necessitate advanced detection methods.

Among these, DGAs have become a powerful technique used by adversaries, as classified under MITRE ATT&CK – T1568.002. – (Dynamic Resolution: Domain Generation Algorithms, Sub-technique T1568.002 – Enterprise | MITRE ATT&CK®, no date).

At its core, a DGA is a deterministic algorithm that generates domain names based on specific input values. – (What is a DGA?, no date). Here’s how it works:

  • The algorithm produces domain names using predefined seed values. Because it is deterministic, the same input will always generate the same set of domains. If the seed value changes, the resulting domains will also change.
  • Malware containing the DGA code propagates to infected systems, where it dynamically generates domain names.
  • The attacker, having an identical copy of the DGA with the same seed input, generates the same list of domains on their end.
  • These dynamically generated domains serve as endpoints for C2 servers, making it significantly harder for security teams to blacklist suspicious domains since they are constantly changing.

By frequently rotating domains, DGAs provide a resilient mechanism for malware communication, challenging defenders to develop more sophisticated countermeasures for effective mitigation.

Here is a typical workflow of a C2 server communication through a malware. Later, we will apply DGA into this workflow so we can understand it better.

Fig.1. Standard C2 communication.

2. Attacker Workflow

When the attacker creates their DGA, they have a copy of it for themselves in which they would get their generated domains. – (CEH, 2024; What is a DGA?, no date).

The attacker from his/her end, registers only one to a few of the generated domains that he/she selects, at the domain registrar (like GoDaddy) to make them seem official. This would make it harder for defenders to predict which ones will be active and it’s also cheaper to register a few domains than to register all of them. The selection usually depends on the cost of the TLDs and the availability to see if the domain is already registered by someone else.

If the domain is available, the attacker registers it by providing fake information to stay anonymous, and they would proceed with their payment with either cryptocurrencies or stolen credit cards to avoid linking the transaction back to themselves. They would do the same for other selected inactive domains.

After registration, the domains that were registered become active and the attacker has two options now:

  1. Configure all the registered domains to point to their one and single Command-and-Control (C2) server by configuring their DNS settings to set the domain to resolve to their C2 server’s IP address. For example, if “example-malicious.com”, “thisisfake.com”, “securecloud.org” are the registered domains, their DNS A-records will point to a single IP address – 192.0.2.123, which is the IP of the attacker’s C2 server.
  2. The other option, which is what adversaries go for usually, is that they would use multiple C2 servers where each registered domain points to a different C2 server. Each C2 server has their own IP addresses that the attacker configures. So, in case if one of the domains are detected and blocked by security teams, the malware can simply resolve the other registered domains to their own respective C2 servers to ensure uninterrupted communication.

Now the attacker has done their part, they would then create their malware which would be executed on the target system. This malware would use the same DGA and generate the same domain list.

The malware would try and resolve all of these domains from its list to their IP addresses. There won’t be any resolving output connection for the non-registered domains because those domains are inactive (they have no IP addresses configured). But, when it queries the attacker-registered domain(s), it resolves the domains to the IP addresses that the attacker set up for that registered domain i.e., to their C2 server or C2 servers. The infected system would then communicate back to the attacker’s C2 server(s).

When the querying of the attackers registered domain happens, the malware contains a domain resolution module to do that querying. The malware would use a standard API e.g., gethostbyname() to perform DNS lookups for each domain. It cycles through the list of domains and attempts to resolve them one by one. Whatever domain was registered, the IP addresses would get resolved.

After resolving a domain to an IP address (to the C2 server), the malware uses the communication protocol module and tries to connect to the C2 server using network protocols like HTTP(s), TCP, or UDP.

The attacker can distribute this attack to a botnet of computers. This happens when multiple systems are infected with the malware, and all these systems would communicate with the C2 server(s) as a part of the botnet. This communication can take place in two ways:

  • One infected system and one C2 server at a time, the rest are backups in case one domain gets disrupted – useful for smaller botnets or targeted attacks.
  • All the domains in the infected systems simultaneously communicate with each of the C2 servers at the same time – useful for large botnets for load balancing

Now, from the basic C&C attack technique view from fig (1), when we apply the involvement of DGA, we get:

Fig.2. DGA-based C2 communication.

3. What can the attackers do now?

To be direct, they can do lots of things:

  • Data Exfiltration: Attackers use the C2 connection to transfer stolen files from the victim’s system to their own servers.
  • Remote Access and Control: Attackers can execute commands, upload/download files, and interact with the system as if they were physically present. (refer to my other article to understand more about RAC with remote access trojans (RATs) – https://kkalvani.wixsite.com/my-site/post/cybersecurity-awareness-month ).
  • Install additional malware: Attackers may download and install other malware, such as ransomware, spyware, keyloggers, or cryptocurrency miners.

The list can go on. Doing all this can enable lateral movement, credential harvesting, data encryption (ransomware infection), spying on the victim, persistence and evasion, weaponizing the bot for a DDoS attack, etc.

4. What are security teams doing about this?

Security teams defend against DGA attack techniques through the following ways (Team, 2024):

Reverse Engineer the DGA: Analyze the malware to understand the algorithm and predict the domains.

Use DNS Filtering and Analysis: Look for domains with high entropy (randomness) or random patterns.

Behavioural Analysis: Identify unusual DNS activity from infected systems.

Collaborate with Registrars: Work with domain registrars to pre-emptively block or sinkhole predicted domains. Sinkholing works when the security teams are able to successfully predict the domain generating pattern from the malware and they may either register them or take control of them if they are already registered, usually by working with domain registrars. This makes the domains “active” but instead of resolving to the attacker’s C2 server, they are redirected to a controlled, safe server (the “sinkhole”).

This redirection process allows security teams to:

  • Analyze the traffic: They can study the malware’s behaviour and track the infected systems.
  • Prevent communication: Malware cannot establish a connection with the attacker’s C2 server, effectively disrupting the attack.
  • Monitor infections: By observing which systems are attempting to connect to the sinkhole, defenders can identify infected machines in a network.

Domain monitoring and blocking

Security teams monitor domain registrations and traffic for patterns indicating DGAs (IoCs) like non-human readable domain names, or rapid domain lookups to flag them and add them to blocklists.

AI and ML models

They are trained to distinguish between legitimate and DGA-generated domains based on features like length, character composition, etc.

DNS traffic analysis

Security tools analyze DNS query logs for unusual patterns like, high volumes of queries to non-existent (non-registered) domains or domains that don’t resolve. With this, they can track whichever infected system attempts to resolve DGA-generated domains.

Threat Intelligence Feeds

Security tools integrate with threat intelligence platforms to keep updated lists of known DGA patterns, algorithms, or active malicious domains.

5. References

CEH, S.B.C., CCSP, CISM, OSCP (2024) ‘Protecting Against Cyber Threats: The use of Domain Generation Algorithm (DGA) by threat actors’, Medium, 2 March.

Available at: https://osintph.medium.com/protecting-against-cyber-threats-the-role-of-domain-generation-algorithm-dga-80c3ec3cda9f (Accessed: 2 January 2025).

Dynamic Resolution: Domain Generation Algorithms, Sub-technique T1568.002 – Enterprise | MITRE ATT&CK® (no date).

Available at: https://attack.mitre.org/techniques/T1568/002/ (Accessed: 2 January 2025).

Team, V. (2024) Demystifying Domain Generation Algorithms, Vercara.

Available at: https://vercara.com/resources/demystifying-domain-generation-algorithms (Accessed: 2 January 2025).

What is a DGA? (no date) Search Security.

Available at: https://www.techtarget.com/searchsecurity/definition/domain-generation-algorithm-DGA (Accessed: 2 January 2025).

AUTHORS

Krish Kalvani


M.Tech in Cybersecurity, Batch 3
Academic Year 2024-25

Leave a Reply

Your email address will not be published. Required fields are marked *