I have been invited by @Ninoseki to the Phishing Kit workshop created and animated with his buddies @sepi140 and @papa_anniekey during the AVTokyo 2020 event. Thank’s guys, really a great idea to do that for people.
Here my humble contribution to this workshop;
Workshop slides are available here:
In real life when you had a phishing kit, the idea is to extract as fast as possible where the stolen data are send back. You do that in order to process the take down as soon as possible for bothering the guy stealing creds or credit cards. The reporting of stealth credentials is done most of the time by email, but could be also local file creation or event re-posted somewhere by http. So, usually you may just have a dedicated server with sinkholing monitoring on email/web to see what happens…
Look at inetsim, it is wonderful for building that. We will talk about finding phishing kits in the wild maybe in a futur post.
However, sometime, it’s even more complicated, it fetch external files, had obfuscated admin etc… Well.. you have to dive into in order to understand it.
Today the goal is to learn how obfuscation is usually performed in php with such phishing kits. I will try to keep this “writeup” on one of the workshop exercice for beginners. The idea is to learn how obfuscation is done and tips on how de-obfuscating things using simply the linux console. Don’t hesitate to ping me if something is not clear. We will use a standard linux workstation, Ubuntu will do the job.
Remember they’re are plenty of nice website to help you in un-obfuscation, everything is in the workshop PDF;
https://gchq.github.io/CyberChef/
http://ddecode.com/phpdecoder/
Or even, I had once done a small contribution (php eval in python…because we can); which works not on this sample :)
https://github.com/Th4nat0s/Chall_Tools/blob/master/phpeval.py
Let’s start…We will un-obfuscate the 16Shop-AMZ-V19.zip (703067054ce912f5a341c4b65b3ebbdf). First, we start by extracting the archive. I recommend to use 7z, it is available in the package p7zip-full, this allows you to uncompress on linux even zip using AES encryption, or rar or whatever, where the open source zip tool is sometimes unable to un-compress your file.
7z x 16Shop-AMZ-V19.zip
if you take a look at the index one, you notice inclusion of auth.php and reference to amz2.php which is not included in this archive.. Weird :)
And if you look at auth.php you will notice already some trouble ;).. This is typical, that’s what we are looking for… It is a mess, pushed on a few lines.
We will use the SED command to simply convert “;” into “; + newline”. before that, I alway use dos2unix to ensure that ending file with CRLF (usually edited on windows) will be converted to simple CR character as Unix do. That a good habit to have, and if the file is already a CR one, it will not break anything. We will explain this strange SED command later.
cat auth.php | dos2unix | sed 's/;/;\n/g' > layer1.txt
Looking at the file now…
Ok.. It shut the warnings at line 1, then it define and open a file which is “itself” and seek into it (read ahead). We notice that it jump in this file until this variable COMPILER_HALT_OFFSET. But what is this variable ??. If you look at the documentation, it is related to the php function __halt_compiler() which stop the PHP execution when reached. This offset is an integer containing the offset of the function.
https://www.php.net/manual/en/function.halt-compiler.php
Indeed __halt_compiler() is used at the end of the script, with a bunch of garbage data after.
Ok.. Let’s dig a bit after that. Remember this, for un-obfuscating we will looks for eval(). Eval() is a dirty function that will execute the code given as parameter. The parameter will be a string to execute containing PHP code. But of course it will be obfuscated. And that’s the way it is always done.. They will hide the strings with the most spaghetti’ble-code possible … and then they eval() it.
So to do that they use for obfuscating usually;
Substitution… a lot… For example,
$aNjE4NjIwNDEzd="\x62\x61\163\145\x36\64\137\144\x65\143\x6f\x64\x65"
It’s simply char encoding, \x00 is the char code in hexadecimal. \000 is in decimal. Now you can understand, we have many substitution here… It put in variables $a the value “h”… etc…
What we can see here is replacements ($X = “Y”). In linux console you may replace strings with sed, yes… well.. you will prefers your notepad++ no problem do as you can :) (NB: I hate NANO)
The function hex2bin() here is hidden with a dual substitution :) … x86 = “h” x65 = “e” x78 = “x”… then string concatenation $a$b$c…
$a="\x68"; $b="\x65"; $c="\x78"; $d="\x32"; $e="\x62"; $f="\x69"; $g="\x6e"; $kMTUwNjIyNTU5s="$a$b$c$d$e$f$g";
Be carefull they’re are many “substitution in this file”. In fact you may see it’s “Twice the same shit”.
$ cat layer1.txt | grep base $aNjE4NjIwNDEzd = "base64_decode"; $aNzk0MjAxOTMyd = "base64_decode";
Again, you may use sed. Sed got this syntax “s/in/out/g”. S mean search.. search the “in” and replace it by “out”. G means do it for every occurrences found. We protect the $ char with \ since sed is regular expression aware. Be careful of that, you probably need to protect a lot of chars “.\/[]*^$”
cat layer1.txt | sed '/\$aNjE4NjIwNDEzd/base64_decode/g'
Well I think every body got the point now… It simply substitute the functions names.
eval("?>".$kMTUwNjIyNTU5s($det($wMTQ4NjQyMzE0z($gODY5MjczNzY2f($aNjE4NjIwNDEzd("7U0Jrt...
means…
eval("?>".$hex2bin($str_rot13($gzinflate($str_rot13($base64_decode("7U0Jrts...
This kind of obfuscation is more than often used, it use function in function in function. I call that personally “obfuscation matryoshka”. And this kind of scheme seems used twice in this file. The php file looks constructed like this.
- [open file and seek]
- [substitution]
- [eval1 matriochka]
- [substitution]
- [eval2 matriochka]
- [garbage]
To extract a “field” from this mess, you may use CUT.. Try to see this line as a table in Excel. If I use the char ” as column delimiter it looks like this following table and if you select the 4th Column, you have extracted the B64 array.
1 " 2" COL 3 " COL 4 eval("?>".$hex2bin($str_rot13($gzinflate($str_rot13($base64_decode("7U0Jrts...
“grep” may help you to lock and work on the good line…
In linux, you may have easily most obfuscation primitive and you may play and pipe them it is usefull.
base64_decode() is “base64 -d”
rot13() is “tr ‘A-Za-z’ ‘N-ZA-Mn-za-m'”
str_rev() is “rev”
hex2bin is “xxd -r -p”
You may also just edit the code replace the eval() by a print() and php it to reveal another level. and replace it in your file. Do as you want ! But don’t miss any eval() :)
So, that’s how obfuscation work. It’s fun to do it by hand.. But well, you know after tons of php deobfuscation, it’s not soo so fun at all :) … Let’s use a un-obfuscator tool to let it more viewable… It will replace hex code (\x41 by “A”), help you on formatting by putting new lines where needed and the tool will perform some evaluation. Sometimes it’s stuck in the middle, you have to help it a bit.
We will install, php Deobfuscator from Simom816
$ sudo apt-get install php $ git clone https://github.com/simon816/PHPDeobfuscator.git $ cd PHPDeobfuscator/ $ composer install
Then use it to improve your source and extract in in layer1.txt
php index.php -f ../auth.php > layer1.txt
We will look at the eval thing… Everything was already addressed by PHPDeobfuscator but it is stuck on a hex2bin.if you look at the code Eval will “run” the code obfuscated behind the hex2bin. To convert hexadecimal in binary data, you may use xxd to replace the hexdump.
let’s do it for fun with command line.
cat layer1.txt | grep eval | head -n 1 | cut -d '"' -f4 | xxd -r -p
With grep we will select the line with “eval” in it, then with “head -n1″ we will select the first line, then with cut we will use ” as column delimiter and extract the 4 column, which is the hexadecimal data and finally xxd -r (reverse) -p will print it. If you want the last line, use tail instead of head command.
Again, PHPDecode, you will find a hexsuite…. same story again… xxd :)
So the idea is to replace the eval that you have found in the source file where the obfuscated eval was. Remember you may wipe the line and mark thing with sed always. Here we replace lines starting with eval up to end of file by the “XXX PUT CODE HERE XXX”. It will help you to find where to put the new code. But your notepad++ is fine too :) you have to fix php markers (<?php) remove the useless ones.
php ./index.php -f layer3.txt | sed s'/eval.*$/XXX PUT CODE HERE XXX/' > layer4.txt
Let’s dig this mess.. looks twice the same but not, it’s multiple run. So after a while decoding and replacing and printing :) you will find another “substitution” scheme.
So, first it reverse the string CET then it does…
$cet=strrev($cet);
$cet= str_replace("#watefuk#","g",$cet);
$cet= str_replace("#easy#","n",$cet);
$cet=str_replace("__%","z",$cet);
$cet=str_replace('..]\^_',"R",$cet);
$cet=str_replace(">^ol%%","I",$cet);
$cet=str_replace('[..%__',"d",$cet);
eval("?>".$scream($cet));
Again, a lot of stupid string substitution… $cet is a reverse of the string $cet… In bash you could use rev.. $scream is base64_decode. So after a couple of evaluation like this…
You may finally find the final code with the titidsss magic function.
eval("?>".titidsss(stream_get_contents(HELLO)));
Within the two hex2bin things which contains …
error_reporting(0);
$file = file_get_contents(LOCALFILE);
if(preg_match('/echo/',$file) or preg_match('/print/',$file)){
unlink(LOCALFILE);
die("X_X");
}else{
function replacedata($data) {
$data = str_replace("#_^wkss|#","g",$data);
$data = str_replace("#as|k#","n",$data);
return $data;
}
and….
function titidsss($data) { $data = replacedata($data); $data = str_replace("_%","z",$data); $data = str_replace("..]\^",'R',$data); $data = str_replace(">^ol%%","I",$data); $data = str_replace("[..%",'d',$data); return base64_decode($data); }
And if you take a look a the beginning of the file , remember what we have found in the beginning.
error_reporting(0);
define('LOCALFILE', "/var/www/html/auth.php");
define('HELLO', fopen("/var/www/html/auth.php", 'r'));
fseek(HELLO, COMPILER_HALT_OFFSET);
So now you know… Hello is the data “After” the COMPILER_HALT_OFFSET string.
This garbage at the end of the file is actually the real code obfuscated. Let’s extract the data.. First, we create a file named hidden_data containing the last line.
cat layer1.txt | tail -n 1 > hidden_data.txt
Add the previously $data substitutions in the tidissss function and in the $replacedata function, edit the 1st line to put the garbage into a variable. And finally print instead of evaluate the resulting base64.
Then just launch it…
php hidden_data.txt
And you will be victorious. You get the code and clear functions, you will see that it retrieve additional data for the phishing kit through the server configured in the file server.ini. You will understand the code in index.php now and references to the mysterious missing file and how to eventually fetch the additional things using a POST.
Hope this could help anybody to understand php obfuscation and give you want to of have fun with phishing kits hunting.
Thank’s again 🙏 to @sepi140 @papa_anniekey and @ninoseki for learning how to handle phishing kits to people. On my side I was really fun for me to hear “[japanese ciphered_text]…….Thanat0s-san…..[japanese_ciphered_text]”, without understanding anything of the sentence :).
Nice event