Last blog post in this series described the analysis of the attack with the use of webshells. Such attacks showed how difficult it is to ensure the security of the entire infrastructure to defend against them. This part focuses on the evaluation of available tools and providing prevention and mitigation recommendations.
Webshell detection tools
I have evaluated the following projects focusing on webshells detection:
These tools were tested against the files presented in part 1 with addition of a few new ones:
- byroe.jpg - webshell hide in an image file
- myluph.php - example of PHP webshell
- webshell.php - simple PHP webshell presented in part 1
- vero.txt - PHP webshell containing both “clean” and obfuscated PHP code
- myluphdecoded.php - decoded file myluph.php
- China Chopper - ASPX chinachopper.aspx and PHP version chinachopper.php
- c99madshell.php - popular C99 webshell
- unknownPHP.php - shared by Bart in his blog post
The conducted tests verified the detection accuracy of all tools when faced with a combination of different webshells mixed with hundreds of valid files from GitHub repositories and other public sources:
- index.html from different popular websites
- ASPX files
- PHP files
- JavaScript files
NeoPI
At first, I tested NeoPI. According to project’s GitHub page, NeoPI is a Python script that uses a variety of statistical methods to detect obfuscated and encrypted content. Below output presents result of running a tool against a set of aforementioned files:
[[ Total files scanned: 4323 ]]
[[ Total files ignored: 0 ]]
[[ Scan Time: 16.773207 seconds ]]
[[ Average IC for Search ]]
0.0762022597838
[[ Top 10 lowest IC files ]]
0.0153 ../webshell_db_short/myluph.php<
0.0168 ../webshell_db_short/vero.txt
0.0202 ../webshell_db_short/unknownPHP.php
0.0248 ../webshell_db_short/phpcollection/2.php
0.0262 ../webshell_db_short/myluphdecoded.php
0.0268 ../webshell_db_short/phpcollection/wkv3.php
0.0270 ../webshell_db_short/china.aspx
0.0284 ../webshell_db_short/phpcollection/agenda.ics.php
0.0285 ../webshell_db_short/phpcollection/config.xml.php
0.0289 ../webshell_db_short/phpcollection/uploads.php
[[ Top 10 entropic files for a given search ]]
6.2409 ../webshell_db_short/phpcollection/phpmailer.lang-zh.php
6.2355 ../webshell_db_short/phpcollection/phpmailer.lang-zh_cn.php
6.1932 ../webshell_db_short/unknownPHP.php
6.1622 ../webshell_db_short/phpcollection/phpmailer.lang-ch.php
6.0307 ../webshell_db_short/vero.txt
6.0258 ../webshell_db_short/myluph.php
6.0151 ../webshell_db_short/phpcollection/phpmailer.lang-ko.php
5.9169 ../webshell_db_short/phpcollection/phpmailer.lang-ja.php
5.7736 ../webshell_db_short/phpcollection/1.php
5.7393 ../webshell_db_short/phpcollection/phpmailer.lang-vi.php
[[ Top 10 longest word files ]]
554750 ../webshell_db_short/phpcollection/wkv3.php
11999 ../webshell_db_short/phpcollection/full_dump.php
11999 ../webshell_db_short/phpcollection/contentobjects.php
1774 ../webshell_db_short/myluph.php
660 ../webshell_db_short/vero.txt
641 ../webshell_db_short/c99shell.php
547 ../webshell_db_short/phpcollection/EmailAddressValidator.php
356 ../webshell_db_short/phpcollection/priv.txt
197 ../webshell_db_short/phpcollection/emission.xml (2).php
197 ../webshell_db_short/phpcollection/emission.xml.php
[[ Top 10 signature match counts ]]
85 ../webshell_db_short/c99shell.php
35 ../webshell_db_short/phpcollection/run-tests.php
27 ../webshell_db_short/phpcollection/WikiComments.aspx
24 ../webshell_db_short/phpcollection/MemberSearch.aspx
22 ../webshell_db_short/phpcollection/CustomPageManagement.aspx
22 ../webshell_db_short/phpcollection/Comments.aspx
20 ../webshell_db_short/phpcollection/phpmailerTest.php
20 ../webshell_db_short/phpcollection/ManageTerms.aspx
20 ../webshell_db_short/phpcollection/TimestampIntegrationTest.php
17 ../webshell_db_short/byroe.jpg
[[ Top cumulative ranked files ]]
56 ../webshell_db_short/myluph.php
57 ../webshell_db_short/vero.txt
176 ../webshell_db_short/c99shell.php
219 ../webshell_db_short/phpcollection/wkv3.php
225 ../webshell_db_short/phpcollection/1.php
372 ../webshell_db_short/myluphdecoded.php
444 ../webshell_db_short/phpcollection/profile.php
525 ../webshell_db_short/phpcollection/WikiComments.aspx
570 ../webshell_db_short/phpcollection/uploadpostattachment.aspx
595 ../webshell_db_short/phpcollection/Fields.aspx
Pros:
- detection ratio: 6 out of 9 webshell files
- successful detection of clean and obfuscated code of the same webshell
- the more complex code structure is, the better results and detection ratio
- various methodologies to detect webshells - signatures, index of coincidence (IC), ratio, entropy, longest keyword matching
Cons:
- failed detection of simple one-line webshells (e.g. China Chopper)
- false negatives and positives in different categories, including final rankings
- manual triage and additional analysis of the highlighted files is required for some of the methodologies (e.g. entropy, keyword matching)
- signature database is outdated as the project appears to be not developed anymore
- webshells hidden inside of another file format (byroe.jpg) will be not detected in wide spectrum of files - NeoIP produce massive false positive
I’ve noticed it would be really helpful to combine summary information about a files detected by more than one heuristic. For instance in my test byroe.jpg was visible in top ten signature matches, longest word and entropy but not in Top cumulative ranked files.
Taking into account that NeoPI wasn’t updated for last 4 years, didn’t detect all types of webshells, generated number of false negatives, it still had quite impressive detection rates of a relatively new webshell samples. I can recommend adding NeoIP to webshell analysis toolbox. InfoSec Institute has a nice write-up on NeoIP with some additional details.
Shell Detector
Shell Detector was a second tool that I have evaluated. I really liked how the results were presented in console:
There is also a web version available here.
Pros:
- detection ratio: 7 out of 9 webshell files (5 as suspicious + 2 webshell)
- successful detection of clean and obfuscated code of the same webshell.
- provided final results in clear graphical form
Cons:
- 131 false positives based on suspicious word existence
- only signature based detection
- webshell signature database out of date
- sluggish interface when number of results is too high (Web version)
- signature database is written in serialized php format (not scalable)
- byroe.jpg was not detected by Shell Detector - not support JPG files
To sum up even though the signature database file appears to be out of date the tool correctly determined almost all files to be malicious. This tool can provide powerful detection capability as long as signature database is kept up to date.
LOKI
LOKI presents scan results in a terminal, coloring entries depending on their severity. It also outputs all matches to a single log file. The rules are written in YARA, easy to use yet very powerful language to identify and classify malware which appears to be a tool of choice by the security industry. According to project’s website most effective rules were borrowed from the rule sets of his bigger brother THOR APT Scanner. For me, the most interesting were the ones dedicated to webshells detection.
My first scan of a sample set with a default signature database showed moderate detection ratio (5/9). With YARA growing popularity among infosec world, it’s possible to build and maintain a powerful database to hunt malware including webshells and research new obfuscation techniques and variants observed in the wild. Taking that into account, I decided to improve the results obtained previously. I found set of rules, that almost perfectly match my expectation. After a quick adjustment, final score was close to ideal - ratio (8/9). It were really a tiny changes, so I’ll shortly describe it:
- Change $php parameter to “<?” in new rule created based on misc_php_exploits
- Add “system($_REQUEST” in misc_php_exploits and newly created rule from point above
- Remove two strings in rule misc_shells - $s6 and $s8 (that one was even marked with a comment that it could generate FP, so it was easy ;)
After all of that, as a result I received the biggest advantage of LOKI - false positive number was zero!
Pros:
- detection ratio: 8 out of 9 webshell files
- successful detection of clean and obfuscated code of the same webshell.
- provided final results in clear log file
- zero false positives(but that really depends on Yara rule set you use)
- easy to develop signatures based on Yara rule
- supports all extensions
Cons:
- only signature based detection for webshells
Summary
To sum up the results from all the tools, it’s really hard task to develop one tool which will mark with good accuracy webshells as suspicious. It’s because there is a wide range of different functions, methods, encodings which would be use to achieve the same effect. Attackers don’t need to use base64_decode function to decode their base64 code. Instead, they can add their own proprietary function to do exactly that. They can use a string lookup array to avoid keyword-based detection or invoke function names by string with str_replace and much more. Imperva did a great research describing various teqchniques in their blog post.
The only webshell not detected by LOKI was unknownPHP.php which obfuscation technique is really advanced - thanks to Darryl from Kahu Security, you can follow the decoding process in a great post. As its not possible to detect it using general signature rules, NeoPI methods (entropy, Index of Coincidence) are an excellent solution for this kind of backdoors. Together with LOKI, it seems to be a powerful weapon to detect webshells.
Prevention and mitigation
There are a few things that can be done to protect organizations against a server compromises:
- PATCH! - it sounds silly, because it seems SO obvious but last year showed that even a well-known attack like Heartbleed doesn’t guarantee that administrators do their job. Two months after the public release, there were still around 300k vulnerable servers
- harden your web server - implement a least-privileges policy on the web server, limit script execution permissions in specific locations etc.
- deploy DMZ (demilitarized zone) - enable logging of allowed and blocked traffic, limit interaction between DMZ and your production environment
- deploy reverse proxy with WAF (Web Application Firewall) - restrict accessible URL paths for only legitimate sources using for example free Mod-Security or other comercial product, consider fuzzy hash matching
- regular test your environment - conduct virus signature(e.g. use by WAF) checks, application fuzzing, code reviews and server network analysis
- regular test system and application - regularly check the application’s security - pentest and vulnerability scans to establish areas of risk
- versioning + backup - establish offline a “well-known good” backup all critical servers, enable monitoring for changes to have clear history on servers
- user validation - employ user input validation to restrict local and remote file inclusion vulnerabilities
- scan all incoming files to web server (if you accepting file upload from users) - as it was shown before, the administrator can not trust the extensions of the files, all of this could be just a trick to hide malware
- always follow up social media discussion!;)
When #ThreatHunting try and define a narrow scope of what you are looking for. I have a thing for webshells lately so… #DFIR 1/8
— Jack Crook (@jackcr) May 10, 2016
Look at processes that are spawned by the owner of the webserver process #DFIR 4/8
— Jack Crook (@jackcr) May 10, 2016
Look at POST requests with no referrer and a 200 response code #DFIR 5/8
— Jack Crook (@jackcr) May 10, 2016
Look for POST requests to new directory paths and filenames with a 200 response code #DFIR 6/8
— Jack Crook (@jackcr) May 10, 2016
Community also has its own ideas:
@jackcr baseline the web server/ app error logs. Focus on exceptions about previously not seen file names e.g -> https://t.co/gIOFcE6wgI
— dfir_it (@dfir_it) May 10, 2016
@jackcr File: size, ext, owner, location, content. Request: UA, URI/params, internal 2 internal, interval/duration/size of requests
— Glenn (@hiddenillusion) May 10, 2016
- AV/HIDS scan of the web server…
Let me digress a little about the last recommendation. First of all, as you know, AV is not a fail-safe mechanism, so you cannot trust it fully. AV products do not protect against all types of attack vectors. It is relatively easy to bypass AV. As a result, you can at least block known malicious code (detected by signatures or heuristics) - not ideal but still an advantage.
When you’ve got AV on your web server (or any other machine for that matter) you need to know that there are costs involved:
- introduce additional risk to your machine by adding code which could be vulnerable to different type of attacks like RCE, local priviliges escalation, sandbox escape, etc. Details can be found on Joxean Koret’s presentation or Google Project Zero posts (1, 2)
- performance - every AV generate some efficiency loss, it is periodically measured and reported by AV-Comparatives organization - lastest can be found here
Conclusion
The whole series was intended to familiarize you with how popular, diverse and at the same time dangerous are attacks leveraging webshells. As the second part of this series showed, crooks aim was targeting specific companies and webshells are only a small part of bigger plan. Variety, diversity and simplicity of webshells causes the defense against them to be a very difficult task. Even if you fill all the recommendations of the section “prevention and mitigation” does not guarantee that your application/environment is 100% safe, but it is important to build security in a comprehensive manner and to leave as little space as possible to beat our “entanglements” ;) Keep fighting! Keep defending!