Over the past year or more I have created two machines for the Hack The Box platform, Cereal and Intelligence. When I started, and still to this day, outside of the official submission criteria there exists a distinct lack of information on creating content for Hack The Box, OffSec Proving Grounds, TryHackMe, etc. The following will be my thoughts and opinions on how to create a vulnerable machine and some of the lessons I have learned. If you have differing views on any of what I have to say feel free to message me on Twitter, Discord, etc and I would be glad to discuss.
The focus should be on teaching something new.
When designing and creating a vulnerable machine the goal should be to teach something. Sure there may be other reasons, but I believe the main reason someone will attempt to hack this machine you are creating is to learn something new. This could be a twist on an existing attack or a new technology, the sky’s the limit. There isn’t a tutorial or pdf guide to go along with the machine so this makes designing a machine even harder. I think that erring on the side of as much hand holding as possible is the way to go. What this looks like is that you want the user to know what they need to do or learn and not leave them entirely clueless. This is easier said than done. Especially if you want to require any significant amount of enumeration. You also have to take into account your target audience for the machine and their existing experience. Little clues and pointers along the way can be helpful. In hindsight, I definitely see areas where I could have improved on Cereal. For example, a big assumption jump in the user exploit chain is that there is an admin user visiting the site that you can attack with a cross-site scripting payload. From the functionality of the site you could assume it. However, what would have been better would be something along the lines of a message that the user’s request would be reviewed shortly or in x amount of time. That would have removed unnecessary guesswork.
Why create one?
Both of the boxes I submitted were created before there was any monetary incentive to create them. So why did I do it?
- You learn as teaching is one of the best ways to fully understand something. You really need a deep understanding of the vulnerabilities you utilize. It’s one thing to exploit something, it’s another to create a vulnerable system that doesn’t have an unintended compromise path, is stable under heavy load, and has a cohesive flow to it.
- It’s fun. Hacking, dissecting, and abusing things is enjoyable and is what I do day to day. Before my current job I spent a lot more time doing web development and server administration. Developing a vulnerable machine is a way to combine the hacking aspect with creating something. I used it as a means to learn and play around more with PowerShell, scripting, automation, CI/CD, Active Directory, and a whole host of other tools and applications I don’t use as much nowadays.
- There is an end goal that provides motivation to finish building something awesome. It’s cool going from some random thoughts of cool potential vulnerabilities to a machine that is live on a platform with 600k+ users. Overall, I had a great experience with the Hack The Box staff with the two machines I submitted. The main complaint I had was the time it takes for the review process. However, this is understandable with the amount of submissions.
What is lacking. Before anything I like to determine what the theme of the box will be. To do this I think it’s best to look at what is currently lacking. If the platform has had all Linux machines with the same type of vulnerability, change it up and create something different. One of the reasons for creating Cereal was I had recently taken the Offensive Security Web Expert course/certificate and observed there wasn’t much online in regard to source code analysis labs. That prompted me to create something with a heavy focus on source code analysis.
Find some vulnerabilities. Most of the ideas of vulnerabilities have come from either something I was learning and my current research projects or just cool and interesting ones I have come across during my job. For example, at the time of creating Cereal I was learning more about session riding, deserialization, and the SeImpersonatePrivilege and for this reason I wanted to utilize them in the machine. In general, most types of vulnerabilities can broadly be broken down into two categories: misconfigurations and insecure software. So for example SQL injection, XSS, SSRF, and various CVEs I would all classify as insecure software. On the other hand, misconfigurations are not something that a patch or update will fix. Instead these are more insecure Active Directory configurations, exposed sensitive APIs, and passwords stored incorrectly. For coming up with insecure software there are two main routes. That is to either build it or find it. This is where having some application development experience can come in handy if you want to develop a custom vulnerable application. To find existing vulnerable software you could look through recent CVEs, or even old ones for that matter, to find one that would fit. Another option to identify both potential misconfigurations and insecure software would be by looking through bug bounty reports. The following links are some of the best lists I know of that can also help out:
Support multiple users. It’s important to consider that most people will be attacking your machine at the same time and will be at different stages in exploitation. You want to not require anything to exploit it that would break the machine for anyone else or weaken the state of the machine. Any services and exploits should be reliable as well and not cause stability issues. This is one of the biggest things that can make a box unenjoyable. If you do require the user to modify, add, upload, etc then it’s good to have some scheduled task to go and clean up any files left behind.
Make it difficult in the right ways. Depending on the difficulty intended I suggest to try and make the path as clear as possible and have the exploitation of it be where the difficulty arises. When you are trying to compromise a machine the most demoralizing part is when you have no idea where to even start. What this looks like is giving choice pointers and hints. Don’t directly say what the vulnerability is but guide in the right direction.
Limit brute forcing. Tying into the last point, what you should not do is make the machine difficult by requiring brute forcing. Typically, this will just annoy users more than teach them something. If you do have any bruteforcing make it so a common wordlist will work. For password cracking you should use something from rockyou and if it’s a directory use something from raft.
Tying it all together. Once you have a main theme and several potential specific vulnerabilities, things start to fall into place and you can fill in the specific steps of the attack path or at least a rough outline of them. The most planning I did ahead of moving into implementation was a very rough draft. Especially considering the root portion of Cereal completely changed after feedback from HTB staff.
Don’t use a public domain or IP address. One thing that I learned after submitting my first machine was to not use any public domains or IP addresses. This was one of the many modifications I made to Cereal after receiving feedback from HTB and I even missed one. Since the top level domain of HTB doesn’t exist, a good domain to use would be the format machinename.htb. The risk here is that you don’t want anyone attacking the machine to accidentally attack a public system that is out of scope.
Automate everything. I started out without automating or scripting anything and quickly realized that was a bad idea. There will likely be many things on the machine that you screw up, misconfigure, or decide to completely redesign. Having the ability to just recreate the machine is incredibly helpful. There have been several machines that I have done where I could identify the exploit path from timestamps, system logs, and other artifacts left behind by the creator. With automation it is easier to have a clean machine without any of those. As a plus you can probably reuse portions of your scripts if you create more than one machine. There are tons of different options out there for automation so I won’t recommend any in particular and they vary between what operating system you use. However, a good starting point would be PowerShell for Windows and shell scripting for Linux.
Think about the resources. When creating a machine you need to consider the resources being used. If you are creating something with Windows I highly recommend using Server Core as it is more lightweight. Monitor the idle cpu and memory usage of the machine and identify any culprits. One problem I ran into was a user simulation program that used Chrome. Some tweaking of how frequently it updated was required to lower the resources it used.
Don’t take feedback too personally.
What makes a good machine? You will soon find that this is highly subjective. What one person finds as a fantastic machine could be boring to someone else. One person may love web application attacks and not active directory attacks or vice versa. Do not be surprised or disappointed when you do get negative feedback. Be careful to properly assess any feedback to determine if it’s subjective or something you actually could have done better. This is easier said than done. What I am not saying is to just disregard all feedback. Maybe the box really was hot garbage. Instead try and sift the subjectiveness out of the feedback. If someone tells you the box was bad without any context of why, that is not helpful at all. If the feedback includes specifics take note as good feedback is hard to find. A lot of people don’t recognize the amount of work that can go into it. So even if you create a flop, you still made something and can learn from it.