Module Development Guide
This short guide aim to help developers to write and integrate new functionalities in the Cerberus-Sandbox.
In Cerberus-Sandbox, every functionnal piece of code can be describe as a module.
A module can work as a standalone tool, but can also be easily integrated in the general workflow of the sandbox.
The following will give you an insight on what's a module, how to make it works, and how to integrate it into Cerberus.
General Layout
Modules are stored as Python scripts under cerberus/core/static_engine/Modules/
.
By keeping everything in a modular architecture, we aim to encourage developers to add custom functionnalities that match their needs, without having to understand the whole code-base of the project. It is also easier for the inital development stage and for further official upgrades, as we don't need to deal with already existing works and only focus on the core of the functionnalities that we want to add.
It also gave the ability to easily write new connector with already-existing products, wich is something we want to push forward.
Orchestrator
The script cerberus/core/static_engine/main.py
is in charge of the orchestration of the modules. This script is the one that gather, parse and execute modules.
If a new Python script is simply dropped in the modules folder, without any tweaks to make it works with the so called orchestrator, it will simply be ignored at runtime.
If your module is malformated, this is the script that will tell you why and where.
Don't make modification to the core of this script in order to match your needs ! If you have to, it means that your module is not well-written.
Module layout
Each module must have a header with the following items:
PRIORITY =
INPUT = ['']
OUTPUT = ['']
RETURNS = ['']
RETURN_CODE = { 0 : "Something" }
Priority
The PRIORITY
keyword is used to tell in which order the modules must be launched.
Priority numbers don't have to be contiguous. If you want to write several modules, you can reserve an arbitray range of priority from X to Y.
Input
The INPUT
keyword is used to tell which argument your module need in order to works.
It's arguments are stored as a python array of strings.
The available arguments are: * 'file' : Absolute path to the input binary to deal with. * 'online' : Boolean that indicate if the scan is online of offline. * 'id' : Identifier number for the scan. * 'None' : No input arguments.
Their is no order for thoses arguments, but keep it aligned with your main()
declaration.
If your main() function is declared as such:
def main(input_file, is_online):
The input
tag must follow the same order:
INPUT = ['file', 'online']
Output
If your module require to store something to the filesystem as an output, you can specify it in the OUTPUT
tag.
It's arguments are stored as a python array of strings.
You can specify relative hardocded path to the output
tag, but keep in mind that the translation of the following keywords is automatically done, in order to adda layer of absraction from the module's point of view.
- 'base' : Path to the base of the scan filesystem
- 'sha1hash' : SHA1sum of the given binary
- 'sha256hash' : SHA256sum of the given binary
- 'md5hash' MD5sum of the given binary
- 'id' : ID of the scan
- 'name' : Original name of the given file
For instance, the following output tag is valid, and will be converted to the right absolute path at runtime:
OUTPUT = ['/base/sha256hash/id/md5hash.disass']
Returns
If your module returns a specific information that need to be taken in count by the next modules, you can specify it with the RETURNS
tag.
It's arguments are stored as a python array of strings.
Return code
In order to raise custom return code and errors for your module, use this tag to specify what are the available return code of your script, and what they mean.
This is stored as a Python dictionnary, with an integer key and a string value (return code id : description).
The following return code range need to be used when writting a module: * from 00 to 09 : General state (used to avoid 'None' as a return value. Keep tracks of the module's state) * from 10 to 19 : Critical errors (failure condition: will immediately stop any further modules) * from 21 to 50 : Conditionnal errors (will shortcut some specific modules)
Here is a snippet of already implemented return code:
0: 'Module ended without errors'
1: 'Module ended with errors'
2: 'Scan succefull'
10: 'File not found'
11: 'WinAPI references not found (deprecated)'
12: 'Not a valid PE file'
13: 'Network error'
14: 'Metadenfender does not have an entry for the submited sample'
21: 'Is a dot net executable'
22: 'Is malformated'
23: 'Scanned binary may be packed'
24: 'Scanned binary does not seems packed'
Module template
The following template can be used to write a new module:
# Module header
PRIORITY = 55
INPUT = ['file']
OUTPUT = ['None']
RETURNS = ['None']
RETURN_CODE = { 99 : "Template module", 100 : "Something else" }
# Main code
def main(path):
if do_something(path):
return 99
return 100
# For standalone execution of the script
if __name__ == "__main__":
parser = ArgumentParser(description='Module Name')
parser.add_argument('-i', '--input', help='PE File to scan', metavar='input-file.exe', required=True)
parser.add_argument('-l', '--log', help='Log level. Default=INFO', choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'], required=False, default='INFO')
args = parser.parse_args()
path = args.input
main(args.online, args.input)