SyntheticSun is a defense-in-depth security automation and monitoring framework which utilizes threat intelligence, machine learning, managed AWS security services and, serverless technologies to continuously prevent, detect and respond to threats.

You sleep in fragmented glass
With reflections of you,
But are you feeling alive?
Yeah let me ask you,
Are you feeling alive?

Synopsis

  • Uses event- and time-based serverless automation (e.g. AWS CodeBuild, AWS Lambda) to collect, normalize, enrich, and correlate security telemetry in Kibana
  • Leverages threat intelligence, geolocation data, open-source intelligence, machine learning (ML) backed anomaly detection and AWS APIs to further enrich security telemetry and identify potential threats
  • Leverages Random Cut Forests (RCF) and IP Insights unsupervised ML algorithms to identify anomalies in timeseries and IP-entity pair data, respectively. Serverless, container-orchestrated resources are provided to train and deploy new IP Insights endpoints at will.
  • Dynamically updates AWS WAFv2 IP Sets and Amazon GuardDuty threat intel sets to bolster protection of your account and infrastructure against known threats

Description

SyntheticSun is built around the usage of the Malware Information Sharing Platform (MISP) and Anomali’s LIMO, which are community driven threat intelligence platforms (TIPs) that provide various types of indicators of compromise (IoC). Normalized and de-duplicated threat intel is looked up against in near-real time to quickly identify known threats in various types of network traffic. To add dynamism to the identification of potential threats IP Insights models are deployed to find anoamlies (and potential threats therein) between the pairing of IP addresses and entities (such as IAM principal ID’s, user-agents, etc.), native RCF detectors are also used in Elasticsearch to find anomalies in near real-time security telemetry as it is streamed into Kibana. To democratize the usage and fine-tuning of ML models within security teams, utilities to train IP Insights models are provided as an add-on to the core solution.

To perform the both the orchestration and automation as well as extraction, transformation, and loading (ETL) of security telemetry into Kibana, various AWS serverless technologies such as AWS Lambda, Amazon DynamoDB, and AWS CodeBuild are used. Serverless technologies such as these are used for their scalability, ease of use, relatively cheap costs versus heavy MapReduce or Glue ETL-based solutions. A majority of the solution is deployed via CloudFormation with helper scripts in Python and shell provided throughout the various Stages to promote adoption and the potential deployment in continuous integration pipelines.

To make the “guts” of the solution as lean as possible basic Python modules such as boto3requestsjsonipaddresssocket and re perform most of the extraction, transformation, and loading (ETL) into downstream services. Because all geolocation information is provided by ip-api.com, it does not require an account or paid tiers and has a great API which includes throttling information in their response headers. A majority of the Elasticsearch and Kibana dependencies are also provided in code (indices, mappings, visualizations, etc) to avoid heavy manual configuration.

Setting Up

SyntheticSun is spread across three Stages due to the size of solution and the required dependencies. All architecture and installation instructions (and FAQs where appropriate) live within their own Stage. Add-ons modules (called an Appendix) are also provided to extend the functionality, which have their own architecture and installation instructions localized.

Before you start: Considerations for Production deployments

SyntheticSun, by virtue of being something you found on GitHub, is a proof-of-concept and therefore I did not go the extra mile for the first release to absolutely harden everything. Provided you are reading this at a point in time where I have not made the necessary changes, consider the following before you deploy this solution into a production environment (or any environment with heightened security needs). I will put these items on a roadmap and update them as appropiate.

  • Train your own IP Insights models using the examples provided in Appendix A. Using your own data, and continually retraining the model, will help accurize findings.
  • Deploy your CodeBuild projects, MISP server, and Elasticsearch Service domain in a VPC in order to harden against internet-borne attacks. Consider using AWS’ Client VPN, AWS Site-to-Site VPN, DirectConnect, Amazon Workspaces, and AppStream 2.0 or (if you absolutely have to) a reverse-proxy to access the MISP console and Kibana within a VPC.
  • Consider using Cognito for AuthN into Kibana. Go a step further and federate your User Pool with your corporate IdP.
  • Consider baking your own AMI for MISP or use Fargate to host it. I would also consider pre-baking Suricata and the Amazon CloudWatch Agent into future builds to help scale deployments of agents and HIDPS across your estate.
  • Modify your Suricata configuration to suit the needs of your SecOps teams looking at the logs, since all this solution does is dump them in. You may also consider writing your own rules or importing other sources to harden your hosts against attacks.