The innovative solution designed to streamline your search processes in cloud storage. By bypassing traditional log indexing in SIEMs, CloudGrep offers a faster, cost-effective method to delve directly into your AWS S3 logs.

Whether you’re debugging applications or conducting a security investigation, discover how CloudGrep enhances your cloud-based search capabilities.

Why?

  • Directly searching cloud storage, without indexing logs into a SIEM or Log Analysis tool, can be faster and cheaper.
  • There is no need to wait for logs to be ingested, indexed, and made available for searching.
  • It searches files in parallel for speed.
  • If you run this in the same region as the S3 bucket using a VPC endpoint for S3 you can avoid data transfer costs. Do check first!
  • This may be of use when debugging applications, or investigating a security incident.

Example

Simple example:

python3 cloudgrep.py --bucket test-s3-access-logs --query 9RXXKPREHHTFQD77
python3 cloudgrep.py -b test-s3-access-logs -q 9RXXKPREHHTFQD77

More complicated example:

python3 cloudgrep.py -b test-s3-access-logs --prefix "logs/" --filename ".log" -q 9RXXKPREHHTFQD77 -s "2023-01-09 20:30:00" -e "2023-01-09 20:45:00" --file_size 10000 --debug

Saving the output to a file:

python3 cloudgrep.py -b test-s3-access-logs -q 9RXXKPREHHTFQD77 --hide_filenames > output.txt

Example output:

Bucket is in region: us-east-2 : Search from the same region to avoid egress charges.
Searching 11 files in test-s3-access-logs for 9RXXKPREHHTFQD77...
access2023-01-09-20-34-20-EAC533CB93B4ACBE: abbd82b5ad5dc5d024cd1841d19c0cf2fd7472c47a1501ececde37fe91adc510 bucket-72561-s3bucketalt-1my9piwesfim7 [09/Jan/2023:19:20:00 +0000] 1.125.222.333 arn:aws:sts::000011110470:assumed-role/bucket-72561-myResponseRole-1WP2IOKDV7B4Y/1673265251.340187 9RXXKPREHHTFQD77 REST.GET.BUCKET - "GET /?list-type=2&prefix=-collector%2Fproject-&start-after=&encoding-type=url HTTP/1.1" 200 - 946 - 33 32 "-" "Boto3/1.21.24 Python/3.9.2 Linux/5.10.0-10-cloud-amd64 Botocore/1.24.46" - aNPuHKw== SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader bucket-72561-s3bucketalt-1my9piwesfim7.s3.us-east-2.amazonaws.com TLSv1.2 - -

Arguments

python3 cloudgrep.py --help
usage: cloudgrep.py [-h] -b BUCKET -q QUERY [-p PREFIX] [-f FILENAME] [-s START_DATE] [-e END_DATE] [-fs FILE_SIZE] [-d] [-hf]

CloudGrep searches is grep for cloud storage like S3.

options:
  -h, --help            show this help message and exit
  -b BUCKET, --bucket BUCKET
                        Bucket to search. E.g. my-bucket
  -q QUERY, --query QUERY
                        Text to search for. Will be parsed as a Regex. E.g. example.com
  -p PREFIX, --prefix PREFIX
                        Optionally filter on the start of the Object name. E.g. logs/
  -f FILENAME, --filename FILENAME
                        Optionally filter on Objects that match a keyword. E.g. .log.gz
  -s START_DATE, --start_date START_DATE
                        Optionally filter on Objects modified after a Date or Time. E.g. 2022-01-01
  -e END_DATE, --end_date END_DATE
                        Optionally filter on Objects modified before a Date or Time. E.g. 2022-01-01
  -fs FILE_SIZE, --file_size FILE_SIZE
                        Optionally filter on Objects smaller than a file size, in bytes. Defaults to 100 Mb.
  -d, --debug           Enable Debug logging.
  -hf, --hide_filenames
                        Dont show matching filesnames.

Deployment

Install with: pip3 install -r requirements.txt

You can run this from your local laptop, or from an EC2 instance in the same region as the S3 bucket with a VPC endpoint for S3 to avoid egress charges.

You can authenticate in a number of ways. If you are running on an EC2, an Instance Profile is likely the best choice.