GitMonitor : A Github Scanning System To Look For Leaked Sensitive Information Based On Rules

GitMonitor is a Github scanning system to look for leaked sensitive information based on rules. I know that there are a lot of very good other tools for finding sensitive information leaked on Github right now, I myself currently still use some of them. However, I think they still lack some features like:

  • A scanning tool based on the rules.
  • The rules mechanism allows me to write rules in the most flexible way possible. The rules allow me to filter information by brand name, file format and by language. As well as allowing me to skip specific file formats and languages (Searching rules). Then clone the repositories that have matched the rules to local before start looking for the sensitive information that exists there based on regular expressions (Sensitive filtering rules). You can do that by defining keywords related to your company brand name, keywords related to your company’s projects, email prefixes, or anything else in the rules.
  • The tool can launch on schedule and has a flexible reporting mechanism.

That is why I created this tool – GitMonitor. GitMonitor uses two different sets of rules to find what you need. The Searching rules will search for repositories that may be related to your organization or internal projects, or anything else, clone repositories that matched to local. Then, Sensitive filtering rules to check if those repositories exist sensitive information. Finally the tool will report via Slack. You can use this tool with Cronjob to create a monitoring system to track sensitive information related to your organization that leaked on Github and receive results via Slack.

Features

  • Search the repository based on rules (Searching rules). You can write rules to search for repositories that may be related to your company. The repositories matching the rules will be cloned to local.
  • Use Regex (Sensitive filtering rules) to search for sensitive information that exists in cloned repository, for classification purposes.
  • Report via Slack.
  • Rules and regex are defined separately
  • Users can define rules and regex easily and intuitively.

Also Read – Locator : Geolocator, IP Tracker, Device Info by URL (Serveo & Ngrok)

Requirements

  • Python3, Python3-pip

Tested on Ubuntu 18.04.

Setup

  • Install requirements:

Python3 -m pip install -r requirements.txt

Please make sure you have Pyyaml version 5x or higher installed

  • Fill in the required information in the configuration file (config.ini):

[git]
user = <username_git>
pass = <password_git>
url_code = https://api.github.com/search/code?q={}+in:file&sort=indexed&order=desc
url_repos = https://api.github.com/search/repositories?q={}+size:>0+is:public&sort=indexed&order=desc
url_commit = https://api.github.com/search/commits?q={}+is:public&sort=indexed&order=desc
rpp = 50
[slack]
webhooks =<full_link_webhooks>
[path]
rule =<path to rule folder>
source =<path to folder to clone repository>
log =<filename of log>
[msg]
start = ====================*====================
*Start scanning at {}*
_Clone completed successfully:_
end = ====================*====================
*Scanning Done at {}*
_Detected possible repository:_
all = ====================**====================

  • Write the rules (Searching rules). Put your rules in the rules directory:
 id: Project_X_Matching
 key: X
 language:
   - java
 #filename:
 #  - LICENSE
 #extension:
 #  - py
 #  - md
 ignore:
 #  language:
 #    - php
   filename:
     - LICENSE
   extension:
     - html
     - txt
  • Define the regular expressions in libs/regex.py file (Sensitive filtering rules).
  • Run:

Python3 gitmonitor.py

  • You can schedule automatic running for the tool by using Cronjob.
R K

Recent Posts

How AI Puts Data Security at Risk

Artificial Intelligence (AI) is changing how industries operate, automating processes, and driving new innovations. However,…

21 hours ago

The Evolution of Cloud Technology: Where We Started and Where We’re Headed

Image credit:pexels.com If you think back to the early days of personal computing, you probably…

5 days ago

The Evolution of Online Finance Tools In a Tech-Driven World

In an era defined by technological innovation, the way people handle and understand money has…

5 days ago

A Complete Guide to Lenso.ai and Its Reverse Image Search Capabilities

The online world becomes more visually driven with every passing year. Images spread across websites,…

6 days ago

How Web Application Firewalls (WAFs) Work

General Working of a Web Application Firewall (WAF) A Web Application Firewall (WAF) acts as…

1 month ago

How to Send POST Requests Using curl in Linux

How to Send POST Requests Using curl in Linux If you work with APIs, servers,…

1 month ago