Socid-Extractor Extracts information about a user from profile webpages / API responses and save it in machine-readable format.

Usage

As a command-line tool:

$ socid_extractor –url https://www.deviantart.com/muse1908
country: France
created_at: 2005-06-16 18:17:41
gender: female
username: Muse1908
website: www.patreon.com/musemercier
links: [‘https://www.facebook.com/musemercier’, ‘https://www.instagram.com/muse.mercier/’, ‘https://www.patreon.com/musemercier’]
tagline: Nothing worth having is easy…

Without installing:

$ ./run.py –url https://www.deviantart.com/muse1908

As a Python library:

import socid_extractor, requests
r = requests.get(‘https://www.patreon.com/annetlovart’)
socid_extractor.extract(r.text)
{‘patreon_id’: ‘33913189’, ‘patreon_username’: ‘annetlovart’, ‘fullname’: ‘Annet Lovart’, ‘links’: “[‘https://www.facebook.com/322598031832479’, ‘https://www.instagram.com/annet_lovart’, ‘https://twitter.com/annet_lovart’, ‘https://youtube.com/channel/UClDg4ntlOW_1j73zqSJxHHQ’]”}

Installation

$ pip3 install socid-extractor

The latest development version can be installed directly from GitHub:

$ pip3 install -U git+https://github.com/soxoj/socid_extractor.git

Sites and Methods

More than 100 methods for different sites and platforms are supported!

  • Google (all documents pages, maps contributions), cookies required
  • Yandex (disk, albums, znatoki, music, realty, collections), cookies required to prevent captcha blocks
  • Mail.ru (my.mail.ru user mainpage, photo, video, games, communities)
  • Facebook (user & group pages)
  • VK.com (user page)
  • OK.ru (user page)
  • Instagram
  • Reddit
  • Medium
  • Flickr
  • Tumblr
  • TikTok
  • GitHub

…and many others.

You can also check tests file for data examples, schemes file to expore all the methods.

When it may be useful

  • Getting all available info by the username or/and account UID. Examples: Week in OSINT, OSINTCurious
  • Users tracking, checking that the account was previously known (by ID) even if all public info has changed. Examples: Aware Online
  • Searching by commonly used cross-service UIDs (GAIA ID, Facebook UID, Yandex Public ID, etc.)
    • DB leaks of forums and platforms in SQL format
    • Indexed links that contain target profile ID
  • Searching for tracking data by comparison with other IDs – how it works, how can it be used.
  • Law enforcement online requests

Testing

python3 -m pytest tests/test_e2e.py -n 10 -k ‘not cookies’ -m ‘not github_failed and not rate_limited’

LEAVE A REPLY

Please enter your comment!
Please enter your name here