github.com/sokomishalov/skraper ↗
Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Facebook, Instagram, Twitter, Youtube, Tiktok, Telegram, Twitch, Reddit, 9GAG, Pinterest, Flickr, Tumblr, Coub, Vimeo, IFunny, VK, Odnoklassniki, Pikabu)
Open this visualization on its own page →
Contributors
4
Lines of Code
1,771
From
2020-01-15
To
2020-12-22
About sokomishalov/skraper
Skraper is a Kotlin library and command-line tool for extracting posts, media, and metadata from over eighteen popular social media and content-sharing platforms including Facebook, Instagram, Twitter, YouTube, TikTok, Reddit, Twitch, Telegram, Pinterest, Tumblr, Flickr, VK, and others. The tool operates without requiring user authorization or full page rendering, instead relying on lightweight parsing via jsoup and Jackson. It supports downloading media files and scraping post metadata across all implemented sources.
The project is organized into three main components: a standalone CLI tool for command-line usage, a Kotlin library for integration into applications, and a Telegram bot. The library provides a consistent interface across different platforms through individual scraper implementations for each source, allowing developers to retrieve user posts, channel information, and resolve direct media links. It offers flexibility in HTTP client implementations, with built-in support for OkHttp, Spring WebFlux, Ktor, and a basic blocking Java client.
The maintainer notes that the tool may break unpredictably since websites frequently change their structure without notice. The library is distributed via Maven and Gradle and includes Java interoperability utilities for use in Java projects alongside its native Kotlin API.