peteromallet/dataclaw
2,086 stars · Last commit 2026-05-30
Agent harness to publish your history from Claude Code et al. as Huggingface datasets.
README preview
# DataClaw
> **This is a performance art project.** Anthropic built their models on the world's freely shared information, then introduced increasingly [dystopian data policies](https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks) to stop anyone else from doing the same with their data - pulling up the ladder behind them. DataClaw lets you throw the ladder back down. The dataset it produces is yours to share.
Turn your Claude Code, Codex, and other coding-agent conversation history into structured data and publish it to Hugging Face with a single command. DataClaw parses session logs, redacts secrets and PII, and uploads the result as a ready-to-use dataset.

Every export is tagged **`dataclaw`** on Hugging Face. Together, they may someday form a growing [distributed dataset](https://huggingface.co/datasets?other=dataclaw) of real-world human-AI coding collaboration.
## Download for Mac
<p align="center">
<a href="https://github.com/peteromallet/dataclaw/releases/latest/download/DataClaw-macOS-Apple-Silicon.dmg">
<img alt="Download DataClaw for Apple Silicon Macs" src="https://img.shields.io/badge/Download%20for%20Mac-Apple%20Silicon-111111?style=for-the-badge&logo=apple&logoColor=white">
</a>
<a href="https://github.com/peteromallet/dataclaw/releases/latest">
<img alt="View GitHub Releases" src="https://img.shields.io/badge/View%20Releases-GitHub-0969da?style=for-the-badge&logo=github&logoColor=white">
</a>
</p>