peteromallet/dataclaw
2,046 stars · Last commit 2026-04-03
Agent harness to publish your history from Claude Code et al. as Huggingface datasets.
README preview
# DataClaw > **This is a performance art project.** Anthropic built their models on the world's freely shared information, then introduced increasingly [dystopian data policies](https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks) to stop anyone else from doing the same with their data - pulling up the ladder behind them. DataClaw lets you throw the ladder back down. The dataset it produces is yours to share. Turn your Claude Code, Codex, and other coding-agent conversation history into structured data and publish it to Hugging Face with a single command. DataClaw parses session logs, redacts secrets and PII, and uploads the result as a ready-to-use dataset.  Every export is tagged **`dataclaw`** on Hugging Face. Together, they may someday form a growing [distributed dataset](https://huggingface.co/datasets?other=dataclaw) of real-world human-AI coding collaboration. ## Give this to your agent Paste this into Claude Code, Codex, or any coding agent: ``` Help me export my Claude Code, Codex, and other coding-agent conversation history to Hugging Face using DataClaw. Install it, then walk me through the process. STEP 1 - INSTALL pip install -U dataclaw