Install
openclaw skills install convert-github-repositoryUse when (1) user provides a GitHub repository URL or local repo path and asks to convert it to a different format. (2) user asks to export a GitHub repository as Markdown documentation, JSON metadata, or CSV. (3) user wants to transform repository structure into a different representation (e.g., folder tree to JSON, issue tracker to README).
openclaw skills install convert-github-repositoryThis skill converts GitHub repository content and metadata into target formats (Markdown documentation, JSON metadata, CSV tables, folder tree JSON). It preserves structure and content fidelity across formats — not a simple file copy, but a semantic transformation that respects GitHub's data model (repos, trees, commits, issues, PRs, releases).
Key responsibilities:
.git directory/convert-github-repository --markdownRepository → Markdown documentation. Converts the entire repository into a set of Markdown files:
README.md from repository rootREADME.md describing its contentsissues/YYYY-MM-DD-{number}-{title}.mdpull-requests/YYYY-MM-DD-{number}-{title}.mdreleases/v{semver}.md/convert-github-repository --jsonRepository → JSON metadata. Exports repository structure and content as a JSON file:
{
"repo": { "name": "...", "description": "...", "stars": N, "license": "...", "topics": [...] },
"files": [{ "path": "...", "size": N, "type": "file|dir", "sha": "..." }],
"branches": [{ "name": "...", "last_commit": "..." }],
"contributors": [{ "login": "...", "contributions": N }]
}
/convert-github-repository --csvIssues + PRs → CSV. Exports issues and pull requests as CSV rows with columns: number, title, state, author, created_at, updated_at, labels, assignees, milestone, body_preview.
/convert-github-repository --treeRepository → folder tree JSON. Outputs a tree structure representing the repository layout:
{
"path": "/",
"type": "directory",
"children": [
{ "path": "src/index.js", "type": "file", "size": 1234, "language": "JavaScript" },
{ "path": "tests/", "type": "directory", "children": [...] }
]
}
/convert-github-repository --readme-to-jsonREADME.md → structured JSON. Parses a README.md and extracts: title (first H1), description (first paragraph after title), installation steps, usage examples, contributing guidelines, license.
Remote GitHub repo: Parse URL to extract owner and repo:
https://github.com/owner/repo -> {owner: "owner", repo: "repo"}
Use GitHub API with GITHUB_TOKEN from env (os.getenv("GITHUB_TOKEN")).
Local repository: Verify .git directory exists. Use git CLI to extract:
git ls-files for file listgit log --oneline for recent commitsgit remote get-url origin to confirm repo identityIf neither source is available, report: "Cannot identify repository source — provide either a GitHub URL (https://github.com/owner/repo) or a local path with a .git directory."
For remote repos, call GitHub API:
GET https://api.github.com/repos/{owner}/{repo}
Authorization: Bearer {GITHUB_TOKEN}
Accept: application/vnd.github+json
Response contains: full_name, description, stargazers_count, forks_count, license.spdx_id, topics, default_branch, created_at, updated_at, homepage, language.
If the token is missing or rate-limited (403), try unauthenticated (lower rate limit) or report: "GitHub API unavailable — check GITHUB_TOKEN or try again later (rate limit: 60 req/hr unauthenticated)."
Get default branch tree:
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{default_branch}?recursive=1
Returns a flat list of all files with path, type (blob/tree), size, sha.
For local repos: Run git ls-tree -r --name-only {branch} to get the file list, then git show {sha}:{path} to read file content.
Filter out common non-essential paths:
node_modules/, .git/, vendor/ — skip unless specifically requested.gitignore-referenced files that are not committed — skip--markdown conversion:
.js → ```javascript, .py → ```python, .go → ```go, .rs → ```rust, etc.README.md in each directory with list of contained files{number}-{slug}.md with frontmatter:
---
number: 42
state: open
author: username
created: 2024-01-15
labels: [bug, help-wanted]
---
# Title
body text...
merged field and review comments--json conversion:
--csv conversion:
number,title,state,author,created_at,updated_at,labels,assignees,milestone,body_preview; delimiter; delimiter""--tree conversion:
.js → JavaScript, .py → Python, etc.)--readme-to-json conversion:
# Heading → titleinstallation, usage, contributing, etc.)code_examples array[text](url) → collected in links arrayBefore delivering:
json.loads() and confirm no errorsReturn the converted output plus a manifest:
{
"format": "markdown",
"files_converted": 142,
"files_skipped": 3,
"skipped_reasons": [
{"path": "node_modules/package/index.js", "reason": "binary or non-text, excluded by default"},
{"path": ".git/config", "reason": "contains sensitive data"}
],
"total_size_bytes": 2048576,
"output_path": "./{repo-name}-converted/"
}
os.getenv("GITHUB_TOKEN").gitignore unless user explicitly requests --include-ignoredskipped_reasons in the manifestfiles_converted, files_skipped, skipped_reasons\uFFFDgit CLI — never assume raw file access is available| Criterion | Minimum | Ideal |
|---|---|---|
| Content preservation | 100% of text file content preserved | Binary files listed in manifest, not content-lost |
| Format validity | Output passes format parser | Strict schema validation (JSON: draft-7, CSV: consistent columns) |
| Manifest completeness | Every skipped file has a reason | Every converted file listed with size and type |
| Encoding correctness | UTF-8 for all text files | Invalid bytes replaced with U+FFFD, not dropped or garbled |
| Rate limit handling | 403 triggers re-auth attempt | Proactive pagination with token refresh on 403 |
| Large repo handling | < 500MB memory for repos up to 10K files | Streaming JSON output, chunked file writes |
A good output passes the target format parser without errors, preserves all semantic content, and includes a complete manifest of what was converted and why some files were skipped.
| Scenario | Bad | Good |
|---|---|---|
| Missing token | Retries unauthenticated forever | Reports "GitHub API auth required — set GITHUB_TOKEN environment variable" |
| Binary file | Tries to read as text, garbles content | Skips binary, reports in manifest: {"path": "logo.png", "reason": "binary/image, excluded"} |
| Large repo | Loads all files into memory, crashes | Streams output, reports "Processed 8,432 of 12,000 files (70%)" |
| Rate limited | Fails silently after 3 requests | Reports "Rate limit exceeded (403) — retry after 14:32 UTC or set GITHUB_TOKEN" |
| Missing field | Skips homepage when absent | Includes "homepage": null — no field silently dropped |
| Format error | Writes malformed JSON with trailing comma | Validates with json.dumps(), reports "Output invalid: unexpected token at line 847" |
| Local repo path | Assumes ~/repo exists | Checks .git directory, reports "Path is not a git repository: ./myrepo" |