# Contributing to SOTA Tracker

Thank you for your interest in contributing! This project welcomes contributions from developers, researchers, and data enthusiasts.

## Quick Start

### Development Setup

```bash
# Clone the repository
git clone https://github.com/romancircus/sota-tracker-mcp.git
cd sota-tracker-mcp

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
pip install pytest  # For tests

# Initialize database
python init_db.py

# Run tests
python -m pytest tests/ -v
```

## Project Structure

```
sota-tracker-mcp/
├── scrapers/          # Web scrapers for external sources
├── fetchers/          # API-based data fetchers
├── utils/             # Shared utilities (DB, logging, etc.)
├── server.py          # MCP server
├── rest_api.py        # REST API wrapper
├── init_db.py         # Database setup
├── tests/             # Test suite
└── data/              # Generated data (tracked via git)
```

## How to Report Issues

### Bug Reports

When reporting bugs, include:

1. **Environment**: OS, Python version
2. **Steps to reproduce**: Exact commands or API calls
3. **Expected vs actual behavior**: What you expected vs what happened
4. **Logs/Error messages**: Full stack traces

### Feature Requests

For new features, include:

1. **Use case**: What problem does this solve?
2. **Proposed solution**: How should it work?
3. **Alternatives considered**: Other approaches you considered

## Adding New Models

### Manual Model Addition

Edit `init_db.py` and add to `seed_sota_models()`:

```python
{
    "id": "new-model-123",
    "name": "New Model Name",
    "category": "llm_local",
    "release_date": "2026-02-01",
    "source": "manual",
    "is_sota": True,
    "is_open_source": True,
    "sota_rank": 5,
    "sota_rank_open": 3,
    "metrics": {
        "vram": "16GB",
        "notes": "Brief description",
        "why_sota": "Why it's SOTA",
        "strengths": ["Quality", "Speed"],
        "use_cases": ["Chat", "Code"]
    }
}
```

Then run:
```bash
python init_db.py
python scrapers/run_all.py --export
```

### Adding a New Scraper

1. Create file in `scrapers/`:
```python
# scrapers/new_source.py
class NewSourceScraper:
    def scrape(self) -> dict:
        # Returns {"source": "...", "models": [...]}
        pass
```

2. Add to `scrapers/run_all.py`:
```python
from scrapers.new_source import NewSourceScraper

# In main():
scrapers = [
    LMArenaScraper(),
    ArtificialAnalysisScraper(),
    NewSourceScraper(),  # Add here
]
```

## Code Style

- Follow PEP 8 style guide
- Use type hints for function signatures
- Add docstrings for functions and classes
- Keep functions under 50 lines when possible

## Testing

### Running Tests

```bash
# Run all tests
python -m pytest tests/ -v

# Run specific test file
python -m pytest tests/test_rest_api.py -v

# Run with coverage
python -m pytest tests/ --cov=. --cov-report=html
```

### Writing Tests

Add tests in `tests/` directory following the existing pattern:

```python
def test_new_functionality():
    assert True
```

## Data Sources

### Adding a New Data Source

1. Check `robots.txt` to ensure scraping is allowed
2. Implement scraper (`scrapers/`) or fetcher (`fetchers/`)
3. Add to `scrapers/run_all.py`
4. Update README.md with the new source

### Scraping Ethics

- Respect rate limits (1 request per second minimum)
- Only scrape publicly available data
- Cache results to minimize server load
- Attribute data sources in README

## Pull Request Process

1. Fork the repository
2. Create a feature branch (`git checkout -b feat/my-feature`)
3. Make your changes with descriptive commits
4. Ensure tests pass (`python -m pytest tests/`)
5. Push and create a pull request

### PR Checklist

- [ ] Tests added/updated
- [ ] Documentation updated if needed
- [ ] Code follows style guidelines
- [ ] All tests pass
- [ ] No secret keys or credentials

## Release Process

Releases are automated via GitHub Actions:

- **Daily**: Data updates at 6 AM UTC
- **Weekly**: Tagged releases on Sundays with JSON/CSV exports

No manual release required - automation handles everything.

## Questions?

- Open an issue for bugs or questions
- Check existing issues before creating new ones
- Join discussions for feature proposals

## License

By contributing, you agree that your contributions will be licensed under the MIT License.