Install
openclaw skills install automated-soap-note-generatorTransform unstructured clinical input (dictation, transcripts, or rough notes) into standardized SOAP (Subjective, Objective, Assessment, Plan) medical documentation. Use ONLY for initial documentation draft generation; ALL output requires physician review before entering patient records. Not for complex cases requiring nuanced clinical reasoning.
openclaw skills install automated-soap-note-generatorAI-powered clinical documentation tool that converts unstructured clinical input into professionally formatted SOAP notes compliant with medical documentation standards.
Key Capabilities:
✅ Use this skill when:
❌ Do NOT use when:
operative-report-generator⚠️ ALWAYS Required:
Upstream Skills:
medical-scribe-dictation: Convert physician verbal dictation to text inputehr-semantic-compressor: Summarize lengthy EHR notes for SOAP generationdicom-anonymizer: Prepare imaging reports for SOAP inclusionaudio-script-writer: Convert audio recordings to text formatDownstream Skills:
medical-email-polisher: Professional communication of SOAP summaries to patientsclinical-data-cleaner: Standardize extracted data for research databaseshipaa-compliance-auditor: Verify de-identification before sharing documentationdischarge-summary-writer: Generate discharge summaries from SOAP encountersreferral-letter-generator: Create referral letters based on Assessment and Plan sectionsComplete Workflow:
Medical Scribe Dictation (audio→text) →
Automated SOAP Note Generator (this skill) →
Physician Review →
EHR Entry /
Medical Email Polisher (patient communication) /
Referral Letter Generator (referrals)
Handle various input formats and prepare for NLP analysis:
from scripts.soap_generator import SOAPNoteGenerator
generator = SOAPNoteGenerator()
# Process text input
soap_note = generator.generate(
input_text="Patient presents with 2-day history of chest pain, radiating to left arm...",
patient_id="P12345",
encounter_date="2026-01-15",
provider="Dr. Smith"
)
# Process from audio transcript
soap_note = generator.generate_from_transcript(
transcript_path="consultation_transcript.txt",
patient_id="P12345"
)
Input Preprocessing Steps:
Parameters:
| Parameter | Type | Required | Description | Default |
|---|---|---|---|---|
input_text | str | Yes* | Raw clinical text or dictation | None |
transcript_path | str | Yes* | Path to transcript file | None |
patient_id | str | No | Patient identifier (MUST be de-identified for testing) | None |
encounter_date | str | No | Date in ISO 8601 format (YYYY-MM-DD) | Current date |
provider | str | No | Healthcare provider name | None |
specialty | str | No | Medical specialty context | "general" |
verbose | bool | No | Include confidence scores | False |
*Either input_text or transcript_path required
Best Practices:
Identify and extract medical concepts from unstructured text:
# Extract entities with context
entities = generator.extract_medical_entities(
"Patient has history of hypertension and diabetes,
currently taking lisinopril 10mg daily and metformin 500mg BID"
)
# Returns structured entities:
# {
# "diagnoses": ["hypertension", "diabetes mellitus"],
# "medications": [
# {"name": "lisinopril", "dose": "10mg", "frequency": "daily"},
# {"name": "metformin", "dose": "500mg", "frequency": "BID"}
# ]
# }
Entity Types Recognized:
| Category | Examples | Notes |
|---|---|---|
| Diagnoses | diabetes, hypertension, pneumonia | ICD-10 compatible where possible |
| Symptoms | chest pain, headache, nausea | Includes severity modifiers |
| Medications | metformin, lisinopril, aspirin | Extracts dose, route, frequency |
| Procedures | ECG, CT scan, blood draw | Includes body site |
| Anatomy | left arm, chest, abdomen | Laterality and location |
| Lab Values | glucose 120, BP 140/90 | Units and reference ranges |
| Temporal | yesterday, 3 days ago, chronic | Normalized to relative dates |
Common Issues and Solutions:
Issue: Missed medications
Issue: Ambiguous abbreviations
Issue: Misspelled drug names
Automatically categorize sentences into appropriate SOAP sections:
# Classify content into SOAP sections
classified = generator.classify_soap_sections(
"Patient reports chest pain for 2 days. Physical exam shows BP 140/90.
Likely angina. Schedule stress test and start aspirin 81mg daily."
)
# Output structure:
# {
# "Subjective": ["Patient reports chest pain for 2 days"],
# "Objective": ["Physical exam shows BP 140/90"],
# "Assessment": ["Likely angina"],
# "Plan": ["Schedule stress test", "start aspirin 81mg daily"]
# }
Classification Rules:
| Section | Content Type | Examples |
|---|---|---|
| S - Subjective | Patient-reported information | "Patient states...", "Patient reports...", "Complains of..." |
| O - Objective | Observable/measurable findings | Vital signs, physical exam, lab results, imaging |
| A - Assessment | Clinical interpretation | Diagnosis, differential, clinical impression |
| P - Plan | Actions to be taken | Medications, procedures, follow-up, patient education |
Multi-label Handling: Some sentences span multiple sections (e.g., "Patient reports chest pain [S], which was sharp and 8/10 [S], with ECG showing ST elevation [O]")
Best Practices:
Parse and normalize timeline information:
# Extract temporal relationships
timeline = generator.extract_temporal_info(
"Patient had chest pain starting 3 days ago, worsening since yesterday.
Had similar episode 2 months ago that resolved with rest."
)
# Returns:
# {
# "onset": "3 days ago",
# "progression": "worsening",
# "previous_episodes": [
# {"time": "2 months ago", "resolution": "with rest"}
# ]
# }
Temporal Elements Extracted:
Normalization: Converts relative dates to standardized format:
Critical for accurate medical documentation:
# Detect negations and uncertainties
analysis = generator.analyze_certainty(
"Patient denies chest pain. No shortness of breath.
Possibly had fever yesterday but not sure."
)
# Identifies:
# - "denies chest pain" → Negative finding (important!)
# - "No shortness of breath" → Negative finding
# - "Possibly had fever" → Uncertain finding (flag for verification)
Detection Categories:
| Type | Cues | Action |
|---|---|---|
| Negation | denies, no, without, absent | Mark as negative finding |
| Uncertainty | possibly, maybe, uncertain, ? | Flag for physician review |
| Hypothetical | if, would, could | Note as conditional |
| Family History | family history of, mother had | Separate from patient findings |
⚠️ Critical: Negation errors are high-risk (e.g., missing "denies" → documenting symptom they don't have)
Produce final formatted output:
# Generate complete SOAP note
soap_output = generator.generate_soap_document(
structured_data=classified,
format="markdown", # Options: markdown, json, hl7, text
include_metadata=True
)
Output Format:
# SOAP Note
**Patient ID:** P12345
**Date:** 2026-01-15
**Provider:** Dr. Smith
## Subjective
Patient reports [extracted symptoms with duration]. History of [chronic conditions].
Currently taking [medications]. Patient denies [negative findings].
## Objective
**Vital Signs:** [BP, HR, RR, Temp, O2Sat]
**Physical Examination:** [Exam findings by system]
**Laboratory/Data:** [Relevant results]
## Assessment
[Primary diagnosis/differential]
[Clinical reasoning summary]
## Plan
1. [Action item 1]
2. [Action item 2]
3. [Follow-up instructions]
---
*Generated by AI. REQUIRES PHYSICIAN REVIEW before entry into patient record.*
Export Formats:
| Format | Use Case | Notes |
|---|---|---|
| Markdown | Human review, documentation | Default, readable |
| JSON | System integration, research | Structured data |
| HL7 FHIR | EHR integration | Healthcare standard |
| Plain Text | Simple documentation | Minimal formatting |
| CSV | Data analysis, research | Tabular data export |
From audio dictation to reviewed SOAP note:
# Step 1: Process audio to text (using medical-scribe-dictation or external)
# Assuming you have transcript: consultation.txt
# Step 2: Generate SOAP note
python scripts/main.py \
--input-file consultation.txt \
--patient-id P12345 \
--provider "Dr. Smith" \
--specialty "cardiology" \
--output soap_draft.md \
--format markdown
# Step 3: Review output
# - Open soap_draft.md
# - Verify medical accuracy
# - Correct any errors
# - Add missing clinical reasoning
# Step 4: Finalize (after physician approval)
# - Copy approved content to EHR
# - Or use for patient communication
Python API Usage:
from scripts.soap_generator import SOAPNoteGenerator
from scripts.post_processor import ReviewFormatter
# Initialize
generator = SOAPNoteGenerator()
reviewer = ReviewFormatter()
# Generate draft
with open("dictation.txt", "r") as f:
raw_text = f.read()
draft = generator.generate(
input_text=raw_text,
patient_id="P12345",
encounter_date="2026-01-15",
provider="Dr. Smith",
specialty="internal_medicine"
)
# Add physician review markers
marked_draft = reviewer.add_review_markers(draft)
# Save with warning header
reviewer.save_with_disclaimer(
marked_draft,
output_path="soap_draft_review.md",
disclaimer="REQUIRES PHYSICIAN REVIEW - NOT FOR DIRECT ENTRY"
)
Expected Output Files:
output/
├── soap_draft.md # Generated SOAP note
├── entities_extracted.json # Structured medical entities
├── classification_report.txt # Confidence scores for each section
└── review_checklist.md # Items requiring manual verification
Pre-generation Checks:
During Generation:
Post-generation Review (PHYSICIAN MUST CHECK):
Before EHR Entry:
Input Quality Issues:
❌ Poor audio quality (background noise, mumbling) → Garbled transcription → Inaccurate SOAP
❌ Incomplete dictation (provider trails off, changes subject) → Missing information
❌ Heavy accents or fast speech → Transcription errors
Medical Accuracy Issues:
❌ Medication name confusion ("Lipitor" vs "lipid lowerer") → Wrong drug documented
❌ Missed negations ("denies chest pain" → "has chest pain") → Critical error
❌ Temporal confusion ("pain since yesterday" vs "pain until yesterday") → Wrong timeline
❌ Uncertain findings documented as certain ("possibly pneumonia" → "pneumonia")
Documentation Issues:
❌ Hallucinated information (AI adds details not in input) → False documentation
❌ Missing context ("continue meds" without specifying which ones)
❌ Generic assessments ("patient is stable" without specifics)
Compliance Issues:
❌ Entering AI-generated text without review → Legal/medical liability
❌ Including PHI in unsecured processing → HIPAA violation
Process Issues:
❌ Not saving original input → Cannot verify if questions arise
❌ No audit trail → Cannot track AI involvement
Problem: Poor entity recognition
references/medical_terminology.md for supported termsProblem: Wrong SOAP classification
Problem: Missing temporal information
Problem: Inappropriate certainty level
Problem: Formatting errors in output
Problem: Processing fails or hangs
Available in references/ directory:
clinical_guidelines.md - Standards for medical documentationsample_soap_notes.md - Example SOAP notes by specialtymedical_terminology.md - Supported medical terms and abbreviationsnlp_pipeline_documentation.md - Technical details of NLP processinghipaa_compliance_guide.md - Guidelines for safe handling of PHIspecialty_specific_templates.md - Templates for cardiology, orthopedics, etc.Located in scripts/ directory:
main.py - CLI interface for SOAP generationsoap_generator.py - Core SOAP generation logicentity_extractor.py - Medical NER modulesoap_classifier.py - Section classification enginetemporal_parser.py - Timeline extractionnegation_detector.py - Negation and uncertainty detectionpost_processor.py - Output formatting and review markersbatch_processor.py - Process multiple encountersvalidator.py - Quality checks and compliance validationTypical Processing Time:
System Requirements:
Supported Input Sizes:
| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
--input, -i | string | - | No | Input clinical text directly |
--input-file, -f | string | - | No | Path to input text file |
--output, -o | string | - | No | Output file path |
--patient-id, -p | string | - | No | Patient identifier |
--provider | string | - | No | Healthcare provider name |
--format | string | markdown | No | Output format (markdown, json) |
# Generate SOAP from text
python scripts/main.py --input "Patient reports chest pain..." --output note.md
# From file
python scripts/main.py --input-file consultation.txt --patient-id P12345 --provider "Dr. Smith"
# JSON output
python scripts/main.py --input-file notes.txt --format json --output note.json
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python script executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Low |
| Data Exposure | May process PHI (Protected Health Information) | High |
| HIPAA Compliance | Must be used in compliant environment | High |
# Python 3.7+
# No external packages required (uses standard library)
⚠️ CRITICAL REMINDER: All AI-generated SOAP notes REQUIRE physician review and approval before entry into patient records. This tool assists documentation but does not replace clinical judgment or medical decision-making.