{"skill":{"slug":"image-to-data","displayName":"Image To Data","summary":"Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.","description":"---\r\nname: \"image-to-data\"\r\ndescription: \"Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.\"\r\n---\r\n\r\n# Image To Data\r\n\r\n## Overview\r\n\r\nBased on DDC methodology (Chapter 2.4), this skill extracts structured data from construction images using computer vision, OCR, and AI models to analyze site photos, scanned documents, and drawings.\r\n\r\n**Book Reference:** \"Преобразование данных в структурированную форму\" / \"Data Transformation to Structured Form\"\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom dataclasses import dataclass, field\r\nfrom enum import Enum\r\nfrom typing import List, Dict, Optional, Any, Tuple\r\nfrom datetime import datetime\r\nimport json\r\nimport base64\r\n\r\nclass ImageType(Enum):\r\n    \"\"\"Types of construction images\"\"\"\r\n    SITE_PHOTO = \"site_photo\"\r\n    SCANNED_DOCUMENT = \"scanned_document\"\r\n    FLOOR_PLAN = \"floor_plan\"\r\n    ELEVATION = \"elevation\"\r\n    DETAIL_DRAWING = \"detail_drawing\"\r\n    PROGRESS_PHOTO = \"progress_photo\"\r\n    SAFETY_PHOTO = \"safety_photo\"\r\n    DEFECT_PHOTO = \"defect_photo\"\r\n    MATERIAL_PHOTO = \"material_photo\"\r\n    EQUIPMENT_PHOTO = \"equipment_photo\"\r\n\r\nclass ExtractionType(Enum):\r\n    \"\"\"Types of data extraction\"\"\"\r\n    OCR_TEXT = \"ocr_text\"\r\n    TABLE = \"table\"\r\n    OBJECT_DETECTION = \"object_detection\"\r\n    MEASUREMENT = \"measurement\"\r\n    CLASSIFICATION = \"classification\"\r\n    PROGRESS = \"progress\"\r\n\r\n@dataclass\r\nclass BoundingBox:\r\n    \"\"\"Bounding box for detected region\"\"\"\r\n    x: int\r\n    y: int\r\n    width: int\r\n    height: int\r\n    confidence: float = 1.0\r\n\r\n@dataclass\r\nclass TextRegion:\r\n    \"\"\"Extracted text region from image\"\"\"\r\n    text: str\r\n    bbox: BoundingBox\r\n    confidence: float\r\n    language: str = \"en\"\r\n\r\n@dataclass\r\nclass DetectedObject:\r\n    \"\"\"Detected object in image\"\"\"\r\n    label: str\r\n    bbox: BoundingBox\r\n    confidence: float\r\n    attributes: Dict[str, Any] = field(default_factory=dict)\r\n\r\n@dataclass\r\nclass ExtractedTable:\r\n    \"\"\"Extracted table from image\"\"\"\r\n    headers: List[str]\r\n    rows: List[List[str]]\r\n    bbox: BoundingBox\r\n    confidence: float\r\n\r\n@dataclass\r\nclass ProgressMeasurement:\r\n    \"\"\"Progress measurement from image\"\"\"\r\n    element_type: str\r\n    total_count: int\r\n    completed_count: int\r\n    percent_complete: float\r\n    area_sqft: Optional[float] = None\r\n    volume_cuft: Optional[float] = None\r\n\r\n@dataclass\r\nclass ImageAnalysisResult:\r\n    \"\"\"Complete image analysis result\"\"\"\r\n    image_id: str\r\n    image_type: ImageType\r\n    text_regions: List[TextRegion]\r\n    detected_objects: List[DetectedObject]\r\n    tables: List[ExtractedTable]\r\n    progress: Optional[ProgressMeasurement] = None\r\n    metadata: Dict[str, Any] = field(default_factory=dict)\r\n    processing_time: float = 0.0\r\n\r\n\r\nclass OCREngine:\r\n    \"\"\"OCR engine for text extraction\"\"\"\r\n\r\n    def __init__(self, engine: str = \"tesseract\"):\r\n        self.engine = engine\r\n        self.supported_languages = [\"en\", \"ru\", \"de\", \"fr\", \"es\"]\r\n\r\n    def extract_text(\r\n        self,\r\n        image_data: bytes,\r\n        language: str = \"en\"\r\n    ) -> List[TextRegion]:\r\n        \"\"\"Extract text from image\"\"\"\r\n        # Simulated OCR extraction (use actual OCR library in production)\r\n        # In production: pytesseract, EasyOCR, or cloud OCR services\r\n\r\n        regions = []\r\n\r\n        # Simulate detecting title block in drawing\r\n        regions.append(TextRegion(\r\n            text=\"PROJECT: OFFICE BUILDING\",\r\n            bbox=BoundingBox(x=100, y=50, width=300, height=30, confidence=0.95),\r\n            confidence=0.95,\r\n            language=language\r\n        ))\r\n\r\n        regions.append(TextRegion(\r\n            text=\"DRAWING: A-101\",\r\n            bbox=BoundingBox(x=100, y=90, width=200, height=25, confidence=0.92),\r\n            confidence=0.92,\r\n            language=language\r\n        ))\r\n\r\n        regions.append(TextRegion(\r\n            text=\"SCALE: 1:100\",\r\n            bbox=BoundingBox(x=100, y=120, width=150, height=20, confidence=0.88),\r\n            confidence=0.88,\r\n            language=language\r\n        ))\r\n\r\n        return regions\r\n\r\n    def extract_structured_text(\r\n        self,\r\n        image_data: bytes,\r\n        template: Optional[Dict] = None\r\n    ) -> Dict[str, str]:\r\n        \"\"\"Extract structured text using template matching\"\"\"\r\n        # Extract text regions\r\n        regions = self.extract_text(image_data)\r\n\r\n        # Match to template fields\r\n        structured = {}\r\n\r\n        if template:\r\n            for field_name, field_config in template.items():\r\n                # Find matching region\r\n                for region in regions:\r\n                    if field_config.get(\"keyword\") in region.text.lower():\r\n                        structured[field_name] = region.text\r\n                        break\r\n        else:\r\n            # Default extraction\r\n            for region in regions:\r\n                if \"PROJECT:\" in region.text:\r\n                    structured[\"project_name\"] = region.text.split(\":\")[-1].strip()\r\n                elif \"DRAWING:\" in region.text:\r\n                    structured[\"drawing_number\"] = region.text.split(\":\")[-1].strip()\r\n                elif \"SCALE:\" in region.text:\r\n                    structured[\"scale\"] = region.text.split(\":\")[-1].strip()\r\n\r\n        return structured\r\n\r\n\r\nclass ObjectDetector:\r\n    \"\"\"Object detection for construction images\"\"\"\r\n\r\n    def __init__(self, model: str = \"yolov8\"):\r\n        self.model = model\r\n        self.construction_classes = self._load_construction_classes()\r\n\r\n    def _load_construction_classes(self) -> Dict[str, Dict]:\r\n        \"\"\"Load construction-specific object classes\"\"\"\r\n        return {\r\n            # Equipment\r\n            \"excavator\": {\"category\": \"equipment\", \"safety_zone\": 20},\r\n            \"crane\": {\"category\": \"equipment\", \"safety_zone\": 30},\r\n            \"forklift\": {\"category\": \"equipment\", \"safety_zone\": 10},\r\n            \"concrete_mixer\": {\"category\": \"equipment\", \"safety_zone\": 5},\r\n            \"scaffolding\": {\"category\": \"equipment\", \"safety_zone\": 5},\r\n\r\n            # Safety\r\n            \"hard_hat\": {\"category\": \"ppe\", \"required\": True},\r\n            \"safety_vest\": {\"category\": \"ppe\", \"required\": True},\r\n            \"safety_glasses\": {\"category\": \"ppe\", \"required\": False},\r\n            \"harness\": {\"category\": \"ppe\", \"required\": False},\r\n\r\n            # Materials\r\n            \"rebar_bundle\": {\"category\": \"material\", \"unit\": \"bundle\"},\r\n            \"concrete_block\": {\"category\": \"material\", \"unit\": \"pallet\"},\r\n            \"lumber_stack\": {\"category\": \"material\", \"unit\": \"bundle\"},\r\n            \"pipe_stack\": {\"category\": \"material\", \"unit\": \"bundle\"},\r\n\r\n            # Workers\r\n            \"worker\": {\"category\": \"person\", \"track\": True},\r\n\r\n            # Building elements\r\n            \"column\": {\"category\": \"structure\"},\r\n            \"beam\": {\"category\": \"structure\"},\r\n            \"slab\": {\"category\": \"structure\"},\r\n            \"wall\": {\"category\": \"structure\"},\r\n        }\r\n\r\n    def detect(\r\n        self,\r\n        image_data: bytes,\r\n        confidence_threshold: float = 0.5\r\n    ) -> List[DetectedObject]:\r\n        \"\"\"Detect objects in image\"\"\"\r\n        # Simulated detection (use actual model in production)\r\n        # In production: YOLO, Faster R-CNN, etc.\r\n\r\n        detected = []\r\n\r\n        # Simulate detected objects\r\n        sample_detections = [\r\n            (\"worker\", 0.92, BoundingBox(200, 300, 80, 180, 0.92)),\r\n            (\"hard_hat\", 0.88, BoundingBox(210, 300, 30, 25, 0.88)),\r\n            (\"safety_vest\", 0.85, BoundingBox(210, 340, 60, 80, 0.85)),\r\n            (\"scaffolding\", 0.78, BoundingBox(400, 100, 200, 400, 0.78)),\r\n            (\"concrete_block\", 0.72, BoundingBox(50, 450, 100, 50, 0.72)),\r\n        ]\r\n\r\n        for label, conf, bbox in sample_detections:\r\n            if conf >= confidence_threshold:\r\n                class_info = self.construction_classes.get(label, {})\r\n                detected.append(DetectedObject(\r\n                    label=label,\r\n                    bbox=bbox,\r\n                    confidence=conf,\r\n                    attributes=class_info\r\n                ))\r\n\r\n        return detected\r\n\r\n    def detect_safety_compliance(\r\n        self,\r\n        image_data: bytes\r\n    ) -> Dict:\r\n        \"\"\"Detect safety compliance in image\"\"\"\r\n        objects = self.detect(image_data)\r\n\r\n        workers = [o for o in objects if o.label == \"worker\"]\r\n        hard_hats = [o for o in objects if o.label == \"hard_hat\"]\r\n        vests = [o for o in objects if o.label == \"safety_vest\"]\r\n\r\n        compliance = {\r\n            \"workers_detected\": len(workers),\r\n            \"hard_hats_detected\": len(hard_hats),\r\n            \"vests_detected\": len(vests),\r\n            \"hard_hat_compliance\": len(hard_hats) / len(workers) if workers else 1.0,\r\n            \"vest_compliance\": len(vests) / len(workers) if workers else 1.0,\r\n            \"overall_compliance\": \"compliant\" if len(hard_hats) >= len(workers) else \"non-compliant\",\r\n            \"violations\": []\r\n        }\r\n\r\n        if len(hard_hats) < len(workers):\r\n            compliance[\"violations\"].append({\r\n                \"type\": \"missing_hard_hat\",\r\n                \"count\": len(workers) - len(hard_hats)\r\n            })\r\n\r\n        return compliance\r\n\r\n\r\nclass TableExtractor:\r\n    \"\"\"Extract tables from images\"\"\"\r\n\r\n    def extract_tables(\r\n        self,\r\n        image_data: bytes,\r\n        detect_headers: bool = True\r\n    ) -> List[ExtractedTable]:\r\n        \"\"\"Extract tables from image\"\"\"\r\n        # Simulated table extraction\r\n        # In production: Camelot, Tabula, or custom CNN\r\n\r\n        tables = []\r\n\r\n        # Simulate a schedule table\r\n        tables.append(ExtractedTable(\r\n            headers=[\"Activity\", \"Start\", \"End\", \"Duration\"],\r\n            rows=[\r\n                [\"Foundation\", \"2024-01-01\", \"2024-01-15\", \"14 days\"],\r\n                [\"Framing\", \"2024-01-16\", \"2024-02-28\", \"44 days\"],\r\n                [\"MEP Rough-in\", \"2024-03-01\", \"2024-03-31\", \"31 days\"]\r\n            ],\r\n            bbox=BoundingBox(50, 200, 500, 200, 0.85),\r\n            confidence=0.85\r\n        ))\r\n\r\n        return tables\r\n\r\n    def table_to_dataframe(self, table: ExtractedTable) -> Dict:\r\n        \"\"\"Convert table to dictionary (DataFrame-like)\"\"\"\r\n        return {\r\n            \"columns\": table.headers,\r\n            \"data\": table.rows,\r\n            \"records\": [\r\n                dict(zip(table.headers, row))\r\n                for row in table.rows\r\n            ]\r\n        }\r\n\r\n\r\nclass ProgressAnalyzer:\r\n    \"\"\"Analyze construction progress from images\"\"\"\r\n\r\n    def __init__(self):\r\n        self.reference_models = {}\r\n\r\n    def analyze_progress(\r\n        self,\r\n        current_image: bytes,\r\n        reference_image: Optional[bytes] = None,\r\n        element_type: str = \"general\"\r\n    ) -> ProgressMeasurement:\r\n        \"\"\"Analyze progress by comparing images\"\"\"\r\n        # Simulated progress analysis\r\n        # In production: Use semantic segmentation + comparison\r\n\r\n        # Simulate progress detection\r\n        return ProgressMeasurement(\r\n            element_type=element_type,\r\n            total_count=100,\r\n            completed_count=65,\r\n            percent_complete=65.0,\r\n            area_sqft=15000.0,\r\n            volume_cuft=None\r\n        )\r\n\r\n    def compare_with_plan(\r\n        self,\r\n        site_photo: bytes,\r\n        plan_image: bytes\r\n    ) -> Dict:\r\n        \"\"\"Compare site photo with plan\"\"\"\r\n        return {\r\n            \"match_score\": 0.78,\r\n            \"deviations\": [],\r\n            \"completion_estimate\": 65.0,\r\n            \"areas_of_concern\": []\r\n        }\r\n\r\n\r\nclass ConstructionImageAnalyzer:\r\n    \"\"\"\r\n    Main class for construction image analysis.\r\n    Based on DDC methodology Chapter 2.4.\r\n    \"\"\"\r\n\r\n    def __init__(self):\r\n        self.ocr = OCREngine()\r\n        self.detector = ObjectDetector()\r\n        self.table_extractor = TableExtractor()\r\n        self.progress_analyzer = ProgressAnalyzer()\r\n\r\n    def analyze_image(\r\n        self,\r\n        image_data: bytes,\r\n        image_type: ImageType,\r\n        image_id: str = \"img_001\",\r\n        extract_types: Optional[List[ExtractionType]] = None\r\n    ) -> ImageAnalysisResult:\r\n        \"\"\"\r\n        Analyze a construction image.\r\n\r\n        Args:\r\n            image_data: Image data as bytes\r\n            image_type: Type of image\r\n            image_id: Unique image identifier\r\n            extract_types: Types of extraction to perform\r\n\r\n        Returns:\r\n            Complete analysis result\r\n        \"\"\"\r\n        start_time = datetime.now()\r\n\r\n        if extract_types is None:\r\n            extract_types = [ExtractionType.OCR_TEXT, ExtractionType.OBJECT_DETECTION]\r\n\r\n        text_regions = []\r\n        detected_objects = []\r\n        tables = []\r\n        progress = None\r\n\r\n        # OCR extraction\r\n        if ExtractionType.OCR_TEXT in extract_types:\r\n            text_regions = self.ocr.extract_text(image_data)\r\n\r\n        # Object detection\r\n        if ExtractionType.OBJECT_DETECTION in extract_types:\r\n            detected_objects = self.detector.detect(image_data)\r\n\r\n        # Table extraction\r\n        if ExtractionType.TABLE in extract_types:\r\n            tables = self.table_extractor.extract_tables(image_data)\r\n\r\n        # Progress analysis\r\n        if ExtractionType.PROGRESS in extract_types:\r\n            progress = self.progress_analyzer.analyze_progress(image_data)\r\n\r\n        processing_time = (datetime.now() - start_time).total_seconds()\r\n\r\n        return ImageAnalysisResult(\r\n            image_id=image_id,\r\n            image_type=image_type,\r\n            text_regions=text_regions,\r\n            detected_objects=detected_objects,\r\n            tables=tables,\r\n            progress=progress,\r\n            metadata={\"extraction_types\": [e.value for e in extract_types]},\r\n            processing_time=processing_time\r\n        )\r\n\r\n    def analyze_site_photo(\r\n        self,\r\n        image_data: bytes,\r\n        image_id: str = \"site_001\"\r\n    ) -> Dict:\r\n        \"\"\"Analyze site photo for progress and safety\"\"\"\r\n        result = self.analyze_image(\r\n            image_data,\r\n            ImageType.SITE_PHOTO,\r\n            image_id,\r\n            [ExtractionType.OBJECT_DETECTION, ExtractionType.PROGRESS]\r\n        )\r\n\r\n        safety = self.detector.detect_safety_compliance(image_data)\r\n\r\n        return {\r\n            \"image_id\": result.image_id,\r\n            \"objects_detected\": len(result.detected_objects),\r\n            \"progress\": result.progress,\r\n            \"safety_compliance\": safety,\r\n            \"equipment\": [o.label for o in result.detected_objects if o.attributes.get(\"category\") == \"equipment\"],\r\n            \"materials\": [o.label for o in result.detected_objects if o.attributes.get(\"category\") == \"material\"]\r\n        }\r\n\r\n    def extract_drawing_data(\r\n        self,\r\n        image_data: bytes,\r\n        image_id: str = \"dwg_001\"\r\n    ) -> Dict:\r\n        \"\"\"Extract data from scanned drawing\"\"\"\r\n        result = self.analyze_image(\r\n            image_data,\r\n            ImageType.FLOOR_PLAN,\r\n            image_id,\r\n            [ExtractionType.OCR_TEXT, ExtractionType.TABLE]\r\n        )\r\n\r\n        # Extract title block info\r\n        title_block = self.ocr.extract_structured_text(image_data)\r\n\r\n        return {\r\n            \"image_id\": result.image_id,\r\n            \"title_block\": title_block,\r\n            \"text_regions\": len(result.text_regions),\r\n            \"tables\": [\r\n                self.table_extractor.table_to_dataframe(t)\r\n                for t in result.tables\r\n            ],\r\n            \"all_text\": [r.text for r in result.text_regions]\r\n        }\r\n\r\n    def batch_analyze(\r\n        self,\r\n        images: List[Tuple[bytes, ImageType, str]]\r\n    ) -> List[ImageAnalysisResult]:\r\n        \"\"\"Analyze multiple images\"\"\"\r\n        results = []\r\n        for image_data, image_type, image_id in images:\r\n            result = self.analyze_image(image_data, image_type, image_id)\r\n            results.append(result)\r\n        return results\r\n\r\n    def export_results(\r\n        self,\r\n        result: ImageAnalysisResult,\r\n        format: str = \"json\"\r\n    ) -> str:\r\n        \"\"\"Export analysis results\"\"\"\r\n        data = {\r\n            \"image_id\": result.image_id,\r\n            \"image_type\": result.image_type.value,\r\n            \"text_count\": len(result.text_regions),\r\n            \"object_count\": len(result.detected_objects),\r\n            \"table_count\": len(result.tables),\r\n            \"texts\": [\r\n                {\"text\": r.text, \"confidence\": r.confidence}\r\n                for r in result.text_regions\r\n            ],\r\n            \"objects\": [\r\n                {\"label\": o.label, \"confidence\": o.confidence}\r\n                for o in result.detected_objects\r\n            ],\r\n            \"processing_time\": result.processing_time\r\n        }\r\n\r\n        if format == \"json\":\r\n            return json.dumps(data, indent=2)\r\n        else:\r\n            raise ValueError(f\"Unsupported format: {format}\")\r\n```\r\n\r\n## Common Use Cases\r\n\r\n### Analyze Site Photo\r\n\r\n```python\r\nanalyzer = ConstructionImageAnalyzer()\r\n\r\n# Load image (in production, read from file)\r\nwith open(\"site_photo.jpg\", \"rb\") as f:\r\n    image_data = f.read()\r\n\r\nresult = analyzer.analyze_site_photo(image_data)\r\n\r\nprint(f\"Objects detected: {result['objects_detected']}\")\r\nprint(f\"Safety compliance: {result['safety_compliance']['overall_compliance']}\")\r\nprint(f\"Progress: {result['progress'].percent_complete}%\")\r\n```\r\n\r\n### Extract Drawing Data\r\n\r\n```python\r\nwith open(\"floor_plan.png\", \"rb\") as f:\r\n    drawing_data = f.read()\r\n\r\ndata = analyzer.extract_drawing_data(drawing_data)\r\n\r\nprint(f\"Drawing: {data['title_block'].get('drawing_number')}\")\r\nprint(f\"Project: {data['title_block'].get('project_name')}\")\r\nfor table in data['tables']:\r\n    print(f\"Table with {len(table['records'])} rows\")\r\n```\r\n\r\n### Detect Safety Violations\r\n\r\n```python\r\ndetector = ObjectDetector()\r\n\r\nwith open(\"site_photo.jpg\", \"rb\") as f:\r\n    image_data = f.read()\r\n\r\nsafety = detector.detect_safety_compliance(image_data)\r\n\r\nif safety['overall_compliance'] == 'non-compliant':\r\n    for violation in safety['violations']:\r\n        print(f\"Violation: {violation['type']} - Count: {violation['count']}\")\r\n```\r\n\r\n## Quick Reference\r\n\r\n| Component | Purpose |\r\n|-----------|---------|\r\n| `ConstructionImageAnalyzer` | Main analysis engine |\r\n| `OCREngine` | Text extraction |\r\n| `ObjectDetector` | Object detection |\r\n| `TableExtractor` | Table extraction |\r\n| `ProgressAnalyzer` | Progress analysis |\r\n| `ImageAnalysisResult` | Complete analysis result |\r\n\r\n## Resources\r\n\r\n- **Book**: \"Data-Driven Construction\" by Artem Boiko, Chapter 2.4\r\n- **Website**: https://datadrivenconstruction.io\r\n\r\n## Next Steps\r\n\r\n- Use [cad-to-data](../cad-to-data/SKILL.md) for CAD/BIM extraction\r\n- Use [defect-detection-ai](../../../DDC_Innovative/defect-detection-ai/SKILL.md) for defects\r\n- Use [safety-compliance-checker](../../../DDC_Innovative/safety-compliance-checker/SKILL.md) for safety\r\n","tags":{"latest":"2.0.0"},"stats":{"comments":0,"downloads":2269,"installsAllTime":85,"installsCurrent":6,"stars":0,"versions":2},"createdAt":1770475475286,"updatedAt":1778988878354},"latestVersion":{"version":"2.0.0","createdAt":1771002339049,"changelog":"Version 2.0.0\n\n- Major redesign: now extracts structured data from construction images using vision, OCR, and AI models.\n- Supports multiple construction image types (site photos, floor plans, scanned documents, etc.).\n- Provides data extraction for text, tables, detected objects, classification, and progress measurement.\n- Introduces detailed schemas for detected objects, bounding boxes, OCR text regions, and tables.\n- Modular architecture for OCR and object detection tailored to common construction needs.\n- Enables template-based structured text extraction and construction-specific object class detection.","license":null},"metadata":null,"owner":{"handle":"datadrivenconstruction","userId":"s1774mv3t1cm8r1kgs9hccdnmn8852nb","displayName":"datadrivenconstruction","image":"https://avatars.githubusercontent.com/u/94158709?v=4"},"moderation":null}