#home#twitter

xclaw

Extract posts from X (Twitter) timeline or profile pages with engagement metrics, media URLs, and download local copies. Use when asked to scrape, list, or analyze X posts from home timeline, user profiles, or search results without clicking into individual posts.

Install

openclaw skills install @zyliu0/x-openclaw

XCLAW

Extract posts from X (Twitter) pages with full metadata including text, author, timestamp, engagement metrics, media URLs, and download local copies.

When to Use

List recent posts from a user's profile
Extract posts from home timeline
Capture post URLs without navigating into individual posts
Gather engagement metrics (likes, reposts, replies, views)
Download images and video thumbnails
Analyze content patterns across multiple posts

File Structure

text

intel/x/
├── {date}-{time}-X.md      # Intel file
└── media/
    └── {YYYYMMDD}-{HHMMSS}/        # Media folder (same timestamp)
        ├── image_{postId}_{index}.jpg
        └── video_{postId}_poster.jpg

File Naming

Item	Format	Example
Intel file	`{date}-{time}-X.md`	`2026-03-16-143052-X.md`
Media folder	`{YYYYMMDD}-{HHMMSS}`	`20260316-143052`
Image	`image_{postId}_{index}.jpg`	`image_2032741269689213289_0.jpg`
Video thumbnail	`video_{postId}_poster.jpg`	`video_2032741269689213289_poster.jpg`

OUTPUT FORMAT — EXACT FORMAT REQUIRED

WRITE THE INTEL FILE IN THIS EXACT FORMAT. FOLLOW EVERY COMPONENT PRECISELY.

text

# X Intelligence Report

**Scraped:** {date} {time} (Asia/Shanghai)
**Source:** X Home Timeline
**Posts Collected:** {count}

---

## Post {n}: {Topic Title}

**Author:** {Name} (@{handle})
**Posted:** {relative time}
**URL:** {post_url}

**Content:**
{post_text_content}

**Metrics:** ❤️ {likes} 🔁 {reposts} 💬 {replies} 👁 {views}

**Media:**
🖼️ media/{folder}/image_{postId}_{index}.jpg

**Signal:** {🔴 HIGH / 🟡 MEDIUM / 🟡 LOW / ⚪ SKIP} — {Brief signal assessment}

---

## Post {n+1}: ...

(MORE POSTS...)

---

## Summary

**High Signal (Tier 1):**
1. {Topic} — {One-line summary}
2. {Topic} — {One-line summary}

**Medium Signal (Tier 2):**
- {Topic} — {One-line summary}
- {Topic} — {One-line summary}

**Skip:**
- {Topic} — {Reason}

Media in Output Format (Conditional)

IF THE POST HAS MEDIA, INCLUDE THE MEDIA LINE AFTER METRICS:

text

**Media:**
🖼️ media/{folder}/image_{postId}_{index}.jpg

OR FOR VIDEO THUMBNAILS:

text

**Media:**
🎬 media/{folder}/video_{postId}_poster.jpg

IF THE POST HAS NO MEDIA, OMIT THE MEDIA SECTION ENTIRELY.

WORKFLOW

Step 1: Create Media Folder FIRST

BEFORE EXTRACTING ANY POSTS, create the media folder:

bash

# Create folder with current timestamp
# Format: YYYYMMDD-HHMMSS
mkdir -p ../../intel/x/media/{YYYYMMDD}-{HHMMSS}

# Example
mkdir -p ../../intel/x/media/20260316-143052

Step 2: Navigate to Target Page

javascript

// For home timeline
browser.navigate({ url: "https://x.com/home" })

// For a specific user profile
browser.navigate({ url: "https://x.com/username" })

Step 3: Scroll to Load More Posts

javascript

browser.act({ 
  fn: "() => { for(let i=0; i<3; i++) { window.scrollBy(0, window.innerHeight); } return 'scrolled'; }",
  kind: "evaluate"
})

Step 4: Extract Posts with Media URLs

javascript

browser.act({
  fn: "() => {
    const articles = document.querySelectorAll('article[data-testid=\"tweet\"]');
    const posts = [];
    articles.forEach(article => {
      const url = article.querySelector('a[href*=\"/status/\"]')?.href;
      if (!url || url.includes('/analytics') || url.includes('/photo')) return;
      
      const text = article.querySelector('[data-testid=\"tweetText\"]')?.innerText || '';
      const timeEl = article.querySelector('time');
      const timestamp = timeEl?.innerText || '';
      const datetime = timeEl?.dateTime || '';
      
      // Get author info
      const nameEl = article.querySelector('[data-testid=\"User-Name\"]');
      const name = nameEl?.querySelector('span')?.innerText || '';
      const handle = nameEl?.querySelector('a[href*=\"/\"]')?.innerText || '';
      
      // Get engagement metrics
      const metricButtons = article.querySelectorAll('button[role=\"button\"]');
      let replies = '0', reposts = '0', likes = '0', views = '0';
      
      metricButtons.forEach(btn => {
        const txt = btn.innerText || '';
        const label = btn.getAttribute('aria-label') || '';
        if (label && label.includes('Reply')) {
          const match = txt.match(/(\d+)/);
          if (match) replies = match[1];
        }
        if (label && label.includes('Repost')) {
          const match = txt.match(/(\d+)/);
          if (match) reposts = match[1];
        }
        if (label && label.includes('Like')) {
          const match = txt.match(/(\d+)/);
          if (match) likes = match[1];
        }
      });
      
      // Views are in an <a> link (not button), e.g. <a href="/.../analytics">71K</a>
      const viewLink = article.querySelector('a[href*=\"/analytics\"]');
      if (viewLink) {
        const v = viewLink.innerText.trim();
        views = v.replace(/[^0-9KMB.]/g, '') || '0';
      }
      
      // Get media URLs - images and video thumbnails
      const mediaUrls = [];
      
      // Image posts: pbs.twimg.com/media/
      const images = article.querySelectorAll('img[src*=\"pbs.twimg.com/media/\"]');
      images.forEach(img => {
        if (img.src && !mediaUrls.includes(img.src)) {
          mediaUrls.push(img.src);
        }
      });
      
      // Also check for image links in <a> tags
      const imageLinks = article.querySelectorAll('a[href*=\"pbs.twimg.com/media/\"]');
      imageLinks.forEach(link => {
        if (link.href && !mediaUrls.includes(link.href)) {
          mediaUrls.push(link.href);
        }
      });
      
      // Video thumbnails: pbs.twimg.com/amplify_video_thumb/
      const videoThumbs = article.querySelectorAll('img[src*=\"pbs.twimg.com/amplify_video_thumb/\"]');
      videoThumbs.forEach(img => {
        if (img.src && !mediaUrls.includes(img.src)) {
          mediaUrls.push(img.src);
        }
      });
      
      const postId = url.match(/\/status\/(\d+)/)?.[1] || '';
      
      posts.push({ 
        name,
        handle,
        text: text.substring(0, 500),
        timestamp, 
        datetime,
        url,
        postId,
        mediaUrls,
        metrics: { replies, reposts, likes, views }
      });
    });
    return posts.slice(0, 10);
  }",
  kind: "evaluate"
})

Step 5: Download Media Files

FOR EACH POST WITH MEDIA, DOWNLOAD THE FILES:

bash

# Download image - replace URL with actual media URL from extraction
curl -L -o "../../intel/x/media/{folder}/image_{postId}_{index}.jpg" "https://pbs.twimg.com/media/xxx?format=jpg&name=large"

# Download video thumbnail
curl -L -o "../../intel/x/media/{folder}/video_{postId}_poster.jpg" "https://pbs.twimg.com/amplify_video_thumb/xxx"

IMPORTANT:

USE -L FLAG TO FOLLOW REDIRECTS
ADD &name=large FOR FULL-RESOLUTION IMAGES
USE THE ACTUAL URLS EXTRACTED FROM STEP 4

Step 6: Write Intel File

WRITE THE MARKDOWN FILE TO ../../intel/x/{date}-{time}-X.md

FOLLOW THE EXACT OUTPUT FORMAT SHOWN ABOVE. INCLUDE:

Header with Scraped, Source, Posts Collected
Each post with Author, Posted, URL, Content, Signal
Media references if present
Summary section at the end

Step 7: Close Chrome Tab

javascript

browser.close()

Key Selectors

Element	Selector
Article container	`article[data-testid="tweet"]`
Post text	`[data-testid=\"tweetText\"]`
Timestamp	`time`
User name	`[data-testid=\"User-Name\"]`
Post URL	`a[href*=\"/status/\"]`
Analytics link (views)	`a[href*=\"/analytics\"]`
Metric buttons	`button[role=\"button\"]`
Images	`img[src*=\"pbs.twimg.com/media/\"]`
Video thumbnails	`img[src*=\"pbs.twimg.com/amplify_video_thumb/\"]`

Tips

Don't click into posts - Extract URLs directly from the timeline
Filter URLs - Exclude /analytics and /photo URLs
Scroll before extracting - X loads posts dynamically
Create folder FIRST - Media folder must exist before downloads
Use -L flag - curl must follow redirects for pbs.twimg.com URLs
Add &name=large - Get full-resolution images, not thumbnails
Always close the tab - Task ends when tab closes

xclaw

Install

XCLAW

When to Use

File Structure

File Naming

OUTPUT FORMAT — EXACT FORMAT REQUIRED

Media in Output Format (Conditional)

WORKFLOW

Step 1: Create Media Folder FIRST

Step 2: Navigate to Target Page

Step 3: Scroll to Load More Posts

Step 4: Extract Posts with Media URLs

Step 5: Download Media Files

Step 6: Write Intel File

Step 7: Close Chrome Tab

Key Selectors

Tips

Related skills