excel application

Other

Compare two columns in an Excel or CSV file to find differences, unique values, and common values. This skill should be used when the user wants to upload an Excel file and compare two specified columns, find data that exists in one column but not the other, identify common or unique entries between columns, or perform any column-to-column comparison in a spreadsheet.

Install

openclaw skills install operation

Excel Column Compare

Overview

Compare two specified columns in an Excel/CSV file to identify differences and commonalities between them. The skill produces a formatted Excel report showing values unique to each column, values shared by both, and a summary of the comparison.

When to Use

  • User asks to compare two columns in an Excel file
  • User wants to find values that exist in one column but not another
  • User needs to identify differences or overlaps between two sets of data in a spreadsheet
  • User says things like "对比这两列的不同", "比较Excel两列数据", "find differences between columns", "compare column A and B"

Workflow

Step 1: Understand the User's Request

Identify from the user's message:

  • The Excel/CSV file to compare (file path)
  • The two columns to compare (by column name, letter like A/B/C, or index)
  • Optional: specific sheet name, comparison mode, or output path

If the user does not specify column identifiers, read the file first to display available columns and ask the user which two to compare.

Step 2: Preview the File (if needed)

If the column identifiers are unclear, read the file to show available columns:

import pandas as pd
df = pd.read_excel(input_file, nrows=5)
print(df.columns.tolist())
print(df.head())

Step 3: Run the Comparison Script

Execute the bundled script scripts/compare_columns.py:

python <skill_path>/scripts/compare_columns.py <input_file> <column_a> <column_b> [--sheet <sheet>] [--output <output>] [--mode <mode>]

Parameters:

  • input_file: Path to the Excel (.xlsx, .xls) or CSV file
  • column_a: Column name (e.g., "姓名"), letter (e.g., A, B), or 0-based index
  • column_b: Column name, letter, or 0-based index for the second column
  • --sheet: Sheet name or index (default: first sheet)
  • --output: Output file path (default: <input>_comparison_result.xlsx)
  • --mode: Comparison mode:
    • full (default): Show all categories — only in A, only in B, common
    • diff: Show only differences (not in both)
    • unique_a: Show only items unique to column A
    • unique_b: Show only items unique to column B
    • common: Show only items common to both columns

Examples:

# Compare by column names
python scripts/compare_columns.py data.xlsx "姓名" "名字"

# Compare by column letters
python scripts/compare_columns.py data.xlsx A B

# Compare specific sheet, only show differences
python scripts/compare_columns.py data.xlsx "Email" "邮箱" --sheet "Sheet2" --mode diff

# Specify output path
python scripts/compare_columns.py data.xlsx C D --output result.xlsx

Step 4: Present the Results

After the script runs:

  1. Display the console summary (unique counts for each category)
  2. Open the generated output Excel file for the user to review
  3. Summarize key findings: how many values are unique to each column, how many are shared

Output Format

The output Excel file contains the following sheets:

SheetContent
SummaryOverview with column names, counts, and statistics
Only in [Column A]Values found only in the first column, with row numbers
Only in [Column B]Values found only in the second column, with row numbers
Common ValuesValues present in both columns

Sheets are color-coded: blue for Column A, orange for Column B, green for common values, purple for summary.

Troubleshooting

  • Column not found: The script prints available column names. Suggest the correct column identifier to the user.
  • Same column selected: The script will error if both references resolve to the same column. Ask the user to choose two different columns.
  • Large files: For files with many rows, the comparison still works efficiently using set operations. Row number display is limited to first 10 occurrences per unique value.
  • Missing dependencies: The script auto-installs pandas and openpyxl if not present.

Resources

scripts/

  • compare_columns.py — Main comparison script that reads an Excel/CSV file, compares two specified columns, and generates a formatted Excel report with the comparison results.