Install
openclaw skills install customer-segment-engAnalyze uploaded bank customer data to segment and profile customers by assets, transactions, and behavior, outputting clusters, statistics, and visual charts.
openclaw skills install customer-segment-engFinancial customer segmentation analysis: Stratify customers based on assets, transaction behaviors, activity levels, and other dimensions, outputting actionable segmentation results and visualizations.
Read user-uploaded CSV or Excel files, automatically identifying column names.
Priority fields to retain:
customer_id / 客户ID — Unique customer identifierage / 年龄gender / 性别balance / 资产余额txn_amount / 交易金额txn_count / 交易次数last_date / 最近交易日期product_count / 持有产品数branch / 网点Missing value handling:
import pandas as pd
df = pd.read_csv(file_path)
df.columns = df.columns.str.strip().str.lower()
Build RFM + extended features:
| Feature | Description |
|---|---|
| Recency | Days since last transaction (smaller = more active) |
| Frequency | Transaction frequency (number of transactions in specified period) |
| Monetary | Transaction amount (total amount in specified period) |
| Tenure | Customer duration (months) |
| Product_Depth | Number of products held |
| Age | Customer age |
Data standardization: Use StandardScaler (Z-score) to normalize all numeric features.
Use K-Means algorithm, automatically determine K value (Elbow Method, SSE inflection point).
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(features)
# Elbow method to find optimal K
sse = {}
for k in range(2, 10):
km = KMeans(n_clusters=k, random_state=42, n_init=10)
km.fit(X_scaled)
sse[k] = km.inertia_
optimal_k = min(sse, key=sse.get) # Simply take k with minimum SSE
K=5 can also be fixed based on business needs (high/medium-high/medium/medium-low/low value customers).
Output core statistics for each cluster:
Cluster 0 (High-Value Customers): Avg. assets 850k, Avg. transaction frequency 28/month, Gender distribution 62% male
Cluster 1 (Potential Customers): Avg. assets 320k,明显 younger trend
...
Recommended label system (five categories):
Generate the following charts (saved as PNG):
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
plt.rcParams['font.sans-serif'] = ['WenQuanYi Micro Hei', 'SimHei']
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Asset distribution
axes[0].hist([g['balance'] for _, g in df.groupby('cluster')], bins=30, label=[f'C{i}' for i in range(k)])
axes[0].set_title('Customer Balance Distribution by Cluster')
# Heatmap
import seaborn as sns
sns.heatmap(cluster_means.T, annot=True, fmt='.1f', ax=axes[1])
axes[1].set_title('Cluster Feature Heatmap')
plt.tight_layout()
plt.savefig(output_path, dpi=150)
Output content:
segmentation_results.csvcluster_summary.csvsegmentation_charts.pngsegmentation_report.mdFor detailed clustering and parameter documentation:
references/rfm-guide.mdreferences/clustering-guide.md