Install
openclaw skills install alibabacloud-odps-maxframe-codingUse this skill for MaxFrame SDK development and documentation navigation on Alibaba Cloud MaxCompute (ODPS). Helps answer MaxFrame API, concept, official example, and supported pandas API questions; create data processing programs; read/write MaxCompute tables; debug jobs (remote or local); and build custom DPE runtime images. Trigger when users mention MaxFrame, MaxCompute with MaxFrame, ODPS table processing, DPE runtime, MaxFrame docs/examples, DataFrame/Tensor operations, or GPU runtime setup. Works for both English and Chinese queries about Alibaba Cloud data processing with MaxFrame.
openclaw skills install alibabacloud-odps-maxframe-codingIF A SKILL APPLIES TO YOUR TASK, YOU DO NOT HAVE A CHOICE. YOU MUST USE IT. </EXTREMELY-IMPORTANT>
This skill uses Claude Code tool names. Non-CC platforms: substitute equivalent tools.
Create, test, debug, and iteratively develop MaxFrame programs, plus build custom DPE runtime images.
Documentation-only questions may skip the implementation flow and use Scenario 0 below.
digraph maxframe_workflow {
"User Request Arrives" [shape=box];
"Detect Scenario Type" [shape=diamond];
"Scenario 1: Writing Code" [shape=box];
"Scenario 2: Remote Debug" [shape=box];
"Scenario 3: Local Debug" [shape=box];
"Scenario 4: Custom Runtime" [shape=box];
"Understand Requirements" [shape=box];
"Operator Selection Needed?" [shape=diamond];
"Use lookup_operator.py" [shape=box];
"Confirm with User" [shape=box];
"Implement Code/Config" [shape=box];
"Add Error Handling" [shape=box];
"Validate execute() Called" [shape=box];
"Validate Session Cleanup" [shape=box];
"Provide Guidance" [shape=doublecircle];
"User Request Arrives" -> "Detect Scenario Type";
"Detect Scenario Type" -> "Scenario 1: Writing Code" [label="new pipeline"];
"Detect Scenario Type" -> "Scenario 2: Remote Debug" [label="cluster testing"];
"Detect Scenario Type" -> "Scenario 3: Local Debug" [label="IDE breakpoints"];
"Detect Scenario Type" -> "Scenario 4: Custom Runtime" [label="custom image"];
"Scenario 1: Writing Code" -> "Understand Requirements";
"Scenario 2: Remote Debug" -> "Understand Requirements";
"Scenario 3: Local Debug" -> "Understand Requirements";
"Scenario 4: Custom Runtime" -> "Understand Requirements";
"Understand Requirements" -> "Operator Selection Needed?";
"Operator Selection Needed?" -> "Use lookup_operator.py" [label="yes"];
"Operator Selection Needed?" -> "Implement Code/Config" [label="no"];
"Use lookup_operator.py" -> "Confirm with User";
"Confirm with User" -> "Implement Code/Config";
"Implement Code/Config" -> "Add Error Handling";
"Add Error Handling" -> "Validate execute() Called";
"Validate execute() Called" -> "Validate Session Cleanup";
"Validate Session Cleanup" -> "Provide Guidance";
}
Scenario 0: Documentation Navigation
Scenario 1: Writing MaxFrame Code
Scenario 2: Remote Debug Mode
Scenario 3: Local Debug Mode
Scenario 4: Create Custom Runtime Image
Use APIs from: maxframe.dataframe, maxframe.tensor, maxframe.learn, maxframe.session, maxframe.udf, maxframe.config. Use canonical imports: import maxframe.dataframe as md and from maxframe.session import new_session; never use from maxframe import new_session.
Use dotenv.load_dotenv() programmatically. Never read .env files directly with Read tool.
MaxFrame is lazy: operations build a graph and run only when .execute() is called. Execute only the final result/write action; do not call .execute() on intermediate DataFrame or Series variables unless the user explicitly asks for preview/debug output.
Always create session before operations, destroy in finally block for cleanup.
Before implementing processing logic, confirm operator selection with user using scripts/lookup_operator.py. If the user already named the exact operator or asked for code only, write Operator confirmed via user prompt: <operator> before implementation.
For MaxFrame documentation questions, check the official online docs first because APIs may change. Use bundled local docs as an offline fallback, for quick cross-checks, or when the website lacks detail.
Do not finish Scenario 0 unless the last non-empty line starts with Sources: and contains a full https://maxframe.readthedocs.io/ URL; never output .... If official docs cannot be checked, include Official docs unavailable: <reason>; using local fallback. before that line.
| Thought | Reality |
|---|---|
| "This is just a simple MaxFrame question" | Questions are tasks. Invoke the skill. |
| "I already know the MaxFrame API" | Skills have latest patterns. Use them. |
| "Let me just write the code directly" | Operator selection is MANDATORY. |
| "I can skip operator confirmation" | User confirmation is REQUIRED. |
Use for API/concept/example questions that do not require a new MaxFrame program.
python scripts/lookup_operator.py search "<term>"python scripts/lookup_operator.py info "<operator>" --section signature|params|examplesrg -n "<keyword>" references/maxframe-client-docs references/practical-guides references/operators-and-modulesreferences/maxframe-client-docs/reference/references/maxframe-client-docs/getting_started/references/maxframe-client-docs/user_guide/references/maxframe-client-docs/user_guide/dataframe/supported_pd_apis.mdreferences/practical-guides/references/runtime-image-guides/Sources: <full official URL> (Primary); append | <local path> (Fallback/Cross-check) only if local docs were usedpython scripts/lookup_operator.py search "<operation>", present options, get confirmation.execute() only, session.destroy() in finally, no hardcoded credentialsimport maxframe.dataframe as md
from maxframe.session import new_session
import dotenv
dotenv.load_dotenv()
session = new_session()
try:
df = md.read_odps_table("source_table")
result = df.groupby('column').agg({'value': 'sum'})
md.to_odps_table(result, "target_table", overwrite=True).execute()
finally:
session.destroy()
See: references/common-workflow.md for complete patterns.
import maxframe.dataframe as md
from maxframe.session import new_session
session = new_session()
try:
df = md.read_odps_table("table_name")
result = df.groupby('region').agg({'sales': 'sum'})
result.execute()
except Exception as e:
print(f"Error: {e}")
print(f"Logview URL: {session.get_logview_address()}")
finally:
session.destroy()
See: references/remote-debug-guide.md for detailed solutions.
debug=True, sample data with md.DataFrame(pd.DataFrame(...))import maxframe.dataframe as md
from maxframe.session import new_session
import pandas as pd
session = new_session(debug=True)
sample_data = pd.DataFrame({
'user_id': ['u1', 'u2', 'u3'],
'level': ['gold', 'silver', 'bronze'],
'amount': [1000, 500, 100]
})
df = md.DataFrame(sample_data)
def calculate_discount(row):
# Set breakpoint here in IDE
if row['level'] == 'gold':
return row['amount'] * 0.1
return row['amount'] * 0.02
try:
result = df.apply(calculate_discount, axis=1)
result.execute()
finally:
session.destroy()
See: references/local-debug-guide.md for complete guide.
Build custom Docker images through conversational guidance using best practices from reference guides.
Create when: need Python libraries not in standard DPE runtime, GPU-enabled processing, specific Python version, custom system dependencies NOT needed when: standard packages suffice, no GPU requirements
If the user already specifies base image, Python versions, GPU requirement, packages, or output directory, restate those choices and continue. Ask only for missing, ambiguous, or incompatible choices.
references/runtime-image-guides/README.mdStep 1: Base Image Selection (ask if missing)
Present Ubuntu options with trade-offs:
Which Ubuntu version for your custom runtime?
A. Ubuntu 22.04 (Recommended for most cases)
- Stable, production-ready
- Excellent CUDA support (12.4, 12.1, 11.8)
- Widely tested ML libraries (PyTorch, TensorFlow)
- LTS until 2027
B. Ubuntu 24.04 (Modern/latest)
- Newer system packages
- Latest LTS (until 2029)
- Better for non-GPU workloads
- Python 3.12 integration
Recommendation:
- GPU/ML workloads → Ubuntu 22.04
- Modern development → Ubuntu 24.04
Step 2: Python Version Selection (ask if missing)
Which Python versions?
A. Python 3.11 only (Recommended for production)
- Best performance
- Smallest image (~1 GB)
- Excellent package support
B. Python 3.10, 3.11, 3.12 (Development)
- Good compatibility
- Medium size (~2 GB)
- Recent versions
C. All versions 3.7-3.12 (Maximum flexibility)
- Largest image (~3-5 GB)
- Maximum compatibility
- Testing across versions
Recommendation:
- Production → Single version (3.11)
- Development → Recent versions (3.10-3.12)
Step 3: GPU Configuration (ask if missing)
If user mentions GPU or ML packages:
Need GPU support?
A. Yes - GPU-enabled with CUDA 12.4 (Recommended)
- Install PyTorch 2.6.0+cu124
- CUDA toolkit 12.4
- Note: Requires Ubuntu 22.04 for best compatibility
B. No - CPU only
- Standard package installation
- Smaller image size
Recommendation: For ML/AI workloads, GPU support significantly improves performance.
Compatibility Handling: If user selected Ubuntu 24.04 earlier and now requests GPU support:
Step 4: Build Dockerfile Section-by-Section
For each section:
Sections:
Step 5: Provide Build and Test Instructions
# Build
docker build -t <image-tag> <output-dir>
# Test Python
docker run --rm <image-tag> conda run -n py311 python --version
# Test GPU (if applicable)
docker run --rm --gpus all <image-tag> python -c "import torch; print(torch.cuda.is_available())"
# Test packages
docker run --rm <image-tag> conda run -n py311 python -c "import transformers; print(transformers.__version__)"
# Push to registry
docker push <image-tag>
Step 6: MaxFrame Usage Example — must include new_session(odps=odps_connection, image="..."); never show new_session(image=...) alone.
from maxframe.session import new_session
session = new_session(odps=odps_connection, image="your-registry/your-image:v1")
# Your MaxFrame operations here
| Component | Recommendation |
|---|---|
| Base Image | Ubuntu 22.04 (production, GPU, ML) |
| Python | 3.11 (production), 3.10-3.12 (development) |
| GPU | Ubuntu 22.04 + CUDA 12.4 + PyTorch 2.6.0+cu124 |
MaxFrame SDK NOT in Runtime Image: SDK and pyodps are client-side only. Custom runtime needs user-specific packages (transformers, pandas, etc.).
MF_PYTHON_EXECUTABLE (CRITICAL): Always set: ENV MF_PYTHON_EXECUTABLE=/py-runtime/envs/<env_name>/bin/python
See: references/runtime-image-guides/ for detailed guides on base image selection, Python environment strategy, package management, GPU/CUDA configuration, Dockerfile templates, and testing/validation.
MANDATORY before implementing processing logic when user mentions specific operations, asks about efficiency/performance, or you need to find appropriate MaxFrame operator.
For documentation-only answers, user confirmation is not required; still use the lookup script to ground API claims.
If the user explicitly names the operator or asks to skip interaction, output Operator confirmed via user prompt: <operator> and implement directly.
python scripts/lookup_operator.py search "<operation>"See: references/operators-and-modules/operator-selector.md for detailed guidance.
Before finishing, validate:
.execute() called on result DataFramefrom maxframe import new_session; no intermediate .execute()finally blockdebug=True used (local debug)MF_PYTHON_EXECUTABLE set (custom runtime)references/operators-and-modules/operator-selector.mdreferences/local-debug-guide.mdreferences/remote-debug-guide.mdreferences/common-workflow.mdreferences/maxframe-client-docs/references/practical-guides/references/runtime-image-guides/assets/examples/*.pyscripts/lookup_operator.py