{"skill":{"slug":"eks-workload-best-practice-assessment","displayName":"Eks Workload Best Practice Assessment","summary":"Use when assessing or reviewing Kubernetes workloads running on Amazon EKS for best practice compliance, including pod configuration, security posture, obser...","description":"---\nname: eks-workload-best-practice-assessment\ndescription: >\n  Use when assessing or reviewing Kubernetes workloads running on Amazon EKS\n  for best practice compliance, including pod configuration, security posture, observability,\n  networking, storage, image security, and CI/CD practices. Requires kubectl and awscli access\n  to the target cluster. Triggers on \"assess my EKS workloads\", \"check k8s best practices\",\n  \"assess container workloads\", \"evaluate pod security\", \"workload compliance check\",\n  \"EKS workload assessment\", \"检查 K8s 工作负载\", \"评估容器最佳实践\",\n  \"审计 EKS 应用\", \"检查 Pod 配置\", \"容器安全评估\", \"工作负载合规检查\".\n---\n\n# EKS Workload Best Practice Assessment\n\nAssess Kubernetes workloads on Amazon EKS against best practices from K8s official documentation\nand the EKS Best Practices Guide. Covers 8 dimensions: workload configuration, security,\nobservability, networking, storage, EKS platform integration, CI/CD, and image security.\n\n## Prerequisites\n\nThis skill requires:\n\n- **[aws knowledge mcp server](https://github.com/awslabs/mcp/tree/main/src/aws-knowledge-mcp-server)** tools:\n  - `aws___search_documentation` — search AWS documentation\n  - `aws___read_documentation` — read full documentation pages\n  - `aws___recommend` — get related documentation\n- **[context7 MCP](https://github.com/upstash/context7)** tools:\n  - `context7_resolve-library-id` — resolve K8s library ID\n  - `context7_query-docs` — query K8s documentation\n- **AWS CLI** (`aws`) — configured with read access to the target EKS cluster and ECR\n- **kubectl** — configured to access the target EKS cluster\n- **jq** — for parsing JSON output from AWS CLI and kubectl commands\n\n## Scope Boundary\n\nThis skill focuses on **workload-level** checks — items that require `kubectl` or in-cluster\ninspection. It complements `aws-best-practice-research` which covers the **infrastructure layer**\n(control plane, node groups, addons, etc.).\n\n| This Skill (Workload Layer) | aws-best-practice-research (Infra Layer) |\n|-----------------------------|------------------------------------------|\n| Pod resource requests/limits | Control plane configuration |\n| Probes (liveness/readiness/startup) | Node group sizing and AZ distribution |\n| PDB, topology constraints | Addon versions |\n| Pod security context, PSA | Secrets envelope encryption |\n| Network Policies | Cluster networking (VPC, subnets) |\n| Service Accounts, RBAC | Authentication mode, Access Entries |\n| Container image scanning | GuardDuty EKS protection |\n| HPA/VPA/Karpenter workload config | Karpenter/CA infrastructure config |\n\n## Workflow\n\n### Step 1: Confirm Assessment Scope\n\nDetermine from user input:\n- **Cluster name** and **AWS Region**\n- **Assessment scope**:\n  - **Full cluster** — assess all namespaces (excluding `kube-system`, `kube-public`, `kube-node-lease` by default)\n  - **Specific namespaces** — user-specified list\n  - **Specific workloads** — user-specified Deployments/StatefulSets\n- **Include infrastructure layer?** — whether to also invoke `aws-best-practice-research` for\n  the EKS infrastructure layer and merge results (default: yes)\n\nIf the user provides only a cluster name, default to full cluster assessment.\n\n### Step 2: Environment Detection & Version Awareness\n\nRun the following commands to detect the environment:\n\n```bash\n# Cluster info via AWS CLI\naws eks describe-cluster --name {CLUSTER} --region {REGION}\n\n# K8s version\nkubectl version --output=json\n\n# Node distribution\nkubectl get nodes -o wide --no-headers\n```\n\nRecord:\n- **K8s server version** (e.g., 1.30) — used for version-aware filtering\n- **EKS platform version** (e.g., eks.15)\n- **Node count and AZ distribution**\n- **Node instance types**\n\n**Version-aware filtering rules** (apply in Step 3):\n- K8s >= 1.25: Check Pod Security Admission (PSA), skip PodSecurityPolicy (PSP)\n- K8s < 1.25: Check PSP, note PSA as upgrade recommendation\n- K8s >= 1.20: Check Startup Probes\n- K8s >= 1.19: Check Topology Spread Constraints\n- K8s >= 1.29 + VPC CNI >= 1.21.1: Check Admin Network Policies\n- EKS with Pod Identity available: Prefer Pod Identity over IRSA\n\n### Step 3: Dynamic Best Practice Research\n\nResearch the latest best practices using **context7** and **aws-knowledge-mcp-server**.\nRun all queries **sequentially** (one at a time) to avoid rate limiting.\n\nFor each of the 8 assessment dimensions, execute the search queries defined in\n`references/search-queries.md`. The general flow per dimension is:\n\n1. Query **context7** (`/websites/kubernetes_io`) for K8s official best practices\n2. Query **aws-knowledge-mcp-server** for EKS-specific best practices\n3. Read key documentation pages from search results (max 2-3 pages per dimension)\n4. Extract check items with specific thresholds and conditions\n\nAfter all research is complete, merge results with the **baseline framework** in\n`references/check-dimensions.md` to ensure no critical dimension is missed.\n\nApply version-aware filtering from Step 2 to remove inapplicable items and add\nversion-specific recommendations.\n\n**Rate limit protection**: If any MCP request returns \"Too many requests\", wait 5 seconds\nand retry once. If it fails again, skip and continue. Sequential execution is mandatory.\n\n### Step 4: Infrastructure Layer Assessment (Optional)\n\nIf infrastructure layer assessment is included (default: yes):\n\n1. Invoke the `aws-best-practice-research` skill for the EKS cluster\n2. Store the infrastructure-layer checklist and assessment results\n3. These will be merged into the final report in Step 7\n\nIf the user opts out, skip this step.\n\n### Step 5: Workload Data Collection\n\nCollect workload configurations using `kubectl`. Independent commands **can run in parallel**\n(they are not subject to MCP rate limits).\n\nSee `references/kubectl-assessment-commands.md` for the complete command list. Key data to collect:\n\n```bash\n# Core workloads\nkubectl get deployments,statefulsets,daemonsets,jobs,cronjobs --all-namespaces -o json\n\n# Pod specifications (within workloads above)\n# Already included in the -o json output\n\n# Disruption and scaling\nkubectl get pdb,hpa --all-namespaces -o json\n\n# Networking\nkubectl get networkpolicies,services,ingresses --all-namespaces -o json\n\n# Security\nkubectl get serviceaccounts --all-namespaces -o json\nkubectl get clusterrolebindings,rolebindings -o json\n\n# Storage\nkubectl get pvc,storageclass -o json\n\n# Namespace labels (for PSA)\nkubectl get namespaces -o json\n\n# Events (recent issues)\nkubectl get events --all-namespaces --sort-by='.lastTimestamp' -o json\n```\n\nFor **ECR image scanning** (if images are from ECR):\n```bash\n# For each unique ECR image found in workloads\naws ecr describe-image-scan-findings --repository-name {REPO} --image-id imageTag={TAG}\naws ecr describe-repositories --repository-names {REPO}\naws ecr get-lifecycle-policy --repository-name {REPO}\n```\n\nFilter collected data to the assessment scope (namespaces/workloads from Step 1).\n\n### Step 6: Per-Dimension Assessment\n\nFor each check item from the research phase (Step 3), evaluate every in-scope workload:\n\n| Status | Meaning |\n|--------|---------|\n| **PASS** | The workload configuration meets or exceeds the recommendation |\n| **FAIL** | The workload configuration does not meet the recommendation |\n| **WARN** | Cannot be fully verified, or partially meets the recommendation |\n| **N/A** | The check does not apply (e.g., storage checks for stateless workloads) |\n\nFor each finding, record:\n- Check item ID and name\n- Status (PASS/FAIL/WARN/N/A)\n- **Actual value** observed (not just \"not configured\")\n- The specific workload(s) affected\n- Version relevance notes (if any)\n\n### Step 7: Generate Report and Save to Local File\n\nGenerate a single comprehensive report using the template in `references/output-template.md`\nand **write it directly to a local markdown file**.\n\n**IMPORTANT — File Writing Rules**:\n- Use the **Write/file tool** (not bash heredoc/echo/cat) to create the report file\n- If the report is too large for a single write, **split into sections**: write the\n  file with the first half, then use an append/edit operation to add the remaining sections\n- Do NOT output the full report content to the terminal\n\nUse the following file naming convention:\n\n```bash\nTIMESTAMP=$(TZ=Asia/Shanghai date +%Y-%m-%d-%H-%M-%S)\nCLUSTER_SLUG=$(echo \"{CLUSTER_NAME}\" | tr '[:upper:]' '[:lower:]' | tr ' :/' '-')\n```\n\n**Assessment Report** — see `references/output-template.md`\n- Full cluster overview\n- Compliance scorecard with rating scale, top 3 priorities, and quick stats\n- Dimension-by-dimension assessment tables\n- Per-workload detail section\n- Critical issues and prioritized remediation\n- Data sources and reference links\n- **Save to:** `${TIMESTAMP}-${CLUSTER_SLUG}-assessment-report.md`\n\nIf infrastructure layer results exist from Step 4, merge them into the report.\n\nAfter saving, print a brief summary to the terminal listing only:\n- The file path of the generated report\n- Overall compliance score\n- Number of PASS / FAIL / WARN findings\n\n### Step 8: Remediation Guidance & Next Steps\n\nAfter saving the reports, offer:\n- \"I can help fix specific FAIL items — which ones would you like to address?\"\n- \"I can re-run the assessment after remediation to verify improvements.\"\n\nFor Critical Issues (FAIL + High priority), provide:\n- Specific remediation commands or manifest changes\n- Whether the fix requires workload restart or is in-place\n- Impact assessment of the change\n\n## Important Guidelines\n\n- **Be comprehensive**: The value of this skill is thoroughness. Better to include a check\n  and mark it N/A than to miss it.\n- **Always cite sources**: Every check item must reference its source (EKS Best Practices Guide,\n  K8s official docs, etc.).\n- **Sequential MCP queries**: All context7 and aws-knowledge-mcp requests must be sequential.\n  kubectl commands can be parallel.\n- **Rate limit protection**: Wait 5s and retry once on \"Too many requests\". Skip on second failure.\n- **Version awareness**: Always filter checks by detected K8s/EKS version. Never recommend\n  features unavailable in the cluster's version.\n- **Actual values in findings**: Always report what was observed, not just \"not configured\".\n  Good: \"`resources.requests.memory: not set` — container has no memory request\"\n  Bad: \"Memory request missing\"\n- **Per-workload granularity**: Report findings at the individual Deployment/StatefulSet level,\n  not just cluster-wide summaries.\n- **Exclude system namespaces by default**: Skip `kube-system`, `kube-public`, `kube-node-lease`\n  unless the user explicitly includes them.\n- **Respect language**: Output in the same language as the user's conversation.\n- **Infrastructure vs workload boundary**: Never duplicate checks from `aws-best-practice-research`.\n  This skill handles ONLY what requires kubectl/in-cluster access.\n","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":280,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1778412841337,"updatedAt":1778492892498},"latestVersion":{"version":"1.0.0","createdAt":1778412841337,"changelog":"EKS Workload Best Practice Assessment v1.0.0\n\n- Initial release providing end-to-end workload best practice assessment for EKS Kubernetes clusters.\n- Assesses workloads against 8 key dimensions: configuration, security, observability, networking, storage, platform integration, CI/CD, and image security.\n- Implements version-aware checks and dynamic best practice research from Kubernetes and AWS EKS documentation.\n- Workload data gathering supports cluster-wide, namespace-specific, or workload-targeted scopes.\n- Optional integration with \"aws-best-practice-research\" for infrastructure-layer best practices.\n- Generates a detailed, markdown-formatted assessment report, saved locally per workflow instructions.","license":"MIT-0"},"metadata":null,"owner":{"handle":"panlm","userId":"s170tnbkqyrrzzgez8ybwx5vcs86edhj","displayName":"panlm","image":"https://avatars.githubusercontent.com/u/1658398?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780090777163}}