Security audit

Alibabacloud Lindorm Vector Migrate Skill

Security checks across malware telemetry and agentic risk

Overview

This skill matches its database-migration purpose, but it deserves review because it can execute migration code with credentials and includes unsafe plaintext HTTP credential examples for sensitive database operations.

Review generated scripts before running them, use least-privilege temporary credentials, prefer TLS/HTTPS or private trusted networks, avoid pasting real passwords into command lines or chat, test on staging first, and make backups before allowing target index creation, overwrite, or deletion.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (20)

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The trigger list includes very broad terms such as '迁移', 'migration', and 'data migration', which can cause the skill to activate for generic requests unrelated to this specific Lindorm vector workflow. Because the skill is designed to generate and execute migration code, accidental invocation increases the chance of unintended system-impacting actions or unnecessary collection of sensitive connection details.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill description explicitly says it will generate and execute code to perform migration, but it does not present a prominent upfront warning that these actions can affect infrastructure, consume resources, create indexes, or transfer large volumes of data. In a skill that performs operational changes, lack of a clear high-level safety warning raises the risk of users authorizing impactful actions without understanding the consequences.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The document explicitly states data will be exported 'as-is' to a local CSV file, which can include sensitive source records, vectors, IDs, and serialized JSON fields. In this skill's context, the agent is designed to generate and execute migration/export code, so lack of an explicit warning, minimization guidance, or secure-handling requirements increases the chance of unintended data exposure on local disk or through subsequent OSS upload.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The documentation embeds a plaintext Basic Auth username and password directly in example commands, which normalizes unsafe secret handling and can lead users to paste real credentials into shell history, logs, screenshots, or shared docs. In a migration skill that may generate and execute commands, this increases the chance of credential exposure during operational use.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The documentation shows Basic Auth credentials sent over plain HTTP to both VPC and public endpoints, with no warning that credentials and migrated data may be exposed to interception. In a migration skill that can directly generate and execute code against production databases, this is materially dangerous because users may copy these examples verbatim and transmit sensitive access credentials insecurely.

Missing User Warnings

Low

Confidence: 90% confidence
Finding: The workflow instructs the agent to connect to remote source systems and enumerate indices/collections before the user has been explicitly warned that metadata from those systems will be accessed. Even if this is expected for migration, schema and collection names can themselves be sensitive and the action crosses a trust boundary by initiating network access and collecting remote metadata.

Missing User Warnings

Low

Confidence: 92% confidence
Finding: The workflow asks for OSS endpoint, bucket, object key, and cloud credentials and then accesses remote object metadata without an explicit warning about handling secrets and querying cloud resources. This creates risk of accidental credential disclosure in conversation, misuse of over-privileged keys, or unintended access to bucket/object metadata.

Missing User Warnings

Low

Confidence: 94% confidence
Finding: The pre-check phase performs local environment inspection and multiple outbound connections to source and target systems without a clear up-front notice that Python version, installed packages, schemas, counts, storage stats, and sampled records may be queried. This is risky because it can reveal local runtime details and remote data characteristics that users may not expect to expose during a pre-check.

Missing User Warnings

Low

Confidence: 95% confidence
Finding: The workflow shows a curl command with inline username and password to query Lindorm cluster version information, but does not warn about shell history, process listing, logs, or transcript exposure. Inline credentials materially increase the chance that secrets are leaked to local audit trails or copied into generated scripts and chat history.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document instructs use of authenticated curl commands with inline username:password over plain HTTP, but provides no warning about credential exposure in shell history, process listings, logs, or network transit. In an ops-facing migration skill, this creates a realistic risk of accidental credential leakage during routine use.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The document includes a DELETE example that removes an index but does not prominently warn that this operation is irreversible and will destroy data. In a migration assistant context, users may copy commands directly, so the lack of a safety warning materially increases the risk of accidental data loss in production.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The examples send Basic Auth credentials to Lindorm over plain HTTP, which exposes usernames and passwords to interception by anyone on the network path. Because this skill is designed to generate and execute migration code, normalizing insecure transport in examples can lead directly to credential compromise and unauthorized database access.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The document provides concrete destructive deletion examples such as `es.delete_by_query(...)` as normal migration replacements, but does not pair them with any warning about irreversible data loss, scope validation, or the need to confirm target/source selection. In this skill’s context, the agent is intended to generate and execute migration code directly, so users may copy or run delete operations against the wrong index or environment and cause bulk data loss.

Missing User Warnings

Medium

Confidence: 78% confidence
Finding: The guide recommends write operations like `es.index(...)` and `_bulk` as standard migration replacements without warning that they modify target data and may overwrite, duplicate, or partially import records if parameters are wrong. Because this skill is designed to collect parameters and directly generate/execute migration code, the lack of caution increases the chance of unintended writes to production Lindorm instances.

External Transmission

Medium

Category: Data Exfiltration
Content: Written data may still be in the refresh interval and not yet visible; force refresh first: ```bash curl -s -u <username>:<password> -X POST "http://<host>:30070/<index_name>/_refresh" ``` ## Row Count Validation
Confidence: 89% confidence
Finding: curl -s -u <username>:<password> -X POST "http://<host>:30070/<index_name>/_refresh" ``` ## Row Count Validation ```bash curl -s -u <username>:<password> "http://<host>:30070/<index_name>/_count" \

External Transmission

Medium

Category: Data Exfiltration
Content: Use `function_score` + `random_score` to randomly sample from the target, then compare against source data: ```bash curl -s -u <username>:<password> "http://<host>:30070/<index_name>/_search" \ -H "Content-Type: application/json" \ -d '{ "size": <sample_size>,
Confidence: 88% confidence
Finding: curl -s -u <username>:<password> "http://<host>:30070/<index_name>/_search" \ -H "Content-Type: application/json" \ -d

External Transmission

Medium

Category: Data Exfiltration
Content: auth = (username, password) if username else None # Trigger build resp = requests.post( f"{base_url}/_plugins/_vector/index/build", json={"indexName": index_name, "fieldName": vector_field}, auth=auth
Confidence: 93% confidence
Finding: requests.post( f"{base_url}/_plugins/_vector/index/build", json=

External Transmission

Medium

Category: Data Exfiltration
Content: ```bash # Search engine port is fixed at 30070 curl -s -u <username>:<password> "http://<host>:30070/" ``` ## Create Index
Confidence: 98% confidence
Finding: curl -s -u <username>:<password> "http://<host>:30070/" ``` ## Create Index ### Check if Index Exists ```bash curl -s -u <username>:<password> -o /dev/null -w "%{http_code}" \ "http://<host>:3007

External Transmission

Medium

Category: Data Exfiltration
Content: After triggering build, **MUST** poll until `status` is `ready` or `failed`; **MUST NOT** exit polling while `status=building`. ```bash curl -s -u <username>:<password> -X GET \ "http://<host>:30070/_plugins/_vector/index/tasks" \ -H "Content-Type: application/json" \ -d '{"indexName": "<index_name>", "fieldName": "<vector_field>", "taskIds": "[]"}'
Confidence: 94% confidence
Finding: curl -s -u <username>:<password> -X GET \ "http://<host>:30070/_plugins/_vector/index/tasks" \ -H "Content-Type: application/json" \ -d

Unvalidated Output Injection

High

Category: Output Handling
Content: | Source | Query-back method | Notes | |--------|-------------------|-------| | Milvus | `client.query(collection_name, filter=f"{pk_field} == {repr(pk_val)}", output_fields=["*"], limit=1)` | Lindorm `_id` is string; Milvus primary key may be int, need `int(doc_id)` conversion. Requires PyMilvus SDK | | Elasticsearch | `curl -s -u user:pass "http://<es_host>:9200/<source_index>/_doc/<doc_id>"` | Direct lookup by `_id` | | Lindorm | `curl -s -u user:pass "http://<src_host>:30070/<source_index>/_doc/<doc_id>"` | Same as ES | | Qdrant | `curl -s "http://<qd_host>:6333/collections/<col>/points/<point_id>"` | point id may be int or UUID str |
Confidence: 94% confidence
Finding: query(collection_name, filter=f"{pk_field} == {repr(pk_val)}", output

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal