爬论文与人才触达工作流

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed recruiting scraper, but it enables broad personal-data harvesting, ethnicity inference, and bulk outreach workflows without enough user controls or privacy guardrails.

Install only if you are prepared to strictly limit it to authorized, lawful recruiting workflows. Before use, disable protected-attribute targeting, avoid harvesting non-public or unnecessary emails, review site terms and robots policies, keep scraped data out of shared Feishu spaces unless access and retention are controlled, and grant only the minimum Feishu/BrightData/API permissions needed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (31)

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The README explicitly promotes bulk collection of personal data such as names, emails, affiliations, research interests, social links, and inferred ethnicity, then exporting that data into Feishu tables and generating outreach at scale. This creates real privacy and compliance risk because it encourages large-scale processing of personal data without any guardrails around consent, lawful basis, sensitive-attribute handling, retention, notice, or jurisdiction-specific restrictions.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill recommends BrightData and anti-scraping approaches for collecting data from third-party sites, but does not disclose that retrieved content may be transmitted through an external service or discuss terms-of-service, privacy, or cross-border data handling implications. In a recruiting context involving personal profiles and emails, that omission can cause unintentional disclosure of personal data to third parties and encourage non-compliant collection practices.

Vague Triggers

High

Confidence: 97% confidence
Finding: The activation rule says the skill should be used whenever a broad class of user tasks is inferred, even without explicit invocation. Over-broad auto-triggering is risky because it can silently route ordinary requests into a workflow that performs scraping, contact extraction, and outreach preparation, bypassing informed user consent.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The listed task categories are expansive and lack exclusions or safety qualifiers, making it easy for the host agent to classify loosely related requests as eligible for this workflow. In context, that broadness matters because the workflow includes sensitive data extraction and mass outreach functions, so imprecise triggering can lead to unnecessary collection or processing of personal data.

Missing User Warnings

High

Confidence: 96% confidence
Finding: This section operationalizes extraction of personal data and generation of personalized recruiting emails, but omits privacy, consent, lawful-basis, or anti-spam safeguards. In practice, that can normalize large-scale harvesting and outreach using profile, publication, and contact information without user-facing checks, creating privacy, compliance, and reputational risk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The Feishu workflow describes importing scraped candidate data and writing generated email content into shared tables without warning about retention, access control, or downstream sharing risks. That is dangerous because centralizing scraped personal data in collaborative tools can amplify exposure if permissions are broad or if the data was collected without proper notice.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The prompt template explicitly directs broad scraping of websites and extraction of emails, including following links to gather missing addresses, with no mention of privacy, consent, robots/ToS, or rate-limit restrictions. This creates a ready-made playbook for aggressive personal-data harvesting and can be misused at scale.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The document provides extensive scraping and anti-detection guidance but omits guardrails about privacy, consent, rate limiting, terms-of-service, and legal/operational risk. In the context of a talent-search skill that explicitly targets harvesting researcher identities, profiles, and emails, this omission increases the likelihood of misuse for unauthorized bulk collection or evasive scraping.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: This section recommends bypassing TLS protections via verify=False or downgrading from HTTPS to HTTP, without any warning that this disables certificate validation and exposes traffic to man-in-the-middle tampering. If copied into the skill's workflows, the agent could ingest manipulated pages, leak requests to attackers, or normalize insecure transport for data collection.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This file embeds a scoring algorithm to infer whether an author is Chinese from names, affiliations, and IDs, which constitutes sensitive ethnicity/national-origin profiling with no safeguards, consent, accuracy limits, or fairness warnings. In the context of a recruiting and researcher-mapping skill, this is especially dangerous because it can enable targeted filtering, segmentation, or outreach based on protected characteristics and can misclassify individuals at scale.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The document explicitly instructs users to bulk scrape and export authors' email addresses and other contact details for talent sourcing and outreach, but provides no privacy safeguards, consent checks, purpose limitations, or legal/compliance warnings. In the context of a recruiting-oriented skill, this materially increases the risk of unauthorized collection, profiling, and mass contact of individuals using personal data harvested from academic platforms and PDFs.

Natural-Language Policy Violations

High

Confidence: 98% confidence
Finding: The document includes procedures to identify 'Chinese authors' by surname, institution, and profile patterns, which is explicit inference and classification of ethnicity/national origin from names and affiliations. In this skill's hiring and researcher-mapping context, that creates a high-risk profiling capability that could be used for discriminatory recruiting, exclusion, or targeted outreach based on a sensitive attribute.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The templates explicitly encourage moving outreach conversations onto WeChat and disclose a personal contact handle, but provide no guardrails around consent, lawful data use, or safe handling of personal contact information. In the context of a sourcing/recruiting automation skill that scrapes researcher identities, emails, and affiliations, this increases the risk of privacy-invasive contact, unsolicited outreach, and inappropriate transfer of personal data to external messaging platforms.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: The document defaults to Chinese or mixed-language outreach without checking recipient language preference or opt-in, which can make automated contact more deceptive, intrusive, or non-consensual. Because this skill is designed for bulk researcher discovery, classification, and personalized email generation, the lack of language-choice safeguards amplifies the risk of unsolicited mass outreach and targeting of specific demographic groups such as Chinese researchers.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This section explicitly promotes bulk collection of authors' profiles, homepages, and especially email addresses, but provides no guidance on consent, lawful basis, rate limiting, or downstream use of personal data. In a recruitment-oriented skill, that omission increases the risk of privacy violations, unwanted contact, and misuse of scraped contact information.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The file recommends extracting email addresses from paper PDFs to supplement missing contact details, which materially expands personal-data harvesting beyond obvious page metadata. Without safeguards, this can enable mass contact harvesting for spam, profiling, or other privacy-invasive uses, especially because PDF extraction is presented as an efficiency tactic.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: These sections describe scraping member pages and pulling emails, education histories, interests, and social links without any privacy or ethics boundary. Because the broader skill is designed for researcher discovery and personalized recruiting, the context makes silent aggregation of identity/contact data more operationally risky than a purely academic indexing use case.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The template explicitly instructs large-scale collection of personal data from websites, linked profiles, papers, and secondary pages to extract emails, then centralizes that data into CSV and Feishu without any consent, purpose limitation, or legal/privacy checks. In the context of a recruiting/mapping skill, this materially increases the risk of privacy violations, unauthorized profiling, mass contact enablement, and non-compliant handling of personal data.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The guide explicitly teaches bulk collection of personal pages, emails, social links, lab rosters, and researcher identities, but provides no privacy, consent, retention, or lawful-use guidance. In the context of a recruiting/mapping skill that is designed to discover and aggregate personal contact data at scale, this omission materially increases the risk of misuse for unsolicited outreach, profiling, or non-compliant data processing.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The document shows authenticated GitHub API usage with a personal token but does not warn against hardcoding tokens, logging headers, or committing credentials to source control. This creates a realistic risk of accidental credential exposure and subsequent API abuse, account impact, or unauthorized access to associated resources.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The example disables TLS certificate verification with `verify=False` and gives no warning that this bypasses server identity validation. If copied into real scraping code, it enables man-in-the-middle interception or response tampering, which is especially risky when collecting personal data or using authenticated sessions.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The script is explicitly designed to extract and persist authors' real email addresses and inferred institutions from PDFs, but it provides no privacy notice, consent check, purpose limitation, or policy guardrails. In the context of a recruiting/mapping skill that identifies researchers and supports personalized outreach, this creates a material risk of privacy-invasive collection and downstream mass contact or profiling.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The scraper automatically downloads PDFs and extracts contact information during normal execution when extract_pdf is enabled by default, with no point-of-action disclosure or confirmation. Because this behavior is embedded in an AI talent-mapping skill oriented toward finding researchers and generating outreach, the automation meaningfully lowers the barrier to bulk harvesting of personal contact data.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script is explicitly designed to harvest emails, social accounts, and profile links for large numbers of GitHub users, enrich them from README content, and export the dataset to Excel/CSV. In the context of a recruiting/mapping skill, this materially increases privacy and abuse risk because it enables bulk profiling, contact harvesting, and downstream spam or targeted outreach without notice, consent, or data-minimization controls.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The scraper is explicitly designed to collect personal contact data such as names, email addresses, social links, education, and research interests from public-facing lab pages, but it provides no consent notice, lawful-use guardrails, or privacy constraints. In the context of a recruiting/mapping skill that also mentions bulk email generation, this materially increases the risk of privacy abuse, unsolicited outreach, and non-compliant personal data processing.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal