Gemini Guide

v1.0.0

Google Gemini API 开发助手,精通 Gemini Pro/Flash、多模态、函数调用、上下文缓存

0· 87·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description match the SKILL.md content: a developer guide for Google Gemini (models, SDK usage, multimodal examples, caching). Nothing requested (no env vars, no binaries) is disproportionate to that purpose.
Instruction Scope
Runtime instructions are example code snippets for the official google-genai SDK, covering model calls, multimodal uploads, function-calling, and caching. Examples reference local files (photo.jpg, video.mp4) and an API key placeholder — all expected for this type of guide. The instructions do not direct the agent to read unrelated system files, access unrelated secrets, or POST data to unexpected endpoints.
Install Mechanism
No install spec or code files are present; the SKILL.md only suggests installing the official 'google-genai' Python package via pip, which is appropriate and low-risk for a usage guide.
Credentials
The document mentions an API key in examples (api_key="YOUR_API_KEY") but the skill declares no required env vars or credentials. Requesting a Google AI API key is appropriate for the guide's purpose and there are no unrelated credential requests.
Persistence & Privilege
Skill is instruction-only, no install, does not request persistent presence or system-level changes. Platform flags (always: false, agent invocation allowed) are standard and consistent with a normal skill.
Assessment
This skill is an example-driven guide for using Google Gemini via the google-genai Python SDK and appears coherent. Before using it: 1) Only supply your Google API key to trusted, official SDKs and endpoints; never paste keys into public chat. 2) Confirm the package 'google-genai' is the official release on PyPI and install it in an isolated environment. 3) Be mindful of costs when using high-context models or uploading large media; restrict API key permissions and set quotas in your Google account. 4) If you run the example code, ensure local files referenced (photo.jpg, video.mp4) are files you intend to upload. These checks will reduce accidental exposure or unintended charges.

Like a lobster shell, security has layers — review code before you run it.

latestvk979bfqeanfa1s7m6tx4x9en3983db6r

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Gemini API - Google AI 模型接入指南

简介

Gemini 是 Google 的多模态大模型,通过 AI Studio 或 Vertex AI 提供 API。 核心优势:超长上下文(最高 200 万 token)和原生多模态(文本/图片/视频/音频)。

模型矩阵

模型上下文窗口特点适用场景
gemini-2.5-pro100 万最强推理,思维链复杂分析、代码生成
gemini-2.0-flash100 万速度快,性价比高日常对话、批量处理
gemini-2.0-flash-lite100 万最快最便宜简单任务、高并发
gemini-1.5-pro200 万超长上下文长文档分析、代码库理解

SDK 安装与基础调用

pip install google-genai   # 官方 SDK
from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="用 Python 实现一个快速排序算法"
)
print(response.text)

多模态能力

from google.genai import types
import pathlib
# 图片理解
image = types.Part.from_bytes(data=pathlib.Path("photo.jpg").read_bytes(), mime_type="image/jpeg")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["描述图片内容", image])
# 视频理解(直接上传文件)
video_file = client.files.upload(file="video.mp4")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["总结视频内容", video_file])
# 音频理解
audio_file = client.files.upload(file="audio.mp3")
response = client.models.generate_content(model="gemini-2.0-flash", contents=["转录并翻译", audio_file])

函数调用与 JSON 模式

# 函数调用
get_weather = types.FunctionDeclaration(
    name="get_weather", description="获取城市天气",
    parameters=types.Schema(type="OBJECT",
        properties={"city": types.Schema(type="STRING", description="城市名")},
        required=["city"]))
tool = types.Tool(function_declarations=[get_weather])
response = client.models.generate_content(
    model="gemini-2.0-flash", contents="北京天气?",
    config=types.GenerateContentConfig(tools=[tool]))
# JSON 模式
response = client.models.generate_content(
    model="gemini-2.0-flash", contents="列出 3 种编程语言",
    config=types.GenerateContentConfig(response_mime_type="application/json"))

上下文缓存(Context Caching)

反复查询同一大文档时可大幅降低成本:

cache = client.caches.create(model="gemini-2.0-flash", contents=[large_document],
    config=types.CreateCachedContentConfig(display_name="my-cache", ttl="3600s"))
response = client.models.generate_content(model="gemini-2.0-flash", contents="第三章讲了什么?",
    config=types.GenerateContentConfig(cached_content=cache.name))

定价对比(每百万 token)

模型输入价格输出价格
Gemini 2.0 Flash$0.10$0.40
Gemini 2.5 Pro$1.25$10.00
Claude Sonnet 4$3.00$15.00
GPT-4o$2.50$10.00

与 OpenAI/Claude API 的差异

特性Gemini APIOpenAI APIClaude API
最大上下文200 万 token12.8 万20 万
原生多模态文本/图片/视频/音频文本/图片/音频文本/图片
免费额度有(AI Studio)
上下文缓存原生支持Prompt Caching
SDK 风格自有 + OpenAI 兼容自有自有

最佳实践

  • 默认用 gemini-2.0-flash,性价比最高
  • 长文档用上下文缓存,节省 75%+ 成本
  • 视频/音频理解是 Gemini 独特优势
  • API Key: https://aistudio.google.com/apikey

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…