Install
openclaw skills install bookforge-lazy-load-strategy-implementerImplement Lazy Load (deferred loading) correctly in a persistence layer to avoid N+1 queries, ripple loading, and proxy identity traps. Use when encountering slow object graph loads, N+1 query problems, out-of-memory on eager loading, ORM lazy loading misconfiguration, or deciding between eager vs lazy loading strategies. Applies to: Hibernate FetchType.LAZY / @BatchSize, SQLAlchemy lazy='select'/'selectin'/'subquery', Django select_related / prefetch_related, EF Core Include() vs Load(), TypeORM eager/lazy relations, Rails includes/preload/eager_load, hand-rolled Data Mapper with virtual proxy patterns. Covers all four implementation variants — lazy initialization, virtual proxy, value holder, ghost — with applicability rules and trade-off analysis. Identifies and fixes the ripple loading anti-pattern (N+1 on collections), the proxy identity trap (two proxies, same row, broken equality), and misuse of Lazy Load on small graphs that should just be eagerly loaded. Produces an implementation plan: chosen variant, ORM configuration or code sketch, batch loading config, eager-load overrides for hot paths, Identity Map integration, and a ripple-loading audit. Requires knowing the data-source pattern already in use (Data Mapper / ORM vs Active Record); if unknown invoke data-source-pattern-selector first.
openclaw skills install bookforge-lazy-load-strategy-implementerYou have an object graph backed by a relational database. Loading some objects eagerly pulls in far more data than any single use case needs — but loading objects lazily without discipline produces N+1 queries (one database call per object in a collection) or ripple loading (a cascade of individual loads triggered across the graph).
Apply this skill when:
Do not apply Lazy Load when:
Prerequisite: know which data-source pattern the codebase uses. If unclear, invoke data-source-pattern-selector first, or ask the user. Lazy Load implementation differs significantly between hand-rolled Active Record (Lazy Initialization is simplest) and Data Mapper / ORM (Virtual Proxy or Ghost are standard).
Gather the following before proceeding:
Required:
Observable from codebase (Grep/Read):
@OneToMany, relationship(), belongs_to, etc.)FetchType.LAZY, lazy='select', include: option)Defaults if not provided:
Sufficiency check: If you know the ORM name, one or two entity classes, and the symptom (N+1 or memory pressure), that is enough to produce the implementation plan. Do not wait for full codebase access.
WHY: Applying Lazy Load blindly to every association trades an eager-loading problem for a ripple-loading problem. The classification determines which associations get Lazy Load, and which get eager loading or fetch-join overrides.
For each association in the entity/model:
Document the classification in a table: association name | direction | lazy/eager/override | reason.
WHY: Ripple loading is the primary failure mode of naively applied Lazy Load. A collection of individually-lazy objects iterated in a loop causes N database calls where one batch call would suffice. This must be identified before choosing a variant.
Search the codebase for these patterns:
JOIN FETCH, include, prefetch_related, or batch size configFlag each occurrence: it is a ripple-loading site. The fix in every case is: make the collection itself the lazy unit, loaded in one batch, not each element individually.
WHY: The four variants differ in how visible the lazy mechanism is to calling code, what identity guarantees they provide, and how much infrastructure they require. The ORM context nearly always determines which variant is practical.
Choose one variant per the decision tree:
If using a standard ORM (Hibernate, EF Core, SQLAlchemy ORM, Django ORM, TypeORM, Rails ActiveRecord):
If using hand-rolled Data Mapper or Active Record (no ORM proxy available):
if (field == null) { field = load(); } and returns the field. Simplest, but null must not be a legitimate value — use a sentinel flag or loaded boolean if it can be..getValue() instead of accessing the field directly. Useful when you want laziness to be visible in the type system.For collections specifically: regardless of variant, always make the collection itself the lazy unit (one database call loads all elements). Never create a collection of individually-lazy domain objects.
Record: chosen variant + rationale + which associations it applies to.
WHY: Abstract variant choice produces no value without an executable implementation or ORM configuration.
Virtual Proxy via ORM (preferred for ORM stacks):
Java/Hibernate:
@OneToMany(fetch = FetchType.LAZY, mappedBy = "order")
@BatchSize(size = 50) // prevents ripple loading: loads 50 collections per SQL IN clause
private List<OrderItem> items;
Python/SQLAlchemy:
# lazy='select' = default proxy; 'selectin' = batch load (prevents ripple)
items = relationship("OrderItem", lazy="selectin") # preferred for collections
Django:
# Don't configure the model — configure the query
# select_related: SQL JOIN for FK / one-to-one (eager for the query)
Order.objects.select_related("customer")
# prefetch_related: separate batch query for reverse FK and M2M (batch lazy)
Order.objects.prefetch_related("items")
EF Core (C#):
// Enable lazy loading proxies globally (opt-in)
services.AddDbContext<AppContext>(o => o.UseLazyLoadingProxies());
// Or explicit load (Value Holder style):
context.Entry(order).Collection(o => o.Items).Load();
// Or eager with Include for hot paths:
context.Orders.Include(o => o.Items).Where(...);
TypeORM:
@OneToMany(() => OrderItem, item => item.order, { lazy: true })
items: Promise<OrderItem[]>; // caller awaits to trigger load
Lazy Initialization (hand-rolled):
class Supplier {
private List<Product> products; // null = not yet loaded
private boolean productsLoaded = false;
public List<Product> getProducts() {
if (!productsLoaded) {
products = ProductMapper.findForSupplier(this.id);
productsLoaded = true;
}
return products;
}
}
Use productsLoaded flag (not null check) when null is a legitimate value.
Value Holder (hand-rolled):
class Supplier {
private ValueHolder<List<Product>> products;
public Supplier(long id) {
this.products = new ValueHolder<>(() -> ProductMapper.findForSupplier(id));
}
public List<Product> getProducts() { return products.getValue(); }
}
Ghost (hand-rolled): See references/ghost-implementation-guide.md for full implementation — requires instrumenting every field accessor in a domain supertype and using a Registry + Separated Interface to avoid domain-to-mapper dependency.
WHY: A lazy collection still causes ripple loading if each element in a parent collection triggers its own individual sub-collection load. Batch loading converts N queries into ceil(N/batch_size) queries.
@BatchSize(size = 50) on the collection. Hibernate issues one WHERE id IN (...) for up to 50 parents at a time.lazy='selectin' on the relationship. SQLAlchemy issues one SELECT ... WHERE parent_id IN (...) for all loaded parents.prefetch_related('items') on the QuerySet. Django issues one query per prefetched relation for the entire result set..Include(o => o.Items) on the query (adds a JOIN, not N separate SELECTs).includes(:items) or preload(:items) on the scope.Batch size guidance: 50 is a safe default. Larger batches reduce round-trips but increase per-query data volume. Set based on average result set size.
WHY: A Virtual Proxy is a different object than the real domain object it wraps. Two proxies for the same database row have different object references (proxy1 != proxy2). Code that tests identity (==, is, ===) or uses proxied objects as hash keys will break silently.
Detection: search the codebase for:
if (a == b) where a or b could be a lazy proxyFix: ensure the ORM's Identity Map (first-level cache / session cache) is active and returns the same proxy instance for the same primary key within a session. For hand-rolled Virtual Proxy, the mapper's Identity Map must return the same proxy object on repeated finds.
For equality checks: override equals()/__eq__()/Equals() to compare by primary key, not by object reference. This is mandatory for any domain class that participates in sets or maps.
Ghost variant avoids this problem entirely: the ghost IS the domain object, so identity is preserved.
WHY: A good default lazy strategy still needs targeted eager overrides for known high-traffic queries. Without them, report queries and list views generate ripple loads even with batch loading.
For each hot path identified in Step 1:
Example: an order detail screen needs customer, items, and product names. The general Order association is lazy. The detail query overrides:
// Hibernate JPQL fetch join — overrides lazy for this query only
em.createQuery("SELECT o FROM Order o JOIN FETCH o.items i JOIN FETCH i.product WHERE o.id = :id")
Document hot-path overrides: query location | associations eagerly fetched | reason.
WHY: The plan persists decisions and gives the team a single document to review before applying changes.
Produce a Markdown document containing:
data-source-pattern-selector)Primary artifact: lazy-load-implementation-plan.md containing:
Secondary: inline code diffs for ORM annotation changes or hand-rolled getter implementations.
1. The collection is the lazy unit, not the elements. The ripple-loading anti-pattern arises when individual elements in a collection are each lazy. Loading the collection lazily (one batch query) is correct. Loading the collection eagerly but each element's sub-associations lazily is also acceptable if the sub-associations are genuinely optional. Never put a Lazy Load on each element of a collection that is iterated immediately.
2. Lazy Load is only worth its complexity cost when the field requires an extra database round-trip. Do not apply Lazy Load to a field stored in the same row as the main object — there is no performance benefit, only added code complexity. The value of Lazy Load is strictly about deferring extra database calls. If the field is co-located in the same SELECT, eager-load it.
3. ORM proxies need Identity Map backing or they break equality. The proxy identity trap is a silent correctness bug: two proxies for the same row compare unequal by reference. Always ensure the ORM session's Identity Map (first-level cache) returns the same proxy for the same key. Override equality to compare by primary key for all domain objects used in sets, maps, or equality comparisons.
4. Different use cases may need different fetch strategies — use query-level overrides. A single ORM mapping cannot be simultaneously optimal for a list screen (lazy, batch-prefetched) and a detail screen (eager fetch-join). Configure lazy as the default, and add fetch-join overrides at the query site for high-traffic paths. Two mapper variants (one lazy, one eager) for the same entity is a legitimate design.
5. Ghost preserves identity; Virtual Proxy does not. When identity semantics matter (domain objects used as set members, equality comparisons in business logic), Ghost is the correct variant: it IS the domain object. Virtual Proxy is a different object that impersonates the real one. For most ORM stacks the ORM manages identity via its session cache, making this a non-issue in practice — but the distinction matters for hand-rolled implementations.
6. Ripple loading cripples applications at scale; batch loading prevents it. A collection of N parent objects, each with a lazy child collection, causes N+1 queries without batch loading. At N=1000, that is 1001 queries. Batch loading reduces this to ceil(N/batch_size)+1 queries. This is not a micro-optimization — it is the difference between a feature working and not working under load.
Trigger: A report listing 200 orders calls order.getItems() inside a loop. Query log shows 201 SELECT statements (1 for orders, 200 for items).
Process:
items is a lazy candidate (not always needed), but this report needs it — mark as hot-path override.FetchType.LAZY with no @BatchSize.@BatchSize(size = 50) — reduces 200 queries to 4.JOIN FETCH o.items.equals()/hashCode() on Order comparing by id.Output: Implementation plan with @BatchSize annotation diff, JPQL fetch-join query, and equality override for Order.
Trigger: A feed endpoint loads 50 User objects. Accessing user.posts inside a Jinja template causes 50 additional SELECT statements.
Process:
posts is a lazy candidate for profile screens; batch-loadable for feed.relationship("Post", lazy="select") — default per-access load.lazy="selectin".lazy="selectin" issues one SELECT posts WHERE user_id IN (1,2,...,50) for the entire page.Output: One-line model change (lazy="select" → lazy="selectin"), test showing query count drops from 51 to 2.
Trigger: A Document entity has a content field (a large text blob). Every SELECT of any document loads the blob even when only the title and date are needed.
Process:
content is a lazy candidate (expensive, not needed for listings; needed for detail view).content field starts null; a contentLoaded boolean flag tracks state.getContent() checks the flag, issues a dedicated SELECT content FROM documents WHERE id=? if not loaded.findWithContent(id) which issues a JOIN or two-column SELECT upfront.Output: Updated Document.java getter with lazy init pattern, updated DocumentMapper.java with findWithContent() method.
references/ghost-implementation-guide.md — Full Ghost variant implementation (domain supertype, load states, Registry + Separated Interface wiring, ghost list for collections)references/lazy-load-variant-comparison.md — Side-by-side comparison of all four variants on: calling-code transparency, identity preservation, ORM support, instrumentation requirements, null-value handlingreferences/orm-lazy-config-cheatsheet.md — Per-ORM Lazy Load configuration: Hibernate, EF Core, SQLAlchemy, Django ORM, TypeORM, Rails ActiveRecordThis skill is licensed under CC-BY-SA-4.0. Source: BookForge — Patterns of Enterprise Application Architecture by Martin Fowler, David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, Randy Stafford.
Install related skills from ClawhHub:
clawhub install bookforge-unit-of-work-implementerclawhub install bookforge-data-access-anti-pattern-auditorOr install the full book set from GitHub: bookforge-skills