MongoDB does not fail at scale because documents are bad. It fails because the document shape stopped matching the way the product reads and writes data, and nobody noticed until the working set moved out of memory.
Node.js gets blamed for some of this too. Usually the real bug is simpler: too much data pulled into the app, too many per-row calls, a missing compound index, or a document that grew from "convenient" to "every screen in the business."
The database is not offended by your model. It is just executing it.
Model around the hot path#
The first question is not "embed or reference?" The first question is "what does the hot path need in one read?"
Embedding is great when the child data is small, bounded, and read with the parent. It is painful when the child data grows without limit, changes at a different cadence, or needs its own query path.
type OrderDocument = {
id: string;
buyerId: string;
status: 'draft' | 'paid' | 'fulfilled' | 'cancelled';
totals: { subtotal: number; tax: number; grandTotal: number };
lines: { sku: string; qty: number; unitPrice: number }[];
};That shape is fine until someone adds every shipment event, every support note, every payment attempt, and every audit trail to the same document because "it belongs to the order." Belongs is not a modeling rule. Access pattern is.
Indexes are part of the feature#
An endpoint is not done when it returns the right JSON on your laptop. It is done when the query shape has an index that matches production cardinality.
For MongoDB, that means looking at the filter, sort, and projection together:
db.orders.createIndex({
buyerId: 1,
status: 1,
createdAt: -1,
});If the query filters on buyerId, filters on status, and sorts by createdAt, that compound index is not an optimization. It is the feature's support structure.
The index review should happen in code review. If the PR adds a query and no one can point to the matching index, the PR is not done.
Watch memory before CPU#
The nasty cliff is the working set. Everything feels fine while the hot indexes and documents fit in memory. Then the product grows, a document gets wider, a dashboard scans too much, and the database starts spending its life fetching pages.
The symptoms look like application problems:
- Node workers waiting on I/O;
- p95s drifting before p50s move;
- connection pools filling during dashboard traffic;
- one "simple" admin screen making every customer request slower.
The fix is rarely "rewrite it." The fix is usually smaller reads, better projections, a correct compound index, and moving unbounded history out of the hot document.
The app should not repair the database shape#
When the Node layer starts doing joins, filters, and sorts in memory, the model is already leaking.
You can get away with it early. A dozen records become a hundred. A hundred becomes ten thousand. Then a harmless endpoint is allocating large arrays so it can throw most of them away. That is not business logic. That is a query plan hiding in TypeScript.
I like Node for product work because it makes the path from API to UI short. That does not mean the app should compensate for a lazy database model. Keep the hot path narrow, name the indexes in the PR, and treat document growth as a production risk, not a storage detail.