Salesforce Dimensional Modeling — datamodelassistant.com

Salesforce is a CRM application, not an analytical platform. Its object model is optimized for the workflow of sales reps, support agents, and admins -- not for the questions a CRO or RevOps analyst wants to ask of the funnel. A dimensional model on top of Salesforce -- a clean star schema -- is what turns that operational object model into a system you can actually report on, forecast from, and join to finance data.

This guide assumes you already know Salesforce. We are not going to re-explain what an Opportunity is. Instead, we walk through the dimensional-modeling decisions that consistently produce a usable analytical layer on top of Salesforce data, and the few traps that catch nearly every team the first time through.

Start From the Business Process, Not the Object

The most common antipattern is to dimensionalize Salesforce object-by-object: a "fact_opportunity" table that mirrors the Opportunity object, a "fact_account" table that mirrors the Account object, and so on. This produces an analytical schema that looks suspiciously like the source -- and is just as awkward to query.

The Kimball discipline is to start from the business process. For most Salesforce-sourced warehouses, the processes worth modeling are:

Opportunity lifecycle -- creation, stage transitions, close. This is the heart of the funnel.
Account activity -- meetings, calls, emails, tasks logged against an Account.
Lead-to-Opportunity conversion -- a separate process from Opportunity lifecycle, because the grain is different (one row per Lead) and the timing is upstream.
Quote and order -- if you use CPQ or have an order object, the quote/order lifecycle is a process distinct from Opportunity.
Case / support -- a separate fact for Service Cloud customers.

Each business process gets its own fact table, with conformed dimensions across them. An analyst should be able to ask "what is meeting volume in the 30 days before Closed-Won, by segment" by joining the activity fact to the opportunity fact through conformed Account and Date dimensions.

The Opportunity Lifecycle Is an Accumulating Snapshot

The single most useful Salesforce fact table is an accumulating snapshot at the Opportunity grain, with date columns for each meaningful milestone:

Created date
Each stage-entered date (Discovery, Qualified, Proposal, Negotiation, Closed-Won/Lost)
Forecast-category-entered dates
Close date (final)

Plus measures:

Amount (current, original, and the amount at each stage entry)
Days in each stage
Stage-skipped flag (did the opportunity skip a stage on the way to close)
Days to close

This single table answers most pipeline-analysis questions: stage conversion rates, average velocity, deal-aging, slip analysis, win-rate by entry stage. It is the workhorse of any Salesforce warehouse.

The data to build it lives in OpportunityHistory (every field change is logged there for tracked fields like StageName, Amount, CloseDate). The pattern is to pivot the long history table into a wide accumulating-snapshot row. This is a meaningful ETL job -- expect to spend real time on it -- but the resulting fact table is worth it.

Account Hierarchy as a Conformed Dimension

Salesforce supports Account hierarchies via the ParentId field, and most B2B enterprises rely on this for parent-account rollups (revenue, pipeline, customers under a single global parent). The dimensional treatment is a standard recursive hierarchy in the Account dimension, flattened at ETL time into:

Ultimate parent (top-of-tree) id and name
Hierarchy depth
Path string for sorting

Be aware of two traps:

Hierarchy mutation. Account hierarchies get restructured -- mergers, re-orgs, parent reassignments. If you Type-2 the Account dimension on ParentId changes but flatten the hierarchy as Type-1, your historical reports will silently shift when the hierarchy moves. The right default is Type-2 on both the parent-id and the flattened hierarchy columns.
Account record types. Many orgs use Account record types to distinguish customers, prospects, partners, and competitors -- and the same record can flip record type over its lifecycle. Treat record type as a Type-2 attribute, not a fixed property.

The Activity History Mess

Activities -- Tasks and Events -- are notoriously hard to model in Salesforce. The native object model splits them across Task, Event, and ActivityHistory (which is a query-only meta-object that unions them). Activities can be linked to multiple records via the WhatId / WhoId polymorphic relationships, with the WhatId pointing to one of Account, Opportunity, Lead, Case, Contract, or a custom object.

For analytical purposes, the cleanest pattern is:

A single activity fact table at the Task/Event grain
A polymorphic what_object_type + what_object_id pair in the fact row
Conformed Account dimension, joined via the resolved AccountId -- which means resolving the parent-of-Task chain (Task.WhatId → Opportunity.AccountId, or Task.WhatId → Account directly, or Task.WhoId → Contact.AccountId, etc.)

This resolution is annoying but worth doing in the ETL layer rather than the report layer. Reporting against polymorphic FKs in a BI tool is uniformly painful.

Beware of EmailMessage, which lives outside Task for inbound/outbound email and has its own activity-like semantics. Most orgs unify these in the warehouse and surface a single "all activities" fact table to analysts.

Custom Fields and the Wide-Dimension Trap

A mature Salesforce org has hundreds of custom fields on Account and Opportunity, and a tempting design is to land every custom field as a column on the Account or Opportunity dimension. This works at first and then breaks: dimension tables hit hundreds of columns, the ETL gets brittle when custom fields are renamed, and BI tools start to choke on the explosion of attributes.

The pragmatic pattern:

Keep "core" attributes (the ones used in most reports) on the dimension
Move long-tail or domain-specific attributes to junk dimensions or attribute tables keyed on the Account/Opportunity surrogate key
Document which custom fields are "promoted" to first-class dimension attributes and have a deliberate process for promoting more

Resist the urge to model every custom picklist as a snowflaked sub-dimension. Picklist values are low-cardinality strings and belong inline on the dimension table.

Currency, Multi-Currency, and the Forecast Field

Salesforce supports multi-currency at the field level: an opportunity Amount might be in EUR, USD, GBP, or anything else, and the CurrencyIsoCode field on the record tells you which. The reporting expectation is usually "show me the pipeline in our corporate currency" with a currency-conversion table to apply daily rates.

Land a Currency dimension keyed by ISO code, and a daily conversion-rate fact (or accumulating-snapshot conversion-rate dimension) that converts to your reporting currency. Pre-compute a corporate_currency_amount column on the Opportunity fact so analysts do not have to remember the conversion logic in every query.

Mind the ForecastCategory field too -- it has its own taxonomy ("Pipeline", "Best Case", "Commit", "Closed") that maps to but is not identical to StageName. Treat them as separate dimensions, not duplicates.

Putting It Together

A working Salesforce warehouse usually lands at:

3–5 fact tables (opportunity accumulating snapshot, activity, lead conversion, case, optional quote/order)
6–10 conformed dimensions (date, account, opportunity, contact/lead, user, owner, currency, record type, stage, source/campaign)
A handful of bridge tables for many-to-many (campaigns to opportunities, users to territories)

The dimensional layer is much narrower than the Salesforce source -- which is the point. Salesforce has hundreds of standard objects and thousands of custom fields across a typical org. A good star schema surfaces the 20 percent that drive 80 percent of the reporting.

Going Deeper

For the underlying Kimball patterns referenced throughout this guide, see What Is a Star Schema? and Kimball Dimensional Modeling. For a comparison of dimensional vs. normalized integration-layer approaches that is especially relevant when stitching Salesforce together with finance or product-event data, see Kimball vs Inmon.

If you are setting up a Salesforce warehouse and want a starting point that is more useful than a blank canvas, paste a sample of your Salesforce object structure into the demo. The assistant will propose an initial dimensional model -- expect to iterate, especially around custom fields and your specific hierarchy choices, but you will not be starting from zero.