Skip to content

ShipMonk Insights — FAQ & Best Practices


Table of Contents

  1. What is ShipMonk Insights?
  2. Getting Access
  3. Navigating the Interface
  4. Understanding Your Data (Merchant Data Catalog)
  5. Available Tables & What They Cover
  6. Best Practices: Building Reports
  7. Best Practices: Joining Tables
  8. Best Practices: Working with Timestamps
  9. Common FAQs
  10. Troubleshooting
  11. Data Refresh & Availability
  12. SQL Access (Native Queries)

1. How is ShipMonk Insights different from Data Catalog?

ShipMonk Insights is ShipMonk's self-service analytics platform, built on top of the Merchant Data Catalog (MDC) — a curated, merchant-facing data layer that exposes your operational data in a structured, queryable format. It lets you build, customize, and share reports and dashboards covering orders, inventory, receiving, returns, billing, and more.

Insights is designed for merchants who want deeper visibility into their fulfillment operations than the standard in-app reports provide — without needing to export data to Excel, prop up data pipelines, or maintain a separate BI tool.

Key capabilities include:

  • Pre-built report templates to get started quickly
  • Custom questions using a visual query builder (no SQL required for most use cases)
  • Native SQL queries for advanced users who want full flexibility
  • Shared dashboards with filters your whole team can use
  • Direct access to your raw MDC data without having to stand up a your own analytics pipeline

2. Getting Access

ShipMonk Insights access is provisioned at the account/user level by your Merchant Success Manager (MSM) or Tech Solutions team. Once enabled:

  1. Log into your ShipMonk account as usual
  2. Navigate to Reports in the main navigation
  3. Click on the Insights tab in the left navigation
  4. You will be automatically authenticated — no separate login is required

Access is scoped to your account only. You cannot see data from other ShipMonk merchants, and they cannot see yours. This is enforced at the database level via row-level security policies.

If Insights does not appear in your navigation, contact your MSM to confirm the feature has been enabled on your account.


3. Navigating the Interface

ShipMonk Insights is powered by Metabase, embedded directly within the ShipMonk app. Here's an overview of the main areas:

Dashboard

This is the default dashboard of your account. This is the landing page all users will see when they first access ShipMonk Insights. Put whatever information here that you'd like everyone to see.

Prepared Reports

This is where you can find a repository of validated ShipMonk metrics and any additional dashboards to help you glean insight from your ShipMonk data. While in this folder, everything is read-only. To save your own copy, simply start editing any of the metrics and you will be prompted to save it to your collection for future reference.

Custom Reports

Custom Reports is the repository where all of your team's dashboards and cards live. It's important to note that any dashboard or metric that you create is visible to everyone in your organization who has access to ShipMonk Insights.

Data Dictionary

Data Dictionary contains information about all tables, fields and their descriptions. This mirrors the information found here.

New Questions

A "question" is any saved report — whether built with the visual builder or written in SQL. You can create new questions, edit existing ones, or duplicate a pre-built question to customize it for your needs.

New SQL Questions

For users comfortable with SQL, the native SQL editor gives you full access to query your MDC data directly. See Section 12 for more.

New Dashboards

All users can create their own dashboard and drop any previously created questions on various pages of the dashboard. Filters can be applied to entire dashboard pages or specific cards within a page, default values can also be set so that users always see the intended subset of information first, before filtering the dashboard for themself.


4. Understanding Your Data (Merchant Data Catalog)

All data in ShipMonk Insights comes from the Merchant Data Catalog (MDC) — a set of pre-built, curated data tables (called "marts"). A few things to know upfront:

  • MDC data refreshes daily. All marts run on a full daily refresh. Data visible in Insights reflects the previous day's operations. See Section 11 for specifics.
  • All timestamps are in UTC unless a _local suffix is explicitly noted. When building time-based reports, be mindful of your warehouse's local timezone when interpreting results. Timestamp columns follow the naming convention event_at_utc (e.g., shipped_at_utc, submitted_at_utc).
  • Column descriptions are available. When building a custom question, hover over any field name to see its definition. This documentation is generated directly from the data catalog and kept up to date.
  • Not every column appears in every table. The MDC is designed to minimize required joins, but some fields (particularly at the line and item level) require combining tables. See Section 7 for guidance.

5. Available Tables & What They Cover

The following MDC tables are available in ShipMonk Insights. Each table is prefixed with MARTS_MDC__ or MARTS_MDC_.

Table What It Contains Common Use Cases New Release
ORDER One row per order. Includes order status, processing status, timestamps, store, trading partner, gift info, custom data, ship window, and shipping cost estimates and adjustments. Order volume reporting, SLA analysis, click-to-ship timing Enhanced April 2026
ORDER_LINE One row per order line showing how ShipMonk fulfilled each order. Includes SKU, product, quantities, pricing, all order timestamps, shipping details, and trading partner information. Line-level fulfillment analysis, shipping cost per line, fulfillment status tracking
ORDER_ITEM One row per order item (SKU within a line) as imported or manually created. Source of truth for items submitted to ShipMonk. Includes item-level quantities and product references. Item-level analysis, marketplace SKU breakdowns, bundle resolution (join with BUNDLE_MAPPING) Added April 2026
ORDER_ATTRIBUTE One row per order attribute key-value pair. Custom attributes attached to orders by the merchant or integration. Multiple attributes per order, join with ORDER on order_id. Trading partner filtering, order segmentation by Shopify tag, custom metadata filtering Added April 2026
ORDER_ACTION_REQUIRED One row per action required on an order. Tracks action type, creation and resolution timestamps, and whether the action remains unresolved. Linked to order and warehouse. Monitoring unresolved order issues, action resolution time tracking, ops bottleneck identification Added April 2026
ORDER_PART One row per split-order part. Shows the original order and all parts it was divided into, including the split reason (order_source) and the original top-level order ID. Tracking split orders, understanding split reasons, reconstructing original order intent Added April 2026
ORDER_REFERENCE One row per order reference value. Includes PO numbers, external order IDs, and other reference types. Orders may have multiple references; all are included. Linked by order_id. PO number lookup, cross-system order reconciliation, external ID mapping Added April 2026
ORDER_WITH_REMOVED_ITEMS One row per item manually removed from an order (excludes split-order removals). Includes order number, key, timestamps, SKU, item name, and originally requested quantity. Auditing manual order edits, tracking removed item volume, ops intervention analysis Added April 2026
PACKAGE_LEVEL_DETAIL One row per package. Includes carrier, service, tracking number, zone, weight, estimated and actual shipping costs, and surcharges. Multi-package orders produce multiple rows. Shipping cost analysis, zone distribution reporting, carrier mix analysis
PACKAGE_LINE One row per SKU within a package. Includes package ID, tracking number, weight, packed quantity, product/SKU details, packaging name, zone, delivery timestamp, and transit days (business days). SKU-level package content analysis, delivery time tracking by zone, transit time benchmarking
INVOICE_ITEM One row per billing line item. Includes pick fees, pack fees, storage, and handling charges. Source of truth for invoice detail pages, PDF, and CSV exports. Cost-per-order breakdowns, billing reconciliation, fee category analysis
INVENTORY_LEDGER One row per inventory transaction (delta). Includes transaction type, quantity change, lot, and location. Full historical log of all inventory movements. Inventory movement auditing, shrink and adjustment tracking, lot-level traceability
INVENTORY_LEDGER_7D Rolling 7-day view of inventory ledger transactions. Same structure as INVENTORY_LEDGER, filtered to the latest 7 calendar days for faster querying of recent activity. Velocity analysis, recent movement trends, short-window inventory alerting
INVENTORY_HISTORY Daily snapshot of inventory positions. Tracks stock levels, allocations, expectations, and quarantined inventory across warehouses, accounts, and products over time. On-hand trends over time, allocation vs. available analysis, quarantine monitoring
PRODUCT One row per product. Master data mart including SKU, name, weight, dimensions, HS codes by region, barcodes, attributes, handling units, and integration custom data. Product catalog reference, bundle-to-component mapping, HS code and compliance lookups Added April 2026

Notes on Tables Not to Confuse

  • ORDER vs. ORDER_LINE: Use ORDER for order-level analysis (volume, SLA). Use ORDER_LINE when you need SKU-level or pricing detail. Most order timestamps are duplicated on ORDER_LINE to avoid requiring a join for common metrics.
  • PACKAGE_LEVEL_DETAIL vs. ORDER_LINE: Packaging data (carrier, tracking, cost) lives on PACKAGE_LEVEL_DETAIL. Order data (submitted, shipped timestamps) lives on ORDER_LINE. You need to join these for full click-to-delivery analysis.
  • INVENTORY_LEDGER vs. INVENTORY_HISTORY: LEDGER shows individual transactions (what changed and when). HISTORY shows daily snapshots (what was on hand at end of day). Use LEDGER for reconciliation; use HISTORY for trends.

6. Best Practices: Building Reports

Start with a pre-built report, then customize it

Rather than building from scratch, find the pre-built report closest to what you need, duplicate it, and modify it. Pre-built metrics can be found under Prepared Reports in the Validated Metrics folder. This gives you a working foundation with the right table and filters already in place.

Filter early and filter specifically

Insights works best when you narrow your dataset early. Apply date range filters, store filters, and order type filters before adding additional columns or groupings. This keeps queries fast and results interpretable.

One question, one purpose

Avoid building "everything" into a single question. Instead, build focused questions that answer one specific thing (e.g., "orders shipped by zone this month") and combine them on a dashboard. This makes individual questions easier to reuse and maintain.

Use the column descriptions

Every column in the visual builder has a description available on hover. When you're unsure what a field represents — especially for timestamps or status fields — read the description before using it. This prevents common mistakes like using order_submitted_at_utc when you meant order_shipped_at_utc.

Validate with a small date range first

When building a new report, start with a short date range (e.g., last 7 or 14 days) to validate the structure and output before expanding to a longer period. Large date ranges over complex tables can be slow to return.

Use order event flags to improve accuracy

Several pre-calculated boolean columns help you exclude orders that would otherwise skew your metrics:

  • order_was_backordered — the order was held due to stock unavailability
  • order_was_on_hold — the order was manually placed on hold
  • order_was_action_required — the order required manual intervention before processing

For example, when calculating click-to-ship time, filtering out order_was_backordered = true gives you a more accurate view of your core operational performance. Without this filter, backorder delays will inflate your average fulfillment time and may not reflect ShipMonk's actual throughput.

Don't use order number as a numeric value

Order numbers and similar reference fields (PO number, customer order number, release number) are strings, not numbers — even when they appear numeric. Do not sum, average, or perform math on these fields. Use them only for filtering or grouping.


7. Best Practices: Joining Tables

Use ID fields, not name fields, for joins

Every MDC table includes a standardized set of key columns to make joins unambiguous. Use these IDs (not names, keys or numbers) when joining tables:

Concept ID Field to Join On
Account account_id
Order order_id
Order Line order_line_id
Order Item order_item_id
Product product_id
Package package_id
Return return_id
Claim claim_id
Receiving receiving_id
Warehouse warehouse_id
Store store_id

Name fields (e.g., store_name, account_name) are available for display, but joining on names is fragile — names can change. Always join on IDs.

Watch for fan-out on multi-line orders

If an order has 3 lines and you join ORDER to ORDER_LINE, you will get 3 rows for that order. Summing order-level fields (like order_value) across those rows will triple-count. Always be clear about your grain (order vs. line vs. item) and aggregate accordingly.


8. Best Practices: Working with Timestamps

All timestamps are UTC

Every timestamp in the MDC follows the _at_utc suffix convention (e.g., shipped_at_utc, submitted_at_utc, ordered_at_utc). If you need to convert to your local timezone, apply a timezone offset in your formula or SQL.

Key order timestamps and what they mean

Timestamp What It Captures
ordered_at_utc When the order was first created in the system
created_at_utc When the order record was created (often same as ordered_at_utc)
submitted_at_utc When the merchant submitted the order for fulfillment
picked_at_utc When warehouse picking began
packed_at_utc When the order was packed
shipped_at_utc When the label was generated / order left the facility
picked_up_at_utc When the carrier physically picked up the package
delivered_at_utc When the carrier marked the package as delivered

Calculating common time-based metrics

Click-to-Ship (C2S) shipped_at_utcordered_at_utc Measures total elapsed time from order placement to shipment. Filter out backordered and on-hold orders for a cleaner view.

Submit-to-Ship shipped_at_utcsubmitted_at_utc Measures ShipMonk's processing speed from the moment you submitted the order. This is the most direct measure of warehouse throughput under ShipMonk's control.

Click-to-Delivery (C2D) delivered_at_utcordered_at_utc End-to-end delivery time. Filter out orders where delivered_at_utc IS NULL to exclude in-transit shipments.

Transit Time delivered_at_utcshipped_at_utc Measures carrier performance. Useful for zone-based transit analysis.


9. Common FAQs

Q: Can I build reports that combine order and shipping data in one view?

Yes, but it requires joining ORDER_LINE with PACKAGE_LEVEL_DETAIL. Both tables share order_id as a common key. In the visual builder, use the "Join data" option in the query builder. In the SQL editor, use a standard LEFT JOIN. See Section 7 for the join pattern.

For simpler metrics like click-to-delivery, the ORDER table has all the information you need. That said, if your order is large in size and has to be shipped in multiple packages, you can get a click to delivery by parcel by joining ORDER you may need up to 4 tables: ORDER (for filtering by status or type) and PACKAGE_LEVEL_DETAIL (for shipped, delivered timestamps and shipping cost at the parcel level). Native SQL makes this significantly easier than the visual builder for multi-table queries.

Q: I used to use a BI tool (like PowerBI, Tableau or Looker Studio) and I'm used to being able to pivot data. Can I do that in Insights?

Pivoting large datasets can be handled through the SQL editor as well as the visual builder. The visual builder is optimized for building charts and summaries, but does have limitations around complex aggregations, and multi-table joins. If you find yourself exporting data to Excel to pivot it, that's a strong signal you should be using native SQL queries. See Section 12.

Q: Why does my Click-to-Delivery average look higher than expected?

The most common causes are:

  1. Backordered orders — orders that sat pending for days or weeks before fulfillment inflate your C2D average. Add a filter: order_was_backordered = false.
  2. On-hold orders — same issue. Add: order_was_on_hold = false.
  3. In-transit orders included — if delivered_at_utc is null, the order hasn't been delivered yet and shouldn't be included in a C2D average. Filter: delivered_at_utc IS NOT NULL.
  4. Upstream logic — many merchants have order delays set up in their integration platform and eCommerce system. Created_at_utc is the timestamp associated with the moment a customer completed the purchase in your system. If you have submission delays upstream of ShipMonk (or within ShipMonk) that delay time will be reflected in the metric.

Q: I want a consolidated dashboard that shows all my key KPIs in one place — where do I start?

Start by identifying 3–5 metrics that matter most to your business (e.g., orders shipped per day, C2S average, inventory on hand, return rate). Build a focused question for each one, then pull them onto a single dashboard. Add a date range filter at the dashboard level so it applies to all cards at once.

ShipMonk insights is very intuitive, but does require some time to familiarize yourself with the datasets and visualization tools. To help merchants get to solutions as quickly as possible, the Tech Solutions team can help you build the initial dashboard for a small fee. If you're interested in this option, reach out to support@shipmonk.com to get started.

Q: Can I see inventory by lot number?

Yes. INVENTORY_LEDGER, INVENTORY_LEDGER_7D, and RECEIVING_LINE all include lot_number and lot_expiration_date fields. You can group or filter by lot to see inventory movement or current stock by lot. PRODUCT_WITH_ACTUAL_STOCK does not have a lot-level breakdown — for lot-level stock, use the ledger tables.

Q: I sell on multiple channels (Shopify, B2B, Amazon, etc.). How do I filter by channel?

Use the store_name or store_id field available on most order tables. Usually, each sales channel you have integrated with ShipMonk appears as a distinct store (not the case for API integrated stores, which typically have orders originating from either an ERP, or middleware). For B2B orders routed through trading partners, you can additionally filter by account_trading_partner_name.

To filter out non-standard order types from KPI calculations (e.g., special projects, back-orders, wholesale shipments), use a combination of order_type and trading partner filters. This is especially important for merchants with a mix of D2C and B2B fulfillment, where lumping both together can skew SLA and shipping metrics.

Q: Can I see order reference fields like PO number or customer order number?

Yes. The following reference fields are available on the ORDER and ORDER_LINE tables:

  • ref_order_number
  • ref_customer_order_number
  • ref_po_number
  • ref_reference_number
  • ref_b2b_routing_reference
  • ref_release_number
  • ref_job_number

These are stored as string fields. Use them for filtering or display — do not perform math on them.

Q: My report is loading slowly. What should I do?

A few things to try:

  1. Narrow your date range. Queries over large date ranges on transaction-heavy tables (like INVENTORY_LEDGER or ORDER_LINE) can take time. Start with 30–90 days and expand once you've validated the report.
  2. Reduce the number of joins. Each additional table adds query time. The MDC is designed to be relatively flat — most common fields are available without joining.
  3. Check if the question is cached. Insights uses query caching for common reports. If your question has never been run before, the first load may be slow; subsequent loads will be faster.
  4. Avoid SELECT * in SQL queries. Select only the columns you need.

Q: How do I know when my data was last updated?

Data is refreshed nightly. See Section 11 for specifics.

Q: Can ShipMonk Insights replace my current BI tool?

For merchants whose primary need is operational reporting (order fulfillment, inventory, returns, shipping cost), Insights is designed to handle the majority of those use cases — particularly through the use of SQL queries. For merchants with complex cross-platform data models (e.g., blending ShipMonk data with Shopify revenue, Amazon performance, or ad spend), Insights is best used alongside an external BI tool or data warehouse, with MDC data exported via Snowflake, SFTP, or S3 into your own environment. If you want to programmatically bring data out of insights, you may be better served with Data Catalog, but many merchants find value in having both Insights and Data Catalog subscriptions.


10. Troubleshooting

"I can't find a column I need"

  • Check the column description — the field may exist under a slightly different name. Use the search bar in the column picker.
  • Check a different table — some fields exist on ORDER_LINE but not on ORDER, or vice versa. Use the schema in this documentation as your quick-guide.
  • It may be a new addition — the MDC schema is actively being extended. If a field you need doesn't exist yet, contact your MSM or Tech Solutions team to request it.

"My numbers don't match the ShipMonk app"

  • Verify the date range and timezone you're using. In-app reports may use local warehouse time; Insights uses UTC.
  • Check whether the in-app report excludes certain order types (e.g., B2B, cancelled orders). Add equivalent filters to your Insights question.
  • Confirm your grain — Insights lets you aggregate at the order, line, or item level. Make sure you're summing at the right level.

"Query takes too long"

  • If you are trying to join many-to-many, or are leveraging multiple CTEs, there is a chance your query may time out at five minutes. While rare, we suggest always limiting the number of rows, or filtering for a smaller date range to validate your join and build visualizations before opening the query up to all records.
  • If your query continues to load slowly, or times out often, please reach out to ShipMonk and they can look into precalculating the dataset for you.

"I see duplicate rows in my result"

  • You likely joined two tables where one side has multiple rows per order (e.g., joining ORDER to ORDER_LINE on order_id). Decide whether you want order-grain or line-grain output and aggregate accordingly before or after the join.
  • In SQL, use DISTINCT or GROUP BY to collapse duplicates if appropriate.

"I'm getting null values on timestamps"

  • Not every order will have every timestamp. An order that was never delivered will have a null delivered_at_utc. An order that skipped packing (e.g., slap-a-label) may not have packed_at_utc. Filter out nulls explicitly when calculating averages or time differences.

11. Data Refresh & Availability

  • Refresh cadence: All MDC marts run on a full daily refresh, typically completing in the early morning hours UTC. Data available in Insights reflects the previous calendar day's operations.
  • Latency: Expect data to be 12–24 hours behind real-time. Insights is not designed for live operational monitoring — it is designed for daily, weekly, and monthly analytical reporting. If real-time monitoring is what you're hoping to explore, please reach out to support@shipmonk.com.
  • Historical data: All tables retain a complete history of your data associated with your account.

12. SQL Queries

Who are SQL queries for?

Native SQL queries are available to merchants who want to go beyond the visual query builder. This is ideal if you:

  • Need to join more than 2–3 tables
  • Want to pivot data (e.g., orders by week across multiple columns)
  • Are building complex metrics that require subqueries or CTEs

How to access the SQL editor

  1. From the Insights home or any collection, click New SQL Question
  2. The SQL editor will open with a connection to your MDC schema
  3. Your session is automatically scoped to your account — row-level security ensures you only see your data, even in raw SQL

Schema reference

Your MDC tables are in the MARTS.MERCHANT_DATA_CATALOG schema. You can browse available tables and columns directly in the editor's schema browser (right sidebar after clicking the book icon).

All table names follow the pattern: MARTS_MDC_<TABLE_NAME> (note that some older marts have an additional underscore in their reference)

Sample SQL: Click-to-Delivery Summary

SELECT
    DATE_TRUNC('week', ol.ordered_at_utc)              AS week,
    COUNT(DISTINCT ol.order_id)                        AS total_orders,
    AVG(DATEDIFF('hour', ol.ordered_at_utc,
                          pkg.delivered_at_utc))       AS avg_c2d_hours
FROM MARTS.MERCHANT_DATA_CATALOG.MARTS_MDC__ORDER_LINE  ol
LEFT JOIN MARTS.MERCHANT_DATA_CATALOG.MARTS_MDC__PACKAGE_LEVEL_DETAIL pkg
    ON ol.order_id = pkg.order_id
WHERE ol.ordered_at_utc >= DATEADD('day', -90, CURRENT_DATE)
  AND pkg.delivered_at_utc IS NOT NULL
  AND ol.order_was_backordered = FALSE
  AND ol.order_was_on_hold = FALSE
GROUP BY 1
ORDER BY 1 DESC;

Sample SQL: Inventory Adjustments by Type

SELECT
    DATE_TRUNC('month', transaction_at_utc)  AS month,
    transaction_type,
    SUM(quantity_change)                     AS net_adjustment,
    SUM(ABS(quantity_change))                AS gross_adjustment
FROM MARTS.MERCHANT_DATA_CATALOG.MARTS_MDC__INVENTORY_LEDGER
WHERE transaction_at_utc >= DATEADD('month', -6, CURRENT_DATE)
  AND transaction_type = 'adjustment'
GROUP BY 1, 2
ORDER BY 1 DESC;

Tips for writing efficient SQL

  • Filter on date columns early — add a WHERE clause on a timestamp column in every query to limit the scan.
  • Use LIMIT while developing — add LIMIT 100 to your query while you're testing to avoid waiting on a full result set.
  • Avoid SELECT * — name only the columns you need. Wide tables can slow down queries significantly.
  • Reference column descriptions — the schema browser in the SQL editor shows column descriptions. Use these to confirm you have the right field before writing a complex query.
  • Join on IDs, not names — see Section 7 for the recommended join keys across tables.

For questions not covered here, contact your Merchant Success Manager or reach out to the Tech Solutions team through the standard intake form.

13. Additional nuances: Timestamps & Order Lifecycle

  • packed_at_utc can be null for slap-a-label orders, where a label is generated directly without a pack step.
  • picked_at_utc can be null for orders fulfilled from a single bin location where no discrete pick scan occurs.
  • submitted_at_utc can lag ordered_at_utc by days or weeks if a merchant submits orders in batches rather than immediately on creation. This makes ordered_at_utc a poor starting point for fulfillment SLA calculations.
  • delivered_at_utc will be null for in-transit packages — always filter delivered_at_utc IS NOT NULL before computing any C2D average, or your average will be dragged down toward zero for recent dates.
  • marketplace_notified_at_utc and loaded_to_truck_at_utc are frequently null — these events are only recorded for specific carrier/integration configurations and should not be assumed to exist.
  • picked_up_at_utc vs shipped_at_utc are not interchangeable. shipped_at_utc is when the label is generated; picked_up_at_utc is when the carrier physically scans it at pickup. Dwell time between these two can be hours or even overnight for end-of-day pickups.
  • ordered_at_utc and created_at_utc are usually identical but can differ when an order is imported via an integration with a delay. Don't assume they're always the same in calculated fields.

Order Status & Filtering

  • order_was_backordered = true does not mean the order is still backordered. It means the order was at some point backordered. The flag is historical, not current state. order_was_backordered exists to help merchants understand row-level outliers which may otherwise skew calculated metrics without being able to quickly understand why. Use processing_status for real-time state.
  • Cancelled orders appear in ORDER and ORDER_LINE. They are not excluded by default. Filter order_status != 'cancelled' if you don't want them inflating order counts.
  • order_type determines D2C vs B2B routing, but an order can be order_type = 'direct_to_consumer' and still have a trading_partner assigned if the merchant uses trading partners as a distributor tagging mechanism (not for wholesale routing). Be cautious when using trading partner presence as a proxy for B2B.
  • Special project orders (e.g., kitting jobs, bulk packs) often have no delivered_at_utc — they don't ship via standard carrier flows. Filtering on order_type or store_name is the safest way to exclude them.
  • processing_status and order_status are different fields. order_status reflects the merchant-visible state. processing_status reflects the warehouse's internal workflow state. You may need both to correctly diagnose where an order is stuck.

Multi-Line & Multi-Package Orders

  • One order can produce multiple packages. A single order_id in PACKAGE_LEVEL_DETAIL may have multiple rows — one per package. Summing estimated_total_charge across all rows gives total shipping cost; don't take a single row and assume it represents the full order.
  • One order line does not equal one package. Items from different order lines may ship together, and a single line may be split across packages. Don't join ORDER_LINE to PACKAGE_LEVEL_DETAIL 1:1 expecting clean results.
  • Click-to-delivery for multi-package orders is ambiguous — do you use the first package delivered or the last? For customer experience metrics, the last delivery date is more meaningful.
  • ORDER_LINE has multiple rows per order whenever a merchant orders multiple SKUs. Summing an order-level field (e.g., order_value) across all lines will multiply it by the number of lines. Aggregate at the right grain.

Inventory & Receiving

  • inventory_on_hand in INVENTORY_HISTORY is a daily snapshot — it reflects end-of-day position, not real-time. A spike or dip within a day won't appear.
  • Inventory adjustments can be positive or negative. When calculating "gross shrink," use SUM(ABS(quantity_change)) for adjustments. SUM(quantity_change) gives you net, which can be near zero even if large offsetting adjustments occurred.
  • Lot inventory and non-lot inventory for the same SKU are tracked separately. While very rare, if a SKU has both lot-controlled and non-lot-controlled stock at any point in its lifetime, you may see separate rows in the ledger. Summing without awareness of this can give unexpected totals.
  • Receiving discrepancies of type exact still exist as rows in RECEIVING_LINE — don't assume that filtering to over/under is exhaustive if you want all receiving events.
  • A receiving line can be reopened, which creates additional ledger entries. If you're reconciling received quantities, make sure you're looking at the final state, not the first entry.
  • lot_expiration_date in INVENTORY_LEDGER reflects the lot at transaction time — if a lot's expiration date is corrected after receiving, older ledger rows won't be updated retroactively.

Returns

  • Return lines can have a null disposition if the return was received but not yet processed. Filter disposition IS NOT NULL if you want only fully processed returns.
  • The same returned unit may generate multiple RETURN_LINE rows if it was partially processed, disputed, or had a correction applied. Deduplication on return_line_id is safe; deduplication on sku + return_id alone is not.
  • dock_to_processed time is meaningless for returns that were never fully processed. Always pair this metric with a filter for completed/closed return status.

Billing & Costs

  • Invoice items include charges that are not order-specific (e.g., monthly storage fees, account-level charges). These rows will have order_id IS NULL. If you join INVOICE_ITEM to ORDER on order_id, these rows will be dropped — which may or may not be what you want.
  • Pick fees and pack fees are separate line items. A cost-per-order calculation needs to sum both.
  • Shipping surcharges in ORDER_SHIPPING_SURCHARGES are additive on top of the base rate in PACKAGE_LEVEL_DETAIL. If you use only estimated_total_charge from package level detail, you may already have surcharges included — double check before also summing from the surcharges table.
  • adjusted_total_charge reflects what was actually billed, estimated_total_charge reflects the original estimate. For true cost analysis, use adjusted. Estimated is useful for modeling, not for reconciliation.

Columns That Look Numeric But Aren't

  • order_number, po_number, customer_order_number, tracking_number are strings, even when they contain only digits. Sorting, averaging, or summing them produces nonsense.
  • warehouse_id and account_id are numeric IDs — but they should only be used for joining or filtering, never for aggregation. Adding them together is meaningless.
  • zone (shipping zone) looks like a number (1–8) but behaves categorically. Zone 7 is not "twice zone 3.5" — don't average zones and expect a meaningful result. Use zone as a grouping dimension only. Note: Zones outside of the United States and in select area codes may have zone values much higher than 1–8; be sure to look at underlying data before producing final metrics.

Joins That Seem Obvious But Aren't

  • store_name varies in casing and formatting across tables — it appears as store_name in some tables and store in others. Confirm which column applies to the table you're using before joining on it.
  • warehouse_id and warehouse_code are not always interchangeable across tables — some tables expose one, some the other. Joining on the wrong one silently drops rows.
  • Joining ORDER_LINE to PACKAGE_LEVEL_DETAIL on order_id produces a fan-out if either side has multiple rows. You often need to aggregate PACKAGE_LEVEL_DETAIL to the order level first (e.g., MIN(shipped_at_utc), SUM(adjusted_total_charge)) before joining.

Last updated: May 2026