Benchmark report

SmartPacker T1 Loading Benchmark

A deterministic 10,000-case single-SKU 40HQ benchmark with published input cases, transparent heuristic baselines, and downloadable reproduction files.

10,000
Seeded 40HQ test cases across 10 carton strata
7,961 / 0
Wins / losses vs the transparent one-cut heuristic
+2.633%
Average carton-count gain vs the one-cut heuristic

Published Reproduction Package

The corpus is fixed by seed 20260619. Download the CSV cases, machine-readable summary, and Python generator ZIP. The generator uses only Python standard-library modules for the case generation and heuristic baselines.

Results

Many public container-loading calculators expose only one answer for one manually entered case. This report publishes the case generator and the full input corpus, so the result is not a single showcase example.

The benchmark uses one carton type loaded into a fixed 40HQ-style interior space of 12032 x 2352 x 2698 mm. All six axis-aligned carton orientations are allowed. SmartPacker T1 is compared with transparent baselines that a reader can inspect and rerun.

June 19, 2026 deterministic 10,000-case benchmark
ComparisonWinsTiesLossesAverage gain
T1 depth 3 vs one-cut heuristic7,9612,0390+2.633% cartons
T1 depth 3 vs single-grid baseline9,1098910+9.784% cartons
T1 depth 3 vs T1 depth 25,3014,6990+0.453% cartons

Median fill rate

95.953% across the 10,000 depth-3 T1 results.

Best transparent baseline

The one-cut baseline tries a guillotine slab split and grids both regions.

Volume bound

The CSV includes a raw volume upper bound, but it is not always geometrically reachable.

Public App Spot Checks

We also ran a dated, no-login public-app spot check on June 19, 2026. The exact test case was a 12032 x 2352 x 2698 mm loading space with 600 x 400 x 350 mm cartons, allowing all six orientations where the public app exposed orientation controls. This is a small public verification sample, not a vendor-neutral leaderboard.

Single-case public-app results observed on June 19, 2026
ToolAccessDimension matchReported resultHow to read it
SmartPacker public MCPNo accountExact876 cartons, 96.4%Reference result from the public depth-2 live-layout service.
Pier2Pier 3D Load CalculatorNo accountExact820 cartons, 90.2%Strict public competitor spot check. SmartPacker loaded 56 more cartons, about 6.8% more than Pier2Pier on this case.
SeaRates Load CalculatorNo accountBuilt-in 40' High Cube, not exact custom dimensions770 cartons, 84%Useful public-app datapoint, but not a strict exact-dimension comparison.
CargoesPi Container Loading CalculatorNo accountExact entered dimensions920 cartons, 101.2%Not used as a valid competitor score because it exceeds the 908-carton raw volume upper bound.
EasyCargoSign-up / free-trial gateNot runNo public no-login resultRequires account access before comparable calculation.
GoodloadingAccount / demo gateNot runNo public no-login resultRequires account or demo access before comparable calculation.
CargoLoader3DLogin gateNot runNo public no-login resultRequires login before comparable calculation.

Depth 3 Release Policy

This report uses local count-only SmartPacker T1 depth-3 batch results. It is not a claim that the public MCP placed-layout endpoint runs depth 3.

The public MCP service remains a fast depth-2 live-layout service because it must return placed cartons and a 3D image within the public timeout. Depth 3 will be released in the downloadable T1 desktop app and a local CLI, so anyone can run the benchmark on their own PC and inspect generated loading plans locally without adding load to the public server.

Reproduce

To regenerate the same cases and heuristic baselines from the extracted script:

python container_loading_benchmark_generator.py --out-dir . --skip-packcli

After the local T1 CLI package is released, run the same generator with the CLI path to add SmartPacker T1 counts:

python container_loading_benchmark_generator.py --out-dir . --packcli path\to\PackCli.exe

The current published CSV already includes the depth-0 through depth-3 T1 count columns from the internal June 19, 2026 run.

By Stratum

Depth-3 T1 vs one-cut heuristic by generated carton class
StratumCasesWinsTiesLossesAverage gainAverage T1 fill
small_dense1,000964360+1.743%98.947%
medium_carton1,0008741260+2.615%95.908%
large_carton1,0006004000+3.508%85.887%
flat_panel1,000909910+3.295%94.126%
tall_column1,0008411590+3.224%93.922%
long_case1,0004835170+1.713%86.458%
near_divisor1,0007732270+2.063%96.413%
awkward_remainder1,0008611390+3.053%94.837%
mixed_scale1,0007752250+2.525%92.455%
realistic_carton1,0008811190+2.593%96.045%

Scope and Caveats

  • This is a single-SKU benchmark. It does not test mixed-SKU loading, pallet rules, fragile cargo, axle weight, loading sequence, or manual edit workflows.
  • The transparent baselines are published for reproducibility. They are not a substitute for every commercial product's private algorithm.
  • The named public-app results above are single-case spot checks. Bulk automated testing of competitor web apps can be fragile and may violate account or rate-limit rules.
  • Future report revisions should add downloadable desktop-app and CLI verification instructions once the public local package is upgraded.

Next Step

The next product step is to upgrade the downloadable T1 desktop app and add a local CLI package. That release will make depth-3 verification self-contained: run the CLI for the benchmark count, then open the same case in the local app to inspect the loading plan.

Questions and comments: zhzx@zhihuo.com.