Case study

File-level Media Normalization: A 37,614-Photo Archive Case Study

A real-world normalization run across 311 source folders and 18 years of media, showing how a large archive can be rebuilt into a deterministic structure before any catalog or DAM.

This page documents the archive, the setup, the execution timeline and the resulting structure, as an application of file-level media normalization, showing how this method performs in practice on a large, real-world archive.

37,614 media files processed
311 source directories
9h 08m total processing time
0 runtime errors

The archive before normalization

The source archive contained media accumulated over many years across disconnected folder trees, inconsistent naming conventions and mixed device origins.

In total, the dataset included:

  • 37,614 media files processed
  • 311 source directories
  • date span from 2001 to 2019

Environment and setup

This execution was performed on a Mac-based local workflow using MediaOrganizer Studio, with the archive read from an HDD.

What normalization means here

In this workflow, normalization means rebuilding the archive according to media metadata rather than preserving inherited folder logic from old exports, copied drives or ad hoc manual structures.

Country / State / City / Year / Month / Day / Timestamp.ext

The result is a structure that is deterministic, readable and rebuildable, and that can then feed Lightroom, Apple Photos or any DAM from a much cleaner foundation.

Loading and discovery phase

Initial indexing time for the archive was approximately 54.38 seconds.

That separation matters: discovery happens first, normalization happens second. In large archives, recursive visibility into nested folders is what makes deterministic rebuilding possible.

Normalization pipeline in execution

Runtime:

  • Start: 11:55:16
  • End: 21:03:52
  • Total: approximately 9h08m

Final result

Final run summary:

  • Processed: 37,614
  • Duplicated: 187
  • No GPS: 13,332
  • Warnings: 13,519
  • Errors: 0
  • SQLite cache: 23,306

The most important metric here is not just the number of files processed, but the combination of scale, determinism and zero runtime errors.

What this demonstrates

This run demonstrates that file-level media normalization is not just a conceptual idea. It works on large archives with mixed history, mixed folder logic and long time spans.

  • deterministic – files are rebuilt into a stable metadata-based structure.
  • scalable – tens of thousands of files can be processed in one workflow.
  • catalog-friendly – the resulting archive is easier to feed into Lightroom, Apple Photos or another DAM.
  • rebuildable – the archive no longer depends on inherited source folders.

Relationship to media normalization

The results observed in this case study — deterministic structure, stable identity, and controlled handling of duplicates — are direct consequences of applying file-level media normalization as a foundational step before any catalog-based organization.

This confirms that the principles described in the core concept are not theoretical, but operational at scale across large and fragmented archives.

Conclusion

The archive shown here started as a fragmented collection of folders accumulated over nearly two decades.

After normalization, it became a deterministic archive layer that is independent from any single catalog or vendor tool.

That is the role of MediaOrganizer: not to replace catalogs, but to prepare the archive beneath them.

Want the implementation behind this workflow? See MediaOrganizer or start with the core concept page.

Download MediaOrganizer on the Mac App Store