Leeds City Council's digital content team is sitting on a problem that has compounded quietly for years: its online image library contains an estimated 14,000 duplicate or near-duplicate photographs, according to figures presented at a scrutiny board meeting in June 2026. The duplication spans everything from planning application documents on the council's planning portal to promotional imagery on the Visit Leeds website.
The timing matters. The council is midway through a wider digital transformation programme, having committed £4.2 million to upgrading its content management infrastructure by the end of the 2026–27 financial year. Carrying redundant image data into a new system does not merely waste storage — it inflates migration costs, slows page-load times for residents using services on slow connections, and creates confusion when outdated photographs of regenerated sites like Kirkgate Market or the South Bank development appear alongside current planning documents.
What the Audit Numbers Reveal
The June audit, conducted internally by the council's Digital Services directorate, found that roughly 31 percent of all images stored across the council's public-facing platforms were either exact duplicates or visually near-identical copies differentiated only by file name or metadata timestamp. That figure is drawn from a sample of approximately 45,000 assets catalogued across five content repositories, including the main leeds.gov.uk content management system and the Leeds Libraries digital collections portal.
Storage costs are not trivial at this scale. Cloud hosting for unoptimised image libraries of this size runs to between £18,000 and £24,000 per year in additional expenditure compared with a deduplicated library, based on standard AWS S3 pricing tiers for comparable data volumes. The council has not published its own breakdown, but the Digital Services team flagged the figure at the scrutiny session as a conservative estimate.
The duplication problem is partly a legacy of separate uploading workflows across different departments. Leeds City Council operates more than 40 discrete service microsites — covering everything from adult social care to the parks service at Temple Newsam — and until 2024 each team managed its own image uploads without a shared taxonomy or deduplication check at upload. A photograph of the Merrion Centre taken in 2019 might exist in six separate folders across three different sub-sites, each with a different filename and no cross-reference.
The Fix — and What It Will Cost
The council's Digital Services team has begun rolling out an automated deduplication tool across its primary CMS since April 2026, targeting the approximately 12,000 assets flagged as exact byte-for-byte matches first. That phase is expected to complete by September 2026. Near-duplicate resolution — images that are visually identical but differ in crop, compression, or resolution — requires manual review or AI-assisted tagging and is scheduled to run through to March 2027.
Leeds Beckett University's School of Computing, based on Headingley Campus, has been in discussions with the council about a research partnership to develop a locally trained image-similarity model suited to civic photography datasets. No contract has been signed as of the date of publication.
For residents and local businesses submitting planning applications through the Civic Hall planning portal on Calverley Street, the practical impact could be measurable. Deduplication reduces the risk of caseworkers referencing an outdated site photograph — a recurring complaint from applicants in the Holbeck and Beeston Hill regeneration zones, where streetscapes have changed substantially since 2020.
The council plans to publish a full asset audit report by October 2026. In the meantime, any organisation uploading documents to a Leeds City Council managed platform is advised to check the council's updated image submission guidelines, published on leeds.gov.uk in May 2026, which now require a unique filename convention and a minimum metadata tag set before upload is accepted by the system. The change is small but it is the kind of procedural fix that, applied consistently, makes the numbers behind problems like this stop growing.