Oct 5 – 9, 2026
Karlsruhe Institute of Technology (KIT)
Europe/Berlin timezone

From Barcodes to Metadata Integrity: A Journey of Data Cleaning in Koha

Oct 6, 2026, 3:45 PM
20m
Audimax (Karlsruhe Institute of Technology (KIT))

Audimax

Karlsruhe Institute of Technology (KIT)

Str. am Forum 1, 76131 Karlsruhe
Presentation

Speakers

Ingrid Schiessl (Instituto Brasileiro de Informação em Ciência e Tecnologia (Brazilian Institute of Information in Science and Technology)) Rebeca Dos Santos De Moura (Instituto Brasileiro de Informação em Ciência e Tecnologia)

Description

This presentation details a large-scale data intervention within the library network of the Brazilian Institute of Museums (Ibram). What began as a straightforward request to populate the barcode field (**952$p**) for a specific set of records evolved into a comprehensive database audit. Our initial strategy of mirroring inventory numbers from `952\$ito952$p` was immediately challenged by Koha’s uniqueness constraint. This technical 'roadblock' revealed a legacy of duplicated inventory data across the network, forcing a deeper dive into the catalog’s health. Using **Python scripts for MARC XML manipulation**, we moved beyond simple item-level fixes to address critical inconsistencies in the 245 (Title Statement) and 008 (Fixed-Length Data) fields. This session will demonstrate how we audited thousands of records outside the ILS to ensure metadata precision before re-importing them via Koha’s batch tools. We will demonstrate the workflow used to: - Identify and Resolve Duplicities: Using Python scripts to audit the entire database for duplicate `952\$i(Inventory) values before migrating them to the unique952$p(Barcode) field, across a multi-library network. - XML Manipulation: Processing MARC XML records to clean and standardize metadata with a level of precision that manual editing couldn't achieve - Cross-Field Cleaning: Addressing critical errors and duplicities found during the process in the245(Title) and008` (Fixed-Length Data) fields.
- Safe Batch Loading: Strategies for re-importing corrected records into Koha while maintaining database integrity.

The session aims to provide practical lessons for librarians and system administrators on how a simple task of updating item-level data can—and should—lead to a more robust and standardized catalog.

Duration of your presentation (in minutes) 20

Authors

Ingrid Schiessl (Instituto Brasileiro de Informação em Ciência e Tecnologia (Brazilian Institute of Information in Science and Technology)) Rebeca Dos Santos De Moura (Instituto Brasileiro de Informação em Ciência e Tecnologia)

Presentation materials

There are no materials yet.