Graeme Winter

Renaming HDF5 (Diamond NXS) files

HDF5 / NeXus files are annoying because they contain external references to other files. Simply

mv foo_000001.h5 bar_000001.h5
mv foo.h5 bar.h5

won’t work as you will break the links which point to foo_000001.h5. So, you need to fix them as you go. This is a script to do just that -

import sys
import h5py

def linkinfo(fin, path):
    link = fin.get(path, getlink=True)
    return link.filename, link.path

def fixit(filename):
    """Fix the references inside filename"""

    assert filename.endswith(".nxs")

    root = filename[:-4]

    with h5py.File(filename, "r+") as fin:
        path = "/entry/data/data_000001"
        d_file, d_path = linkinfo(fin, path)
        d_file = f"{root}_000001.h5"
        del(fin[path])
        fin[path] = h5py.ExternalLink(d_file, d_path)

        for path in ("/entry/instrument/detector/detectorSpecific/pixel_mask",
                     "/entry/instrument/detector/pixel_mask",
                     "/entry/instrument/detector/saturation_value",
                     "/entry/instrument/detector/serial_number"):
            d_file, d_path = linkinfo(fin, path)
            d_file = f"{root}_meta.h5"
            del(fin[path])
            fin[path] = h5py.ExternalLink(d_file, d_path)

fixit(sys.argv[1])

You can use this with

dials.python fixit.py renamed.nxs

where renamed.nxs has already been renamed to what you want the new file root to be. I found this useful, maybe others will too?