NeXus/HDF5 developments

NeXus is a common data format for neutron, x-ray and muon science. NeXus is developed as an international standard by scientists and programmers representing major scientific facilities in Europe, Asia, Australia, and North America in order to facilitate greater cooperation in the analysis and visualization of neutron, x-ray, and muon data. NeXus itself builds on top of HDF5, which is by itself a widely adopted, standardized data format and has been proposed by the European Commission as an ISO standard for all binary data. Hence, any NeXus file is a fully valid HDF5 file, which can be read by a large number of applications without any further modification.

PaNdata ODI developments

In addition, two open source software packages have been developed in collaboration with the PNI-HDRI project and with support by the FP7/EU funding of PaNdata ODI, namely a complete re-rewrite of NeXus API in C++ and a NeXus data collector, which interfaces between instrument controls and the NeXus libraries, to automatically aggregate all relevant information into NeXus files.

The NeXus PNI libraries

The PNI libraries are a stack of related C++ libraries developed with the intention to simplify the development of scientific software in the field of Photon-, Neutron, and Ion-scattering. The development started within the High Data Rate Initiative (HDRI) at DESY and is a joint project of PNI-HDRI and PaNdata. Originally only a strictly object oriented API for the Nexus file format should have been developed. Due to high performance requirements the API should have been implemented in C++. However, shortly after the development begun it turned out that the major problem with C++ was not the Nexus API but rather the fact that C++ provides no data structures required for writing scientific software. In particular the missing array types for numerical calculations turned out the become a serious problem. Additionally working with raw pointers would make the resulting code error prone and susceptible to all kinds of memory issues (in particular memory leaks). Thus a utility library was developed providing all kinds of missing data types and structures. This library recently became the core of the PNI library stack.

Most of the code is written in C++. However, python-bindings exist to those C++ libraries. The following libraries are currently available

  • libpnicore - this is the central library providing data types and fundamental data structures like arrays and buffers
  • libpniio - provides functionality for data IO. This includes readers for several data types and the original Nexus API.
  • python-pniio - a Python binding to libpniio.
Documentation and open source code can be obtained from the project web under http://code.google.com/p/pni-libraries/wiki/Introduction. Debian packages can be found in the PNI-HDRI repository.

NexDaTaS - The NeXus data collector

All operations carried out on a beamline are orchestrated by the control client (CC), a software operated by the beamline-scientist and/or a user. Although the term client suggests that it is only a minor component aside from all the hardware control servers, databases, and whatever software is running on a beamline it is responsible for all the other components and tells them what to do at which point in time. In terms of an orchestra the CC is the director which tells each group of instruments or individual artist what to do at a certain point in time.

It is important to understand the role of the CC in the entire software system on a beamline as it determines who is responsible for certain operations. The CC might be a simple single script running on the control PC which can is configured by the user before start or it might be a whole application of its own like SPEC or ONLINE. Historically it is the job of the CC to write the data recorded during the experiment (this is true at least for low rate data-sources). However, with the appearance of complex data formats like Nexus the IO code becomes more complex. To cope with this complexity, NexDaTaS has been developed jointly by PNI-HDRI and PaNdata to provide an easy to use interface between the NeXus data integration and the control system. NexDaTaS is realized as a Tango server which allows to store NeXuS Data in H5 files. The server provides storing data from other Tango devices, various databases as well as passed by a user client via JSON strings. A detailed description can be found on the project page.

This NexDaTaS repository contains

  • Nexus Data Writer implemented as a Tango server
  • Configuration Tool written in PyQt , NDTS Component Designer
  • Configuration Server implemented as a Tango server on a MySQL database
  • Simple examples of configuration files and control clients

Further PaNdata involvement

PaNdata is contributing to the NeXus standard as well as to the HDF5 developments in various ways. PaNdata has proposed a number of enhancements to the NeXus application definitions, which has been adopted by the NeXus International Advisory Committee (NIAC). The NIAC is a kind of standardization authority for NeXus, and PaNdata partners are also active members of the NIAC:

PaNdata driven developments

The upcoming generation of X-ray detectors poses a number of harsh challenges on data processing and analysis. These PaNdata requirements could only be satisfied with significant enhancements of the HDF5 formats. As a consquence PaNdata partners have been funding HDF5 developments (independent of the PaNdata ODI project), which have already found its way into the current HDF5 release. One notable extension is the implementation of a filter API, which permits for example to plug arbitrary compression algorithms into the HDF5 pipeline, which permits for parallelized compression, which wasn't possible before. Another enhancements is the SingleWriteMultipleRead (SWMR) module which is currently under discussion.
The PaNdata activities have also promoted the support of NeXus by other communities. Substantial efforts went into interoperability between NeXus and CIFS, a standard format in macromolecular crystallography. And an increasing number of detector (Pilatus, Lamba) are starting to support NeXus/HDF5 natively.

Resources