STFC is the UK public sector research organisation providing access to large scale scientific facilities. It has an expenditure of £500 million p.a. with 2500 staff based at seven locations including the Rutherford Appleton Laboratory (RAL) where this project is centred. Two departments of STFC will be involved in this project.
ISIS is the world‘s leading pulsed spallation neutron source. It runs 700 experiments per year performed by 1600 users on the 22 instruments. These experiments generate 1TB of data in 700,000 files. All data ever measured at ISIS over twenty years is stored at the Facility, some 2.2 million files in all. ISIS use is predominantly UK but includes most European countries through bilateral agreements and EU funded access. There are nearly 10,000 people registered on the ISIS user database of which 4000 are non-UK EU. The user base is expanding significantly with the arrival of the Second Target Station.
e-Science provides the STFC facilities with an advanced IT infrastructure including massive data storage, high-end supercomputing, vast network bandwidth, and interoperability with other IT infrastructure in the UK and internationally. It operates the UK National Grid Service and the EGEE Regional Operation Centre for the UK and Ireland. It undertakes collaborative IT research at UK, European and global levels. In this project, e-Science will provide overall coordination and provide a bridge to e-Science activities such as the EGI, NGIs and eIRG. Since 2001, e-Science had been developing a common e-Infrastructure supporting a single user experience across the STFC facilities. Much of this is now in place at ISIS and Diamond as well as the STFC Central Laser Facility. Components are also being adopted by ILL, the Australian National Synchrotron and Oakridge National Laboratory in the US. On ISIS today, experiments instrument computers are closely coupled to data acquisition electronics and the main neutron beam control. Data is produced in ISIS specific RAW format and access is at the instrument level indexed by experiment run numbers. Beyond this data management comprises a series of discrete steps. RAW files are copied to intermediate and long term data stores for preservation. Reduction of RAW files, analysis of intermediate data and generation of data for publication is largely decoupled from the handling of the RAW data. Some connections in the chain between experiment and publication are not currently preserved.Future data management will focus on development of loosely coupled components with standardised interfaces allowing more flexible interactions between components. The RAW format is being replaced by NeXus. The ICAT metadata catalogue sits at the heart of this new strategy, implementing policy controlling access to files and metadata and using single authentication it allows linking of data from beamline counts through to publications and supports WWW-based searching across facilities