Read contents of h5 and hdf5 files and write them in Excel workbook
Read contents of h5 and hdf5 files and write them in Excel workbook
When we are handling .h5 and .hdf5 files, which are useful for storing large amounts of data in any format, we may want to output all results into a more common file anyone can use (without any required knowledge).
In this tutorial I’ll show how all the contents from an .h5 file can be extracted and written into an Excel workbook in an orderly way. The input and output files as well as the script are added as attachments.
The first step for extracting data from a .h5 file is to have a look at its structure, as we need to know how to reach the relevant data within. The HDF5 Viewer Utility is very useful for this:
With this utility, we see that data is arranged in two main groups (Bottom and Top), each of these, in turn, contains relevant data about each element: Damage and another subgroup called Event_2 which contains the attributes NoOfCycles, RainflowData and RandomData. With this structure in mind, we can query all data from the file using loops. Let’s have a look at the script:
- The first step is to retrieve all groups from the .h5 file (using readhdf5toc) and loop through each of them for extracting data from all elements. As an Excel workbook will be created (with xlsopen) for each of the main groups, assign a name to each of the output files.
close all, clear, clc
filename = 'ElementRainflow.h5';
% Get all available groups
groups = readhdf5toc(filename)(:,2);
% Extract data from each group
for i = 1:numel(groups)
% Get all elements for this group
group = groups{i};
elements = readhdf5toc(filename, group)(:,2);
% Open .xlsx file to write outputs for this group
outputName = strcat(filename(1:end-3), '_', group(2:end), '.xlsx');
file = xlsopen(outputName, 1);
- Using the list of all available elements, we can look at the data present within each of them. From the HDF5 Viewer, we saw that there is a field called Damage and another subgroup called Event_2. We can read data in Damage using readhdf5 and the data contained in Event_2 same as we did with the top groups.
% Extract data from each element
for j = 1:numel(elements)
% Get all datatypes for this element
element = strcat(group, '/ElementId_', num2str(j), '_2D');
datatypes = readhdf5toc(filename, element)(:,2);
% Get total damage for this element
totalDamage = readhdf5(filename, datatypes{1}){1}.Damage;
% Get all datatypes inside Event_2
subdatatypes = readhdf5toc(filename, datatypes{2})(:,2);
- From Event_2, we can access the attributes RainflowData and RandomData, which will both be retrieved as a Structure, an OML data type which is somewhat like Python dictionaries (its entries are indexed by strings).
% Extract rainflow and random data
RainflowData = readhdf5(filename, subdatatypes{2}){1};
RandomData = readhdf5(filename, subdatatypes{3}){1};
% Relevant data for writing output
Stress = RainflowData.('2_CycleId') + 1;
NumberOfCycles = RandomData.NoOfCycles;
Probability = RandomData.Probability;
StressAmplitude = RainflowData.('4_StressAmp');
MeanStress = RainflowData.('5_MeanStress');
Damage = RainflowData.('6_Damage');
CorrectedStressAmp = RainflowData.('7_CorrectedAmp');
- After having extracted all relevant data, we use xlswrite to output data into the workbook in an orderly way. That is, we create the corresponding worksheet for each element ID and we include headers along with the data. Once we are done, xlsclose must be used to release the Excel file.
% Format data for output
worksheet = ['Element ' num2str(j)];
headers = {'Stress', 'NumberOfCycles', 'Probability', 'Stress Amplitude', ...
'Mean Stress', 'Damage', 'CorrectedStressAmp'};
data = [Stress, NumberOfCycles, Probability, StressAmplitude, ...
MeanStress, Damage, CorrectedStressAmp];
% Write data
xlswrite(file, headers, worksheet, 'A1:G1');
xlswrite(file, data, worksheet, 'A2:G100');
end
% Close .xlsx file after writing
xlsclose(file);
end
Now we can share these Excel workbooks with anyone, doesn’t matter whether they know what an .h5 file is or not!