Read contents of h5 and hdf5 files and write them in Excel workbook


Read contents of h5 and hdf5 files and write them in Excel workbook

 

When we are handling .h5 and .hdf5 files, which are useful for storing large amounts of data in any format, we may want to output all results into a more common file anyone can use (without any required knowledge).

 

In this tutorial I’ll show how all the contents from an .h5 file can be extracted and written into an Excel workbook in an orderly way. The input and output files as well as the script are added as attachments.

 

The first step for extracting data from a .h5 file is to have a look at its structure, as we need to know how to reach the relevant data within. The HDF5 Viewer Utility is very useful for this:

 

 

With this utility, we see that data is arranged in two main groups (Bottom and Top), each of these, in turn, contains relevant data about each element: Damage and another subgroup called Event_2 which contains the attributes NoOfCycles, RainflowData and RandomData. With this structure in mind, we can query all data from the file using loops. Let’s have a look at the script:

 

  1. The first step is to retrieve all groups from the .h5 file (using readhdf5toc) and loop through each of them for extracting data from all elements. As an Excel workbook will be created (with xlsopen) for each of the main groups, assign a name to each of the output files.

 

close all, clear, clc

 

filename = 'ElementRainflow.h5';

 

% Get all available groups

groups = readhdf5toc(filename)(:,2);

 

% Extract data from each group

for i = 1:numel(groups)

     

      % Get all elements for this group

      group = groups{i};

      elements = readhdf5toc(filename, group)(:,2);

     

      % Open .xlsx file to write outputs for this group

      outputName = strcat(filename(1:end-3), '_', group(2:end), '.xlsx');

      file = xlsopen(outputName, 1);

 

  1. Using the list of all available elements, we can look at the data present within each of them. From the HDF5 Viewer, we saw that there is a field called Damage and another subgroup called Event_2. We can read data in Damage using readhdf5 and the data contained in Event_2 same as we did with the top groups.

     

      % Extract data from each element

      for j = 1:numel(elements)

           

            % Get all datatypes for this element

            element = strcat(group, '/ElementId_', num2str(j), '_2D');

            datatypes = readhdf5toc(filename, element)(:,2);

           

            % Get total damage for this element

            totalDamage = readhdf5(filename, datatypes{1}){1}.Damage;

           

            % Get all datatypes inside Event_2

            subdatatypes = readhdf5toc(filename, datatypes{2})(:,2);

 

  1. From Event_2, we can access the attributes RainflowData and RandomData, which will both be retrieved as a Structure, an OML data type which is somewhat like Python dictionaries (its entries are indexed by strings).

           

            % Extract rainflow and random data

            RainflowData = readhdf5(filename, subdatatypes{2}){1};

            RandomData = readhdf5(filename, subdatatypes{3}){1};

           

            % Relevant data for writing output

            Stress = RainflowData.('2_CycleId') + 1;

            NumberOfCycles = RandomData.NoOfCycles;

            Probability = RandomData.Probability;

            StressAmplitude = RainflowData.('4_StressAmp');

            MeanStress = RainflowData.('5_MeanStress');

            Damage = RainflowData.('6_Damage');

            CorrectedStressAmp = RainflowData.('7_CorrectedAmp');

 

  1. After having extracted all relevant data, we use xlswrite to output data into the workbook in an orderly way. That is, we create the corresponding worksheet for each element ID and we include headers along with the data. Once we are done, xlsclose must be used to release the Excel file.

           

            % Format data for output

            worksheet = ['Element ' num2str(j)];

            headers = {'Stress', 'NumberOfCycles', 'Probability', 'Stress Amplitude', ...

                  'Mean Stress', 'Damage', 'CorrectedStressAmp'};      

            data = [Stress, NumberOfCycles, Probability, StressAmplitude, ...

                  MeanStress, Damage, CorrectedStressAmp];

           

            % Write data

            xlswrite(file, headers, worksheet, 'A1:G1');

            xlswrite(file, data, worksheet, 'A2:G100');

           

      end

     

      % Close .xlsx file after writing

      xlsclose(file);

     

end

 

Now we can share these Excel workbooks with anyone, doesn’t matter whether they know what an .h5 file is or not!