HDF is a self-describing data format that is similar to netCDF. The National Aeronautics and Space Administration (NASA) uses HDF for many of its data sets. HDF files can be written with a number of different "data models," for example, HDF is apparently a popular way to store images. I am describing the "sd" data model. The most basic information on how Matlab reads hdf can be obtained by typing
help hdf
in a
Matlab session.
Open
a buffer for the input file.sd_id = hdfsd( 'start', 'filename.hdf',
'rdonly' )
Obtain information on the number of data sets and attributes in the
file.[ndatasets,nglobal_attr,status] =
hdfsd('fileinfo',sd_id)
The number of data sets could be 1 if there is
only the data, 3 if information on the longitudes and latitudes of the data is
also provided, and more if more variables are provided, for example, zonal and
meridional winds.
You next need to read the "attributes" of the file. The attributes can include information on the units of the data, how to unpack the data, the organization of the data set (the central longitude and latitude of the first grid box), ... An example of how data is packed is sea-level pressure, where a value of 1020.5 mb might be stored as 2050, and you will need to divide by 100 and add 1000 to get the data value.
The following code will read the attribute names and their
values.for icnt = 0: nglobal_attr-1 % Note that HDF counts from 0, not
1.
icnt
attribute_name = hdfsd('readattr', sd_id, icnt)
hdfsd(
'readattr', sd_id, hdfsd('findattr',sd_id,'attribute_name') )
end
For the data I was reading (SeaWifs chlorophyll), the attributes were written as single precision numbers, and I needed to convert them to double precision in order to do math on them. If you find that Matlab complains when you try even the simplest mathematical manipulation, this is most likely the problem.
To read the icnt'th data set:icnt = 0; % icnt begins at zero
because HDF counts from 0, not 1.
Close the file buffer when you are
finished.
sds_id = hdfsd( 'select', sd_id, icnt
)
[ds_name, ds_ndims, ds_dims, ds_type, ds_atts, stat] =
hdfsd('getinfo',sds_id);
ds_start = zeros(1,ds_ndims);
ds_stride = [];
ds_edges = ds_dims;
[ds_data, status] =
hdfsd('readdata',sds_id,ds_start,ds_stride,ds_edges); % "ds_data" is the
data.
end
hdfsd('end',sd_id);