HDF5 for Python

Version 2.1.3

logo image

About the project

The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started.

In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py.

Best of all, the files you create are in a widely-used standard binary format, which you can exchange with other people, including those who use programs like IDL and MATLAB.

Downloads

On Linux & Mac OS X, you can also install via easy_install or pip. The install guide is also helpful.

Older versions of h5py are also available.

Documentation

The h5py user manual is a great place to start; you may also want to check out the FAQ.

We have a mailing list at Google Groups for user questions and development.

The lead developer of h5py also maintains a blog at alfven.org.