HDF5 for Python

Version 2.2.1

logo image

About the project

The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started.

In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py.

Best of all, the files you create are in a widely-used standard binary format, which you can exchange with other people, including those who use programs like IDL and MATLAB.

Stable Downloads

On Linux & Mac OS X, you can also install via easy_install or pip.

A full list of downloads is available at PyPI. Really old versions are at the old Google Code site.

Documentation

The h5py user manual is a great place to start; you may also want to check out the FAQ.

There's an O'Reilly book, Python and HDF5, written by the lead author of h5py, Andrew Collette.

We also have a mailing list at Google Groups for user questions and development.