pyEELSMODEL.io_tools package

Submodules

pyEELSMODEL.io_tools.dm_ncempy module

THIS CODE IS FROM NCEMPY (https://openncem.readthedocs.io/en/latest/#) It is GPLv3 and MIT license A module to load data and meta data from DM3 and DM4 files into python as written by Gatan’s Digital Micrograph program.

Note

General users:

Use the simplified dm.dmReader() function to load the data and meta data as a python dictionary.

Advanced users and developers:

Access the file internals through the dm.fileDM() class.

On Memory mode:

The fileDM support and “on memory” mode that pre-loads the file data in memory and read operations during header parsing are performed against memory. This can significantly improve performance when the file resides in a parallel file system (PFS) because latency of seek operations PFSs is very high.

pyEELSMODEL.io_tools.dm_ncempy.dmReader(filename, dSetNum=0, verbose=False, on_memory=True)

A simple function to parse the file and read the requested dataset. Most users will want to use this function to simplify reading data directly into memory and to retriece the spatial axes (i.e. energy axis).

Parameters:
  • filename (str) – The filename to open and read into memory

  • dSetNum (int) – The number of the data set to read. Almost always should be = 0. Default = 0

  • verbose (bool) – Allow extra printing to see file internals. Default = False

  • on_memory (bool, default True) – Whether to use the on_memory option of fileDM. Usually provides much faster data reading.

Notes

Use the coords key for spatial axes (i.e. energy loss for spectra).

Returns:

A dictionary of keys where the data is in the ‘data’ key. Other metadata is contained in other named keys such as ‘pixelSize’ ‘coords’ contains the coordinate axes with the proper origin and scale applied (i.e. the energy loss axis for EELS data)

Return type:

dict

Example

Load data from a single image dm3 file and display it >> from ncempy.io import dm >> im0 = dm.dmReader(‘filename.dm3’) >> plt.imshow(im0[‘data’]) #show the single image from the data file

class pyEELSMODEL.io_tools.dm_ncempy.fileDM(filename, verbose=False, on_memory=True)

Bases: object

Opens the file and reads in the header. Data is loaded using the getDataset method.

Attributes ———- xSize, ySize, zSize : list The shape of the data for each data set in the file. Each value is the same as the shape attribute for an ndarray. zSize2 : list The shape of the 4th dimension for a 4D file (i.e. 4D-STEM data) dataSize : list The total number of bytes in each dataset. Similar to numpy’s nbytes attribute for an ndarray. dataOffset : list The starting byte number for each dataset. This can provide fast access directly to the data if needed by seeking to this byte number. dataShape : list The total number of dimensions in eahc dataset. Similar to numpy’s ndims attribute for an ndarray. file_name : str The name of the file file_path : pathlib.Path A pathlib.Path object for the open file fid : file The file handle. numObjects : list The number of datasets in the file. Often (but not always) the file contains a thumbnail and the raw data. The thumbnail will always be the first object. See thumbnail attribute. thumbnail : bool Tells whether the first object or dataset in the file is a thumbnail. If true then this object is always skipped in methods and it is assumed that the user wants to skip this. Can retrive this thumbnail using a built-in method if desired. scale : list The real size of the pixel. Real and reciprical space are supported. scaleUnit : list The unit name as a string for each dimension of each dataset. origin : list The origin for the real or reciprocal space scaling for each dimension. Be careful, this value is actually meant to be scaled by the scale before being used. See ncempy.io.dmReader() for proper handling of this especially for spectroscopy data. allTags : dictionary Contains all tags in the DM file as key value pairs.

Examples

Read data from a file containing a single image into memory >> from ncempy.io import dm >> with dm.fileDM(‘filename.dm4’) as dmFile1: >> dataSet = dmFile1.getDataset(0)

Example of reading a full multi-image DM3 file into memory: >> with dm.fileDM(‘imageSeries.dm3’)as dmFile2: >> series = dmFile2.getDataset(0)

allTags
dataOffset
dataShape
dataSize
dataType
fid
fileSize
file_name
file_path
fromfile(*args, **kwargs)

Reads data from a file or memory map. Calls np.fromfile and np.frombuffer depending on the on_memory mode of the fileDM.

Note

This is essentially a passthrough function to Numpy’s frombuffer and fromfile depending on the class variable on_memory.

Parameters:
  • *args – dtype and count are required

  • **kwargs

Returns:

Data read from the file as a 1d ndarray.

Return type:

ndarray

getDataset(index)

Retrieve a dataset from the DM file.

Notes

Most DM3 and DM4 files contain a small “thumbnail” as the first dataset written as RGB data. This function ignores that dataset if it exists. To retrieve the thumbnail use the getThumbnail() function.

The pixelOrigin returned is not actually the start of the coordinates. The start of the energy axis for EELS (for example) will be pixelSize * pixelOrigin. dmReader() returns the correct coordinates. The correct origin is: pixelSize * pixelOrigin and be careful about the sign as it seems some datasets might use -pixelOrigin in the previous equation.

Parameters:

index (int) –

The number of the data set to retrieve ignoring the thumbnail.

If a thumbnail exists then index = 0

actually corresponds to the second data set in a DM file.

Returns:

A dictionary of the data and meta data. The data is associated with the ‘data’ key in the dictionary.

Return type:

dict

getMemmap(index)

Return a numpy memmap object (read-only) for the dataset requested. This is very useful for very large datasets to avoid loading the entire data set into memory. No meta data is returned.

Parameters:

index (int) – The number of the dataset in the DM file.

Returns:

A read-only numpy memmap object with access to the data. The file will continue to be open as long as the memmap is open. Delete the memmap to close the file.

Return type:

numpy.memmap

getMetadata(index)

Get the useful metadata in the file. This parses the allTags dictionary and retrieves only the useful information about hte experimental parameters. This is a (useful) subset of the information contains in the allTags attribute.

Note: some DM files contain extra information called the Tecnai Microscope Info. This is added to the metadata dictionary as a string.

Parameters:

index (int) – The number of the dataset to get the metadata from.

getSlice(index, sliceZ, sliceZ2=0)

Retrieve a slice of a dataset from the DM file. The data set will have a shape according to 3D = [sliceZ,Y,X] or 4D: [sliceZ2,sliceZ,Y,X]

Note: Most DM3 and DM4 files contain a small “thumbnail” as the first dataset written as RGB data. This function ignores that dataset if it exists. To retrieve the thumbnail use the getThumbnail() function.

Warning: DM4 files with 4D data sets are written as [X,Y,Z1,Z2]. This code currently gets the [X,Y] slice. Getting the [Z1,Z2] slice is not yet implemented. Use the getMemmap() function to retrieve arbitrary slices of large data sets.

Parameters:
  • index (int) – The number of the dataset in the DM file.

  • sliceZ (int) – The slice to get along the first dimension (C-ordering) for 3D datasets or 4D datasets.

  • sliceZ2 (int) – For 4D dataset

Returns:

A dictionary containing meta data and the data.

Return type:

dict

getThumbnail()

Read the thumbnail saved as the first dataset in the DM file as an RGB array. This is not fully tested. Be careful using this.

Returns:

Numpy array of size [3,Y,X] which is an RGB thumbnail.

Return type:

ndarray

metadata
numObjects
on_memory
origin
parseHeader()

Parse the header by reading the root tag group. This ensures the file pointer is in the correct place.

scale
scaleOrigin
scaleUnit
seek(fid, offset, from_what=0)

Positions the reading head for fid. fid can be a file or memory map. Follows the same convention as file.seek

Parameters:
  • fid (file id) – File or memory map.

  • offset (int) – Number of bytes to move the head forward (positive value) or backwards (negative value).

  • from_what (int) – Reference point to use in the head movement. 0: for beginning of the file (default behavior), 1: from the current head position, and 2: from the end of the file.

tell()

Return the current position in the file. Switches mode based on on_memory mode.

Returns:

pos – The current position in the file.

Return type:

int

thumbnail
verbose
writeTags(new_folder_path_for_tags=None)

Write out all tags as human readable text to a text file in the same directory (or a user definable directory) and with a the same name as the DM file.

Parameters:

new_folder_path_for_tags (str or pathlib.Path, Optional) – Allow user to define a different path than the directory of the current file.

xSize
ySize
zSize
zSize2

pyEELSMODEL.io_tools.hdf5_io module

pyEELSMODEL.io_tools.hdf5_io.load_elements_and_edges(filename)
pyEELSMODEL.io_tools.hdf5_io.load_h5py(filename)
pyEELSMODEL.io_tools.hdf5_io.load_hspy(filename)

Module contents