pyEELSMODEL.io_tools package
Submodules
pyEELSMODEL.io_tools.dm_ncempy module
THIS CODE IS FROM NCEMPY (https://openncem.readthedocs.io/en/latest/#) It is GPLv3 and MIT license A module to load data and meta data from DM3 and DM4 files into python as written by Gatan’s Digital Micrograph program.
Note
- General users:
Use the simplified dm.dmReader() function to load the data and meta data as a python dictionary.
- Advanced users and developers:
Access the file internals through the dm.fileDM() class.
- On Memory mode:
The fileDM support and “on memory” mode that pre-loads the file data in memory and read operations during header parsing are performed against memory. This can significantly improve performance when the file resides in a parallel file system (PFS) because latency of seek operations PFSs is very high.
- pyEELSMODEL.io_tools.dm_ncempy.dmReader(filename, dSetNum=0, verbose=False, on_memory=True)
A simple function to parse the file and read the requested dataset. Most users will want to use this function to simplify reading data directly into memory and to retriece the spatial axes (i.e. energy axis).
- Parameters:
filename (str) – The filename to open and read into memory
dSetNum (int) – The number of the data set to read. Almost always should be = 0. Default = 0
verbose (bool) – Allow extra printing to see file internals. Default = False
on_memory (bool, default True) – Whether to use the on_memory option of fileDM. Usually provides much faster data reading.
Notes
Use the coords key for spatial axes (i.e. energy loss for spectra).
- Returns:
A dictionary of keys where the data is in the ‘data’ key. Other metadata is contained in other named keys such as ‘pixelSize’ ‘coords’ contains the coordinate axes with the proper origin and scale applied (i.e. the energy loss axis for EELS data)
- Return type:
dict
Example
Load data from a single image dm3 file and display it >> from ncempy.io import dm >> im0 = dm.dmReader(‘filename.dm3’) >> plt.imshow(im0[‘data’]) #show the single image from the data file
- class pyEELSMODEL.io_tools.dm_ncempy.fileDM(filename, verbose=False, on_memory=True)
Bases:
objectOpens the file and reads in the header. Data is loaded using the getDataset method.
Attributes ———- xSize, ySize, zSize : list The shape of the data for each data set in the file. Each value is the same as the shape attribute for an ndarray. zSize2 : list The shape of the 4th dimension for a 4D file (i.e. 4D-STEM data) dataSize : list The total number of bytes in each dataset. Similar to numpy’s nbytes attribute for an ndarray. dataOffset : list The starting byte number for each dataset. This can provide fast access directly to the data if needed by seeking to this byte number. dataShape : list The total number of dimensions in eahc dataset. Similar to numpy’s ndims attribute for an ndarray. file_name : str The name of the file file_path : pathlib.Path A pathlib.Path object for the open file fid : file The file handle. numObjects : list The number of datasets in the file. Often (but not always) the file contains a thumbnail and the raw data. The thumbnail will always be the first object. See thumbnail attribute. thumbnail : bool Tells whether the first object or dataset in the file is a thumbnail. If true then this object is always skipped in methods and it is assumed that the user wants to skip this. Can retrive this thumbnail using a built-in method if desired. scale : list The real size of the pixel. Real and reciprical space are supported. scaleUnit : list The unit name as a string for each dimension of each dataset. origin : list The origin for the real or reciprocal space scaling for each dimension. Be careful, this value is actually meant to be scaled by the scale before being used. See ncempy.io.dmReader() for proper handling of this especially for spectroscopy data. allTags : dictionary Contains all tags in the DM file as key value pairs.
Examples
Read data from a file containing a single image into memory >> from ncempy.io import dm >> with dm.fileDM(‘filename.dm4’) as dmFile1: >> dataSet = dmFile1.getDataset(0)
Example of reading a full multi-image DM3 file into memory: >> with dm.fileDM(‘imageSeries.dm3’)as dmFile2: >> series = dmFile2.getDataset(0)
- allTags
- dataOffset
- dataShape
- dataSize
- dataType
- fid
- fileSize
- file_name
- file_path
- fromfile(*args, **kwargs)
Reads data from a file or memory map. Calls np.fromfile and np.frombuffer depending on the on_memory mode of the fileDM.
Note
This is essentially a passthrough function to Numpy’s frombuffer and fromfile depending on the class variable on_memory.
- Parameters:
*args – dtype and count are required
**kwargs
- Returns:
Data read from the file as a 1d ndarray.
- Return type:
ndarray
- getDataset(index)
Retrieve a dataset from the DM file.
Notes
Most DM3 and DM4 files contain a small “thumbnail” as the first dataset written as RGB data. This function ignores that dataset if it exists. To retrieve the thumbnail use the getThumbnail() function.
The pixelOrigin returned is not actually the start of the coordinates. The start of the energy axis for EELS (for example) will be pixelSize * pixelOrigin. dmReader() returns the correct coordinates. The correct origin is: pixelSize * pixelOrigin and be careful about the sign as it seems some datasets might use -pixelOrigin in the previous equation.
- Parameters:
index (int) –
- The number of the data set to retrieve ignoring the thumbnail.
If a thumbnail exists then index = 0
actually corresponds to the second data set in a DM file.
- Returns:
A dictionary of the data and meta data. The data is associated with the ‘data’ key in the dictionary.
- Return type:
dict
- getMemmap(index)
Return a numpy memmap object (read-only) for the dataset requested. This is very useful for very large datasets to avoid loading the entire data set into memory. No meta data is returned.
- Parameters:
index (int) – The number of the dataset in the DM file.
- Returns:
A read-only numpy memmap object with access to the data. The file will continue to be open as long as the memmap is open. Delete the memmap to close the file.
- Return type:
numpy.memmap
- getMetadata(index)
Get the useful metadata in the file. This parses the allTags dictionary and retrieves only the useful information about hte experimental parameters. This is a (useful) subset of the information contains in the allTags attribute.
Note: some DM files contain extra information called the Tecnai Microscope Info. This is added to the metadata dictionary as a string.
- Parameters:
index (int) – The number of the dataset to get the metadata from.
- getSlice(index, sliceZ, sliceZ2=0)
Retrieve a slice of a dataset from the DM file. The data set will have a shape according to 3D = [sliceZ,Y,X] or 4D: [sliceZ2,sliceZ,Y,X]
Note: Most DM3 and DM4 files contain a small “thumbnail” as the first dataset written as RGB data. This function ignores that dataset if it exists. To retrieve the thumbnail use the getThumbnail() function.
Warning: DM4 files with 4D data sets are written as [X,Y,Z1,Z2]. This code currently gets the [X,Y] slice. Getting the [Z1,Z2] slice is not yet implemented. Use the getMemmap() function to retrieve arbitrary slices of large data sets.
- Parameters:
index (int) – The number of the dataset in the DM file.
sliceZ (int) – The slice to get along the first dimension (C-ordering) for 3D datasets or 4D datasets.
sliceZ2 (int) – For 4D dataset
- Returns:
A dictionary containing meta data and the data.
- Return type:
dict
- getThumbnail()
Read the thumbnail saved as the first dataset in the DM file as an RGB array. This is not fully tested. Be careful using this.
- Returns:
Numpy array of size [3,Y,X] which is an RGB thumbnail.
- Return type:
ndarray
- metadata
- numObjects
- on_memory
- origin
- parseHeader()
Parse the header by reading the root tag group. This ensures the file pointer is in the correct place.
- scale
- scaleOrigin
- scaleUnit
- seek(fid, offset, from_what=0)
Positions the reading head for fid. fid can be a file or memory map. Follows the same convention as file.seek
- Parameters:
fid (file id) – File or memory map.
offset (int) – Number of bytes to move the head forward (positive value) or backwards (negative value).
from_what (int) – Reference point to use in the head movement. 0: for beginning of the file (default behavior), 1: from the current head position, and 2: from the end of the file.
- tell()
Return the current position in the file. Switches mode based on on_memory mode.
- Returns:
pos – The current position in the file.
- Return type:
int
- thumbnail
- verbose
- writeTags(new_folder_path_for_tags=None)
Write out all tags as human readable text to a text file in the same directory (or a user definable directory) and with a the same name as the DM file.
- Parameters:
new_folder_path_for_tags (str or pathlib.Path, Optional) – Allow user to define a different path than the directory of the current file.
- xSize
- ySize
- zSize
- zSize2
pyEELSMODEL.io_tools.hdf5_io module
- pyEELSMODEL.io_tools.hdf5_io.load_elements_and_edges(filename)
- pyEELSMODEL.io_tools.hdf5_io.load_h5py(filename)
- pyEELSMODEL.io_tools.hdf5_io.load_hspy(filename)