Data Processing

Creating a processing context

In order to process data with fluxEngine, a processing context must be created. A processing context knows about:

  • The model that is to be processed

  • How processing will be parallelized (it takes that information from the current handle) – changing parallelization settings will invalidate a given context

  • What kind of data is to be processed each time (full HSI cubes or individual PushBroom frames)

  • The size of data that is to be processed. fluxTrainer models are designed to be camera-independent (to an extent), and thus do not know about the actual spatial dimensions of the data that is to be processed. But once processing is to occur, the spatial dimensions have to be known

  • The input wavelengths of the data being processed. While a model build in fluxTrainer specifies the wavelengths that will be used during processing, cameras of the same model don’t map the exact same wavelengths onto the same pixels, due to production tolerances. For this reason cameras come with calibration information that tells the user what the precise wavelengths of the camera are. The user must specify the actual wavelengths of the input data, so that fluxEngine can interpolate those onto the wavelength range given in the model

  • Any white (and dark) reference data that is applicable to processing

There are two types of processing contexts that can be created: one for HSI cubes, one for PushBroom frames.

HSI Cube Processing Contexts

To process entire HSI cubes the user must use the constructor of ProcessingContext that takes a ProcessingContext.HSICube attribute argument. It has the following parameters:

  • The model that is to be processed

  • The storage order of the cube (BSQ, BIL, or BIP)

  • The scalar data type of the cube (e.g. 8 bit unsigned integer)

  • The spatial dimensions of the cube that is to be processed

  • The wavelengths of the cube

  • Whether the input data is in intensities or reflectances

  • An optional set of white reference measurements

  • An optional set of dark reference measurements

There are two ways to specify the spatial dimensions of a given cube. The first is to fix them at this point, only allowing the user to process cubes that have exactly this size with the processing context. The alternative is to leave them variable, but specify a maximum size. This has the advantage that the user can process differently sized cubes with the same context, but has the major disadvantage that if a white reference is used, it will be averaged along all variable axes, that means that any spatial information of the reference data will be averaged out. (It is also possible to only fix one of the spatial dimensions.)

For referencing it is typically useful to average multiple measurements to reduce the effect of noise. For this reason, any references that are provided have to be tensors of 4th order, with an additional initial dimension at the beginning for the averages. For example, a cube in BSQ storage order has the shape (λ, y, x), so the references must have the shape (N, λ, y, x), where N may be any positive number, indicating the amount of measurements that is to be averaged. A cube in BIP storage order would have a shape of (y, x, λ), leading to a reference shape of (N, y, x, λ).

Note

It is possible to supply only a single cube as a reference measurement, in that case N would be 1. In that case the structure of the data is effectively only a tensor of third order – but the additional dimension still has to be specified. The function numpy.expand_dims may be used for this purpose to add the additional dimension:

referenceData = numpy.expand_dims(cube, axis=0)

Reference cubes must always have the same storage order as the cubes that are to be processed.

The first example here shows how to create a processing context without any references, assuming that the input data is already in reflectances, with a 32bit floating point data type, and fixed spatial dimensions:

width = 1024
height = 2150
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Reflectance)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.HSICube,
                                       storageOrder=fluxEngine.HSICube_StorageOrder.BSQ,
                                       dataType=numpy.float32,
                                       maxHeight=height, height=height,
                                       maxWidth=width, width=width,
                                       wavelengths=wavelengths, referenceInfo=referenceInfo);

Alternatively, to create a processing context that uses a white reference cube, and where the y dimension has a variable size, the following code could be used:

width = 1024
height = 2150
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Intensity)
# this is just an example, the real reference data
# would come from somewhere
referenceInfo.whiteReference = np.ones((1, len(wavelengths), 40, width), np.uint8)

context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.HSICube,
                                       storageOrder=fluxEngine.HSICube_StorageOrder.BSQ,
                                       dataType=numpy.uint8,
                                       maxHeight=height, height=-1,
                                       maxWidth=width, width=width,
                                       wavelengths=wavelengths, referenceInfo=referenceInfo);

Note

The maximum size specified here also determines how much RAM is allocated in fluxEngine internally. Specifying an absurdly large number will cause fluxEngine to exhaust system memory.

Note

The white and dark references may have different spatial dimensions if those dimensions are specified as variable. In the above example, if a dark reference were to be specified, it would have to have the same width and number of bands (because those are both fixed), but it could have a different height.

PushBroom Frame Processing Contexts

To process entire HSI cubes the user must use the constructor of ProcessingContext that takes a ProcessingContext.PushBroomFrame argument. It has the following parameters:

  • The model that is to be processed

  • The storage order of the PushBroom frame

  • The scalar data type of each PushBroom frame

  • The spatial width of each PushBroom frame (which will be the actual with of each image if LambdaY storage order is used, or the height of each image if LambdaX storage order is used)

  • The wavelengths of each PushBroom frame

  • Whether the input data is in intensities or reflectances

  • An optional set of white reference measurements

  • An optional set of dark reference measurements

As PushBroom processing can be thought of as a means to incrementally build up an entire cube (but process data on each line individually), the spatial width must be fixed and cannot be variable. (The number of frames processed, i.e. the number of calls to ProcessingContext.processNext() is variable though.)

As it is often useful to average multiple reference measurements to reduce noise, the white and dark references must be supplied as tensors of third order, with a dimension structure of (N, x, λ) or (N, λ, x), depending on the storage order.

The following example shows how to set up a processing context without any references, assuming the input data is already in reflectances, stored as 32bit floating point numbers:

width = 320
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Reflectance)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.PushBroomFrame,
                                       storageOrder=fluxEngine.PushBroomFrame_StorageOrder.LambdaY,
                                       dataType=numpy.float32, width=width,
                                       wavelengths=wavelengths, referenceInfo=referenceInfo);

Alternatively, if both white and dark reference measurements are to be supplied, and unsigned 8bit integer numbers, one could use the following code:

width = 640
wavelengths = [900, 901.5, 903, ...]
referenceInfo = fluxEngine.ReferenceInfo(fluxEngine.ValueType.Intensity)
# Replace this with actually obtaining the reference data.
# In this example the white reference contains 5 measurements,
# and the dark reference contains 10. Since this uses LambdaY
# storage order, these references are effectively HSI cubes
# in  BIL storage order.
referenceInfo.whiteReference = np.ones((5, len(wavelengths), width), np.uint8)
referenceInfo.darkReference = np.ones((10, len(wavelengths), width), np.uint8)
context = fluxEngine.ProcessingContext(model, fluxEngine.ProcessingContext.PushBroomFrame,
                                       storageOrder=fluxEngine.PushBroomFrame_StorageOrder.LambdaY,
                                       dataType=numpy.uint8, width=width,
                                       wavelengths=wavelengths, referenceInfo=referenceInfo);

Processing Data

Once a processing context has been set up, the user may use it to process data. This happens in two steps:

  • Set the data pointer for the source data that is to be processed

  • Process the data

The first step has to be called each time new data is to be processed. This is different from the C/C++ API.

Simply provide the source data in form of a numpy array with the right dimensions.

For HSI cubes the numpy array must have three dimensions in the right storage order, and the number of wavelengths must be fixed. The width and height might be fixed, depending on how the processing context was created.

For PushBroom frames the numpy array must have two dimensions in the right storage order, and both must have the right size, depending on the parameters with which the processing context was created.

fluxEngine will check that the data type and dimensions of the input data match the processing context, and will throw an exception if they do not.

Note

At the moment fluxEngine will create a copy of the data provided here, as the Python/NumPy memory ownership model don’t match the memory model within fluxEngine.

Once the data pointer has been set, the user may process the data with the ProcessingContext.processNext() method.

Therefore, to process data with fluxEngine in Python, the following two method calls should

# fetch the source data as a numpy array
sourceData = ...
context.setSourceData(sourceData)
context.processNext()

After processing has completed, the user may obtain the results (see the next section).

PushBroom Resets

As PushBroom cameras can be thought of as incrementally building up a cube line by line, at some point the user may want to indicate that the current cube is considered complete and a new cube starts. In that case the processing context has to be reset, so that all stateful operations are reset as well, such as object detection, but also kernel-based operations.

To achieve this the method ProcessingContext.resetState() exists. Its usage is simple:

context.resetState()

Note

There is no requirement to actually perform such a reset. If fluxEngine is used to process PushBroom frames that are obtained from a camera above a conveyor belt in a continuous process, for example, it is possible to just simply process all incoming frames in a loop and never call the reset method. In that situation the reset method would be called though if the conveyor belt is stopped and has to be started up again. Though, depending on the specific application, that could also mean that a processing context would have to be created again, for example because references have to be measured again, and a simple state reset is not sufficient.

Obtaining Results

After processing has completed via the ProcessingContext.processNext() method, fluxEngine provides a means for the user to obtain the results of that operation.

When designing the model in fluxTrainer that is to be used here, Output Sinks should be added to the model wherever processing results are to be obtained later.

Note

If a model contains no output sinks, it can be processed with fluxEngine, but the user will have no possibility of extracting any kind of result from it.

fluxEngine provides a means to introspect a model to obtain information about the output sinks that it contains. The following two identifiers for output sinks have to be distinguished:

  • The output sink index, this ist just a number starting at 0 and ending at one below the number of output sinks that may be used to specify the output sink for the purposes of the fluxEngine API.

    The ordering of output sinks according to this index is non-obvious. Loading the same .fluxmdl file will lead to the same order, but saving models with the same configuration (but constructed separately) can lead to different orders of output sinks.

  • The output id, which is a user-assignable id in fluxTrainer that can be used to mark output sinks for a specific purpose. The output id may not be unique (but should be), and is purely there for informational purposes.

    For each output sink in the model the user will be able to obtain the output id of that sink. There is also a method ProcessingContext.findOutputSink() that can locate an output sink if the output id of that sink is unique. It will return the index of the sink with that output id.

To obtain information about all output sinks in the context, the method ProcessingContext.outputSinkInfos() exists, which returns a list of OutputSinkInfo objects that contain information about each output sink. The order of the list also defines the output sink index.

sinkInfos = context.outputSinkInfos()
for i in range(0, len(sinkInfos)):
    print("Sink with index {0} has name {1}".format(i, sinkInfos[i].name))

The OutputSinkInfo structure contains the following information:

  • The output id of the output sink

  • The name of the output sink as a string (this is the name the user specified when creating the model in fluxTraineer)

  • The output delay of the output sink (only relevant in the case when PushBroom data is being processed, see Output Delay for PushBroom Cameras for a more detailed discussion of this.

  • Data structure: what kind of data the output sink will return

Output sinks store either tensor data or detected objects, depending on the configuration of the output sink, and where it sits in the processing chain.

Tensor Data

Tensor data is always returned as a NumPy array. Tensor data within fluxEngine always has a well-defined storage order, as most algorithms that work on hyperspectral data are at their most efficient in this memory layout. While fluxEngine supports input data of arbitrary storage order, it will be converted to the internal storage order at the beginning of processing. The output data will always have the following structure:

  • When processing entire HSI cubes it will effectively return data in BIP storage order, that means that the dimension structure will be (y, x, λ) (for data that still has wavelength information) or (y, x, P), where P is a generic dimension, if the data has already passed through dimensionality reduction filters such as PCA.

  • When processing PushBroom frames it will effectively return data in LambdaX storage order, with an additional dimension of size 1 at the beginning. In that case it will either be (1, x, λ) or (1, x, P).

  • Pure per-object data always has a tensor structure of order 2, in the form of (N, λ) or (N, P), where N describes the number of objects detected in this processing iteration. Important: objects themselves are returned as a structure (see below), per-object data is data that is the output of filters such as the Per-Object Averaging or Per-Object Counting filter. Also note that output sinks can combine objects and per-object data, in which case the per-object data will be returned as part of the object structure.

    For PushBroom data it is recommended to always combine per-object data with objects before interpreting it, as the output delay of both nodes may be different, and when combining the data the user does not have to keep track of the relative delays themselves.

The output structure of a tensor data output sink is stored in a OutputSinkTensorStructure object assigned to the OutputSinkInfo.structure attribute of the OutputSinkInfo object that characterizes the output sink. In contains the following information:

  • The scalar data type of the tensor data (this is the same as the data type configured in the output sink)

  • The order of the tensor, which will be 2 or 3 (see above)

  • The maximum sizes of the tensor that can be returned here

  • The fixed sizes of the tensor that will be returned here. If the tensors returned here are always of the same size, the values here will be same as the maximum sizes. Any dimension that is not always the same will have a value of -1 instead. If all of the sizes of the tensor returned here are fixed, the tensor returned will always be of the same size. (There is one notable exception: if the output sink has a non-zero output delay of m, the first m processing iterations will produce a tensor that does not have any data.)

Please refer to the documentation of the OutputSinkTensorStructure class for more information on how this information is returned.

Using ProcessingContext.outputSinkData() it is possible to obtain that tensor data after a successfull processing step.

For example, if we know a given sink with index sinkIndex has signed 16bit integer data that spans the entire cube that is being processed (when processing HSI cubes), the following code could be used to obtain the results:

1 # Obtain sink index from somewhere
2 # e.g. sinkIndex = context.findOutputSink(42)
3 sinkIndex = ...
4 data = context.outputSinkData(sinkIndex)
5 # data is now a numpy.array with shape (cubeHeight, cubeWidth, 1)
6 # and of type numpy.int16
7 print(data)

Object Data

Objects will be returned as an list of OutputObject, which contain information about objects that were detected in the model.

The output structure of an object list data output sink is stored in a OutputSinkObjectListStructure object assigned to the OutputSinkInfo.structure attribute of the OutputSinkInfo object that characterizes the output sink. It contains the following information:

  • The maximum number of objects that can be returned in a single iteration.

  • Whether per-object data was output using the output sink in the model, and if so, how large it is (per-object data is always a vector, i.e. a tensor of order 1)

  • The scalar type of per-object data (if any)

Please refer to the documentation of OutputSinkObjectListStructure for more information on how this information is returned.

Using ProcessingContext.outputSinkData() it is possible to obtain that object data after a successfull processing step. It may be called in the following manner, assuming sinkIndex is a sink that is known to return objects:

1 # Obtain sink index from somewhere
2 # e.g. sinkIndex = context.findOutputSink(42)
3 sinkIndex = ...
4 objectList = context.outputSinkData(sinkIndex)
5 for object in objectList:
6     x, y, width, height = object.boundingBox
7     f = "Object: bbox topleft [{0}, {1}] -- bottomright [{2}, {3}], area {4}"
8     print(f.format(x, y, x + width - 1, y + height - 1, object.area))

The mask and additionalData entries of each object are Numpy arrays.

Note

When comparing this to the C/C++ APIs: the Python API will use only the extended object logic internally, and always provide all possible information for each object. Also, the numpy arrays that are fields of the fluxEngine.OutputObject structure are always be copied by the Python API before returning this information to the user, so the user need not care about the lifetime of that data, which they would have to in C/C++.