Data Processing

In order to process data with fluxEngine, a so-called processing context must be created. A processing context knows about:

How processing will be parallelized (it takes that information from the current handle) – changing parallelization settings will invalidate a given context
The instrument device that supplies the data that is being processed. This is currently either a PushBroom camera or a spectrometer.
- It is also possible to process data loaded from files on disk, see the section Processing an ENVI cube, and to process data
What kind of processing should occur.
- Should a preview of the data be generated?
- Should the data be preprocessed so that it may be recorded to disk?
- Should a fluxEngine model be used to process the data from the device?
Any white and dark reference data that should be used her.

Processing contexts are represented by the fluxEngine::ProcessingContext class. Special static methods are available to create processing contexts for various different purposes.

After a context has been created data may be processed. The basic logic is the following:

Set the source (input) data of the context to a buffer from the instrument device via the correct overload of fluxEngine::ProcessingContext::setSourceData(). Note that this only sets a pointer in the processing context and the buffer must not be returned to the instrument device until processing has completed.

Call ctx.processNext() to process the data supplied to the processing context.

Obtain the result of the processing via the outputSinkData() method. This will return only a pointer to the processing result that is stored internally in the processing context. This is described in detail further down in Obtaining Results.

The following sections will describe different types of processing contexts that may be created for processing data from instrument devices. Alternativly one may create processing contexts to process data from measurements that have been loaded from disk; XXX describes the specifics for these contexts there. Finally it is possible to manually supply fluxEngine with data that it should process. This is the most complicated setup, as the user must provide fluxEngine with a lot of details on how the data is laid out in memory. This is described in the section Manual Data Processing.

Instrument Preview

Sometimes it is useful to obtain a preview image that may be shown to the user. The following device types will generate the following preview data:

PushBroom cameras: the preview data from a PushBroom camera will consist of data that is averaged over all wavelengths, so that only an intensity value is returned. The resulting tensor structure will always be (1, width, 1) for each buffer that is being processed.
Spectrometers: the preview data will be a single spectrum (tensor structure (bands)) that may be plotted.
HSI imager cameras: the preview data will be a single grayscale image that consists of the intensities averaged over all wavelengths. The tensor structure of the preview data will always be (height, width, 1) for each buffer that is being processed.

To create a processing context for preview data, one may use the fluxEngine::ProcessingContext::createInstrumentPreviewContext() static method. There are two overloads for this method: one that just takes a fluxEngine::InstrumentDevice pointer that will create the processing context associated with the main processing queue set of the fluxEngine handle, and one that takes an additional reference to a fluxEngine::ProcessingQueueSet.

To use the processing context to create the preview, one may call the overload of fluxEngine::ProcessingContext::setSourceData() that takes a fluxEngine::InstrumentDevice::BufferInfo argument. After the source data has been set, a call to fluxEngine::ProcessingContext::processNext() will perform the data processing.

The following example code will endlessly loop to create preview images from data that is acquired from a PushBroom camera:

 try {
     fluxEngine::ProcessingContext ctx;
     ctx = fluxEngine::ProcessingContext::createInstrumentPreviewContext(instrumentDevice);

     fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
     instrumentDevice->startAcquisition(parameters);
     while (true) {
         fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
         if (!buffer.ok)
             continue;
         ctx.setSourceData(buffer);
         ctx.processNext();
         // See below for details on how the output sink logic
         // works.
         auto outputData = ctx.outputSinkData(0);
         // Get a tensor view on the output data
         TensorData view{outputData};
         assert(view.dimension(0) == 1);
         assert(view.dimension(2) == 1);
         int64_t const width = view.dimension(1);
         for (int64_t x = 0; x < width; ++x) {
             float pixelValue{};
             if (view.dataType() == fluxEngine::DataType::Float32)
                 pixelValue = view.at<float>(0, x, 0);
             else if (view.dataType() == fluxEngine::DataType::Float64)
                 pixelValue = static_cast<float>(view.at<double>(0, x, 0));
             // (etc. handle the other cases)

             // Do something with the pixel value here
         }
         instrumentDevice->returnBuffer(buffer.id);
     }
     instrumentDevice->stopAcquisition();
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

The context that has been created in this manner may be reused for multiple acquisitions. Note, however, that the strucure of the input data at the time of context creation determines what kind of data the context expects. If parameters are changed, especially things such as ROI, the context may no longer be compatible to the buffers that the device provides.

Recording HSI Data

fluxEngine also has the capability of preprocessing data in order for it to be stored as HSI data cubes.

There are certain preprocessing steps that are always done, such as applying corrections to the data obtained from the camera (if the camera has corrections that are to be applied in software), as well as normalizing the storage order. (fluxEngine always uses the BIP data layout internally, with wavelengths in ascending order.)

Additionally, the user may request further normalizations:

The user may select whether they want to store the recorded data as intensities or as reflectances. If the data is stored as intensities the user has the option to include white and dark references (if measured).

The user may select a wavelength grid to use instead of the raw wavelengths from the camera. By default hyperspectral cameras will have slight variations in the wavelengths that each pixel corresponds to due to manufacturing tolerances. For this reason fluxEngine provides the user with the ability to normalize the wavelengths onto a regularized grid.

(When fluxEngine processes data in models wavelengths are always normalized, but this allows the user to already perform the normalization during recording.)

The processing context that is created for the recording of HSI data requires more input than just the preview context. In addition to the device itself (in order to automatically obtain the structure of the input data) the context requires additional information.

The output tensor structure will always be (1, width, bands) for each PushBroom line that is being processed.

The white and dark references may be provided via the fluxEngine::ProcessingContext::InstrumentParameters structure. It allows the user to comfortably provide buffer containers for this purpose (see Measuring References).

The static method used to create a processing context for this purpose is fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext().

The following example shows how the white and dark reference buffer containers that were recorded in Measuring References can be used for creating a processing context for recording HSI data:

 // Declare variables for later use
 fluxEngine::ProcessingContext::HSIRecordingResult ctxAndInfo;
 fluxEngine::ProcessingContext ctx;
 // For storing the recording result
 fluxEngine::BufferContainer recordedData;
 // The following were measured previously:
 fluxEngine::BufferContainer whiteReference, darkReference;
 try {
     fluxEngine::ProcessingContext::InstrumentParameters parameters;
     parameters.whiteReference = &whiteReference;
     parameters.darkReference = &darkReference;
     // Measure raw intensities
     fluxEngine::ValueType valueType = fluxEngine::ValueType::Intensity;
     // Empty vector -> don't normalize wavelength grid
     std::vector<double> targetWavelengths = {};

     ctxAndInfo = fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext(instrumentDevice,
                     valueType, parameters, targetWavelengths);
     ctx = std::move(ctxAndInfo.context);

     // ctxAndInfo.wavelengths contains the actual wavelengths
     //     of ths HSI data
     // ctxAndInfo.whiteReference contains the white reference
     //     data after it has been normalized in the same manner
     //     as the original data (or NULL to indicate it's not
     /      present)
     // ctxAndInfo.darkReference contains the dark reference
     //     data after it has been normalized in the same manner
     //     as the original data (or NULL to indicate it's not
     /      present)
     // Other fields contain further information

     // Store up to 1000 lines
     recordedData = fluxEngine::createBufferContainer(ctx, 1000);

     fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
     instrumentDevice->startAcquisition(parameters);
     // Record exactly 1000 lines
     while (recordedData.count() < 1000) {
         fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
         if (!buffer.ok)
             continue;
         ctx.setSourceData(buffer);
         ctx.processNext();
         recordedData.addLastResult(ctx);
         instrumentDevice->returnBuffer(buffer.id);
     }
     instrumentDevice->stopAcquisition();

     // The data was stored in recordedData
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

The previous example used a fluxEngine::BufferContainer to also store the result of the recording. That is the simplest way to do this, but it is also possible to directly access the data via

 // Record exactly 1000 lines
 while (recordedData.count() < 1000) {
     fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
     if (!buffer.ok)
         continue;
     ctx.setSourceData(buffer);
     ctx.processNext();
     // See below for details on how the output sink logic
     // works.
     auto outputData = ctx.outputSinkData(0);
     // Get a tensor view on the output data
     TensorData view{outputData};
     // For HSI data:
     //   view.order() == 3
     //   view.dimension(0) == 1 (1 line)
     //   view.dimension(1) == width
     //   view.dimension(2) == bands (# wavelengths)
     //   view.dataType() will differ, depending on
     //       the device, and with what options the
     //       context was created
     // Here user code could do something with the data
     instrumentDevice->returnBuffer(buffer.id);
 }

The previous example selected fluxEngine::ValueType::Intensity and provided a white reference when creating a processing context. The following table illustrates the various possible combinations:

Value Type	White reference supplied	Allowed	White reference included in result
Intensity	no	yes	no
Intensity	yes	yes	yes
Reflectance	no	no 1
Reflectance	yes	yes	yes

It is also possible to specify the white reference directly in the form of raw tensor data manually, instead of specifying it in form of buffer containers. Please take a look at the reference documentation of the fluxEngine::ProcessingContext::InstrumentParametersEx structure that may passed to fluxEngine::ProcessingContext::createInstrumentHSIRecordingContext() instead for details on this.

Footnotes

1: With the virtual PushBroom camera it is possible to use image cubes that have alrady been stored in reflectances. In that specific corner case it will be possible to create such a processing context – but that will not be true for real devices.

Note

As with all context creation functions, it is also possible to specify a processing queue set as an optional second parameter to the method to associate the context with a different processing queue set.

Model Processing

Finally it is possible to use data from an instrument as the input of models that have been loaded. The user must have first loaded a fluxEngine model from disk using the functions described in Models.

Creating a processing context for model processing takes the following inputs:

The instrument device to create the context for

The model to use

Optionally a white & dark reference, again in the form of a fluxEngine::ProcessingContext::InstrumentParameters structure supplied by the user

The selected model must be compatible with the input data the connected instrument device generates, otherwise processing context creation will fail.

The static method fluxEngine::ProcessingContext::createInstrumentProcessingContext() is used to create a context for instrument data processing. If the model doesn’t require reflectance data is its input, it is not necessary to specify a white reference. Most models, however, will require a white reference, as most models will require input data in reflectances.

The following example code shows how to create a processing context that processes data obtained directly from an instrument:

 // These were measured previously
 fluxEngine::BufferContainer whiteReference, darkReference;
 // This was loaded previously
 fluxEngine::Model model;
 try {
     fluxEngine::ProcessingContext ctx;

     fluxEngine::ProcessingContext::InstrumentParameters parameters;
     parameters.whiteReference = &whiteReference;
     parameters.darkReference = &darkReference;

     ctx = fluxEngine::ProcessingContext::createInstrumentProcessingContext(instrumentDevice,
                     model, parameters);

     fluxEngine::InstrumentDevice::AcquisitionParameters parameters;
     instrumentDevice->startAcquisition(parameters);
     while (true) {
         fluxEngine::BufferInfo buffer = instrumentDevice->retrieveBuffer(std::chrono::seconds{1});
         if (!buffer.ok)
             continue;
         ctx.setSourceData(buffer);
         ctx.processNext();
         // TODO: obtain result data from the context
         instrumentDevice->returnBuffer(buffer.id);
     }
     instrumentDevice->stopAcquisition();

     // The data was stored in recordedData
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

Sequence Ids

When processing data in models the buffer number (frame number) is used as a so-called sequence id. For imager cameras and spectrometers this is mostly irrelevant, but for PushBroom cameras this is used by fluxEngine to modify the behavior slightly whether individual frames have been lost. PushBroom cameras only provide a single line each time a buffer is returned, and an image is constructed by concatenating lines one after another. A missing buffer will mean that a line is missing, and if the data is naively concatenated, the missing line will cause distortions.

What fluxEngine does to mitigate this is the following:

For any model that outputs on a per-line basis still only the line in question will be processed. (It will not generate additional output for missing lines.)

For any model that uses algorithms that put together the current line with previous lines (such as object detection), if a buffer or more are missing between the last invocation and the current one, the algorithm will behave as if the current line had been repeated as often as there were buffers missing.

For example, if a single buffer is missing, the line after the missing buffer will be repeated once when performnig any 2D reconstruction, i.e. it will occur twice.

For data processed from the device directly, the buffer number will be used as the sequence id for this. However, when storing a buffer in a buffer container, the sequence id will not be saved – and when extracting a buffer from a buffer container, the user has the ability to select a sequence id to use, instead. (By default it would use the index within the buffer container as the sequence id.)

Note

Also note that the behavior that a missing buffer is repeated only applies to processing within a fluxEngine model – adding a buffer to a buffer container completely ignores the sequence id; if the user wants a similar behavior here, it is up to them to implement this.

PushBroom Resets

As PushBroom cameras can be thought of as incrementally building up a cube line by line, at some point the user may want to indicate that the current cube is considered complete and a new cube starts. In that case the processing context has to be reset, so that all stateful operations are reset as well, such as object detection, but also kernel-based operations.

To achieve this the method ProcessingContext::resetState() exists. Its usage is simple:

 try {
     context.resetState();
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

It is recommended that any time acquisition is stopped and then restarted that the user performs such a reset for any model they use.

Note

There is no requirement to actually perform such a reset. If fluxEngine is used to process PushBroom frames that are obtained from a camera above a conveyor belt in a continuous process, for example, it is possible to just simply process all incoming frames in a loop and never call the reset method. In that situation the reset method would be called though if the conveyor belt is stopped and has to be started up again. Though, depending on the specific application, that could also mean that a processing context would have to be created again, for example because references have to be measured again, and a simple state reset is not sufficient.

Obtaining Results

After processing has completed via the ProcessingContext::processNext() method, fluxEngine provides a means for the user to obtain the results of that operation.

When designing the model in fluxTrainer that is to be used here, Output Sinks should be added to the model wherever processing results are to be obtained later.

For processing instrument preview and instrument recording processing contexts an automatic output sink with sink index 0 will be created by fluxEngine so the user may extract the preview and/or recording data. Additionally, for instrument recording sinks, the BufferContainer::addLastResult() method may be used to add the last output data of a model to a buffer container. (Though that method only works for recording contexts.)

Note

If a model contains no output sinks, it can be processed with fluxEngine, but the user will have no possibility of extracting any kind of result from it.

fluxEngine provides a means to introspect a model to obtain information about the output sinks that it contains. The following two identifiers for output sinks have to be distinguished:

The output sink index, this ist just a number starting at 0 and ending at one below the number of output sinks that may be used to specify the output sink for the purposes of the fluxEngine API.

The ordering of output sinks according to this index is non-obvious. Loading the same .fluxmdl file will lead to the same order, but saving models with the same configuration (but constructed separately) can lead to different orders of output sinks.
The output id, which is a user-assignable id in fluxTrainer that can be used to mark output sinks for a specific purpose. The output id may not be unique (but should be), and is purely there for informational purposes.

For each output sink in the model the user will be able to obtain the output id of that sink. There is also a method ProcessingContext::findOutputSink() that can locate an output sink if the output id of that sink is unique. It will return the index of the sink with that output id.

To obtain information about all output sinks in the context, the method ProcessingContext::outputSinkMetaInfos() exists, which returns a vector of simple structs that contain information about each output sink. The index of the vector is also the output sink index.

 try {
     auto sinkMetaInfos = context.outputSinkMetaInfos();
     for (std::size_t i = 0; i < sinkMetaInfos.size(); ++i) {
         int sinkIndex = static_cast<int>(i);
         std::cout << "Output sink with index " << sinkIndex << " has name "
                   << sinkMetaInfos[i].name << std::endl;
     }
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

The ProcessingContext::OutputSinkMetaInfo structure contains the following information:

The output id of the output sink
The name of the output sink as an UTF-8 string (this is the name the user specified when creating the model in fluxTraineer)
Storage type: what kind of data the output sink will return (the current options are either tensor data or detected objects)
The output delay of the output sink (only relevant in the case when PushBroom data is being processed, see Output Delay for PushBroom Cameras for a more detailed discussion of this.

Output sinks store either tensor data or detected objects, depending on the configuration of the output sink, and where it sits in the processing chain.

Tensor Data

Tensor data within fluxEngine always has a well-defined storage order, as most algorithms that work on hyperspectral data are at their most efficient in this memory layout. While fluxEngine supports input data of arbitrary storage order, it will be converted to the internal storage order at the beginning of processing. The output data will always have the following structure:

When processing entire HSI cubes it will effectively return data in BIP storage order, that means that the dimension structure will be (y, x, λ) (for data that still has wavelength information) or (y, x, P), where P is a generic dimension, if the data has already passed through dimensionality reduction filters such as PCA.
When processing PushBroom frames it will effectively return data in LambdaX storage order, with an additional dimension of size 1 at the beginning. In that case it will either be (1, x, λ) or (1, x, P).
Pure per-object data always has a tensor structure of order 2, in the form of (N, λ) or (N, P), where N describes the number of objects detected in this processing iteration. Important: objects themselves are returned as a structure (see below), per-object data is data that is the output of filters such as the Per-Object Averaging or Per-Object Counting filter. Also note that output sinks can combine objects and per-object data, in which case the per-object data will be returned as part of the object structure.

For PushBroom data it is recommended to always combine per-object data with objects before interpreting it, as the output delay of both nodes may be different, and when combining the data the user does not have to keep track of the relative delays themselves.

To obtain the tensor structure of a given output sink, the method ProcessingContext::outputSinkTensorStructure() is available. It returns the following information:

The scalar data type of the tensor data (this is the same as the data type configured in the output sink)
The order of the tensor, which will be 2 or 3 (see above)
The maximum sizes of the tensor that can be returned here
The fixed sizes of the tensor that will be returned here. If the tensors returned here are always of the same size, the values here will be same as the maximum sizes. Any dimension that is not always the same will have a value of -1 instead. If all of the sizes of the tensor returned here are fixed, the tensor returned will always be of the same size. (There is one notable exception: if the output sink has a non-zero output delay of m, the first m processing iterations will produce a tensor that does not have any data.)

Please refer to the documentation of ProcessingContext::OutputSinkTensorStructure for more information on how this information is returned.

Using ProcessingContext::outputSinkData() it is possible to obtain that tensor data after a successful processing step. It will also return information about the scalar data type, the order, and the stride structure of the resulting tensor, even if that information is in principle reconstructible from the data obtained via the ProcessingContext::outputSinkTensorStructure() method.

Tensor data retrieved from an output sink may be cast into the convenient fluxEngine::TensorData wrapper that allows easy access to tensor elements.

For example, if we know a given sink with index sinkIndex has signed 16bit integer data that spans the entire cube that is being processed (when processing HSI cubes), the following code could be used to obtain the results:

 /* obtained from previous introspection */
 int sinkIndex = ...;
 std::int64_t cube_width = ...;
 /* from current buffer */
 int64_t bufferNumber = ...;
 try {
     auto data = ProcessingContext::outputSinkData(sinkIndex);
     TensorData view{outputData};
     // PushBroom data
     assert(view.dimension(0) == 1);
     assert(view.dimension(1) == cube_width);
     assert(view.dimension(2) == 1);

     for (std::int64_t x = 0; x < cube_width; ++x) {
         std::cout << "Classification result for pixel (" << x << ", " << bufferNumber << ") = "
                     << view.at<int16_t>(0, x, 0) << "\n";
     }
     std::cout.flush();
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }

Alternatively, if fluxEngine::TensorData is not used and access happens manually, the following code will give the same output:

 /* obtained from previous introspection */
 int sinkIndex = ...;
 std::int64_t cube_width = ...;
 /* from current buffer */
 int64_t bufferNumber = ...;
 try {
     auto data = ProcessingContext::outputSinkData(sinkIndex);
     TensorData view{outputData};
     auto classificationData = static_cast<int16_t const*>(data.data);
     /* Classification results have an inner dimension of 1, so the
     * actual sizes should be (1, cube_width, 1) for PushBroom data.
     */
     assert(data.sizes[0] == 1);
     assert(data.sizes[1] == cube_width);
     assert(data.sizes[2] == 1);

     int64_t strideY = data.strides[0];
     int64_t strideX = data.strides[1];

     for (std::int64_t x = 0; x < cube_width; ++x) {
         std::int64_t index = 0 * strideY + x * strideX;
         std::cout << "Classification result for pixel (" << x << ", " << bufferNumber << ") = "
                     << classificationData[index] << "\n";
     }
     std::cout.flush();
 } catch (std::exception& e) {
     std::cerr << "An error occurred: " << e.what() << std::endl;
     exit(1);
 }