In the following image, each spectrum represents a different octane number
Linear Modeling is used in the industry to define the relationship between an acquired spectrum and the Lab Value used as reference. This relationship enables predicting the expected value from an unknown spectrum.
There are several modeling software tools that are used for model development.
These tools require the user to prepare the training set - a matching list of spectra and lab data, each pair synchronized to the same time.
Once the training set is loaded into the software, the user has to do the following:
Select the desired number of factors.
Exclude bad spectra / lab values from the training set. (iterative process)While R-Squared too low and outliers exist:
- Plot the lab / predictions graph to evaluate the model accuracy / robustness
- Exclude one or more points from the training set
Export the model.
Manual Model Development - the traditional way
Training set and outliers
The reasons for outliers in the training set can be caused by:
1. Typos (wrong lab values are received from the lab).
2. Wrong time stamps.
3. Analyzer issues - noise / NIR lamp / dirty tube etc.
4. Temporary Process Changes - mixing samples, dirty samples etc.
5. Any other reason that can cause a change in the acquired spectrum
The ModelGateway solution
ModelGateway is using advanced algorithm for navigating through the training set.
Different training set combinations are evaluated and a search algorithm is used to find the set that provides the best criterion (R-Squared, M-Distance and other criterions are used). The model that is tested on each iteration is a standard Linear Model, but the method of evaluating thousands and millions of models in an optimized way, enables finding the best set.
The best training set is then saved and used for further predictions on acquired spectra
The exported model can now be used by the analyzer for predicting new acquired spectra and report the predictions to the DCS.