Continuing in my series of “I’m serious – Inference is useful!” blog postings, yesterday I put together an example of how you can document process modeling using Inference and some additional numerical libraries. In this example, I used the
MtxVec Numerical Library collection provided by
Dew Research. The particular example I documented was an experimental study from the National Institute of Standards and Technology (NIST) involving the thermal expansion of copper.
What is process modeling?
Process modeling entails using computational methods to describe, represent and simulate the functioning of real-world scientific and engineering processes. Computer modeling represents a complementary method to laboratory. The purposes of modeling include:
- analysis and understanding of observed phenomena;
- testing hypothesis and theories;
- predicting process outputs under conditions which are too difficult or too expensive to measure directly;
- calibration of measurement systems; and
- process optimization.
Process modeling describes the variation in one observed quantity (response) as a function of a deterministic component, typically a mathematical function of one or more quantities (factors), and a random component (noise), typically described by a probability distribution. For example, the variation in the measured expansion of a metal like copper can be described by partitioning the variability into a deterministic part, which is a function of temperature, and some left-over random error.
How does one build process models?
Process models are built by fitting experimental data to mathematical functions. Linear least squares regression is the most widely used modeling method. When people talk of using "regression", "linear regression" or "least squares," they are fitting their data to empirical models.
Nonlinear least squares regression extends linear least squares regression for use with a much larger and more general class of empirical models. Almost any closed form mathematical function can be incorporated in a nonlinear regression model. And, unlike linear regression, there are very few limitations on the way parameters can be used in the functional part of a nonlinear regression model.
Polynomial models are among the most frequently used empirical models for fitting functions. A rational function model is a generalization of the polynomial model. Rational function models contain polynomial models as a subset. For example, the function below is a quadratic/quadratic rational function model that can be used to describe the coefficient of thermal expansion of copper metal as a function of temperature,

where Y corresponds to the coefficient of thermal expansion, X corresponds to the temperature in degrees Kelvin, and B4 corresponds to estimated parameters obtained by nonlinear least squares. Nonlinear least squares parameter estimation is in iterative method which employs estimated starting values.
The data
The data result from a National Institute of Standards and Technology (NIST) experimental study involving the thermal expansion of copper. The response variable is the coefficient of thermal expansion, and the predictor variable is temperature in degrees. Let us look at the data in a simple plot.
The plot exhibits a steep initial slope that levels off to a more gradual slope. This type of response curve can often be modeled with a rational function model, like the quadratic/quadratic rational function model described earlier. The plot further indicates that there appear to be no gross outliers in the data.
Fitting the data using Dew Research MtxVec Numerical Libraries
Fitting the thermal expansion data to the quadratic/quadratic function model requires non-linear fitting. This involves a three-step process: (1) define the fit function, (2) obtain initial estimates of the beta coefficients, and (3) iterate the beta coefficients using the optimization method used in nonlinear regression. With a great deal of time and effort, one could write code that implemented and executed the algorithms of this three-step process. But there is a much easier way. Using specialized numerical libraries like those supplied by Dew Research in the Inference for .NET environment allows for rapid development and deployment of the solution using some simple IronPython glue code.
Using IronPython, we can define the quadratic-quadratic rational function to which the data will be fit. Next, we need initial estimates for the regression coefficients. As suggested at
NIST pages, we'll use linear fit on selected points to get initial estimates for regression coefficients. For this we can use the MulLinRegress routine available in the MtxVec library, and then define a function to perform an initial estimate on fractional model by performing a linear fit:
Perform non-linear least squares fit
Now all we must do is put all this together and perform the non-linear fit. Towards this end, we willl use Dew Stats tMtxNonLinReg control. The call to FitModel() routine first loads data, then performs initial estimation of regression coefficients and then uses initial estimates in actual non-linear regression:
Plot the fit on top of the data
Now let take a look at the results. We create a plot that compares the experimental thermal coefficient values (symbols) with the corresponding calculated values (line) based on the non-linear fit.
Conclusion
We conclude that the quadratic/quadratic rational function model does in fact provide a reasonable empirical model for this data set. Further evaluation would reveal that a cubic/cubic rational function model does even a bit better.