|
|
|
|
|
|
|
|
Home > Blog > Categories
|
3/3/2009
One of the features that we added to the most recent release of Inference is the ability to print Expression output to the header and footer regions of Microsoft Word documents. I recently created a sample How-To Guide that illustrates dynamic header and footer output in an Inference in Word document. If you have Inference for .NET installed, you can download the How-To Guide, entitled "Header and Footer Output in Word," from Inference Online:
To print output to headers and footers:
- Create an Expression that returns a value.
- Enter a Label for the Expression – for example, MyExpression.
- In a header or footer, type the Label of the Expression (enclosed in greater than/less than brackets). Example: <MyExpression>.
When the document is executed, the bracketed Expression label placeholders will be replaced with the actual code output from the Expression.
Here are some additional tips:
- You can use the Output Result property of the Expression to specify whether the Expression gets evaluated in the Word canvas during document execution. This restricts the expression output to only the header or footer.
- You can use the Alias Code Text property of the Expression to specify whether the code, or the label, is shown in the Word canvas for an Expression. This makes the Word canvas easier to read.
- If your document includes a Data Frame, you can use an Expression to access its attributes. For example, if your DataFrame includes a title or version number, you can include that in the header/footer of your results document.
- Header/footer output will also be included in pdf results documents.
- This same functionality also works in Excel.
In the actual How-To Guide, we create two Expressions that will generate the header and footer output, and then add the placeholders into the header and footer areas of the document:
When the document is executed to a Microsoft Word View, the Expressions are evaluated, and their output replaces those placeholders in the header and footer sections:
That's pretty much it. You can download this How-To Guide from Inference Online and try it out yourself! 1/29/2009
Designing a product for reliability is a critical element of the manufacturing lifecycle. To start, one must understand the durability of a product over time or when subjected to particular strain. The following examples illustrate how survival/failure/reliability data can be analyzed and documented using the NAG Numerical C Library and Inference for .NET. The first example examines the reliability of a fiber product, measuring a braided cord’s strength under tension, while second example computes the survival curve for carbon fibers subjected to strain.
Scenario
Manufacturers face intense global competition, pressure for shorter product-life cycles, stringent cost constraints, and higher customer expectations of quality and reliability. In response to these needs, manufacturing industries have gone through a revolution in the use of statistical methods for product quality. A central element of this approach is a focus on product reliability, typically defined as “quality over time” centered on how long a component can be used until it fails.
Design for reliability requires a core understanding of anticipated and unexpected product failure modes. Important product areas include structural fibers (e.g., polymer and carbon fibers) used in chords (e.g., tire cord) and composites (e.g., carbon fiber composites). Although failure as a function of time (survival: time to failure) is commonly analyzed, failure as a function of some applied strain (survival: tension to failure) is equally appropriate. The only requirement is that this applied strain be measurable and is always a positive quantity.
Design for reliability often involves analyses where a treated or modified product is evaluated in comparison to a control group to address the following questions:
- What level of strain on average can the product tolerate before it fails?
- Does a particular treatment or modification results in higher reliability under strain?
- What are the risk factors that affect reliability under strain?
I will use Inference for .NET and the NAG C Library in two scenarios to answer these questions.
Example #1 – Survival of Braided Cord Subject to Strain The objective of this study is to estimate the reliability of a fiber product in terms of the probability that a braided cord survives (does not break) beyond a certain level of applied strain (tension). Specifically, the CordData dataset contains measurements of 48 sections of braided cord, which were placed under increasing tension. During the course of our measurements, seven (7) of the braided cords became damaged and the corresponding measurements were stopped. This resulted in censored measurements (Censor=1) where the measurement before stopping the experiment indicated that the breaking tension is at least as large as the final measurement. By contrast, the other measurements (Censor=0) correspond to the tension that resulted in breaking the braided cord.
To get started, I opened a new Word document, added an Inference Parts Container, and imported the relevant data (labeled CordData). Subsequently, I embedded the NAG C# imports assembly of NAG C functions to the dynamic Inference document (labeled NAGwrapper):
I decided to initially generate a survival curve of the braided cord, by calculating the Kaplan – Meier estimates of the survival probabilities. This I achieved, by adding a code block and supporting code to utilize the class g12aac of the embedded .NET library, and to plot the subsequent results.
The following chart was the result of executing this Inference document.
Note that the median survival tension is estimated to be 54.8. Therefore we determine that if a braided cord is subjected to weathering, it has a 50% probability to surviving—that is, not breaking—to at least an applied tension of 54.8 MPa. The plot shows the survival curve including the 95% confidence intervals based on Greenwood’s formula.
Example #2 – Survival of a Single Carbon Fiber Subject to Strain High-tech composites, like carbon fiber composites, exploit the qualities of fiber products to achieve strength with the least possible mass. Towards that end, we need to understand clearly the factors that affect strength and how to manipulate them to design the material response to strain. Accordingly, the objective of this study was to compare reliability of single carbon fibers subject to tension across four (4) groups corresponding to fibers of different lengths.
To achieve this goal, I first imported the representative data into the Inference document (labeled CarbonfiberData), and subsequently added two code blocks. The 1st code block contained code to utilize the NAG library g12aac method, as done for the braided cord data; however, this time the analysis was performed for the 4 groups of fiber lengths.
The 2nd code block used the g12bac function to derive a Cox proportional hazard model of the carbon fiber data. Both code blocks plotted the results using Inference graphing functions.
Using these coefficients, we can construct a "customized" survival curve for any particular carbon fiber length. More importantly, the method provides a measure of the sampling error associated with the predictor's coefficient. This lets us assess whether the carbon fiber length coefficient is significantly different from zero—that is, whether the carbon fiber length is significantly related to carbon fiber survival when subjected to tension.
Conclusion Combining the Numerical Algorithms Group’s C Library and Inference for .NET eliminates the need for clunky reliability analysis software or inflexible add-ins. Instead, it is possible to test reliability of a product using trusted methods in a user-friendly environment like Word and Excel. And, with on-the-fly scripting via IronPython, the development process moves much faster. Best of all, anybody can easily pick up where I left off. By using Inference, all pertinent code, objects, data, and explanatory text is kept neatly in a single Word document, making it easy for other scientists to validate my methods or re-run the analysis using their own data.
12/24/2008
Visual Numerics’ computational analysis libraries offer robust, accurate, and reliable algorithms for use in science, technical, and business environments. Inference for .NET enables computational analysis libraries to be easily encapsulated in Word documents, along with high level scripting commands, data, and explanatory text. With these capabilities at hand, I can rapidly create tailored solutions aimed at solving specific data analysis problems.
In this example, I’m going to show you how I can use Inference for .NET and Visual Numerics’ IMSL C# Numerical Library to perform an analysis of manufacturing data and document the results.
The Scenario A pharmaceutical manufacturing facility has been producing single dose tablet drug product for several years. Current measurement systems are based on storing finished material and performing offline quality tests to assure the finished product meets performance specifications.
Our manufacturing team is assigned to investigate the manufacturing process and improve its consistency (sigma capability) by using Quality by Design tools.
What is Quality by Design? The principles of QbD in pharmaceutical development are fairly straightforward. QbD requires the achievement of two levels of understanding:
- Clinical Understanding, which establishes a link between the attributes of the drug product and safety and efficacy in humans; and
- Process Understanding, which establishes a link between the attributes of the drug product and process parameters, process attributes and material attributes of the active pharmaceutical ingredient (API) and excipients that go into the drug product.
By implementing Quality by Design practices, the pharmaceutical manufacturer can increase drug quality and safety while reducing production costs. In this step of the investigation, we use Quality-by-Design tools from the Statistical Process Control arena to establish the current state of the process based on an analysis of historical manufacturing data.
A Tailored Solution We’ve been tasked with using process control charts to analyze a key tablet performance metric and determine whether the manufacturing process is in a state of statistical control. We’ll collaborate remotely with our team, using a regulatory compliant workflow enabled by Microsoft Office SharePoint Server, to create a dynamic Inference document.
To start, we open a new Word document and add an Inference Parts Container. The Visual Numerics IMSL C# Numerical Libraries contains a plethora of process control functions that members of our team have used successfully in past projects. So we place this .NET assembly (labeled ImslCL) into the Word document. Then, team members import manufacturing data (labeled ManufacturingData) from an Excel spreadsheet: 
The first question we need to answer is, "What is the problem with the Manufacturing Process." To address this we decide to generate a Shewart Control Chart by adding a code block and supporting IronPython code to utilize the class ShewhartControlChart of the embedded .NET library:
At this point, we’ve successfully created a dynamic Inference document that can be executed to create an Inference results document, producing the Shewhart Control Chart for analysis:
This chart shows us that 14 out of 90 batches failed to meet the 60-minute dissolution requirement. Now, we present these findings to our team as an Inference results document, concluding that even though the manufacturing process was not on target for process capability, the spread was small enough (within 3 sigmas) to indicate a process under statistical control.
Other members of our team perform further analysis in our dynamic Inference document using additional functions called from the VNI IMSL C# Numerical Libraries. Upon reviewing their results, our team determines that the manufacturer’s objective in development should be to move the process capability distribution toward the center while maintaining the same spread.
10/28/2008
Continuing in my series of “I’m serious – Inference is useful!” blog postings, yesterday I put together an example of how you can document process modeling using Inference and some additional numerical libraries. In this example, I used the MtxVec Numerical Library collection provided by Dew Research. The particular example I documented was an experimental study from the National Institute of Standards and Technology (NIST) involving the thermal expansion of copper.
What is process modeling? Process modeling entails using computational methods to describe, represent and simulate the functioning of real-world scientific and engineering processes. Computer modeling represents a complementary method to laboratory. The purposes of modeling include:
- analysis and understanding of observed phenomena;
- testing hypothesis and theories;
- predicting process outputs under conditions which are too difficult or too expensive to measure directly;
- calibration of measurement systems; and
- process optimization.
Process modeling describes the variation in one observed quantity (response) as a function of a deterministic component, typically a mathematical function of one or more quantities (factors), and a random component (noise), typically described by a probability distribution. For example, the variation in the measured expansion of a metal like copper can be described by partitioning the variability into a deterministic part, which is a function of temperature, and some left-over random error.
How does one build process models? Process models are built by fitting experimental data to mathematical functions. Linear least squares regression is the most widely used modeling method. When people talk of using "regression", "linear regression" or "least squares," they are fitting their data to empirical models.
Nonlinear least squares regression extends linear least squares regression for use with a much larger and more general class of empirical models. Almost any closed form mathematical function can be incorporated in a nonlinear regression model. And, unlike linear regression, there are very few limitations on the way parameters can be used in the functional part of a nonlinear regression model.
Polynomial models are among the most frequently used empirical models for fitting functions. A rational function model is a generalization of the polynomial model. Rational function models contain polynomial models as a subset. For example, the function below is a quadratic/quadratic rational function model that can be used to describe the coefficient of thermal expansion of copper metal as a function of temperature,

where Y corresponds to the coefficient of thermal expansion, X corresponds to the temperature in degrees Kelvin, and B4 corresponds to estimated parameters obtained by nonlinear least squares. Nonlinear least squares parameter estimation is in iterative method which employs estimated starting values.
The data The data result from a National Institute of Standards and Technology (NIST) experimental study involving the thermal expansion of copper. The response variable is the coefficient of thermal expansion, and the predictor variable is temperature in degrees. Let us look at the data in a simple plot.
The plot exhibits a steep initial slope that levels off to a more gradual slope. This type of response curve can often be modeled with a rational function model, like the quadratic/quadratic rational function model described earlier. The plot further indicates that there appear to be no gross outliers in the data.
Fitting the data using Dew Research MtxVec Numerical Libraries Fitting the thermal expansion data to the quadratic/quadratic function model requires non-linear fitting. This involves a three-step process: (1) define the fit function, (2) obtain initial estimates of the beta coefficients, and (3) iterate the beta coefficients using the optimization method used in nonlinear regression. With a great deal of time and effort, one could write code that implemented and executed the algorithms of this three-step process. But there is a much easier way. Using specialized numerical libraries like those supplied by Dew Research in the Inference for .NET environment allows for rapid development and deployment of the solution using some simple IronPython glue code.
Using IronPython, we can define the quadratic-quadratic rational function to which the data will be fit. Next, we need initial estimates for the regression coefficients. As suggested at NIST pages, we'll use linear fit on selected points to get initial estimates for regression coefficients. For this we can use the MulLinRegress routine available in the MtxVec library, and then define a function to perform an initial estimate on fractional model by performing a linear fit:
Perform non-linear least squares fit Now all we must do is put all this together and perform the non-linear fit. Towards this end, we willl use Dew Stats tMtxNonLinReg control. The call to FitModel() routine first loads data, then performs initial estimation of regression coefficients and then uses initial estimates in actual non-linear regression:
Plot the fit on top of the data Now let take a look at the results. We create a plot that compares the experimental thermal coefficient values (symbols) with the corresponding calculated values (line) based on the non-linear fit.
Conclusion We conclude that the quadratic/quadratic rational function model does in fact provide a reasonable empirical model for this data set. Further evaluation would reveal that a cubic/cubic rational function model does even a bit better.
10/17/2008
In a previous blog entry, I illustrated how Inference could be combined with computational libraries, such as CenterSpace’s NMath, to quickly assemble document-based statistical analysis solutions. Along those same lines, I recently downloaded a trial of WebCab Components’ Options and Futures .NET library, which contains a collection of complex financial calculation implementations. I used this component to create a sample Inference for Word document that calculates the Present Value of European Call Options for Time to Maturity, measured in years. Say what??? Yeah, OK, on the off chance you didn't major in economics in college, let’s cover some quick background first.
European Call Options Defined A call option is a financial contract between two parties (a buyer and a seller), whereby the buyer of the option has the right, but not the obligation to buy an agreed quantity of a particular commodity or financial instrument from the seller of the option
- at a certain time (expiration date)
- for a certain price (strike price)
If the buyer decides to do so (exercise the option), the seller of the option is obligated to sell the particular commodity or financial instrument at the strike price. For this right, the buyer pays a fee (a premium).
Call options have different styles. Specifically, a European call option allows the option holder to exercise the option only on the expiration date. Call options can be purchased on most financial instruments including stocks and stock indices (e.g., S&P 500, FTSE 100).
Factors Determining Present Value of a European Call Option on a Stock Index Call options are traded on exchanges. The present value of a call option varies depending on the time to maturity and underlying market conditions. Accordingly, traders in call options need a mean to assess the value of a call option at any time. The present value of European option can be evaluated using the Black-Scholes model by considering the following factors:
- yield: the average dividend yield of the stocks in the index
- indexValue: the value of the index on which the option is based
- strike: the value of the index which is taken as the price at which the trade will occur
- riskFreeRate: the theoretical rate of return of an investment with zero risk, continuously compounded, typically the return on 3-month US treasuries
- volatility: the volatility of the index, which typically corresponds to the standard deviation of the continuously compounded returns of the stocks on the index
- timeToMaturity: the time until the option matures (in years)
Using Inference and WebCab Libraries to Build an Estimation Application Estimating the present value using an analytical solution of the Black-Scholes model requires evaluation of a complex mathematical function, which is typically not available in standard software packages. You can build the function in Excel but it requires a bit of know-how and several hours of your time. However, specialized libraries of functions, like WebCab Options and Futures Library for .NET, provide a function CallOnIndex, which allows you to evaluate the present value of a call option on a stock index. To use such .NET libraries typically requires a skilled developer to build a custom applications using C# in Microsoft Visual Studio. However, mere mortals can achieve the same end in a matter of minutes using the same library in conjunction with the Inference for .NET and some IronPython glue code.
To begin, I created a new Word document, added an Inference Parts Container, and then embedded the WebCab Options and Futures .NET assembly:
Next, I added a series of Code Blocks with code to handle initialization and variable definition (pertaining to the Present Value scenario), along with explanatory text:
Then, I added another Code Block and added the IronPython code to perform the actual present value calculation and plot the results:
When exported, this yields a results document that plots the Present Value of European Call Options for Time to Maturity, where each line represents a different Time to Maturity Value:
Again, this example demonstrates how easily Inference can be combined with powerful .NET libraries, like as WebCab’s financial components, to quickly create document-based business applications. 10/8/2008
Performing data analysis is one of the core applications of dynamic scripting. The combination of dynamic scripting and computational analysis libraries allows users to rapidly prototype and implement data analysis solutions. To illustrate this, I recently downloaded a trial of NMath, a collection of software components from Centerspace Software that provides “numerical components for mathematical, financial, engineering, and scientific applications on the .NET platform,” and created a sample Inference in Word document that illustrates using these libraries. The specific sample I created is based on CenterSpace’s own DataFrameExample, which demonstrates how to load sample data into a vector-based data storage object (called a “data frame”) and perform some simple data manipulations and statistical calculations.
When I re-created this example in Inference, I opted to store the data in an Inference in Excel DataFrame Document (complete with explicit data-typing), which I embedded in Inference in Word:
Then, I embedded the NMath assemblies into the Inference in Word document:
Next, I was able to add code blocks (comprising the data manipulation and statistical calculation code) and explanatory text to form a literate document:
When exported to a Microsoft Word Results Document, the data from the Excel DataFrame is loaded, and the code is executed against that DataFrame. The resulting Word document combines the code and text of the original document with the code output to form a Results Document. The screenshot below shows the permuted DataFrame columns:
This screenshot illustrates the generation of a column dictionary and calculation of some basic statistics:
This sample illustrates the ease with which data analysis can be performed and documented in Microsoft Office using Inference and available data analysis libraries, such as CenterSpace’s NMath. 10/1/2008
One of the more novel features of Inference for .NET is that it allows you to encapsulate .NET assemblies that you want to call from an Inference document directly into the document itself. By embedding an assembly into an Inference document, anyone that you give or send this document to only needs to have Inference for .NET installed to execute the document -- the embedded assemblies get extracted at runtime. This makes deployment of your document a cinch -- all your target users need is Office and Inference -- there’s no need to deploy anything other than the specific document.
Let’s take a look at an actual example of this. I’m going to share an Inference for Word sample that illustrates embedding and accessing an assembly that encapsulates a web service in a Word document. Specifically, this sample utilizes a web service available from the Nation Center for Biotechnology Information (NCBI) available from:
http://www.ncbi.nlm.nih.gov
The NCBI provides public databases useful in computational biology and genome research. You can download this sample (called "NCBI ELink Web Service in Word Sample.docx" from Inference Online).
To start, I created a .NET assembly that encapsulates the NCBI web service, which I called NCBIWebService:

Then I created a new Word document, added an Inference Parts Container, and embedded the assembly DLL created above into the document:

Next, I added the requisite IronPython code that will access the web service (specifically, the code queries for a list all available publication indices) and prints out the results:

And that’s all I need to do. I can then export the document to a "Webpage View":

...which when executed is shown in a web browser:

That’s all there is to it. This sample shows how easy it is to embed and access .NET assemblies in Inference documents.
|
|
|
|
|
|
|
|
| |
|
© 2008 Blue Reference, Inc.
All rights reserved. | 3052 NW Merchant Way, Suite 100, Bend, Oregon 97701 | 541-316-2343 |
Terms of Use
 |
|
 |
 |
 |
 |
|