tag:blogger.com,1999:blog-56529242024-03-14T02:25:06.979-07:00Applied Data Science and Machine LearningTips, tricks, and comments related to topics in data science and machine learning. Used to be called "data mining and predictive analytics" but updated the title to reflect the language of the day!
<br>
Hosted by Dean Abbott, Abbott AnalyticsDean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.comBlogger196125tag:blogger.com,1999:blog-5652924.post-14713955456120860352023-11-24T12:09:00.000-08:002023-11-24T12:39:18.872-08:00How Confident Are We of Machine Learning Model Predictions?<p>When we build binary classification models using algorithms like Neural Networks, XGBoost, Random Forests, etc., we get as an output of the models a prediction that ranges from 0 to 1. But how sure are we that this is a stable prediction? Does a score of 0.8 really mean 0.8? There is a difference between 0.8 +/- 0.05 and 0.8 +/- 0.4 after all! </p><p>One reason we love models grounded in statistics is that because of the strong assumptions they have, we can compute many metrics to provide insight into how sure we are the the coefficients are correct and what confidence intervals exist for model predictions. For example, see "Calculating Confidence Intervals for Logistic Regression" (<a href="https://stats.stackexchange.com/questions/354098/calculating-confidence-intervals-for-a-logistic-regression">https://stats.stackexchange.com/questions/354098/calculating-confidence-intervals-for-a-logistic-regression</a>) or books like "Applied Linear Statistical Models" (<a href="https://a.co/d/a7BR3pa">https://a.co/d/a7BR3pa</a>). </p><p>However, for other model types (non-parametric for example), we don't have the benefit of these kinds of measures. Over the past decade or more, when I've needed this kind of information, I've used bootstrap sample this way:</p><p></p><ol style="text-align: left;"><li>Build my model. This is the baseline. For the testing data set, each record gets a score. </li><li>Create 100 bootstrap samples of the training data. </li><li>Build 100 models (one for each bootstrap sample) using same the protocol as the baseline model</li><li>Run each model through the testing set. We now have 100 scores for every record in the test set...a distribution of scores</li><li>compute the 90% confidence interval equivalent by identifying the probabilities (or model scores) at 5th and 95th percentiles (ranks 5 and 95 of the 100 scores). For the 95% confidence interval, one would need to interpolate between 2nd and 3rd, and also the 97th and 98th ranked scores</li></ol><div>This works fine for any algorithm. However, I'd never seen a formal treatment of this topic; this is really to my discredit as I had never really done a signfiicant search to find any theory related to this topic.</div><div><br /></div><div>At this year's Machine Learning Week Europe (<a href="https://machinelearningweek.eu/">https://machinelearningweek.eu/</a>), there was a talk on this subject given by Dr. Michael (Naatz) Allgöwer (<a href="https://www.linkedin.com/in/allgoewer/">https://www.linkedin.com/in/allgoewer/</a>) entitled "NFORMAL PREDICTION: A UNIVERSAL METHOD FOR UNCERTAINTY QUANTIFICATION" that introduced another way of accomplsihing this objective. A Wikipedia summary of the approach is here (<a href="https://en.wikipedia.org/wiki/Conformal_prediction">https://en.wikipedia.org/wiki/Conformal_prediction</a>). I like what I've heard from Dr. Allgöwer at the conference and would like to experiment with this approach to learn how it works and what limitations might exist for the approach.</div><div><br /></div><div>I hope to compare the approaches, with pros and cons, in the coming weeks. Stay tuned!</div><div><br /></div><div><br /></div><p></p><p><br /></p>Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-15257977360759300812023-11-02T20:38:00.001-07:002023-11-02T20:41:41.701-07:00What if Generative AI Turns out to be a Dud?<p> I follow posts on twitter from different sides of the generative AI debates, including Yann LeCun (whom I've followed for decades) and Gary Marcus (whom I discovered just in the past few years). I'll post at some other time about my views, but found this post by Marcus to be intriguing. I first published my comments here on <a href="https://www.linkedin.com/feed/update/urn:li:activity:7110664013620908032/" target="_blank">LinkedIn</a> </p><p><br /></p><p><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;">Key quotes at the end of the article,</span><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;"> </span></p><br style="box-sizing: inherit; caret-color: rgba(0, 0, 0, 0.9); color: rgba(0, 0, 0, 0.9); font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif; font-size: 14px; line-height: inherit;" /><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;">"Everybody in industry would probably like you to believe that AGI is imminent. It stokes their narrative of inevitability, and it drives their stock prices and startup valuations. Dario Amodei, CEO of Anthropic, recently projected that we will have AGI in 2-3 years. Demis Hassabis, CEO of Google DeepMind has also made projections of near-term AGI.</span><br style="box-sizing: inherit; caret-color: rgba(0, 0, 0, 0.9); color: rgba(0, 0, 0, 0.9); font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif; font-size: 14px; line-height: inherit;" /><br style="box-sizing: inherit; caret-color: rgba(0, 0, 0, 0.9); color: rgba(0, 0, 0, 0.9); font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif; font-size: 14px; line-height: inherit;" /><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;">I seriously doubt it. We have not one, but many, serious, unsolved problems at the core of generative AI — ranging from their tendency to confabulate (hallucinate) false information, to their inability to reliably interface with external tools like Wolfram Alpha, to the instability from month to month (which makes them poor candidates for engineering use in larger systems)."</span><br style="box-sizing: inherit; caret-color: rgba(0, 0, 0, 0.9); color: rgba(0, 0, 0, 0.9); font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif; font-size: 14px; line-height: inherit;" /><br style="box-sizing: inherit; caret-color: rgba(0, 0, 0, 0.9); color: rgba(0, 0, 0, 0.9); font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif; font-size: 14px; line-height: inherit;" /><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;">This is exactly how it comes across to me and is consistent with what I've experienced myself and what my closest colleagues who have used generative AI have also experienced.</span><div><span color="rgba(0, 0, 0, 0.9)" face="-apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", "Fira Sans", Ubuntu, Oxygen, "Oxygen Sans", Cantarell, "Droid Sans", "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Lucida Grande", Helvetica, Arial, sans-serif" style="background-color: white; caret-color: rgba(0, 0, 0, 0.9); font-size: 14px;"><br /></span></div><div>The Marcus article: <a href="https://garymarcus.substack.com/p/what-if-generative-ai-turned-out" target="_blank">https://garymarcus.substack.com/p/what-if-generative-ai-turned-out</a></div><div><br /></div>Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-1663310629740185732017-05-22T20:00:00.001-07:002024-01-13T16:04:49.595-08:00<div class="MsoNormal">
<b style="mso-bidi-font-weight: normal;"><span style="font-size: 14pt; mso-bidi-font-size: 12.0pt;">What Programming do
Predictive Modelers Need to Know?<o:p></o:p></span></b></div>
<div class="MsoNormal">
Dean Abbott<o:p></o:p></div>
<div class="MsoNormal">
<a href="http://www.smarterhq.com/">SmarterHQ</a> and <a href="http://www.abbottanalytics.com/">Abbott Analytics</a></div><div class="MsoNormal"><span class="MsoHyperlink"><br /></span></div><div class="MsoNormal"><span class="MsoHyperlink">Note: this post, almost 7 years after posting 5/22/2017), has been flagged for copyright or some kind of digital rights issue. Of course, the specific issue wasn't identified so I'm left guessing. I'm guessing it was a screen capture of one of the software products. (??). So to be safe I took them all out. The irony is that the images are all from the vendor websites. I dont think any of them are current (I certainly hope not!)</span></div><div class="MsoNormal"><span class="MsoHyperlink"><br /></span></div><div class="MsoNormal"><span class="MsoHyperlink">I hope the article continues to provide the value it was intended to provide.</span></div><div class="MsoNormal"><span class="MsoHyperlink"><br /></span></div><div class="MsoNormal"><span class="MsoHyperlink">-- Dean, 1/13/2024<br style="mso-special-character: line-break;" />
<!--[if !supportLineBreakNewLine]--><br style="mso-special-character: line-break;" />
<!--[endif]--></span><o:p></o:p></div>
<div class="MsoNormal">
(First published in Predictive Analytics Times, <a href="http://www.predictiveanalyticsworld.com/patimes/what-programming-do-predictive-modelers-need-to-know-0408152-2/5129/">http://www.predictiveanalyticsworld.com/patimes/what-programming-do-predictive-modelers-need-to-know-0408152-2/5129/</a>,
2014. Updated and edited here.)<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
In most lists of the most popular software for doing data
analysis, statistics, and predictive modeling, the top software tools are
Python and R—command line languages rather than GUI-based modeling packages.
There are several reasons for this, perhaps most importantly that they are
free, they are robust programming languages supported by a very broad user community,
and they have extensive sets of algorithms. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
One recent survey of software was published at r4stats.com (<a href="http://r4stats.com/articles/popularity/">http://r4stats.com/articles/popularity/</a>)
and contains additional metrics not usually found in software comparisons,
including scholarly articles that include software, google scholar hits, and
job trends in addition to the more typical summaries by user-identified use of
software such as with the Rexer Analytics surveys (<a href="http://www.rexeranalytics.com/Data-Miner-Survey-Results-2013.html">http://www.rexeranalytics.com/Data-Miner-Survey-Results-2013.html</a>)
and polls on kdnuggets.com and reviews of software by technology research
companies such as Gartner (<a href="http://pages.alteryx.com/GartnerMQAdvancedAnalyticsNowAvailable-T.html">http://pages.alteryx.com/GartnerMQAdvancedAnalyticsNowAvailable-T.html</a>),
Forrester (<a href="http://global.sap.com/campaign/na/usa/CRM-XU13-BIP-PATDWS/index.html?urlid=CRM-XU13-BIP-PATDWS">http://global.sap.com/campaign/na/usa/CRM-XU13-BIP-PATDWS/index.html?urlid=CRM-XU13-BIP-PATDWS</a>),
and Hurwitz & Associates (<a href="http://www.sas.com/content/dam/SAS/en_us/doc/analystreport/hurwitz-advanced-analytics-107212.pdf">http://www.sas.com/content/dam/SAS/en_us/doc/analystreport/hurwitz-advanced-analytics-107212.pdf</a>).
Thanks to r4stats for providing these links in their article.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
For those interested in getting a job in analytics, the
article provides very useful rankings of software by the number of job
postings, led by Java, SAS, Python, C/C++/C#, R, SPSS, and Matlab. They also
provide a few examples of the trending in the job postings of the tools over
the past 7 years—important trending to consider as well. This reveals for
example a nearly identical increase in Python and R compared with a decrease in
SAS over the past few years (SAS is still #2 overall in job postings though
because of the huge SAS install base). <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
The user interface that appears to have won the day in commercial
software for predictive modeling is the workflow-style interface where a user
connects icons that represent functions or tasks into a flow of functions. This
kind of interface has been in use for decades, and one that I was first
introduce to in the software package Khoros / Cantata in the early 90s (<a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.9854&rep=rep1&type=pdf">http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.9854&rep=rep1&type=pdf</a>).
Clementine was an early commercial tool using this paradigm (now IBM SPSS Modeler),
and now most tools, including those that have historically used drop-down
“Windows-style” menus, are embracing a workflow interface. And it’s not just
commercial tools that are built from the ground up using a workflow-style
interface: many open source software like KNIME and RapidMiner embraced this
style from their beginnings. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Even though workflow interfaces have won the day, there are
several excellent software tools that still use a typical drop-down file menu
interface for a variety of reasons: some legacy and some functional. I still use
several of them myself. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
There are several reasons I like the workflow interface.
First, it is self-documenting, much like command-line interfaces. You see
exactly what you did in the analysis, at least at a high level. To be fair,
some of these nodes have considerable critical customization options set inside
the nodes, but the function is self-evident nevertheless. Command line
functions have the same issue: there are often numerous options one has to
specify to call a function successfully.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Second, you can reuse the workflow easily. For example, if
you want to run your validation data through the exact same data preparation
steps you used in building your models, you merely connect a new data source to
the workflow. Third, you can explain what you did to your manager very easily
and visually without the manager needing to understand code. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Another way to think of the workflow interface is as a <i style="mso-bidi-font-style: normal;">visual programming</i> interface. You string
together functional blocks from a list of functions (nodes) made available to
you by the software. So whether you build an analysis in a visual workflow or a
command line programming language, you still do the same thing: string together
a sequence of commands to manipute and model the data. For example, you may
want to load a csv file, replace missing values with the mean, transform your
positively-skewed variables with a log transform, split you data into training
and testing subsets, then build a decision tree. Each of these steps can be
done with a node (visual programming) or a function (programming).<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
From this perspective, the biggest difference between visual
programming and command line programming is that R and Python have a larger set
of functionas available to you. From an algorithm standpoint, this difference is
primarily manifested in obscure or new, cutting-edge algorithms. This is one
important reason why most visual programming interface tools have added R and
Python integration into their software, typically through a node that will run
the external code within the workflow itself. The intent isn’t to replace the
software, but to enhance it with functions not yet added to the software
itself. This is especially the case with leading-edge algorithms that have
support in R or Python already because of their ties with the academic
community. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Personally, I used to create code regularly for building
models, primarily in C and FORTRAN (3<sup>rd</sup> generation languages) though
also in other scripting (4<sup>th</sup> generation) languages like unix shell
programming (sh, csh, ksh, bash), Matlab, Mathematica, and others. But
eventually I used commercial software tools because my consulting clients used
them, and they contained most of what I needed to do to solve data mining
problems. Since they all have limited sets of functions, I would have to
sometimes make creative use of the existing functions to accomplish what I
needed to do, but it didn’t stop me from being successful with them. And I
didn’t have to write and re-write code for each of these clients. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Each of these tools have their own way of performing an
analysis. Command line tools also have their own way of performing an analysis.
Much of what separates novices and experts in a software tool is not an
awareness of the particular functions or building blocks, but an understanding
of how best to use the existing building blocks. This is why I recommend
analysts learn a tool and learn it well, becoming an expert in the tool so that
the tool is used to its fullest potential. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal">
Examples of workflows in some of the most popular and
acclaimed advanced analytics software packages are shown below. Note that the style that has dominated these top tools is the visual programming interface, and they are very similar in how the user builds these workflows.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->2<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: Statistica workflow from my Predictive
Analytics World workshop, Advanced Methods Hands-On<o:p></o:p></div>
<div class="MsoCaption">
<br /></div>
<!--EndFragment--></td></tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->3<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: IBM SPSS Modeler Stream, from https://34f2c.https.cdn.softlayer.net/8034F2C/dal05/v1/AUTH_db1cfc7b-a055-460b-9274-1fd3f11fe689/5b0fd91ef0e1a6f21f6e983ccc775a37/offering_3d4b6acc-2e09-4451-b4d0-3385f4385aa8.png<o:p></o:p></div>
<!--EndFragment--></td></tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->4<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: SAS Enterprise Miner, from <a href="https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAKWAAAAJGVjMGE3YzE4LTRhZTEtNGE4Zi05ZWZiLTc3ZDgwY2VjNzZiZg.png">https://media.licdn.com/mpr/mpr/AAEAAQAAAAAAAAKWAAAAJGVjMGE3YzE4LTRhZTEtNGE4Zi05ZWZiLTc3ZDgwY2VjNzZiZg.png</a><o:p></o:p></div>
<div class="MsoCaption">
<br /></div>
<!--EndFragment--></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->5<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: RapidMiner processing flow from <a href="http://www.xomnia.com/wp-content/uploads/2015/03/rm_studio-1024x551.png">http://www.xomnia.com/wp-content/uploads/2015/03/rm_studio-1024x551.png</a></div>
<div class="MsoCaption">
<br /></div>
<!--EndFragment--></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->6<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: Alteryx workflow from <a href="https://www.alteryx.com/sites/default/files/uploads/valprop/valprop-product-designer_3.png">https://www.alteryx.com/sites/default/files/uploads/valprop/valprop-product-designer_3.png</a><o:p></o:p></div>
<!--EndFragment--></td></tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoCaption">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->7<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: Microsoft Azure workflow from <a href="https://azurecomcdn.azureedge.net/cvt-f50521786d7494d1129069f701d8751293420aaad404b4aa8260a68606055807/images/page/services/machine-learning/simple-scalable-cutting-edge.jpg">https://azurecomcdn.azureedge.net/cvt-f50521786d7494d1129069f701d8751293420aaad404b4aa8260a68606055807/images/page/services/machine-learning/simple-scalable-cutting-edge.jpg</a><o:p></o:p></div>
<div class="MsoCaption">
<br /></div>
<!--EndFragment--></td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment--><span style="font-family: "cambria"; font-size: 12pt;">Figure </span><!--[if supportFields]><span
style='font-size:12.0pt;font-family:Cambria;mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"MS 明朝";mso-fareast-theme-font:minor-fareast;
mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA'><span style='mso-element:field-begin'></span><span
style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC <span style='mso-element:
field-separator'></span></span><![endif]--><span style="font-family: "cambria"; font-size: 12pt;">8</span><!--[if supportFields]><span
style='font-size:12.0pt;font-family:Cambria;mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"MS 明朝";mso-fareast-theme-font:minor-fareast;
mso-hansi-theme-font:minor-latin;mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA'><span style='mso-element:field-end'></span></span><![endif]--><span style="font-family: "cambria"; font-size: 12pt;">: Orange Data Mining from <a href="http://1.bp.blogspot.com/-4h6g-5eoJpU/UeiJ1l_9mtI/AAAAAAAAAEc/sCiyxXKP48o/s1600/orangepython.png">http://1.bp.blogspot.com/-4h6g-5eoJpU/UeiJ1l_9mtI/AAAAAAAAAEc/sCiyxXKP48o/s1600/orangepython.png</a></span><!--EndFragment-->
</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><br /></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoFootnoteText">
Figure <!--[if supportFields]><span style='mso-element:
field-begin'></span><span style="mso-spacerun:yes"> </span>SEQ Figure \* ARABIC
<span style='mso-element:field-separator'></span><![endif]-->9<!--[if supportFields]><span style='mso-element:
field-end'></span><![endif]-->: Weka Knowledge Flow from <a href="http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//weka.png">http://www.siliconafrica.com/wp-content/themes/directorypress/thumbs//weka.png</a><o:p></o:p></div>
<div class="MsoFootnoteText">
<br /></div>
<!--EndFragment--></td></tr>
</tbody></table>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-69714785907046030472017-03-20T08:29:00.000-07:002017-03-20T08:34:19.947-07:00A Question of Resource AllocationOf the resources consumed in data mining projects, the most precious
(read: "expensive") is time, especially the time of the human analyst.
Hence, a significant question for the analyst is how best to allocate
his or her time.<br />
<br />
Long and continuing experience
indicates clearly that the most productive use of time in such work is
that dedicated to data preparation. I apologize if this seems like an
old topic to the reader, but it is an important lesson which seems to be
forgotten annually, as each new technical development presents itself. A
surprising number of authors- particularly the on-line variety- come to
the conclusion that "the latest thing"* will spare us from needing to
prepare and enhance the data.<br />
<br />
I offer, as <u>yet another</u>
data point in favor of this perspective a recent conversation I had
with a colleague. He and a small team conducted parallel modeling
efforts for a shared client. Using the same base data, they constructed
separate predictive models. His model and theirs achieved similar test
performance. The team used a random forest, while he used
logistic regression, one of the simplest modeling techniques. The team
was perplexed at the similarity in model performance. My associate asked
them how they had handled missing values. They responded that they
filled them in. He asked exactly how they had filled the missing values.
The response was that they set them all to zeros (!). By not taking the
time and effort to comprehensively address this issue, they had forced
their model to do the significant extra work of filling in these gaps
itself. Consider that some fraction of their data budget was spent on fixing this mistake, rather than being used to create a better model. Note, too, that it is far easier (less code, less input
variables to monitor, less to go wrong) to deploy a modestly-sized
logistic regression than any random forest.<br />
<br />
Given this
context, it is curious to note that so much of what is published (again,
especially on-line; think of titles such as: "The 10 Learning
Algorithms Every Data Scientist Must Know") and so many job listings
emphasize- almost to the point of exclusivity- learning algorithms, as
opposed to practical questions of data sampling, data preparation and
enhancement, variable reduction, solving the business problem (instead of the technical one) or ability to deploy the final product.<br />
<br />
<br />
*
For "the latest thing", you may fill in, variously, neural networks,
decision trees, SVM, random forests, GPUs, deep learning or whatever
comes out as next year's "next big thing".<br />
<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com3tag:blogger.com,1999:blog-5652924.post-27361905570594914172016-04-25T08:01:00.000-07:002016-04-25T08:09:03.529-07:00Tracking Model Performance Over Time<b>Context</b><br />
<br />
Most introductory data mining texts include substantial coverage of model testing. Various methods of assessing true model performance (holdout testing, k-fold cross validation, etc.) are usually explained, perhaps with some important variants, such as stratification of the testing samples.<br />
<br />
Generally, all of this exposition is aimed at in-time analysis: Model development data may span multiple time periods, but the testing is more or less blind to this: all periods are treated as fair game and mixed together. This is fine for model development. Once predictive models are deployed, however, it is desirable to continue testing to track model performance over time. Models which degrade over time need to be adjusted or replaced.<br />
<br />
<br />
<b>Subtleties of Testing Over Time</b><br />
<br />
Nearly all production model evaluation is performed with new out-of-time data. As new periods of observed outcomes become available, they are used to calculate running performance measures. As far it goes, focusing on the actual performance metric makes sense. In my experience, though, some clients become distracted by movement in the independent variables or in the predicted or actual outcome distributions, in isolation. It is important to understand the dynamic of these changes to fully understand model performance over time.<br />
<br />
For the sake of a thought experiment, consider a very simply problem with one independent variable, and one target variable, both real numbers. Historically, the distribution of each of these variables has been confined to specific ranges. A predictive model has been constructed as a linear regression which attempts to anticipate the target variable, using only the input of the single independent variable (and a constant). Assume that errors observed in the development data have been small and otherwise unremarkable (they are distributed normally, their magnitude is relatively constant across the range of the independent variable, there is no obvious pattern to them and so forth).<br />
<br />
Once this model is deployed, it is executed on all future cases drawn from the relevant statistical universe, and predictions are saved for further analysis. Likewise, actual outcomes are recorded as they become available. At the conclusion of each future time period, model performance within that period is examined.<br />
<br />
Consider the simplest change to well-developed model: the distribution of the independent variable remains the same, but the actual outcomes begin to depart the regression line. Any number of changes could be taking place in the output distribution, but the predicted distribution (the regression line) cannot move since it is entirely defined by the independent variable, which in this case is stable. By definition, model performance is degrading. This circumstance is easy to diagnose: the dynamic linking the target and independent variables is changing, hence a new model is necessary to restore performance.<br />
<br />
What happens, though, when the independent variable begins to migrate? There are two possible effects (in reality, some combination of these extremes is likely): 1. The distribution of actual outcomes will either shift to appropriately match the change ("the dots march along the regression line"), or 2. The distribution of actual outcomes does not shift to match the change. In the first case, the model continues to correctly identify the relationship between the target and the independent variable, and model performance will more-or-less endure. In the second case, reality begins to wander from the model and performance deteriorates. Notice that, in the second case, the actual outcome distribution may or may not change noticeably- either way, the model no longer correctly anticipates reality and needs to be updated.<br />
<br />
<br />
<b>Conclusion</b><br />
<br />
The example used here was deliberately chosen to be simple, for illustrations' sake. Qualitatively, though, the same basic behaviors are exhibited by much more complex models. Models featuring multiple independent variables or employing complex transformations (neural networks, decision trees, etc.) obey the same fundamental dynamic. Given the sensitivity of nonlinear models to each of their independent variables, a migration in even one of them may provoke the changes described above. Consideration of the components of this interplay in isolation only serves to confuse: Changes over time can only be understood as part of the larger whole.<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com1tag:blogger.com,1999:blog-5652924.post-40959177576530284432015-12-06T20:04:00.000-08:002015-12-06T20:05:58.660-08:00Predictive Modeling Skills: Expect to be Surprised<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<i><span style="font-family: "times";">Excerpted
from Chapter 1 of my book </span></i><a href="http://www.amazon.com/Applied-Predictive-Analytics-Principles-Professional/dp/1118727967/ref=sr_1_1?s=books&ie=UTF8&qid=1420039995&sr=1-1&keywords=applied+predictive+analytics"><i><span style="font-family: "times";">Applied Predictive Analytics</span></i></a><i><span style="font-family: "times";">, Wiley 2014</span></i></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">Conventional wisdom says that predictive modelers
need to have an academic background in statistics, mathematics, computer
science, or engineering. A degree in one of these fields is best, but without a
degree, at a minimum, one should at least have taken statistics or mathematics
courses. Historically, one could not get a degree in predictive analytics, data
mining, or machine learning. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">This has changed, however, and dozens of
universities now offer master’s degrees in predictive analytics. Additionally,
there are many variants of analytics degrees, including master’s degrees in
data mining, marketing analytics, business analytics, or machine learning. Some
programs even include a practicum so that students can learn to apply textbook
science to real-world problems. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">One reason the real-world experience is so critical
for predictive modeling is that the science has tremendous limitations. Most
real-world problems have data problems never encountered in the textbooks. The
ways in which data can go wrong are seemingly endless; building the same
customer acquisition models even within the same domain requires different
approaches to data preparation, missing value imputation, feature creation, and
even modeling methods. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">However, the <i>principles </i>of how one can solve
data problems are not endless; the experience of building models for several
years will prepare modelers to at least be able to identify when potential
problems may arise. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">Surveys of top-notch predictive modelers reveal a
mixed story, however. While many have a science, statistics, or mathematics
background, many do not. Many have backgrounds in social science or humanities.
How can this be? <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">Consider a retail example. The retailer Target was
building predictive models to identify likely purchase behavior and to incentivize
future behavior with relevant offers. Andrew Pole<b>, </b>a Senior Manager of
Media and Database Marketing described how the company went about building
systems of predictive models at the Predictive Analytics World Conference in
2010. Pole described the importance of a combination of domain knowledge,
knowledge of predictive modeling, and most of all, a forensic mindset in
successful modeling of what he calls a “guest portrait.” <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">They developed a model to predict if a female
customer was pregnant. They noticed patterns of purchase behavior, what he
called “nesting” behavior. For example, women were purchasing cribs on average
90 days before the due date. Pole also observed that some products were
purchased at regular intervals prior to a woman’s due date. The company also
observed that if they were able to acquire these women as purchasers of other
products during the time before the birth of their baby, Target was able to
increase significantly the customer value; these women would continue to
purchase from Target after the baby was born based on their purchase behavior
before. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">The key descriptive terms are “<i>observed</i>” and
“<i>noticed.</i>” This means the models were not built as black boxes. The
analysts asked, “does this make sense?” and leveraged insights gained from the
patterns found in the data to produce better predictive models. It undoubtedly
was iterative; as they “noticed” pat- terns, they were prompted to consider
other patterns they had not explicitly considered before (and maybe had not
even occurred to them before). This forensic mindset of analysts, noticing
interesting patterns and making connections between those patterns and how the
models could be used, is critical to successful modeling. It is rare that
predictive models can be fully defined before a project and modelers can anticipate
all of the most important patterns the model will find. So we shouldn’t be
surprised that we <i>will </i>be surprised, or put another way, we should <i>expect
</i>to be surprised. <o:p></o:p></span></div>
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Revision>0</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Pages>1</o:Pages>
<o:Words>662</o:Words>
<o:Characters>3776</o:Characters>
<o:Company>Smarter Remarketer</o:Company>
<o:Lines>31</o:Lines>
<o:Paragraphs>8</o:Paragraphs>
<o:CharactersWithSpaces>4430</o:CharactersWithSpaces>
<o:Version>14.0</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">This kind of mindset is not learned in a university
program; it is part of the personality of the individual. Good predictive
modelers need to have a forensic mindset and intellectual curiosity, whether or
not they understand the mathematics enough to derive the equations for linear
regression. <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 12.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="font-family: "times";">(This post first appeared in the <a href="http://www.predictiveanalyticsworld.com/patimes/" target="_blank">Predictive Analytics Times</a></span>)</div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com1tag:blogger.com,1999:blog-5652924.post-33367070195642003292015-07-17T20:15:00.002-07:002015-07-18T04:32:50.846-07:00Data Mining's Forgotten Step-ChildrenDepending on whose definition one reads, the list of activities which comprise data mining will vary, but the first two items are always the same...<br />
<br />
<br />
<b>Number 1: Prediction</b> <br />
<br />
The most common data mining function, by far, is <i>prediction</i> (or, more esoterically, <i>supervised learning</i>), which is sometimes listed twice, depending on the type of variable being predicted: <i>classification</i> (when the target is categorical) vs. <i>regression</i> (when the target is numerical). Predictive models learned by machines from historical examples easily occupy the most of almost any measure of data mining: time, money, technical papers published, software packages, etc. The hyperbole of marketers and the fears of data mining critics, also, are most often associated with prediction.<br />
<br />
<br />
<b>Number 2: Clustering</b><br />
<br />
The second most common data mining function in practice is <i>clustering</i> (sometimes known by the alias <i>unsupervised learning</i>). Gathering things into "natural" groupings has a long history in some fields (cladistics in biology, for instance), though clustering's "no right or wrong answer" quality likely will cement its continuing spot in second place. Despite being second banana to prediction, clustering enjoys widespread application and is well understood even in non-technical circles. What marketer doesn't like a good segmentation? <br />
<br />
<br />
<b>"... and all the rest!"</b><br />
<br />
What else is in the data mining toolbox? Definitions vary, but the next two most commonly mentioned tasks are <i>anomaly detection</i> and <i>association rule discovery</i>. Other tasks have been included, such as <i>data visualization,</i> though that field dates back well over a hundred years and clearly enjoys a healthy existence outside of the data mining field.<br />
<br />
Anomaly detection (a superset of statistical <i>outlier detection</i>) searches for observations which violate patterns in data. Generally, these patterns are discovered (explicitly or not) using prediction or clustering. Given that a wide array of prediction or clustering techniques might be applied, the patterns concluded to exist within a single data set will vary, implying that observations flagged as anomalous will vary. This leaves anomaly detection somewhat in the company of clustering in the sense of having "no right or wrong answers". Still, anomaly detection can be immensely useful, with two common applications being fraud detection and data cleansing. This author has used a simple anomaly detection process to help find errors in predictive model implementation code.<br />
<br />
Association rule discovery attempts to identify patterns among data items which exhibit associations with one another. The classic example is individual items of merchandise in a retail setting (<i>market basket analysis</i>): Each purchase represents an association of a variety of distinct items with one another. After enough purchases, relationships among items can be inferred, such as the frequent purchase of coffee with sugar. Relationships among people, as evidenced by instances of telephone or electronic contact, have also been explored, both for marketing purposes and in law enforcement.<br />
<br />
<br />
<b>Further Reading</b><br />
<br />
Neither anomaly detection nor association rule discovery receive nearly the press that the first two members of the data mining club do, but it is worth learning something about them. Some problems fall more naturally into their purview. To get started with these techniques, the standard references will do, such as Witten and Frank, or Han and Kamber. Also consider material on outliers in the traditional statistical literature.<br />
<br />
<br />
<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-17183176723795081822014-07-24T22:56:00.000-07:002014-08-09T22:06:48.628-07:00Similarities and Differences Between Predictive Analytics and Business Intelligence<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">I’ve been reminded recently of the overlap
between business intelligence and predictive analytics. Of course any reader of
this blog (or at least the title of the blog) knows I live in the world of data
mining (DM) and predictive analytics (PA), not the world of business
intelligence (BI). In general, I don’t make comments about BI because I am an
outsider looking in. Nevertheless, I view BI as a sibling to PA because we
share so much in common: we use the same data, often use similar metrics and
even sometimes use the same tools in our analyses. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">I was interviewed by <a href="https://www.linkedin.com/pub/victoria-garment/3b/443/662" target="_blank">Victoria Garment</a> of
<a href="http://www.softwareadvice.com/" target="_blank">Software Advice</a> on the topic of testing accuracy of predictive models in
January, 2014 (I think I was first contacted about the interview in December,
2013). What I didn’t know was that John Elder and Karl Rexer, two good friends
and colleagues in this space, were interviewed as well. The resulting article, "</span><span style="font-family: Helvetica;"><a href="http://plotting-success.softwareadvice.com/3-predictive-model-accuracy-tests-0114/" target="_blank">3 Ways to Test the Accuracy of Your Predictive Models</a>" </span><span style="font-family: Helvetica;">posted on their <a href="http://plotting-success.softwareadvice.com/" target="_blank">Plotting Success blog</a></span><span style="font-family: Helvetica;"> was well written and generated quite a bit of buzz on twitter after it was
posted.</span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">Prior to the interview, I had no knowledge of
Software Advice and after looking at their blog, I understand why: they are
clearly a BI blog. But after reading maybe a dozen posts, it is clear that we
are siblings, in particular sharing concepts and approaches in big data, data
science, staffing and talent acquisition. I've enjoyed going back to the blog. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">The similarities of BI and PA are points I’ve tried to make in
talks I’ve given at eMetrics and performance management conferences. After
making suitable translations of terms, these two fields can understand each
other well. Two sample differences in terminology are described here. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">First, one rarely hears the term KPI at a PA
conference, but will often hear it at BI conferences. If we use google as an
indicator of popularity of the term KPI,</span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
</div>
<ul>
<li><span style="font-family: Helvetica;">' “predictive analytics” KPI' yielded a mere
103,000 hits on google, whereas</span></li>
<li><span style="font-family: Helvetica;">' “business intelligence” KPI' yielded 1,510,000
hits.</span></li>
</ul>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica;">In PA, one is more likely to hear these ideas described as metrics or even features or derived variables that can be used as inputs to models are as a target variable.</span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;"><br /></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">As a second example, a “use case” is frequently
presented in BI conferences to explain a reason for creating a particular KPI or analysis. “Use
Cases” are rarely described in PA conferences; in PA we say “case studies”.
Back to google, we find</span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
</div>
<ul>
<li><span style="font-family: Helvetica;">' "business intelligence" "use
case" ' – 306,000 hits on google</span></li>
<li><span style="font-family: Helvetica;">' “predictive analytics” ”use case” ' – 58,800 hits
on google</span></li>
<li><span style="font-family: Helvetica;">' “predictive analytics “case study” ' – 217,000
hits on google</span></li>
</ul>
<br />
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">Interestingly, the top two links for
“predictive analytics” “use case” from the search weren’t even predictive
analytics use cases or case studies. The second link of the two actually described how
predictive analytics is a use case for cloud computing. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">The BI community, however, seems to embrace PA
and even consider it part of BI (much to the chagrin of the PA community, I
would think). According to the Wikipedia entry on BI, the following chart shows
topics that are a part of BI:<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://1.bp.blogspot.com/-Xhv8ATtkQiU/U9L4-GCBtkI/AAAAAAAAAfA/lIOvKyMvTVg/s1600/BI-wikipedia.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://1.bp.blogspot.com/-Xhv8ATtkQiU/U9L4-GCBtkI/AAAAAAAAAfA/lIOvKyMvTVg/s1600/BI-wikipedia.png" height="123" width="400" /></a></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-no-proof: yes;"><!--[if gte vml 1]><v:shapetype
id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t"
path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype><v:shape id="Picture_x0020_16" o:spid="_x0000_i1026" type="#_x0000_t75"
style='width:6in;height:134pt;visibility:visible;mso-wrap-style:square'>
<v:imagedata src="file://localhost/Users/jfordham/Library/Caches/TemporaryItems/msoclip/0/clip_image001.png"
o:title=""/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></span><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">Interestingly, DM, PA, and even Prescriptive
Analytics are considered a part of BI. I must admit, at all the DM and PA
conferences I’ve attended, I’ve never heard attendees describe themselves as BI
practitioners. I have heard more cross-branding of BI and PA at other
conferences that include BI-specific material, like Performance Management and
Web Analytics conferences. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica;">Contrast this with the PA Wikipedia page. This
taxonomy of fields related to PA is typical. I would personally include dashed
lines to Text Mining and maybe even Link Analysis or Social Networks as they are related
though not directly under PA. Interestingly, statistics falls under PA here,
I’m sure to the chagrin of statisticians! And, I would guess that at a
statistics conference, the attendees would not refer to themselves as
predictive modelers. But maybe they would consider themselves data scientists! Alas, that’s another topic
altogether. But that is the way these kinds of lists go; they are difficult to perfect and usually generate discussion over where the dividing lines occur. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-no-proof: yes;"><!--[if gte vml 1]><v:shape
id="Picture_x0020_17" o:spid="_x0000_i1025" type="#_x0000_t75" style='width:351pt;
height:148pt;visibility:visible;mso-wrap-style:square'>
<v:imagedata src="file://localhost/Users/jfordham/Library/Caches/TemporaryItems/msoclip/0/clip_image003.png"
o:title=""/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></span><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-xdtpMcQ89N0/U9L5DEQ8H5I/AAAAAAAAAfU/Y_pyLOwMKUo/s1600/PA-wikipedia.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/-xdtpMcQ89N0/U9L5DEQ8H5I/AAAAAAAAAfU/Y_pyLOwMKUo/s1600/PA-wikipedia.png" height="168" width="400" /></a></div>
<br />
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">This tendency to include fields are part of “our
own” is a trap most of us fall into: we tend to be myopic in our views of the
fields of study. It frankly reminds me of a map I remember hanging in my house
growing up in Natick, MA: <a href="http://www.universalhub.com/node/11572" target="_blank">“A Bostonian’s Idea of The United States of America”</a>. Clearly, Cape Cod is far more important than Florida or even California!<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Revision>0</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Pages>1</o:Pages>
<o:Words>700</o:Words>
<o:Characters>3996</o:Characters>
<o:Company>Smarter Remarketer</o:Company>
<o:Lines>33</o:Lines>
<o:Paragraphs>9</o:Paragraphs>
<o:CharactersWithSpaces>4687</o:CharactersWithSpaces>
<o:Version>14.0</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica;">Be that as it may, my final point is that BI and PA
are important but complementary disciplines. BI is a much larger field and
understandably so. PA is more of a specialty, but a specialty that is gaining
visibility and recognition as an important skill set to have in any
organization. Here’s to further collaboration in the future!<o:p></o:p></span></div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com4tag:blogger.com,1999:blog-5652924.post-71903637420262400432014-05-26T09:07:00.002-07:002014-05-26T09:07:41.911-07:00Why Overfitting is More Dangerous than Just Poor Accuracy, Part II<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
<a href="http://abbottanalytics.blogspot.com/2014/05/why-overfitting-is-more-dangerous-than.html" target="_blank">In Part I, </a>I explained one problem with overfitting the data: estimates of the target variable in regions without any training data can be unstable, whether those regions require the model to interpolate or extrapolate. Accuracy is a problem, but more precisely, the problems in interpolation and extrapolation are not revealed using any accuracy metrics and only arise when new data points are encountered after the model is deployed.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
This month, a second problem with overfitting is described: unreliable model interpretation. Predictive modeling algorithms find variables that associate or correlate with the target variable. When models are overfit, the algorithm has latched onto variables that it finds to be strongly associated with the target variable, but these relationships are not repeatable. The problem is that these variables that appear to be strongly associated with the target are not necessarily related at all to the target. When we interpret what the model is telling us, we therefore glean the wrong insights, and these insights can be difficult to shed once we rebuild models to simplify them and avoid overfitting.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Consider an example from the 1998 KDD Cup data. One variable, RFA_3, has 70 levels (71 if we include the missing values), a case of a high-cardinality input variable. A decision tree may try to group all levels with the highest association with the target variable, TARGET_B, a categorical variable with labels 0 for non-responders and 1 for responders to a mailing campaign.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
RFA_3 turns out to be one of the top predictors when building decision trees. The decision tree may try to group all levels with high average rates of TARGET_B equal to 1. The table below shows the 10 highest rates along with the counts for how many records match each value of RFA_3. The question is this: when a value like L4G matches only 10 records, one of which is a responder (10% response rate), do we believe it? How sure are we that the measured 10% rate in our sample is reproducible for the next 10 values of L4G?</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
We can gain some insight by applying a simple statistical test, like a binomial distribution test you can find online. The upper and lower bounds of the measured rate given the sample size is shown in the table as well. For L4G, we are 95% sure from the statistical test that L4G will have a rate between 0% and 28.6%. This means the 10% rate we measured in the small sample size could really in the long run be 1%. Or, it could be 20%. We just don’t know.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
</div>
<div class="clear" style="-webkit-text-stroke-width: 0px; background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; clear: both; color: #444444; display: block; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: normal; height: 0px; letter-spacing: normal; line-height: 21px; margin: 0px; orphans: auto; outline: 0px; overflow: hidden; padding: 0px; text-align: start; text-indent: 0px; text-transform: none; vertical-align: baseline; visibility: hidden; white-space: normal; widows: auto; width: 0px; word-spacing: 0px;">
</div>
<div class="clear" style="-webkit-text-stroke-width: 0px; background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; clear: both; color: #444444; display: block; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: normal; height: 0px; letter-spacing: normal; line-height: 21px; margin: 0px; orphans: auto; outline: 0px; overflow: hidden; padding: 0px; text-align: start; text-indent: 0px; text-transform: none; vertical-align: baseline; visibility: hidden; white-space: normal; widows: auto; width: 0px; word-spacing: 0px;">
</div>
<div class="bottom-single-related" id="related_posts" style="-webkit-text-stroke-width: 0px; background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 21px; margin: 0px; orphans: auto; outline: 0px; padding: 0px; text-align: start; text-indent: 0px; text-transform: none; vertical-align: baseline; white-space: normal; widows: auto; word-spacing: 0px;">
</div>
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse; border-spacing: 0px; border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 21px; margin: 0px; padding: 0px; vertical-align: baseline; width: 400px;"><tbody style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;">
<tr bgcolor="#4f81bd" style="border: 0px; color: white; font-family: inherit; font-size: 12px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;">RFA_3</strong></div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"> Count</strong></div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"> TARGET_B = 1 Percent</strong></div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="81"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;">Confidence Interval Lower Bound</strong></div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="82"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;">Confidence Interval Upper Bound</strong></div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="84"><div style="background-color: transparent; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<strong style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"> 95% Confidence above average</strong></div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
A2C</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
1</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
100.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"></td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"></td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
No</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S4B</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
2</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
50.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"></td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"></td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
No</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S4C</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
9</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
11.1%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
0.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
31.6%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
No</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
L4G</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
10</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
10.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
0.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
28.6%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
No</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
N1E</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
46</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
8.7%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
0.6%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
16.8%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
No</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S2D</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
200</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
11.0%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
6.7%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
15.3%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Yes</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
A4D</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
1,867</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
9.1%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
7.8%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
10.4%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Yes</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S3D</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
1,989</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
9.4%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
8.1%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
10.7%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Yes</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S3E</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
2,262</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
8.7%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
7.5%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
9.9%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Yes</div>
</td></tr>
<tr style="border: 0px; font-family: inherit; font-size: 13px; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;"><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
S4D</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="65"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
2,675</div>
</td><td style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" width="78"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
9.6%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="81"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
8.5%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="82"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
10.7%</div>
</td><td nowrap="nowrap" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline;" valign="bottom" width="84"><div align="center" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Yes</div>
</td></tr>
</tbody></table>
<br />
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
For the 1998 KDD Cup data, it turns out that RFA_3 isn’t one of the better predictors of TARGET_B; it only showed up as a significant predictor when overfitting reared it’s ugly head.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The solution? Beware of overfitting. For high-cardinality variables, apply a complexity penalty to reduce the likelihood of finding these low-count associations. For continuous variables, the problem exists as well and can be just as deceptive. For all problems you are solving, resample the data to assess models on held-out data (testing data), cross-validation, or bootstrap sampling.</div>
<div>
<br /></div>
note: this post first appeared in <a href="http://www.predictiveanalyticsworld.com/patimes/overfitting-dangerous-just-poor-accuracy-part-ii/" target="_blank">Predictive Analytics Times</a> (with minor edits added here)<br />
<br />
<div class="post-3399 post type-post status-publish format-standard hentry category-leading-stories" id="post-3399" style="-webkit-text-stroke-width: 0px; background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 21px; margin: 0px; orphans: auto; outline: 0px; padding: 0px; text-align: start; text-indent: 0px; text-transform: none; vertical-align: baseline; white-space: normal; widows: auto; word-spacing: 0px;">
<div class="right-content" style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; float: right; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline; width: 425px;">
<div class="content-wrap" style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;">
<div addthis:title="Why Overfitting is More Dangerous than Just Poor Accuracy, Part II " addthis:url="http://www.predictiveanalyticsworld.com/patimes/overfitting-dangerous-just-poor-accuracy-part-ii/" class="addthis_toolbox addthis_default_style " style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;">
<div class="fb-like fb_iframe_widget" data-action="like" data-font="arial" data-href="http://www.predictiveanalyticsworld.com/patimes/overfitting-dangerous-just-poor-accuracy-part-ii/" data-layout="button_count" data-ref="" data-send="false" data-show_faces="false" data-width="90" fb-iframe-plugin-query="action=like&app_id=172525162793917&font=arial&href=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false&width=90" fb-xfbml-state="rendered" style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; display: inline-block; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: 16px; margin: 0px; outline: 0px; padding: 0px; position: relative; vertical-align: baseline;">
<a class="addthis_button_facebook_like at300b" fb:like:layout="button_count" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px 2px; text-decoration: none; vertical-align: baseline;"><span style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; display: inline-block; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; position: relative; text-align: justify; vertical-align: bottom; width: 77px;"><br /><iframe allowtransparency="true" class="" frameborder="0" height="1000px" name="f93b4176" scrolling="no" src="http://www.facebook.com/plugins/like.php?action=like&app_id=172525162793917&channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2FdgdTycPTSRj.js%3Fversion%3D41%23cb%3Df2ed25daa%26domain%3Dwww.predictiveanalyticsworld.com%26origin%3Dhttp%253A%252F%252Fwww.predictiveanalyticsworld.com%252Ff109e93e9c%26relation%3Dparent.parent&font=arial&href=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false&width=90" style="border: none; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; line-height: inherit; margin: 0px; padding: 0px; position: absolute; text-align: center; vertical-align: baseline; visibility: visible; width: 77px;" title="fb:like Facebook Social Plugin" width="90px"></iframe></span></a></div>
<a class="addthis_button_facebook_like at300b" fb:like:layout="button_count" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px 2px; text-decoration: none; vertical-align: baseline;">
</a><a class="addthis_button_tweet at300b" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; min-width: 83px; outline: 0px; padding: 0px 2px; text-align: center; text-decoration: none; vertical-align: baseline;"><iframe allowtransparency="true" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" data-twttr-rendered="true" frameborder="0" id="twitter-widget-0" scrolling="no" src="http://platform.twitter.com/widgets/tweet_button.1400006231.html#_=1401119612648&count=horizontal&counturl=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&id=twitter-widget-0&lang=en&original_referer=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&size=m&text=Why%20Overfitting%20is%20More%20Dangerous%20than%20Just%20Poor%20Accuracy%2C%20Part%20II%20&url=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline; width: 80px !important;" title="Twitter Tweet Button"></iframe></a><a class="addthis_button_linkedin_counter at300b" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px 2px; text-align: center; text-decoration: none; vertical-align: baseline;"><iframe allowtransparency="true" frameborder="0" role="presentation" scrollbars="no" scrolling="no" src="http://ct1.addthis.com/static/r07/linkedin023.html#href=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&dr=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Flog-in%2F&conf=url%3Dhttp%253A%252F%252Fwww.predictiveanalyticsworld.com%252Fpatimes%252Foverfitting-dangerous-just-poor-accuracy-part-ii%252F%26title%3DWhy%2520Overfitting%2520is%2520More%2520Dangerous%2520than%2520Just%2520Poor%2520Accuracy%252C%2520Part%2520II%26product%3Dtbx-300%26data_track_clickback%3Dfalse%26data_track_addressbar%3Dfalse%26data_track_textcopy%3Dfalse%26ui_atversion%3D300%26username%3Dra-51d42c415148d3ab%26pubid%3Dra-51d42c415148d3ab&share=url%3Dhttp%253A%252F%252Fwww.predictiveanalyticsworld.com%252Fpatimes%252Foverfitting-dangerous-just-poor-accuracy-part-ii%252F%26title%3DWhy%2520Overfitting%2520is%2520More%2520Dangerous%2520than%2520Just%2520Poor%2520Accuracy%252C%2520Part%2520II%26imp_url%3D0%26passthrough%3Dpinterest_share%253Dmedia%25253Dhttp%2525253A%2525252F%2525252Fwww.predictiveanalyticsworld.com%2525252Fog%2525252Fpat_logo1.png%252526url%25253Dhttp%2525253A%2525252F%2525252Fwww.predictiveanalyticsworld.com%2525252Fpatimes%2525252Foverfitting-dangerous-just-poor-accuracy-part-ii%2525252F%252526description%25253DWhy%25252520Overfitting%25252520is%25252520More%25252520Dangerous%25252520than%25252520Just%25252520Poor%25252520Accuracy%2525252C%25252520Part%25252520II%25252520-%25252520Predictive%25252520Analytics%25252520Times%2526linkedin%253Dcounter%25253Dhorizontal%26smd%3Drsi%253D%2526rxi%253Dundefined%2526gen%253D0%2526rsc%253D%2526dr%253Dhttp%25253A%25252F%25252Fwww.predictiveanalyticsworld.com%25252Fpatimes%25252Flog-in%25252F%2526sta%253DAT-ra-51d42c415148d3ab%25252F-%25252F-%25252F5383637810e615cc%25252F1&li=counter%3Dhorizontal" style="border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 18px; line-height: inherit; margin: 0px; padding: 0px; vertical-align: baseline; width: 100px;"></iframe></a><a class="addthis_button_google_plusone at300b" g:plusone:size="medium" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px 2px; text-align: center; text-decoration: none; vertical-align: baseline;"></a>
<div id="___plusone_0" style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px none; display: inline-block; float: none; font-family: inherit; font-size: 1px; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; line-height: normal; margin: 0px; outline: 0px; padding: 0px; text-indent: 0px; vertical-align: baseline; width: 90px;">
<iframe data-gapiattached="true" frameborder="0" hspace="0" id="I0_1401119612654" marginheight="0" marginwidth="0" name="I0_1401119612654" scrolling="no" src="https://apis.google.com/u/0/_/+1/fastbutton?usegapi=1&size=medium&hl=en-US&origin=http%3A%2F%2Fwww.predictiveanalyticsworld.com&url=http%3A%2F%2Fwww.predictiveanalyticsworld.com%2Fpatimes%2Foverfitting-dangerous-just-poor-accuracy-part-ii%2F&gsrc=3p&ic=1&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en.FtehYG75y6Y.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Fz%3Dzcms%2Frs%3DAItRSTPFFgeA7RdSw91rKA5xagxeFsQVmA#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1401119612654&parent=http%3A%2F%2Fwww.predictiveanalyticsworld.com&pfname=&rpctoken=42787212" style="border: 0px none; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; left: 0px; line-height: inherit; margin: 0px; padding: 0px; position: static; top: 0px; vertical-align: baseline; visibility: visible; width: 90px;" tabindex="0" title="+1" vspace="0" width="100%"></iframe></div>
<a class="addthis_button_google_plusone at300b" g:plusone:size="medium" href="https://www.blogger.com/blogger.g?blogID=5652924" style="border: 0px; color: #3b44a1; cursor: pointer; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px 2px; text-align: center; text-decoration: none; vertical-align: baseline;">
</a><a class="addthis_counter addthis_pill_style addthis_nonzero" href="http://www.predictiveanalyticsworld.com/patimes/overfitting-dangerous-just-poor-accuracy-part-ii/#" style="border: 0px; color: #3b44a1; cursor: pointer; display: inline-block; float: left; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: bold; height: 20px; line-height: inherit; margin: 0px; outline: 0px; overflow: hidden; padding: 0px; text-align: center; text-decoration: none !important; vertical-align: baseline;"></a><a class="atc_s addthis_button_compact" href="https://www.blogger.com/blogger.g?blogID=5652924" style="-webkit-transition: none; background-image: url(data:image/png; background-position: 0px 0px; background-repeat: no-repeat no-repeat; border: 0px; color: black; cursor: pointer; display: block; float: left; font-family: arial, helvetica, sans-serif !important; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; height: 20px; line-height: 20px; margin: 0px; outline: 0px; overflow: hidden; padding: 0px; text-align: center; text-decoration: none !important; transition: none; vertical-align: baseline; width: 50px;"><span style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;"></span></a><a class="addthis_button_expanded" href="http://www.predictiveanalyticsworld.com/patimes/overfitting-dangerous-just-poor-accuracy-part-ii/#" style="-webkit-transition: none; background-image: url(data:image/png; background-position: 0px -114px; background-repeat: no-repeat no-repeat; border: 0px; box-sizing: content-box; color: #333333; display: block !important; float: left; font-family: arial, helvetica, sans-serif !important; font-size: 11px; font-style: inherit; font-variant: inherit; font-weight: bold; height: 20px; line-height: 20px; margin: 0px 0px 0px 3px; outline: 0px; padding: 0px 0px 0px 4px; text-align: center; text-decoration: none !important; transition: none; vertical-align: baseline; width: 34px !important;" tabindex="1000" target="_blank" title="View more services">1</a><br />
<div class="atclear" style="background-color: transparent; background-position: initial initial; background-repeat: initial initial; border: 0px; clear: both; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;">
</div>
</div>
</div>
</div>
</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; text-align: center; vertical-align: baseline;">
<br /></div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com1tag:blogger.com,1999:blog-5652924.post-7550542402152754292014-05-01T13:20:00.003-07:002014-05-01T13:20:33.597-07:00Why Overfitting is More Dangerous than Just Poor Accuracy, Part I<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Arguably, the most
important safeguard in building predictive models is complexity regularization
to avoid overfitting the data. When models are overfit, their accuracy is lower
on new data that wasn’t seen during training, and therefore when these models
are deployed, they will disappoint, sometimes even leading decision makers to
believe that predictive modeling “doesn’t work”. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Overfit, however, is thankfully
a well-known problem and every algorithm has ways to avoid it. CART® and C5
trees use pruning to remove branches that are prone to overfitting, CHAID trees
require splits are statistically significant to add complexity to the trees.
Neural networks use held-out data to stop training when accuracy on held-out
data becomes worse. Stepwise regression uses information theoretic criteria
like the Akaike Information Criterion (AIC), Minimum Description Length (MDL),
or the Bayesian Information Criterion (BIC) to add terms only when the
additional complexity is offset by enough reduction of error. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">But overfitting has
more problems than merely misclassification cases in holdout data or incurring
large errors for regression problems. Without loss of generality, this
discussion will only describe overfilling in classification problems, but the
same principles apply in regression problems as well. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">One way modelers reduce
the likelihood of overfit is to apply the principle of Occam’s Razor, where if
two models exhibit the same accuracy, we will prefer the simpler model because
it is more likely to generalize well. By simpler, we must keep in mind that we
prefer models that <i>behave</i> more simply rather than models that just
appear to be simpler because they have fewer terms. John Elder (a regular
contributor to the PA Times) has a fantastic discussion of that topic in the
book by Seni and Elder, </span><a href="http://books.google.com/books?id=CdyZpAWt6YcC&pg=PA82&lpg=PA82&dq=John+Elder+generalized+degrees+of+freedom&source=bl&ots=-PoFJy5n3w&sig=hsoM36IZAnM-FITGxldlNjiQz58&hl=en&sa=X&ei=qpDQUvCKO8j1oATxv4LoBQ&ved=0CFMQ6AEwBA#v=onepage&q=John%20Elder%20generalized%20degrees%20of%20freedom&f=false"><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Ensemble Methods in Data Mining</span></a><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">.<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Consider this example
contrasting linear and nonlinear models.<span style="mso-spacerun: yes;">
</span>The figure below shows decision boundaries for two models separates two
classes of the famous Iris Data (http://archive.ics.uci.edu/ml/datasets/Iris).
On the left is the decision boundary from a linear model built using linear
discriminant analysis (like LDA or the Fisher Discriminant) and on the right, a
decision boundary built by a model using quadratic discriminant analysis (like
the Bayes Rule). The image can be found at </span><a href="http://scikit-learn.org/0.5/auto_examples/plot_lda_vs_qda.html"><span style="color: windowtext; font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA; text-decoration: none; text-underline: none;">http://scikit-learn.org/0.5/auto_examples/plot_lda_vs_qda.html</span></a><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">It appears that the
accuracy of both models is the same (let’s assume that it is), yet the behavior
of the models is very different. If there is new data to be classified that
appears in the upper left of the plot, the LDA model will call the data point
versicolor whereas the QDA model will call it virginica. Which is correct? We
don’t know which would be correct from the training data, but we do know this:
there is no justification in the data to increase the complexity of the model
from linear to quadratic. We probably would prefer the linear model here. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://4.bp.blogspot.com/-7h6COmx2o4g/U2KsBa-c5WI/AAAAAAAAAcM/FBsvNSFmyvU/s1600/January+2014+-+Why+Overfitting+is+More+Dangerous+than+Just+Poor+Accuracy+Fig1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://4.bp.blogspot.com/-7h6COmx2o4g/U2KsBa-c5WI/AAAAAAAAAcM/FBsvNSFmyvU/s1600/January+2014+-+Why+Overfitting+is+More+Dangerous+than+Just+Poor+Accuracy+Fig1.png" height="240" width="320" /></a></div>
<br />
<div align="center" class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: 28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 3.5in 280.0pt 308.0pt 336.0pt; text-align: center; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-no-proof: yes;"><!--[if gte vml 1]><v:shapetype
id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t"
path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype><v:shape id="Picture_x0020_1" o:spid="_x0000_i1026" type="#_x0000_t75"
style='width:321pt;height:241pt;visibility:visible;mso-wrap-style:square'>
<v:imagedata src="file://localhost/Users/deanabb/Library/Caches/TemporaryItems/msoclip/0/clip_image001.emz"
o:title=""/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></span><span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Apply models to regions
in the data without data is the entire reason for avoiding overfit. The issue
with the figure above was with model behavior when doing <i>extrapolation</i>,
where we want to make sure that the models behave in a reasonable way for
values outside (larger than or smaller than) the data used in training. But
models also need to behave well when they <i>interpolate</i>, meaning we want
models to behave reasonably for data in between data that exists in the
training data. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Consider the second
figure below showing decision boundaries for two models built from a data set
derived from the famous KDD Cup data from 1998. The two dimensions in this plot
are Average Donation Amount (Y) and Recent Donation Amount (X). This data tells
the story that higher values of average and recent donation amounts are related
to higher likelihoods of donors responding; note that for the smallest values
of both average and recent donation amount, at the very bottom left of the
data, the regions are colored cyan. <o:p></o:p></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-5VhEWKYLcbQ/U2KsFca4QhI/AAAAAAAAAcU/QeXcPX6YhHw/s1600/January+2014+-+Why+Overfitting+is+More+Dangerous+than+Just+Poor+Accuracy+Fig2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-5VhEWKYLcbQ/U2KsFca4QhI/AAAAAAAAAcU/QeXcPX6YhHw/s1600/January+2014+-+Why+Overfitting+is+More+Dangerous+than+Just+Poor+Accuracy+Fig2.png" height="153" width="320" /></a></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Both models are built
using the Support Vector Machines (SVM) algorithm, but with different values of
the complexity constant, C. Obviously, the model at the left is more complex
than the model on the right. The magenta regions represent responders and the
cyan regions represent non-responders.<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">In the effort to be
more accurate on training data, the model on the left creates closed-decision
boundaries around any and all groupings of responders. The model at the right
joins these smaller blobs together into a larger blob where the model classifies
data as responders. The complexity constant for the model at the right gives up
accuracy to gain simplicity. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Which model is more
believable? The one on the left will exhibit strange interpolation properties;
data in between the magenta blobs will be called non-responders, sometimes in
very thin regions between magenta regions; this behavior isn’t smooth or
believable. The model at the right creates a single region of data to be
classified as a responder and is clearly better than the model at the left.<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">Beware of overfitting
the data and test models not just on testing or validation data, but if
possible, on values not in the data to ensure its behavior, whether
interpolation or extrapolation, is believable.<o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<br /></div>
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]--><!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Revision>0</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Pages>1</o:Pages>
<o:Words>858</o:Words>
<o:Characters>4896</o:Characters>
<o:Company>Abbott Analytics</o:Company>
<o:Lines>40</o:Lines>
<o:Paragraphs>11</o:Paragraphs>
<o:CharactersWithSpaces>5743</o:CharactersWithSpaces>
<o:Version>14.0</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves>false</w:TrackMoves>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-fareast-language:JA;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">In part II, the problem
overfitting causes for model interpretation will be addressed. <o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;"><br /></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; tab-stops: .5in 1.0in 1.5in 2.0in 2.5in 3.0in 3.5in 4.0in 4.5in 5.0in 5.5in 6.0in; text-autospace: none;">
<span style="font-family: Helvetica; mso-bidi-font-family: Helvetica; mso-fareast-language: JA;">This article first appeared at the Predictive Analytics Times, <a href="http://www.predictiveanalyticsworld.com/patimes/why-overfitting-is-more-dangerous-than-just-poor-accuracy-part-i/" target="_blank">http://www.predictiveanalyticsworld.com/patimes/why-overfitting-is-more-dangerous-than-just-poor-accuracy-part-i/</a></span></div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com4tag:blogger.com,1999:blog-5652924.post-47114983368344692922014-01-16T07:20:00.001-08:002014-01-17T06:52:22.524-08:00Data Science and Big Data Search Trends<div style="font-family: Helvetica; font-size: 12px;">
These are from google trends.</div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
Data science is growing, but still way behind other traditional terms for our field such as data mining, predictive analytics and machine learning. Big data on the other hand is growing rapidly and outpacing other fields.</div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
This page for now is just an FYI that I may refer to in today's DM Radio show, "Blinded by Data Science" (<a href="http://t.co/59xvGh1gTz">http://bit.ly/KfK74l </a>). I hope to turn it into a more cogent blog post soon (but no promises!)</div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“data science”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22data+science%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“data science” vs. “data mining”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22data+science%22,+%22data+mining%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“data science vs. predictive analytics”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22data+science%22,+%22predictive+analytics%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“data science” vs. “machine learning”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22data+science%22,+%22machine+learning%22&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“data science city”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22data+science%22&geo=US&cmpt=q&content=1&cid=GEO_TABLE_0_2&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“big data”</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
<b>“big data” city</b></h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “data science”</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22data+science%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “data mining”</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22data+mining%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “predictive analytics”</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22predictive+analytics%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “machine learning”</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22machine+learning%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“predictive analytics”</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22predictive+analytics%22&geo=US&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “data science” city</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22data+science%22&geo=US&cmpt=q&content=1&cid=GEO_TABLE_0_2&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“big data” vs. “data mining” city</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22big+data%22,+%22data+mining%22&geo=US&cmpt=q&content=1&cid=GEO_TABLE_0_2&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
<h3>
“predictive analytics” city</h3>
</div>
<div style="font-family: Helvetica; font-size: 12px;">
<script src="//www.google.com/trends/embed.js?hl=en-US&q=%22predictive+analytics%22&geo=US&cmpt=q&content=1&cid=GEO_TABLE_0_2&export=5&w=500&h=330" type="text/javascript"></script></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
<div style="font-family: Helvetica; font-size: 12px;">
</div>
<div style="font-family: Helvetica; font-size: 12px; min-height: 14px;">
<br /></div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-20094256568228488892014-01-09T08:37:00.000-08:002014-01-17T06:52:34.002-08:00Speaking Engagements First Quarter 2014I'll be speaking at several events this quarter<br />
<br />
<b>1) EITA Global Webinar: </b> <a href="https://www.eitaglobal.com/control/w_product/~product_id=300058REC/~sel=LIVE/~Dean_Abbott/~Key_Steps_in_Starting_Your_First_Predictive_Analytics_Project" target="_blank">Key Steps in Starting Your First Predictive Analytics Project</a><br />
Tuesday, January 14, 2014, 1:00 PM EST<br />
<br />
This 90 minute webinar will walk through a predictive analytics project from start to finish using the CRISP-DM process model, including<br />
<br />
<ul>
<li>Data Needed for Predictive Modeling</li>
<li>Data Preparation</li>
<li>Top Modeling Algorithms: Decision Tree, Neural Networks, Regression, Clustereing</li>
<li>How to Assess Models</li>
<li>Model Deployment</li>
</ul>
<br />
Webinar cost: for viewing the webinar live, the cost is $239. There are volume discounts available.<br />
<br />
<br />
<b>2) Quebit Webinar: </b><a href="http://www.quebit.com/ai1ec_event/ibm-spss-modeler-seminar-series-techniques-unbalanced-data/?instance_id=122" target="_blank">IBM SPSS Modeler Seminar Series: Techniques for Unbalanced Data</a><br />
Dates Offered: January 22, 1-4 PM EST (lecture)<br />
January 23, 1-2 PM EST (Q&A)<br />
<br />
This webinar is divided into two days. On the first day, Keith McCormick and I will be discussing different strategies for handling unbalanced data. On the second day will cover questions from Day 1.<br />
<br />
We will be using examples used in our recent book, <a href="http://www.packtpub.com/ibm-spss-modeler-cookbook/book" target="_blank">The IBM SPSS Modeler Cookbook</a>.<br />
<br />
Use the code SPSSMODDA100 to get a $100 discount off of the webinar. It will also get you $300 off the annual subscription (which I strongly recommend--Keith and the Quebit team are the best Modeler trainers in the business).<br />
<br />
<br />
<b>3) UC / Irvine Predictive Analytics Certificate Program: </b><a href="http://unex.uci.edu/courses/sectiondetail.aspx?year=2014&term=Winter&sid=00391" target="_blank">Algorithms, Modeling Methods, Verification & Validation</a><br />
January 27, 2014 to March 16, 2014 (7 weeks, online)<br />
<br />
Learn how to use the basics of predictive analytics and modeling data to determine which algorithms to use. Understand the similarities and differences and which options affect the models most. Discover how to verify and validate your model. Topics covered include predictive analytics algorithms for supervised learning, including decision trees, linear and logistic regression, neural networks, k-nearest neighbor, support vector machines, and model ensembles. Gain a deeper understanding of how algorithms work qualitatively by reviewing best practices and the influence of various options on predictive models.<br />
<br />
Cost: $695<br />
<br />
This is the algorithms course, part of the UC Irvine Predictive Analytics Certificate program. There are prerequisites for the course, but if you don't have them yet, take them and take the algorithms course in a future quarter.<br />
<br />
4) <a href="http://www.knime.org/ugm2014" target="_blank"><b>7th KNIME User Group Meeting (UGM) and Workshops</b></a><br />
Feb 12-14, Zurich, Switzerland<br />
<br />
Wed Feb 12: Real-Time Customer Intelligence, Dean Abbott (Abbott Analytics, Inc.)<br />
Friday, Feb 14: Strategies for Building Predictive Models in KNIME, Dean Abbott (Abbott Analytics)<br />
<br />
KNIME User Group Meeting and Free Workshops, February 12-14, have a fee of 200 €<br />
<br />
5) <a href="http://www.meetup.com/San-Diego-KNIME-Users/events/157277972/" target="_blank"><b>The first San Diego KNIME Meetup</b></a><br />
<br />
Date: Wednesday, February 26, 2014, 6:00 PM to 8:30 PM (free)<br />
Location: DART Neuroscience, 12278 Scripps Summit Drive, San Diego, CA<br />
<br />
6) <a href="http://www.predictiveanalyticsworld.com/sanfrancisco/2014/speakers.php" target="_blank"><b>Predictive Analytics World, San Francisco</b></a><br />
<br />
Talk, March 18: <a href="http://www.predictiveanalyticsworld.com/sanfrancisco/2014/agenda.php#day2-1000b" target="_blank">Data Preparation from the Trenches: 4 Approaches to Deriving Attributes</a><br />
My talk at PAW/SF this year is on data preparation, focusing on strategies for building derived attributes manually and automatically.<br />
<br />
Workshop, March 19: <a href="http://www.predictiveanalyticsworld.com/sanfrancisco/2014/ensembles_handson.php" target="_blank">Supercharging Prediction: Hands-On with Ensemble Models</a><br />
This workshop discusses how to build model ensembles using a state-of-the-art software package. You will have a license for the software for use in the workshop and for a period of time subsequent to the workshop.<br />
<br />
<div>
Workshop, March 20: <a href="http://www.predictiveanalyticsworld.com/sanfrancisco/2014/handson_predictive_analytics.php" target="_blank">Advanced Methods Hands-on, Predictive Modeling Techniques</a></div>
This workshop discusses the predictive analytics modeling process using a state-of-the-art software package. We will go from beginning to end (Business Understanding through Deployment) for one use case. You will have a license for the software for use in the workshop and for a period of time subsequent to the workshop.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-65936313497251362242013-11-18T14:55:00.000-08:002014-01-17T06:53:28.505-08:00A Good Business Objective Beats a Good Algorithm<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle (<a href="http://www.kaggle.com/" style="border: 0px; color: #3b44a1; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;" target="_blank">kaggle.com</a>) is perhaps the most well-known forum for modeling competitions, using a crowd-sourcing mentality: if more people try to solve a problem, the likelihood that someone will create an excellent solution to that problem increases.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The participants, and there have been 10s of thousands of participants since their 2011 beginning, sometimes have no predictive modeling background and sometimes an extensive data science background. Some very clever algorithms and solutions have been developed with, on some occasions, ground-breaking results</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
One conclusion to draw from these competitions is that what we need in the predictive analytics space is more data scientists with different, innovative ideas for solving problems, and perhaps more in-depth training of data scientists so they can create these innovative solutions. After all, the Netflix prize winner created a solution that was an ensemble of model ensembles, comprised of hundreds of models (not a Kaggle competition, but one created by and for <a href="http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html" style="border: 0px; color: #3b44a1; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;" target="_blank">Netflix</a>).</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
This idea of the importance of machine learning expertise was the topic of a Strata conference debate in 2012, tackling the question, “which is more important, domain expertise or machine learning expertise”, or the way it was phrased for the debate, “who should your first hire be: a domain expert or data scientist?”</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The conclusion of the majority at the Strata conference was the machine learning is more important, but even the moderator, Mark Driscoll, concluded the following,<br style="margin: 0px; padding: 0px;" />“Could you currently prepare your data for a Kaggle competition? If so, then hire a machine learner. If not, hire a data scientist who has the domain expertise and the data hacking skills to get you there.” (<a href="http://medriscoll.com/post/18784448854/the-data-science-debate-domain-expertise-or-machine" style="border: 0px; color: #3b44a1; font-family: inherit; font-size: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; line-height: inherit; margin: 0px; outline: 0px; padding: 0px; vertical-align: baseline;" target="_blank">http://medriscoll.com/post/18784448854/the-data-science-debate-domain-expertise-or-machine</a>)</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The point is that defining the competition objectives and the data needed to solve the problem is critically important. Non-domain experts, the data scientists, can not ever hope to understand the domain well enough to determine what the most effective question to answer would be, where to find the data to build a modeling data set, what the target variable should be, and how one should assess which model is best. These are business domain specific.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Even companies building the same kinds of models, let’s say customer retention or churn, will approach them differently depending on the kind of business, the lead time needed to act on potential churners, and the metrics for churn that relate to ROI for that company. I’ve build models for companies in the same domain area that took very different approaches; even though I had some domain experience from customer 1, that didn’t translate into developing business objectives well for company 2.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
It’s the partnership that matters. I often think of these partnerships within an organization as the three-legged stool, all of which are needed for the modeling project to succeed: a business stakeholder who understands what business objectives matter to the company and how to articulate them, IT staff who know where the data is, what it means, and how to access it, and the analysts who know how to take the data and the business objectives and translate them into modeling objectives that address the business problem. Without all three, projects fail. We modelers could build the best models in the world that solve the wrong problem exceedingly well!</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
(first posted at http://www.predictiveanalyticsworld.com/patimes/a-good-business-objective-beats-a-good-algorithm/)</div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com4tag:blogger.com,1999:blog-5652924.post-68069396371988317212013-09-07T11:57:00.000-07:002013-09-08T05:17:03.485-07:00On Data Mining ContestsData mining contests have grown in popularity over the years, from the annual competitions at technical conferences to the continuous stream of events at sites like Kaggle. This has yielded several benefits, allowing many experts to work on difficult problems, giving novices a chance to work on real data and showcasing successful solutions. These competitions have even garnered the attention of the mainstream press. While believing that the spread of these technical contests has been largely positive, this author feels that it's worth noting the limitations of these contests.<br />
<br />
Despite using real data, the problems, as formulated, are somewhat artificial. Questions of sampling and initial variable selection have already been decided, as have the evaluation function and the model's part in the ultimate solution. To some extent, these are necessary constraints, but they are constraints nonetheless. In real world data mining, all of these questions are the responsibility of the data miner and his or her clients, and they are not trivial considerations. In most larger organizations, the data is large enough that there is always "one more" table in the database which could be tapped for candidate predictors. Likewise, how the model might best be positioned as part of the total solution is not always obvious, especially in more complex problems. A minority of contests permit the use of outside data, but even this is somewhat unrealistic since real organizations have budgets for the purchase of outside data, such as demographic data to be appended to a customer population. I've yet to learn of anyone paying for outside variables to append to competition data, though.<br />
<br />
Another issue is the large number of competitors which these contests attract. Though it is good to have many analysts take a crack at a problem, one must wonder about the statistical significance of having hundreds of statisticians test God-only-knows how many hypotheses against the same data. Further, the number of competitors and the similarity of top contestants' performance figures make selection of a single "winner" a dubious proposition.<br />
<br />
Finally, it has become rather common for winners of the contests to construct solutions of vast proportions- typically ensembles of gigantic number of base models. While such models may be feasible to deploy in some circumstances, they far too computationally demanding to execute on many real databases quickly enough to be practical.<br />
<br />
Some of these criticisms are probably unavoidable, especially the ones regarding the pre-selected, pre-digested contest data. Still, it'd be interesting to see future data mining competitions address at least some of these issues. For one thing, it might be interesting to see solution sizes (lines of SQL or C++ or something similar) limited to something which ordinary IT departments would be capable of executing during a typical overnight run. Averaging across an increased number of tasks might begin to improve the significance of differences among contestants' performances.<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com4tag:blogger.com,1999:blog-5652924.post-21926803372054203412013-08-21T19:07:00.003-07:002013-08-21T19:12:19.219-07:00Beware Phantom DataOne of the perennial challenges facing the data analyst is missing values. A great deal has been written about the importance of identifying the source of missing values, the danger of overly simplistic solutions and, of course, the many and varied mechanisms for "filling them in" with synthetic data ("imputation").<br />
<br />
Of the tremendous volume of material written on this subject, nearly all assumes that the analyst knows precisely which items are missing from the data. In reality, this is sometimes not the case. Relational databases and statistical software files, as a rule, have a special value to indicate "missing", though that does not mean that it is always used. Some file formats offer only indirect provision for missings, if any at all, and how software reacts to such missings varies.<br />
<br />
Consider, too, the popular practice of using special values (such as -9999) to represent missing values. What could possibly go wrong? For one thing, the person writing the data may not consider whether the flag value might represent a legitimate value. Is it possible, for instance, to have an account balance of -9999 dollars (euros, etc.)? In my career, I have seen databases which used different flag values for each field (-99, -9999, -99999, etc.), making the writing of code against such data extremely tedious (and error-prone). I have also seen -9999 used to indicate one type of missing value, and -9998 to indicate another type of missing. When the hand-off of information from one person (system, process, etc.) to another is confused or incomplete, interpretation of the data becomes incorrect.<br />
<br />
Another aspect of this problem is the precise definition given to fields, and their possible misinterpretation by data consumers (such as data miners). Imagine that a particular integer field is being used to record the number of times each customer has made a payment on their loan, within the past 6 months. As customers begin their tenure, this variable starts with a value of zero. Suppose our model included this field as an independent variable. Presumably low risk customers have higher values, while higher risk customers have lower values. Without missing any payments, early lifecycle customer are penalized arbitrarily by the model. One could make the argument that this variable should be undefined (recorded as a database missing value flag) until a customer has a full 6-month track record, but this is exactly the sort of conversation which very often fails to materialize in real organizations.<br />
<br />
These are all instances of "phantom data": Items in the database which are missing values, but mistaken for real data. It shouldn't take much imagination on the reader's part to conjure similar problematic situations in his or her own field. The lesson is to look beyond the known missings for more subtle gaps in the data. Time spent investigating the nature of database systems, company procedures and so forth which generate data is insurance against being burned by serious misunderstanding of the data.<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-61881480256273235912013-07-23T16:49:00.001-07:002014-01-17T06:53:40.244-08:00The NSA, Link Analysis and Fraud Detection<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The recent leaks about the NSA’s use data mining and predictive analytics has certainly raised awareness of our field and has resulted in hours of discussions with family, relatives, friends and reporters about what predictive analytics can (and can’t) do with phone records, emails, chat messages, and other structured and unstructured data. Eric Siegel and I have been interviewed on multiple occasions to address this issue from a Predictive Analytics perspective, and in case, in the same article: “<a href="http://fcw.com/articles/2013/06/12/nsa-risk-assessment.aspx" target="_blank">What the NSA can’t do with your data (probably)</a>”. Part of my goal in these conversations has been to bring back to reality many of the inflated expectations of what can be done with predictive analytics: predictive analytics is a powerful approach to finding patterns in data, but it isn’t magic, nor is it fool-proof.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
First, let me be clear: I have no direct knowledge of the analytics the NSA is doing. I have worked on many fraud detection projects for the U.S. Government and private sector, some including what I would describe as a “social networking” component to them where the connections between parties is an important part of the risk factors.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The phone call meta data shows simple information about each phone call: origination, destination, date of the call, duration, and perhaps some geographic information about the origination and destination. One of the valuable aspects of the data is that connections can be made between origination and destination numbers, and as a results, one can build social networks of every origination phone number in the data. The U.S. has more than 326.4 millions cell phones subscriptions as of December 2012 according to CTIA. The Pew Research survey found that individual cell phone users had on average 664 social connections (not all of which are cell connections). The number of links needed to build a U.S.-wide social map of phone call connections easily outstrips any possible visualization method, and therefore, without filtering connections and networks, these social maps would be useless. One of the factors working in our favor, if we are concerned with privacy issues related to this meta data, is therefore the sheer size of the network.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The networks of phone calls, I believe, are particularly useful in connecting high-risk individuals with others whom the NSA may not know beforehand are connected to the person of interest. In other words, a starting point is needed first and the social network is built from this starting point. If one has multiple starting points, one can also find linkages between networks even if the networks themselves don’t overlap significantly.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
The strength of a link can include information such as number of calls, duration of calls, regularity of calls, most recent call, oldest call, and more. Think of these as a cell-phone version of RFM analysis. The networks can be pruned easily based on thresholds for these key features, simplifying the networks considerably.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
But even if the connections are made, this data is woefully incomplete on its own. First, there is no connection to the person who actually made the call, only the phone number and who it is registered to. Finding who made the calls requires more investigation. Second, it doesn’t necessarily connect all the phones an individual might use. If a person uses 5 or 6 cell phones, one doesn’t know that the same person is behind these phone numbers. Third, one certainly doesn’t know the intent or content of the call.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Given these limitations, what value is there to the network of calls? These networks are usually best used as lead-generation engines. Which other phone numbers in a network are connected to multiple high-risk individuals (but weren’t here-to-fore considered high risk)? Is the timeline of calls temporally correlated with other known events?</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
Analytics, and link analysis in particular, provide tremendously powerful techniques to identify new leads and remove unfruitful leads by finding connections unlikely to occur randomly.</div>
<div style="border: 0px; color: #444444; font-family: 'Source Sans Pro', HelveticaNeue-Light, Arial, 'Helvetica Neue', sans-serif; font-size: 16px; line-height: 22px !important; margin-bottom: 20px; outline: 0px; padding: 0px; vertical-align: baseline;">
NOTE: this article first appeared as an article in the PATimes: http://www.predictiveanalyticsworld.com/patimes/the-nsa-link-analysis-and-fraud-detection/</div>
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-58539377335148791682013-06-18T07:19:00.001-07:002014-01-17T06:53:51.281-08:00Big Data is Not EnoughBig data is the big buzz word in the world of analytics today. According to google trends, shown in the figure, searches for "big data" have been growing exponentially since 2010 though perhaps is beginning to level off. Or take a look on amazon.com for books with Big Data in the title sometime: the publication dates, for the most part, are in 2012 or 2013.
<br />
<script src="//www.google.com/trends/embed.js?hl=en-US&q=big+data&cmpt=q&content=1&cid=TIMESERIES_GRAPH_0&export=5&w=400&h=330" type="text/javascript"></script>
<br />
<br />
But what's the key to unlock the big data door? In his interview with Eric Siegel on April 12, Ned Smith of Business News Daily (http://www.businessnewsdaily.com/4326-predictive-analytics-unlocks-big-data.html) starts with this apt insight: "Predictive Analytics is the 'Open Sesame' for the world of Big Data." Big data is what we have; predictive analytics (PA) is what we do with it.
<br />
<br />
Why is the data so big? Where does it come from? We who do PA usually think of doing predictive modeling on structured data pulled from a database, probably flattened into a single modeling table by a query so that the data is loadable into a software tool. We then clean the data, create features, and away we go with predictive modeling.
<br />
<br />
But according to a 2012 IBM study, <a href="http://www.ibmbigdatahub.com/infographic/where-does-big-data-come">"Analytics: The real-world use of big data"</a>, 88% of big data comes from transactions, 73% from log data, and significant proportions of data come from audio and video (still and motion). These are not structured data. Log files are often unstructured data containing nothing more than notes, sometimes freehand, sometimes machine-created, and therefore cannot be used without first preprocessing the data using text mining techniques. For all of us who have built models augmented with log files or other text data, we know how much work is involved in transforming text into useful attributes that can then be used in predictive models
<br />
<br />
Even the most structured of the big data sources, transactional data, often are nothing more than dates, IDs and very simple information about the nature of the transaction (an amount, time period, and perhaps a label about the nature of the transaction).
<br />
<br />
Transactional data is rarely used directly; it is usually transformed into a form more useful for predictive modeling. For example, rather than building models where each row is a web page transaction, we transform the data so that each row is a person (the ID) and the fields are aggregations of that person’s history for as long as their cookie has persisted; the individual transactions have to be linked together and aggregated to be useful.
<br />
<br />
The big data wave we are experiencing is therefore not helpful directly for improving predictive models, we need to first determine the level of analysis needed to build useful models, i.e., what a record in the model represents. The unit of analysis is determined by the question the model is intended to answer, or put another way, the decision the model is intended to improve within the organization. This is determined by defining the business objectives of the models, normally by a program manager or other domain expert in the organization, and not by the modeler.
<br />
<br />
The second step in building data for predictive modeling is creating the features to include as predictors for the models. How do we determine the features? I see three ways:
<br />
<ol>
<li>the analyst can define the features based on his / her experience in the field, or do research to find what others have done in the field through google searching and academic articles. This assumes the analyst is, to some degree, a domain expert.</li>
<li>the key features can be determined by other domain experts either handed down to the analyst or through interviews of domain experts by the analyst. This is better than a google search because the answers are focused on the organization’s perspective on solving the problem.</li>
<li>the analyst can rely on algorithm-based features creation. In this approach, the analyst merely provides the raw input fields and allows the algorithms to find the appropriate transformations of individual fields (easy) or multivariate combinations (more complex). Some algorithms and implementations of algorithms in software can do this quite effectively. This third approach I see advocated implicitly by data scientists in particular.</li>
</ol>
In reality, a combination of all three is usually used and I recommend all three. But features based on domain expertise almost always provides the largest gains in model performance compared with algorithm-based (automatic) feature creation.
<br />
<br />
This is the new thee-legged stool of predictive modeling: big data provides the information, augmenting what we have used in the past, domain experts provide the structure for how to set up the data for modeling, including what a record means and the key attributes that reflect information expected to be helpful to solve the problem, and predictive analytics provides the muscle to open the doors to what is hidden in the data. Those who take advantage of all three will be the winners in operationalizing analytics.
<br />
<br />
First posted at <a href="http://www.predictiveanalyticsworld.com/patimes/">The Predictive Analytics Times</a>Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-3839453216385063042013-06-07T19:16:00.001-07:002013-06-08T07:13:10.855-07:00Dean Abbott Featured in "Popular Mechanics" On-Line ArticleOur own Dean Abbott has been consulted for an on-line <i>Popular Mechanics</i> article, <a href="http://www.popularmechanics.com/technology/military/news/why-the-nsa-wants-all-that-verizon-metadata-15562288" target="_blank">"Why the NSA Wants All That Verizon Metadata"</a> (Jun-06-2013), by Glenn Derene. Since the initial report connecting the NSA with Verizon, details have emerged suggesting similar large-scale information-gathering by the American government from other telecommunication and Internet companies.<br />
<br />
Some applications of data mining to law enforcement and anti-terrorism problems have clearly been fruitful (for detection of money laundering, for instance, which is one source of funding for criminal and terrorist organizations). On the other hand, direct application of these techniques to plucking out the bad guys from large numbers of innocents strikes this author as dubious, and has long been criticized by experts, such as Bruce Schneier. What's plain is that people in democratic societies must remain vigilant of the balance of information and power granted to their governments, lest the medicine become worse than the disease.<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-28632279022358040742013-04-26T09:55:00.001-07:002013-04-26T09:55:22.669-07:00Math and Predictive Analytics - A Personal AccountLast week I taught a workshop at Predictive Analytics World entitled <a href="http://www.predictiveanalyticsworld.com/sanfrancisco/2013/ensembles_handson.php" target="_blank">Supercharging Prediction: Hands-On with Ensemble Models</a>. The workshop was intended to introduce predictive modelers to the concept of ensembles through a combination of lecture to provide an overview of model ensembles and hands-on to gain experience building ensembles using <a href="http://www.salford-systems.com/">Salford Systems SPM v7.0</a> (Salford Systems sponsored the workshop).
<br />
<br />
This morning, Heather Hinman, a Marketing Communications Manager at Salford Systems, <a href="http://1.salford-systems.com/blog/bid/285726/Supercharging-Predictive-Modeling-A-Beginners-Perspective" target="_blank">posted comments on attending that workshop at the Salford Systems blog</a>. Two comments were particularly interesting, especially their implications vis a vis my last blog post on math and predictive analytics:<br />
<br />
<blockquote class="tr_bq">
<span style="font-family: Arial, Helvetica, sans-serif; font-size: 13px; line-height: 18px;">I will admit I was intimidated at first to be participating in a predictive modeling workshop as I do not have a background in statistics, and only have basic training on decision tree tools by Salford Systems' team of in-house experts. Despite my basic knowledge of decision trees, I was thrilled that I was able to follow along with ease and understanding when learning about tree ensembles and modern hybrid modeling approaches. Marketing folk building predictive models? Yes, we can!</span></blockquote>
and<br />
<blockquote class="tr_bq">
Now back at the office in San Diego, along with my usual responsibilities, I feel confident in my ability to build predictive models and gain insights into the data at hand to achieve the email marketing and online campaign goals for our communication efforts! </blockquote>
In the post, Heather also outlines some of the principles she learned and how she used them to build the predictive models in the workshop.<br />
<br />
The point is this: if one uses good software that uses solid principles for building predictive models, and one understands key principles of building predictive models, someone without a mathematics background can build good, profitable models.<br />
<br />
<br />
<br />Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-64196828303665474192013-04-01T16:49:00.000-07:002013-04-01T18:49:00.606-07:00Do Predictive Modelers Need to Know Math?(Note: this post was first published in the <a href="http://www.predictiveanalyticsworld.com/patimes/march13/" target="_blank">March 2013 Edition of the Predictive Analytics Times</a>)<br>
Predictive analytics is just a bunch of math, isn’t it? After all, algorithms in the form of matrix algebra, summations, integrals, multiplies and adds are the core of what predictive modeling algorithms do. Even rule-based approaches need math to compute how good the if-then-else rules are.
<br><br>
I was participating in a predictive analytics course recently and the question a participant asked at the end of two days of instruction was this: “it’s been a long time since I’ve had to do this kind of math and I’m a bit rusty. Is there a book that would help me learn the techniques without the math?”
<br><br>
The question about math was interesting. But do we need to know the math to build models well? Anyone can build a bad model, but to build a good model, don’t we need to know what the algorithms are doing? The answer, of course, depends on the role of the analyst. I contend, however, that for most predictive analytics projects, the answer is “no”.
<br><br>
Let’s consider building decision tree models. What options does one need to set to build good trees? Here is a short list of common knobs that can be set by most predictive analytics software packages:
1. Splitting metric (CART style trees, C5 style trees, CHAID style trees, etc.)
2. Terminal node minimum size
3. Parent node minimum size
4. Maximum tree depth
5. Pruning options (standard error, Chi-square test p-value threshold, etc.)
<br><br>
The most mathematical of these knobs is the splitting metric. CART-styled trees use the Gini Index, C5 trees use Entropy (information gain), and CHAID style trees use the chi-square test as the splitting criterion. A book I consider the best technical book on data mining and statistical learning methods, “The Elements of Statistical Learning”, has this description of the splitting criteria for decision trees, including the Gini Index and Entropy:
<br><br>
<a href="http://1.bp.blogspot.com/-Fd3pF9d51mI/UVWxcnn7mLI/AAAAAAAAAPw/nSOjGqwEwWo/s1600/gini+entropy+and+misclassification+formulas.png" imageanchor="1"><img border="0" src="http://1.bp.blogspot.com/-Fd3pF9d51mI/UVWxcnn7mLI/AAAAAAAAAPw/nSOjGqwEwWo/s320/gini+entropy+and+misclassification+formulas.png" /></a>
<br><br>
To a mathematician, these make sense. But without a mathematics background, these equations will be at best opaque and at worst incomprehensible. (And these are not very complicated. Technical textbooks and papers describing machine learning algorithms can be quite difficult even for more seasoned, but out-of-practice mathematicians to understand).
<br><br>
As someone with a mathematics background and a predictive modeler, I must say that the actual splitting equations almost never matter to me. Gini and Entropy often produce the same splits or at least similar splits. CHAID differs more, especially in how it creates multi-way splits. But even here, the biggest difference for me is not the math, but just that they use different tests for determining "good" splits
<br><br>
There are, however, very important reasons for someone on the team to understand the mathematics or at least the way these algorithms work qualitatively. First and foremost, understanding the algorithms helps us uncover why models go wrong. Models can be biased toward splitting on particular variables or even particular records. In some cases, it may appear that the models are performing well but in actuality they are brittle. Understanding the math can help remind us that this may happen and why.
<br><br>
The fact that linear regression uses a quadratic cost function tells us that outliers affect overall error disproportionately. Understanding how decision trees measure differences between the parent population and sub-populations informs us why a high-cardinality variable may be showing up at the top of our tree, and why additional penalties may be in order to reduce this bias. Seeing the computation of information gain (derived from Entropy) tells us that binary classification with a small target value proportion (such as having 5% 1s) often won't generate any splits at all.
<br><br>
The answer to the question if predictive modelers need to know math is this: no they don’t need to understand the mathematical notation, but neither should they ignore the mathematics. Instead, we all need to understand the effects of the mathematics on the algorithms we use. “Those who ignore statistics are condemned to reinvent it,” warns Bradley Efron of Stanford University. The same applies to mathematics.
<br><br>Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com11tag:blogger.com,1999:blog-5652924.post-76208669152654969702013-02-14T11:09:00.000-08:002013-02-16T14:48:51.489-08:00What To Take Home from Your Next Predictive Analytics ConferenceWhy should one go to a predictive analytics conference? What should one take home from a conference like <a href="http://www.predictiveanalyticsworld.com" target="_blank">Predictive Analytics World</a> (PAW)? There are many reasons conferences are valuable including interacting with thought leaders and practitioners, seeing software and hardware tools (the exhibit hall), and learning principles of predictive analytics from talks and workshops. This post focuses on the talks, and in particular, case studies. <br><br>
There is no quicker way to upgrade our capabilities than having someone else who has "been there" tell us how they succeeded in their development and implementation of predictive models. When I go to conferences, this is at the top of my list. In the best case studies I am able to see different way of looking at a problem than I had considered before, how the practitioner overcame obstacles, how their target variable was defined, what data was used in building the models, how the data was prepared, what figure of merit they used to judge a model's effectiveness, and much more. <br><br>
Almost all case studies we see at conferences are success stories; we all love winners. Yes, we all know that we learn from mistakes and many case studies actually enumerate mistakes. But success sells and given time limitations in a 20-50 minute talk, few mistakes and dead-ends are usually described in the talks. And, as we used to say in when I was doing government contracting, one works like crazy on the research and then when the money runs out, one declares victory. Putting a more positive spin on the process, we do as well as we can with the resources we have, and if the final solution improves the current system, we are indeed successful. <br><br>
But once we observe the successful approach, what can we really take home with us? There are three reasons we should be skeptical taking case studies and applying them directly to our own problems.<br><br>
The first two reasons are straightforward. First, our data is different from the data used in the talk. Obviously. But it is likely to be different enough that one cannot not take the exact same approach to data preparation or target variable creation that one sees at a conference.<br><br>
Second, our business is different. The way the question was framed and the way predictions can be used are likely to differ in our organization. If we are building models to predict Medicare fraud, they way the “suspicious” claim is processed and which data elements are available vary significantly for each provider (codes being just one example).<br><br>
The third reason is more subtle and more difficult to overcome. In a fascinating New Yorker article entitled, <a href="http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer" target="_blank">"The Truth Wears Off: Is there something wrong with the scientific method?"</a>, author Jonah Lehrer describes an effect seen by many researchers over the past few decades. Findings in major studies, published in reputable journals, and showing statistically significant results have been difficult to replicate by the original researcher and by others. This is a huge problem because replicating results is what we do as predictive modeler: we assume that behavior in the past can and will be replicated in the future.<br><br>
In one example, researcher Jonathan Schooler (who was originally at the University of Washington as a graduate student) “demonstrated that subjects shown a face and asked to describe it were much less likely to recognize the face when shown it later than those who had simply looked at it. Schooler called the phenomenon ‘verbal overshadowing’. The study turned him into an academic star."<br><br>
A few years later, he tried to replicate the study didn’t succeed. In fact, he tried many times over the years and never succeeded. The effect he found at first waned each time he tried to replicate the study with additional data. "This was profoundly frustrating. It was as if nature gave me this great result and then tried to take it back.”
There have been a variety of potential explanations for the effect, including “regression to the mean”. This might very well be the case because even when we show statistically significant results defined by having a p value less than 0.05, there is still a chance that the effect found was not really there at all. Over thousands of studies, dozens find effects therefore that aren't really there. <br><br>
Let's assume we are building models and there is actually no significant difference between responders and non-responders (but we don't know that). However, we work very hard to identify an effect, and eventually we find the effect on training and testing data. We publish. But the effect isn't there; we happened upon the effect just had good luck (which in the long run is actually bad luck!). Even if the chance of finding the effect by chance is 1 in 100, or 1 in 1000, if we experiment enough and search through enough variables, we may happen upon a seemingly good effect eventually. This process, called "over searching" by Jensen and Cohen (see "<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.7008" target="_blank">Multiple Comparisons in Induction Algorithms</a>"), is a real danger.<br><br>
So what do we do at conferences? We should take home ideas, principles, and approaches rather than recipes. It should spur us to try ideas we either hadn't yet tried or even thought about before. <br><br>
(An earlier version of this post was first published in the <a href="http://www.predictiveanalyticsworld.com/patimes/february13/" target="_blank">Predictive Analytics Times February 2013</a> issue)Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com3tag:blogger.com,1999:blog-5652924.post-48985528830597318362013-02-10T12:18:00.000-08:002017-05-08T02:11:54.301-07:00Using Geographic DataMost organizations collect and maintain some type of geographic data, yet many ignore this data during analysis. Any business has some record of customer addresses, for instance, but this data is usually formatted in an awkward, non-numeric form. Geographic data can be very predictive, though, since behaviors being predicted often have some correlation to location.<br />
<br />
So, how might one use geographic data? Possible answers depend on several factors, most importantly the volume and type of such data. A company serving a national market in the United States, for instance, will have customer shipping and billing addresses (not necessarily the same thing) for each customer (possibly for each transaction). These addresses normally come with a range of spatial granularities: street address, town, state, and associated ZIP Code (a 5-digit postal code).<br />
<br />
Even at the largest level of aggregation, the state level, there may be over 50 distinct values (besides the 50 states, American addresses may be in Washington D.C. [technically not part of any state], or any of a number of other American territories, the most common of which is probably Puerto Rico). With 50 or so distinct values, significant data volume is needed to amass the observations needed to draw conclusions about each value. Given the best case scenario, in which all states exhibit equal observation counts, 1,000 observations breaks out into 50 categories of merely 20 observations each- not even enough to satisfy the old statistician's 30 observation rule of thumb. In data mining circles, we are accustomed to having much larger observation counts, but consider that the distribution of state values is never uniform in real data.<br />
<br />
Using individual dummy variables to represent each state <i>may</i> be possible with especially large volumes. Possibly an "other" category covering the least frequent so many states will be needed. Another technique which I have found to work well is to replace the categorical state variable with a numeric variable representing a summary of the target variable, conditioned by state. In other words, all instances of "Virginia" are replaced by the average of the target variable for all Virginia cases, all instances of "New Jersey" are replaced by the average of the target variable for all New Jersey cases, and so on. This solution concentrates information about the target which comes from the state in a single variable, but makes interactions with other predictors more opaque. Ideally, such summaries are calculated on special hold-out set of data, used just for this purpose, so as to avoid over-fitting. Again, it may be necessary to lump the smallest so many states together as "other". While I have used American states in my example, it should not be hard for the reader to extend this idea to Canadian provinces, French départements, etc.<br />
<br />
Most American states are large enough to provide robust summaries, but as a group they may not provide enough differentiation in the target variable. Changing the spatial scale implies a trade-off: Smaller geographic units exhibit worse summary variance, but improved geographic differentiation. American town names are not necessarily unique within a given state and similar names may be confused (Newtown, Pennsylvania is quite a distance from Newtown Square, Pennsylvania, for instance). In the United States, county names are unambiguous, and present finer spatial detail than states. County names do not, however, normally appear in addresses, but they are easily attached using ZIP Code/County tables easily found on-line. Another possible aggregation is the Section Code Facility, or "SCF", which is the first 3 digits of the ZIP Code.<br />
<br />
In the American market, other types of spatial definitions which can be used include: Census Bureau definitions, telephone area codes and Metropolitan Statistical Areas ("MSAs") and related groupings defined by the U.S. Office of Management and Budget. The Census Bureau is a government agency which divides the entire country in to spatial units which vary in scale, down to very small areas (much smaller than ZIP Codes). MSAs are very popular with marketers. There are 366 MSAs at present, and they do not cover the entire land area of the United States, though they do cover about 85% of its population.<br />
<br />
It is important to note that nearly all geographic entities change in size, shape and character over time. While existing American state and county boundaries almost never change any more, ZIP code boundaries and Census Bureau definitions, for instance, do change. Changing boundaries obviously complicates analysis, even though historic boundary definitions are often available. Even among entities whose boundaries do not change, radical changes in behavior may happen in geographically distinct ways. Consider that a model built before hurricane Katrina may no longer perform well in areas affected by the storm.<br />
<br />
Also note that some geographic units, by definition, "respect" other definitions. American counties, for instance, only contain land from a single state. Others don't: the third-most populous MSA, Chicago-Joliet-Naperville, IL-IN-WI, for example, overlaps three different states.<br />
<br />
Being creative when defining model inputs can be as helpful with geographic data as it is with more conventional data. In addition to the billing address itself, consider transformations such as: Has the billing address ever changed (1) or not (0)? How many times has the billing address changed? How often has the billing address changed (number of times changed divided by number of months the account has been open)? How far is the shipping address from the billing address? And so on...<br />
<br />
Much more sophisticated use may be made of geographic data than has been described in this short posting. Software is available commercially which will determine drive time contours about locations, which would be useful, for instance when modeling retail store location revenue models. Additionally, there is an entire branch of statistics, called <i>spatial statistics</i>, which defines an entire class of analysis procedures specific to this sort of thing.<br />
<br />
I encourage readers who have avoided geographic data to consider even simple mechanisms to include it in model construction. Opening up a new dimension in your analysis may provide significant returns.<br />
<br />
<br />
<br />
<br />
<br />Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com1tag:blogger.com,1999:blog-5652924.post-19061026438383029612013-02-02T07:23:00.004-08:002013-02-02T07:25:27.006-08:00When Analysis Isn't the AnswerData mining is an important tool whose benefits have been demonstrated in diverse fields, among business, government and non-profit organizations. Its application areas continue to grow, especially given the ever-shrinking cost of gathering and organizing data. Yet, there are problems for which data mining is wholly unsuited as a solution.<br />
<br />
To understand when data mining is not applicable, it will be helpful to define precisely when it <b>is</b> applicable. Data mining (inferential statistics, predictive analytics, etc.) requires data stored in a machine format of sufficient volume, quality and relevance so as to permit the construction of predictive models which assist in real-world decision making.<br />
<br />
Most of our time as data miners is spent worrying over the quality of the data and the process of turning data into models, however it is important to realize the usual context of data mining. Most organizations can perform basic decision making competently, and they have done so for thousands of years. Whether the base decision process is human judgment, a simple set of rules or a spreadsheet, much performance potential is already realized before data mining is applied. Consultants' marketing notwithstanding, data mining typically inhabits the margin of performance, where it tries to bring an extra "edge".<br />
<br />
So, if the above two paragraphs describe conditions conducive to data mining success, what sorts of real-world situations defy data mining? The most obvious would be problems featuring data that is too small, too narrow, too noisy or of too little relevance to allow effective modeling. Organizations which have not maintained good records, which still rely on non-computer procedures and those with too little history are good examples. Even within very large organizations which collect and store enormous databases, there may be no relevant data for the problem at hand (for instance, when a new line of business is being opened, or new products introduced). It is surprising how often business people expect to extract value from a situation when they have failed to invest in appropriate data gathering.<br />
<br />
Another large area with minimal data mining potential is organizations whose basic business process is so fundamentally broken that the usual decision making procedures have failed to do the usual "heavy lifting". Any of us can easily recall experiences in retail establishments whose operation was so flawed that it was obvious that the profit potential was not nearly being exploited. Data mining cannot fine tune a process which is so far gone. No amount of quantitative analysis will fix unkept shelves, weak product offering or poor employee behavior.Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com0tag:blogger.com,1999:blog-5652924.post-22802091280749618042013-01-16T07:57:00.002-08:002013-01-16T07:57:34.324-08:00Three Ways to Get Your Predictive Models Deployed
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Revision>0</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Pages>1</o:Pages>
<o:Words>1195</o:Words>
<o:Characters>6812</o:Characters>
<o:Company>Abbott Analytics</o:Company>
<o:Lines>56</o:Lines>
<o:Paragraphs>15</o:Paragraphs>
<o:CharactersWithSpaces>7992</o:CharactersWithSpaces>
<o:Version>14.0</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-fareast-language:JA;}
</style>
<![endif]-->
<!--StartFragment-->
<br />
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">We all know
that given reasonable data, a good predictive modeler can build a model that
works well and helps make makes better decisions than what is currently used in
your organization (at least in our own minds). Newer data, sophisticated
algorithms, and a seasoned analyst are all working in our favor when we build
these models, and if success were measured by accuracy (as they are in most
data mining competitions), we're in great shape. Yes, there are always gotchas
and glitches along the way. But when my deliverable is only slideware, even of
the modeling is hard, I'm confident of being able to declare victory at the
end.<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">However,
the reality is that there is much more to the transition from cool model to
actual deployment than a nice slide deck and paper accepted at one's favorite
predictive analytics, data mining or big data conference. In these venues, the
winning models are those that are "accurate" (more on that later) and
have used creative analysis techniques to find the solution; we won't submit a
paper when we only had to press the "go" button and have the data
mining software give us a great solution!<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">For me, the
gold standard is deployment. If the model gets used and improves the decisions
an organization makes, I've succeeded. Three ways to increase the likelihood
your models are deployed are: <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">1) Make
sure the model stakeholder designs deployment into the project from the
beginning</span></b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;"><o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">The model
stakeholder is the individual, usually a manager, who is the advocate of
predictive models to decision-makers. It is possible that a senior-level
modeler can do this task, but that person must be able to switch hit: he or she
must be able to speak the language of management and be able to talk technical
detail to analytics. This may require more than one trusted person: the
manager, who is responsible and makes the ultimate decisions about the models,
and the lead modeler, who is responsible for the technical aspects of the
model. It is more than "talking the talk" and knowing buzz-words in both
realms; the person or persons must truly be "one of" both groups. <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">For those
who have followed my blog posts and conference talks, you know I am a big
advocate of the CRISP-DM process model (or equivalent methodologies, which seem
to be endless). I've referred to CRISP-DM often, including on topics related to
</span><a href="http://abbottanalytics.blogspot.com/2011/06/what-do-data-miners-need-to-learn.html"><span style="color: #878787; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA; text-decoration: none; text-underline: none;">what
data miners need to learn</span></a><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">
and </span><a href="http://abbottanalytics.blogspot.com/2012/04/why-defining-target-variable-in.html"><span style="color: #878787; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA; text-decoration: none; text-underline: none;">Defining
the Target Variable</span></a><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">, just as
two examples. <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">The
stakeholder must not only understand the business of objectives of the model
(Business Understanding in CRISP-DM), but must be present during discussions
take place related to which models will be built. It is essential that
reasonable expectations are put into place from the beginning, including what a
good model will "look like" (accuracy and interpretability) and how
the final model will be deployed. <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">I've seen
far too many projects die or become inconsequential because either the wrong
objectives were used in building the models, meaning the models were
operationally useless, or because the deployment of the models was not
considered, meaning again that the models were operationally useless. As an
example, on one project, the model was assumed to be able to be run within a
rules engine, but the models that were built were not rules at all, but were
complex non-linear models that could not be translated into rules. The problem
obviously could have been avoided had this disconnect been verbalized early in
the modeling process. <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">2) Make
sure modelers understand the purpose of the models</span></b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;"><o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">The
modelers must know how the models will be used and what metrics should be used
to judge model performance. A good summary of typical error metrics used by
modelers is found </span><a href="http://abbottanalytics.blogspot.com/2006/11/error-measures.html%20target="><span style="color: #878787; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA; text-decoration: none; text-underline: none;">here</span></a><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">. However, for most of the models I have
deployed in customer acquisition, retention, and risk modeling, the treatment
based on the model is never applied to the <i>entire</i> population (we don't
mail everyone, just a subset). So the metrics that make the most sense are
often ones like "lift after the top decile", maximum cumulative net
revenue, top 1000 scores to be investigated, etc. I've actually seen negative
correlations between the ranking of models based on global metrics (like
classification error or R^2) vs. the ranking based on subset selection ranking,
such as top 1000 scores; very different models may be deployed depending on the
metric one uses to assess them. If modelers aren't aware of the metric to be
used, the wrong model can be selected, even one that does worse than the
current approach.<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">Second, if
the modelers don't understand how the models will be deployed operationally,
they may find a fantastic model, one that maximizes the right metric, but is
useless. The Neflix Prize is a </span><a href="http://techblog.netflix.com/2012/04/%20netflix-recommendations-beyond-5-stars.html"><span style="color: #878787; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA; text-decoration: none; text-underline: none;">great
example</span></a><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">: the final winning model was
accurate but far too complex to be used. Netflix extracted key pieces to the
models to operationalize instead. I've had customers stipulate to me that
"no more than 10 variables can be included in the final model". If
modelers aren't aware of specific timelines or implementation constraints, a
great but useless model can be the result.<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">3) Make
sure the model stakeholder understands what the models can and can't do</span></b><span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;"><o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">In the
effort to get models deployed, I've seen models elevated to a status they don't
deserve, most often by exaggerating their accuracy and expected performance
once in operation. I understand why modelers may do this: they have a direct
stake in what they did. But the manager must be more skeptical and
conservative.<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">One of the
most successful colleagues I've ever worked with used to assess model
performance on held-out data using the metric we had been given (maximum depth
one could mail to and still achieve the pre-determined response rate). But then
he always backed off what was reported to his managers by about 10% to give
some wiggle room. Why? Because even in our best efforts, there is still a
danger that the data environment after the model is deployed will differ from
that used in building the models, thus reducing the effectiveness of the
models.<o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">A second
problem for the model stakeholder is communicating an interpretation of the
models to decision-makers. I've had to do this exercise several times in the
past few months and it is always eye-opening when I try to explain the patterns
a model is finding when the model is itself complex. We can describe overall
trends ("on average", more of X increases the model score) and we can
also describe specific patterns (when observable fields X and Y are both high,
the model score is high). Both are needed to communicate what the models do,
but have to connect with what a decision-maker understands about the problem.
If it doesn't make sense, the model won't be used. If it is too obvious, the
model isn't worth being used. <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">The ideal
model for me is one where the decision-maker nods knowingly at the "on
average" effects (these should usually be obvious). Then, once you throw
in some specific patterns, he or she should scrunch his/her eyes, think a bit,
then smile as the implications of the pattern dawns on them as that pattern
really does make sense (but was previously not considered). <o:p></o:p></span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;">As
predictive modelers, we know that absolutes are hard to come by, so even if
these three principles are adhered to, other factors can sabotage the
deployment of a model. Nevertheless, in general, these steps will increase the
likelihood that models are deployed. In all three steps, communication is the
key to ensuring the model built addresses the right business objective, the
right scoring metric, and can be deployed operationally.</span></div>
<div class="MsoNormal" style="line-height: 16.0pt; mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="color: #262626; font-family: Georgia; font-size: 10.0pt; mso-bidi-font-family: Georgia; mso-fareast-language: JA;"><br /></span></div>
<div class="MsoNormal">
<span style="color: #262626; font-family: Georgia; font-size: 10pt; line-height: 16pt;">NOTE: this post was originally posted for the Predictive Analytics Times at </span><span style="color: #262626; font-family: Georgia; font-size: x-small;"><span style="line-height: 21.328125px;">http://www.predictiveanalyticsworld.com/patimes/january13/</span></span><span style="color: #262626; font-family: Georgia; font-size: 10pt; line-height: 16pt;"> </span></div>
<!--EndFragment-->Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com1tag:blogger.com,1999:blog-5652924.post-70724328216402493522013-01-04T08:52:00.002-08:002013-01-04T08:52:35.456-08:00Top Posts in 2012For the second consecutive year, a quick look back at posts from the prior year.<br />
<br />
For posted in 2012, in order of popularity:<br />
<ol>
<li><a href="http://abbottanalytics.blogspot.com/2012/02/target-pregnancy-and-predictive.html" target="_blank">Target, Pregnancy, and Predictive Analytics, Part I</a></li>
<li><a href="http://abbottanalytics.blogspot.com/2012/02/target-pregnancy-and-predictive_21.html" target="_blank">Target, Pregnancy, and Predictive Analytics, Part II</a></li>
<li><a href="http://abbottanalytics.blogspot.com/2012/05/predictive-analytics-world-had-target.html" target="_blank">Predictive Analytics World Had the Target Story First</a></li>
<li><a href="http://abbottanalytics.blogspot.com/2012/04/why-defining-target-variable-in.html" target="_blank">Why Defining the Target Variable in Predictive Analytics is Critical</a></li>
<li><a href="http://abbottanalytics.blogspot.com/2012/04/dilbert-database-marketing-and-spam.html" target="_blank">Dilbert, Database Marketing, and Spam</a></li>
</ol>
I’m also adding #6 because Will post in December did very well, but of course has had only one month to accumulate views.<br />
<ol start="6">
<li><a href="http://abbottanalytics.blogspot.com/2012/12/6-reasons-you-hired-wrong-data-miner.html" target="_blank">6 Reasons You Hired the Wrong Data Miner</a></li>
</ol>
From posts prior to 2012, in order of popularity for 2012:<br />
<ol>
<li><a href="http://abbottanalytics.blogspot.com/2011/06/what-do-data-miners-need-to-learn.html" target="_blank">What Do Data Miners Need to Learn</a> (June 2011)</li>
<li><a href="http://abbottanalytics.blogspot.com/2006/11/free-and-inexpensive-data-mining.html" target="_blank">Free and Inexpensive Data Mining Software</a> (November 2006) This post needs to be updated!</li>
<li><a href="http://abbottanalytics.blogspot.com/2009/04/why-normalization-matters-with-k-means.html" target="_blank">Why Normalization Matters for K Means</a> (April 2009) It always amazes me why this post persists as one of the most popular, but nearly ¼ of the visits used the search term “K Means Noisy Data”</li>
<li><a href="http://abbottanalytics.blogspot.com/2008/04/data-mining-data-sets.html" target="_blank">Data Mining Data Sets</a> (April 2008) This post also needs to be updated</li>
<li><a href="http://abbottanalytics.blogspot.com/2009/12/business-analytics-vs-business.html" target="_blank">Business Analytics vs. Business Intelligence</a> (December 2009)</li>
</ol>
One final note: When I look back at visits since the start of this blog, 4 of the top 5 posts are the top 4 “prior to 2012” above. The #5 most popular post over all the years I’ve had the blog is one by Will from 2007, “<a href="http://abbottanalytics.blogspot.com/2007/03/missing-values-and-special-values.html" target="_blank">Missing Values and Special Values: The Plague of Data Analyis</a>”, one that I have always liked very much.<br />
<br />
Best to all of you in 2013!
Dean Abbotthttp://www.blogger.com/profile/16818000233889520746noreply@blogger.com0