
| Current Path : /var/www/web-klick.de/dsh/50_dev2017/1310__algorithms/Julia/Notebooks/ |
Linux ift1.ift-informatik.de 5.4.0-216-generic #236-Ubuntu SMP Fri Apr 11 19:53:21 UTC 2025 x86_64 |
| Current File : /var/www/web-klick.de/dsh/50_dev2017/1310__algorithms/Julia/Notebooks/DecisionTrees.ipynb |
{
"metadata": {
"language": "Julia",
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name='Sections'/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [Decision Trees](#Decision Trees)\n",
" - [Using Iris Dataset as a Sample](#Iris)\n",
" - [Decision Stumps](#Stumps)\n",
" - [Verifying a Split](#Verifying)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='Decision Trees'/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Decision Trees\n",
"=====\n",
"[[back to top]](#Sections)\n",
"\n",
"- Decision tree learning uses a decision tree as a predictive model which maps observations about an item to conclusions about the item's target value.\n",
"\n",
"- Decision tree learning is a method commonly used in data mining.[\n",
"\n",
"- The goal is to create a model that predicts the value of a target variable based on several input variables.\n",
"\n",
"- Decision tree learning is one of the most successful techniques for supervised classification learning.\n",
"\n",
"- Decision trees generate white-box classification and regression models which can be used for *feature selection* and *sample prediction*.\n",
" - (If a given situation is observable in a model the explanation for the condition is easily explained by boolean logic.)\n",
"\n",
"- Decision trees require almost no data preparation (i.e. normalization) and can handle both numerical and nominal/categorical data. Decision trees can also be pruned or bundled into ensembles of trees (i.e. random forests) in order to remedy over-fitting and improve prediction accuracy.\n",
"\n",
"- A decision tree is a predictive model which maps observations (features) about an item to conclusions about the item\u2019s type or class.\n",
"\n",
"- Large amounts of data can be analysed using standard computing resources in reasonable time.\n",
"\n",
"This library provides an implementation of the [**ID3**](https://en.wikipedia.org/wiki/ID3_algorithm) decision tree classifier algorithm along with:\n",
"\n",
"- pruning, parallelized random forest generation \n",
"- cross validation functionality\n",
"- support for mixed numerical and nominal data."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"using DecisionTree\n",
"using RDatasets"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='Iris'/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"####Using Iris Dataset as a Sample\n",
"[[back to top]](#Sections)\n",
"\n",
"The Iris dataset consists of 150 samples of [iris flowers](https://en.wikipedia.org/wiki/Iris_flower_data_set), along with their: \n",
"\n",
"- sepal lengths and widths \n",
"- petal lengths and widths\n",
"- specie names. \n",
"\n",
"We\u2019ll use the sepal and petal measurements as the features, \n",
"and the specie names (Setosa, Versicolor, Virginica) as the labels."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"iris = dataset(\"datasets\", \"iris\");"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# helper function to convert PooledDataArray to Array\n",
"function array{T, R}(da::PooledDataArray{T, R})\n",
" n = length(da)\n",
" res = Array(T, size(da))\n",
" for i in 1:n\n",
" if da.refs[i] == zero(R)\n",
" error(NAException())\n",
" else\n",
" res[i] = da.pool[da.refs[i]]\n",
" end\n",
" end\n",
" return res\n",
"end\n"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
"array (generic function with 1 method)"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"using Gadfly\n",
"using RDatasets"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"plot(dataset(\"datasets\", \"iris\"),x=\"SepalLength\", y=\"SepalWidth\", Geom.point)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div id=\"gadflyplot-Knps0f6uCWlUj3MaHnZq\"></div>\n",
"<script>\n",
"(function (module) {\n",
"function draw_with_data(data, parent_id) {\n",
" var g = d3.select(parent_id)\n",
" .append(\"svg\")\n",
" .attr(\"width\", \"120mm\")\n",
" .attr(\"height\", \"80mm\")\n",
" .attr(\"viewBox\", \"0 0 120 80\")\n",
" .attr(\"stroke-width\", \"0.5\")\n",
" .attr(\"style\", \"stroke:black;fill:black\");\n",
" g.append(\"defs\");\n",
" var ctx = {\n",
" \"scale\": 1.0,\n",
" \"tx\": 0.0,\n",
" \"ty\": 0.0\n",
" };\n",
"(function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"fill\", \"#000000\")\n",
" .attr(\"stroke-width\", 0.3)\n",
" .attr(\"font-family\", \"Helvetic,Arial,sans\")\n",
" .style(\"font-size\", \"3.88px\");\n",
" (function (g) {\n",
" g.attr(\"class\", \"plotroot xscalable yscalable\");\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"fill\", \"#4C404B\")\n",
" .attr(\"font-family\", \"'PT Sans','Helvetica Neue','Helvetica',sans-serif\")\n",
" .style(\"font-size\", \"3.18px\")\n",
" .attr(\"class\", \"guide ylabels\");\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 75.63)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"1.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 46)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"2.5\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -52.79)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"7.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -3.4)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"5.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 36.12)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"3.0\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 85.51)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"0.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 115.15)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"-1.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 16.36)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"4.0\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -23.15)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"6.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 105.27)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"-0.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 26.24)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"3.5\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -42.91)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"7.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 95.39)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"0.0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 6.48)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"4.5\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 65.76)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"1.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", 55.88)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"2.0\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -33.03)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"6.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 22.82)\n",
" .attr(\"y\", -13.28)\n",
" .attr(\"text-anchor\", \"end\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .call(function(text) {\n",
" text.text(\"5.5\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"fill\", \"#362A35\")\n",
" .attr(\"font-family\", \"'PT Sans','Helvetica Neue','Helvetica',sans-serif\")\n",
" .style(\"font-size\", \"3.88px\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 9.18)\n",
" .attr(\"y\", 31.18)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .style(\"dominant-baseline\", \"central\")\n",
" .attr(\"transform\", \"rotate(-90, 9.18, 31.18)\")\n",
" .call(function(text) {\n",
" text.text(\"SepalWidth\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"fill\", \"#4C404B\")\n",
" .attr(\"font-family\", \"'PT Sans','Helvetica Neue','Helvetica',sans-serif\")\n",
" .style(\"font-size\", \"3.18px\")\n",
" .attr(\"class\", \"guide xlabels\");\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", -38.12)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"1\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 198.44)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"12\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 219.95)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"13\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 47.9)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"5\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 4.89)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"3\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", -81.13)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"-1\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 26.4)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"4\");\n",
" })\n",
";\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 69.41)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"6\");\n",
" })\n",
";\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 112.42)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"8\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 155.43)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"10\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 133.92)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"9\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 90.91)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"7\");\n",
" })\n",
";\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", -59.62)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"0\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 176.94)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"11\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"visibility\", \"hidden\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", -16.61)\n",
" .attr(\"y\", 63.65)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"2\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"fill\", \"#362A35\")\n",
" .attr(\"font-family\", \"'PT Sans','Helvetica Neue','Helvetica',sans-serif\")\n",
" .style(\"font-size\", \"3.88px\");\n",
" g.append(\"svg:text\")\n",
" .attr(\"x\", 69.41)\n",
" .attr(\"y\", 73)\n",
" .attr(\"text-anchor\", \"middle\")\n",
" .call(function(text) {\n",
" text.text(\"SepalLength\");\n",
" })\n",
";\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.on(\"mouseover\", guide_background_mouseover(\"#C6C6C9\"))\n",
" .on(\"mouseout\", guide_background_mouseout(\"#F0F0F3\"))\n",
" .call(zoom_behavior(ctx))\n",
";\n",
" (function (g) {\n",
" d3.select(\"defs\")\n",
" .append(\"svg:clipPath\")\n",
" .attr(\"id\", parent_id + \"_clippath0\")\n",
" .append(\"svg:path\")\n",
" .attr(\"d\", \" M23.82,5 L 115 5 115 57.36 23.82 57.36 z\");g.attr(\"clip-path\", \"url(#\" + parent_id + \"_clippath0)\");\n",
" (function (g) {\n",
" g.attr(\"class\", \"guide background\")\n",
" .attr(\"stroke\", \"#F1F1F5\")\n",
" .attr(\"fill\", \"#FAFAFA\")\n",
" .attr(\"opacity\", 1.00);\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,5 L 115 5 115 57.36 23.82 57.36 z\");\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"#F0F0F3\")\n",
" .attr(\"stroke-width\", 0.2)\n",
" .attr(\"class\", \"guide ygridlines xfixed\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,46 L 115 46\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-3.4 L 115 -3.4\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,85.51 L 115 85.51\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,16.36 L 115 16.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,105.27 L 115 105.27\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-42.91 L 115 -42.91\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,6.48 L 115 6.48\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,55.88 L 115 55.88\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-13.28 L 115 -13.28\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-33.03 L 115 -33.03\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,65.76 L 115 65.76\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,95.39 L 115 95.39\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,26.24 L 115 26.24\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-23.15 L 115 -23.15\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,115.15 L 115 115.15\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,36.12 L 115 36.12\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,-52.79 L 115 -52.79\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M23.82,75.63 L 115 75.63\");\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"#F0F0F3\")\n",
" .attr(\"stroke-width\", 0.2)\n",
" .attr(\"class\", \"guide xgridlines yfixed\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M198.44,5 L 198.44 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M47.9,5 L 47.9 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M-81.13,5 L -81.13 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M69.41,5 L 69.41 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M155.43,5 L 155.43 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M90.91,5 L 90.91 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M176.94,5 L 176.94 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M-16.61,5 L -16.61 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M-59.62,5 L -59.62 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M133.92,5 L 133.92 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M112.42,5 L 112.42 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M26.4,5 L 26.4 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M4.89,5 L 4.89 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M219.95,5 L 219.95 57.36\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M-38.12,5 L -38.12 57.36\");\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" d3.select(\"defs\")\n",
" .append(\"svg:clipPath\")\n",
" .attr(\"id\", parent_id + \"_clippath1\")\n",
" .append(\"svg:path\")\n",
" .attr(\"d\", \" M23.82,5 L 115 5 115 57.36 23.82 57.36 z\");g.attr(\"clip-path\", \"url(#\" + parent_id + \"_clippath1)\");\n",
" (function (g) {\n",
" g.attr(\"class\", \"plotpanel\");\n",
" (function (g) {\n",
" g.attr(\"stroke-width\", 0.3);\n",
" (function (g) {\n",
" g.attr(\"stroke-width\", 0.3);\n",
"g.selectAll(\"form0\")\n",
" .data(d3.zip(data[0],data[1]))\n",
" .enter()\n",
" .append(\"circle\")\n",
".attr(\"cx\", function(d) { return d[0]; })\n",
".attr(\"cy\", function(d) { return d[1]; })\n",
".attr(\"r\", 0.6)\n",
".attr(\"class\", \"geometry color_LCHab(70.0,60.0,240.0)\")\n",
".on(\"mouseout\", geom_point_mouseout(10.00, 0.50), false)\n",
".on(\"mouseover\", geom_point_mouseover(10.00, 0.50), false)\n",
".attr(\"stroke\", \"#0096DD\")\n",
".attr(\"fill\", \"#00BFFF\")\n",
";\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" d3.select(\"defs\")\n",
" .append(\"svg:clipPath\")\n",
" .attr(\"id\", parent_id + \"_clippath2\")\n",
" .append(\"svg:path\")\n",
" .attr(\"d\", \" M23.82,5 L 115 5 115 57.36 23.82 57.36 z\");g.attr(\"clip-path\", \"url(#\" + parent_id + \"_clippath2)\");\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"none\")\n",
" .attr(\"class\", \"guide zoomslider\")\n",
" .attr(\"opacity\", 0.00);\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"#6A6A6A\")\n",
" .attr(\"stroke-opacity\", 0.00)\n",
" .attr(\"stroke-width\", 0.3)\n",
" .attr(\"fill\", \"#EAEAEA\")\n",
" .on(\"click\", zoomin_behavior(ctx))\n",
".on(\"dblclick\", function() { d3.event.stopPropagation(); })\n",
".on(\"mouseover\", zoomslider_button_mouseover(\"#cd5c5c\"))\n",
".on(\"mouseout\", zoomslider_button_mouseover(\"#6a6a6a\"))\n",
";\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M108,8 L 112 8 112 12 108 12 z\");\n",
" (function (g) {\n",
" g.attr(\"fill\", \"#6A6A6A\")\n",
" .attr(\"class\", \"button_logo\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M108.8,9.6 L 109.6 9.6 109.6 8.8 110.4 8.8 110.4 9.6 111.2 9.6 111.2 10.4 110.4 10.4 110.4 11.2 109.6 11.2 109.6 10.4 108.8 10.4 z\");\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"fill\", \"#EAEAEA\")\n",
" .on(\"click\", zoomslider_track_behavior(ctx, 82, 99));\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M88.5,8 L 107.5 8 107.5 12 88.5 12 z\");\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"fill\", \"#6A6A6A\")\n",
" .attr(\"class\", \"zoomslider_thumb\")\n",
" .call(zoomslider_behavior(ctx, 82, 99))\n",
".on(\"mouseover\", zoomslider_thumb_mouseover(\"#cd5c5c\"))\n",
".on(\"mouseout\", zoomslider_thumb_mouseover(\"#6a6a6a\"))\n",
";\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M97,8 L 99 8 99 12 97 12 z\");\n",
" }(g.append(\"g\")));\n",
" (function (g) {\n",
" g.attr(\"stroke\", \"#6A6A6A\")\n",
" .attr(\"stroke-opacity\", 0.00)\n",
" .attr(\"stroke-width\", 0.3)\n",
" .attr(\"fill\", \"#EAEAEA\")\n",
" .on(\"click\", zoomout_behavior(ctx))\n",
".on(\"dblclick\", function() { d3.event.stopPropagation(); })\n",
".on(\"mouseover\", zoomslider_button_mouseover(\"#cd5c5c\"))\n",
".on(\"mouseout\", zoomslider_button_mouseover(\"#6a6a6a\"))\n",
";\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M84,8 L 88 8 88 12 84 12 z\");\n",
" (function (g) {\n",
" g.attr(\"fill\", \"#6A6A6A\")\n",
" .attr(\"class\", \"button_logo\");\n",
" g.append(\"svg:path\")\n",
" .attr(\"d\", \"M84.8,9.6 L 87.2 9.6 87.2 10.4 84.8 10.4 z\");\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
" }(g.append(\"g\")));\n",
"}(g.append(\"g\")));\n",
" d3.select(parent_id)\n",
" .selectAll(\"path\")\n",
" .each(function() {\n",
" var sw = parseFloat(window.getComputedStyle(this).getPropertyValue(\"stroke-width\"));\n",
" d3.select(this)\n",
" .attr(\"vector-effect\", \"non-scaling-stroke\")\n",
" .style(\"stroke-width\", sw + \"mm\");\n",
" });\n",
"}\n",
"\n",
"var data = [\n",
" [50.053380503144645,45.75227987421384,41.45117924528302,39.3006289308176,47.902830188679246,56.50503144654088,39.3006289308176,47.902830188679246,34.9995283018868,45.75227987421384,56.50503144654088,43.60172955974842,43.60172955974842,32.84897798742138,65.10723270440252,62.956682389937114,56.50503144654088,50.053380503144645,62.956682389937114,50.053380503144645,56.50503144654088,50.053380503144645,39.3006289308176,50.053380503144645,43.60172955974842,47.902830188679246,47.902830188679246,52.203930817610065,52.203930817610065,41.45117924528302,43.60172955974842,56.50503144654088,52.203930817610065,58.65558176100629,45.75227987421384,47.902830188679246,58.65558176100629,45.75227987421384,34.9995283018868,50.053380503144645,47.902830188679246,37.150078616352204,34.9995283018868,47.902830188679246,50.053380503144645,43.60172955974842,50.053380503144645,39.3006289308176,54.35448113207546,47.902830188679246,90.91383647798742,78.01053459119498,88.76328616352203,58.65558176100629,80.16108490566037,62.956682389937114,75.85998427672955,45.75227987421384,82.31163522012577,52.203930817610065,47.902830188679246,67.25778301886793,69.40833333333333,71.55888364779874,60.80613207547169,84.46218553459119,60.80613207547169,65.10723270440252,73.70943396226414,60.80613207547169,67.25778301886793,71.55888364779874,75.85998427672955,71.55888364779874,78.01053459119498,82.31163522012577,86.6127358490566,84.46218553459119,69.40833333333333,62.956682389937114,58.65558176100629,58.65558176100629,65.10723270440252,69.40833333333333,56.50503144654088,69.40833333333333,84.46218553459119,75.85998427672955,60.80613207547169,58.65558176100629,58.65558176100629,71.55888364779874,65.10723270440252,47.902830188679246,60.80613207547169,62.956682389937114,62.956682389937114,73.70943396226414,50.053380503144645,62.956682389937114,75.85998427672955,65.10723270440252,93.06438679245282,75.85998427672955,80.16108490566037,103.81713836477986,45.75227987421384,97.36548742138365,84.46218553459119,95.21493710691824,80.16108490566037,78.01053459119498,86.6127358490566,62.956682389937114,65.10723270440252,78.01053459119498,80.16108490566037,105.96768867924528,105.96768867924528,69.40833333333333,88.76328616352203,60.80613207547169,105.96768867924528,75.85998427672955,84.46218553459119,95.21493710691824,73.70943396226414,71.55888364779874,78.01053459119498,95.21493710691824,99.51603773584905,110.26878930817611,78.01053459119498,75.85998427672955,71.55888364779874,105.96768867924528,75.85998427672955,78.01053459119498,69.40833333333333,88.76328616352203,84.46218553459119,88.76328616352203,65.10723270440252,86.6127358490566,84.46218553459119,84.46218553459119,75.85998427672955,80.16108490566037,73.70943396226414,67.25778301886793],\n",
" [26.23970125786164,36.1186320754717,32.167059748427675,34.14284591194969,24.263915094339623,18.33655660377359,28.215487421383656,28.215487421383656,38.09441823899372,34.14284591194969,22.288128930817614,28.215487421383656,36.1186320754717,36.1186320754717,16.360770440251578,8.457625786163518,18.33655660377359,26.23970125786164,20.312342767295604,20.312342767295604,28.215487421383656,22.288128930817614,24.263915094339623,30.19127358490567,28.215487421383656,36.1186320754717,28.215487421383656,26.23970125786164,28.215487421383656,32.167059748427675,34.14284591194969,28.215487421383656,14.38498427672957,12.409198113207548,34.14284591194969,32.167059748427675,26.23970125786164,24.263915094339623,36.1186320754717,28.215487421383656,26.23970125786164,49.9491352201258,32.167059748427675,26.23970125786164,20.312342767295604,36.1186320754717,20.312342767295604,32.167059748427675,22.288128930817614,30.19127358490567,32.167059748427675,32.167059748427675,34.14284591194969,49.9491352201258,40.070204402515735,40.070204402515735,30.19127358490567,47.97334905660378,38.09441823899372,42.04599056603774,55.87649371069183,36.1186320754717,51.9249213836478,38.09441823899372,38.09441823899372,34.14284591194969,36.1186320754717,42.04599056603774,51.9249213836478,45.99756289308176,32.167059748427675,40.070204402515735,45.99756289308176,40.070204402515735,38.09441823899372,36.1186320754717,40.070204402515735,36.1186320754717,38.09441823899372,44.021776729559754,47.97334905660378,47.97334905660378,42.04599056603774,42.04599056603774,36.1186320754717,28.215487421383656,34.14284591194969,49.9491352201258,36.1186320754717,45.99756289308176,44.021776729559754,36.1186320754717,44.021776729559754,49.9491352201258,42.04599056603774,36.1186320754717,38.09441823899372,38.09441823899372,45.99756289308176,40.070204402515735,30.19127358490567,42.04599056603774,36.1186320754717,38.09441823899372,36.1186320754717,36.1186320754717,45.99756289308176,38.09441823899372,45.99756289308176,24.263915094339623,32.167059748427675,42.04599056603774,36.1186320754717,45.99756289308176,40.070204402515735,32.167059748427675,36.1186320754717,20.312342767295604,44.021776729559754,51.9249213836478,32.167059748427675,40.070204402515735,40.070204402515735,42.04599056603774,30.19127358490567,32.167059748427675,40.070204402515735,36.1186320754717,40.070204402515735,36.1186320754717,40.070204402515735,20.312342767295604,40.070204402515735,40.070204402515735,44.021776729559754,36.1186320754717,28.215487421383656,34.14284591194969,36.1186320754717,34.14284591194969,34.14284591194969,34.14284591194969,42.04599056603774,32.167059748427675,30.19127358490567,36.1186320754717,45.99756289308176,36.1186320754717,28.215487421383656,36.1186320754717]];\n",
"\n",
"var draw = function(parent_id) {\n",
" draw_with_data(data, parent_id);\n",
"};\n",
"\n",
"if ('undefined' !== typeof module) {\n",
" module.exports = draw;\n",
"} else if ('undefined' !== typeof window) {\n",
" window.draw = draw\n",
"}\n",
"\n",
"return module;\n",
"})({}).exports(\"#gadflyplot-Knps0f6uCWlUj3MaHnZq\");\n",
"//@ sourceURL=gadflyplot-Knps0f6uCWlUj3MaHnZq.js\n",
"</script>\n"
],
"metadata": {},
"output_type": "display_data",
"text": [
"D3(120.0,80.0,IOBuffer(Uint8[0x66,0x75,0x6e,0x63,0x74,0x69,0x6f,0x6e,0x20,0x64 \u2026 0x20,0x3d,0x20,0x64,0x72,0x61,0x77,0x0a,0x7d,0x0a],true,true,true,false,24977,9223372036854775807,24978),0,String[],Bool[],1,3,[0xa7004af4b326ce7b=>([50.0534,45.7523,41.4512,39.3006,47.9028,56.505,39.3006,47.9028,34.9995,45.7523 \u2026 84.4622,88.7633,65.1072,86.6127,84.4622,84.4622,75.86,80.1611,73.7094,67.2578],0),0xe717497a33540045=>([26.2397,36.1186,32.1671,34.1428,24.2639,18.3366,28.2155,28.2155,38.0944,34.1428 \u2026 34.1428,34.1428,42.046,32.1671,30.1913,36.1186,45.9976,36.1186,28.2155,36.1186],1)],true,false,nothing,true)"
]
},
{
"html": [],
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"Plot(...)"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='Stumps'/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###Decision Stumps\n",
"[[back to top]](#Sections)\n",
"\n",
"A [decision stump](https://en.wikipedia.org/wiki/Decision_stump) is the simplest decision tree, consisting of only one split and two classification bins (leaves). \n",
"\n",
"The split it generates is the one with the most predictive power. \n",
"This is done by traversing all the features and splitting the dataset into two subsets for every unique value per feature, \n",
"then returning the split with the highest information gain (lowest dispersion of binned labels across the leaves)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"iris[1:4,:]"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 33,
"text": [
"4x5 DataFrame:\n",
" SepalLength SepalWidth PetalLength PetalWidth Species\n",
"[1,] 5.1 3.5 1.4 0.2 \"setosa\"\n",
"[2,] 4.9 3.0 1.4 0.2 \"setosa\"\n",
"[3,] 4.7 3.2 1.3 0.2 \"setosa\"\n",
"[4,] 4.6 3.1 1.5 0.2 \"setosa\"\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"features = matrix(iris[:, 1:4]);\n",
"labels = array(iris[:, \"Species\"]);"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"stump = build_stump(labels, features)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 7,
"text": [
"Decision Tree\n",
"Leaves: 2\n",
"Depth: 1"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print_tree(stump)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Feature 3, Threshold 3.0\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"L-> setosa : 50/50\n",
"R-> versicolor : 50/100\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"The resultant split was made on the 3rd feature (petal length) with a threshold of 3.0 cm. Iris samples with a petal length less than 3.0 cm are binned into the Setosa specie leaf (Left), and those greater than or equal to 3.0 cm are binned into the Versicolor leaf (Right)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"*Note that the Left leaf consists of 50 samples, all of which have the Setosa label, while the Right leaf contains 100 samples, half of which have the Versicolor label. Since there are a total of 50 samples of each species, we can see that this initial split separated all the Setosa samples into one leaf, but bundled the Versicolor and Virginica samples into another.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<br>\n",
"<a name='Verifying'/>"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"####Verifying a Split\n",
"[[back to top]](#Sections)\n",
"\n",
"One way of verifying this split is by making species predictions based on the sample features using the generated decision stump and then comparing the predictions to the actual species labels using a [confusion matrix](https://de.wikipedia.org/wiki/Beurteilung_eines_Klassifikators#Wahrheitsmatrix:_Richtige_und_falsche_Klassifikationen)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"predictions = apply_tree(stump, features);\n",
"confusion_matrix(labels, predictions)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 30,
"text": [
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 50 0 0\n",
" 0 50 0\n",
" 0 50 0\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.49999999999999994"
]
}
],
"prompt_number": 30
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can see that all Setosa samples were correctly classified, while the Viginica samples were incorrectly classified as Versicolor, resulting in an overall accuracy of $\\frac{2}{3}$."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"So now we have a method of separating the Setosa species from the others (via petal length), but how can we separate the remaining two species from each other?\n",
"\n",
"We could manually remove all the Setosa samples from the original dataset and generate another decision stump on the set consisting of only Versicolor and Virginica samples."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"no_setosa_labels = labels[51:end];\n",
"no_setosa_features = features[51:end, :];\n",
"stump2 = build_stump(no_setosa_labels, no_setosa_features);\n",
"print_tree(stump2)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Feature 4, Threshold 1.8\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"L-> versicolor : 49/54\n",
"R-> virginica : 45/46\n"
]
}
],
"prompt_number": 54
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"This time the most predictive split takes place on the 4th feature (the petal width) with a threshold of 1.8 cm. The split isn\u2019t as clean as the first one, but the separation of the two species is acceptable."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We are now in a position to join the two splits generated by the decision stumps by manually cascading them to form a decision tree which will separate all three species from each other."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###Automatic Cascading of Splits\n",
"[[back to top]](#Sections)\n",
"\n",
"So far we've gone to through building a decision stump, manually removing the well separated samples from our labels and features, building a secondary decision stump on the remaining samples, and finally manually cascading the two splits to generate a decision tree.\n",
"\n",
"Instead of all this we could streamline the entire process.\n",
"\n",
"We could recursively split the dataset into subsets, until the resulting subsets correspond to pure leaves that contain samples of only one specie class."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"tree = build_tree(labels, features);\n",
"print_tree(tree)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Feature 3, Threshold 3.0\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"L-> setosa : 50/50\n",
"R-> Feature 4, Threshold 1.8\n",
" L-> Feature 3, Threshold 5.0\n",
" L-> Feature 4, Threshold 1.7\n",
" L-> versicolor : 47/47\n",
" R-> virginica : 1/1\n",
" R-> Feature 4, Threshold 1.6\n",
" L-> virginica : 3/3\n",
" R-> Feature 1, Threshold 7.2\n",
" L-> versicolor : 2/2\n",
" R-> virginica : 1/1\n",
" R-> Feature 3, Threshold 4.9\n",
" L-> Feature 1, Threshold 6.0\n",
" L-> versicolor : 1/1\n",
" R-> virginica : 2/2\n",
" R-> virginica : 43/43\n"
]
}
],
"prompt_number": 55
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Note that all the leaves are pure and some of them contain only one sample. This is a fully grown tree, where every sample has been forced into a leaf using different feature thresholds. But this forcing of samples onto solo leaves will often lead to over-fitting, and that\u2019s where the pruning of leaves will help."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###Leaf Pruning\n",
"[[back to top]](#Sections)\n",
"\n",
"Pruning is a method of making decision trees less prone to over-fitting, thus improving their accuracy, and in the process reducing their size and complexity. With pruning, splits with little predictive power are removed, producing more flexible, yet still accurate models.\n",
"\n",
"The pruning scheme is a bottom-up (pessimistic) process, which needs to be repeated until no more merges can be made. Let\u2019s get a count of the number of leaves in our tree generated in the previous section, prune it with a purity threshold of 90%, and get a count of the leaves of the new pruned tree:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"length(tree)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"9"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pruned = prune_tree(tree, 0.9);"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"prompt_number": 15
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"length(pruned)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
"8"
]
}
],
"prompt_number": 16
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"This pruning cut off only one of the leaves, so let\u2019s get more aggressive and use a 60% threshold, and see what it looks like:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pruned = prune_tree(tree, 0.6);\n",
"length(pruned)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
"3"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\n",
"print_tree(pruned)"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Feature 3, Threshold 3.0\n"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"L-> setosa : 50/50\n",
"R-> Feature 4, Threshold 1.8\n",
" L-> versicolor : 49/54\n",
" R-> virginica : 45/46\n"
]
}
],
"prompt_number": 19
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"But how can we determine a good threshold to use? \n",
"\n",
"One approach is to prune the original tree using a variety of different thresholds, take the number of leaves for each pruned tree, plot the two variables against each other, and look for the *elbow* or where the most dramatic drop occurs in the plot. Another approach is to use cross validation."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###N-Fold Cross Validation\n",
"[[back to top]](#Sections)\n",
"\n",
"Cross validation is a model validation technique for assessing how well a model will generalize given an independent data set. \n",
"\n",
"This typically involves breaking the source datasets into training and testing sets, where the training set is used for generating the model, and the testing set for measuring its prediction accuracy. \n",
"\n",
"[N-fold cross validation](https://de.wikipedia.org/wiki/Kreuzvalidierungsverfahren#Problemstellung) randomly partitions the original set into N equal sized subsets. Each subset is used once as the testing set, while the remaining samples are used for training, thus resulting in N prediction performance measurements."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"purities = linspace(0.1, 1.0, 10);\n",
"accuracies = zeros(length(purities));\n",
"[accuracies[i] = mean(nfoldCV_tree(labels, features, purities[i], 10)) for i in 1:length(purities)]"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"Fold 1"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 5 0 0\n",
" 6 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 5 0 0\n",
" 6 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 6 0\n",
" 0 4 0\n",
" 0 5 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 8 0\n",
" 0 2 0\n",
" 0 5 0\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 6 0 0\n",
" 5 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 7 0\n",
" 0 4 0\n",
" 0 4 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 6 0\n",
" 0 3 0\n",
" 0 6 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 7 0 0\n",
" 4 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 7 0 0\n",
" 5 0 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 7 0 0\n",
" 4 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Mean Accuracy: 0.24\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 7\n",
" 0 0 6\n",
" 0 0 2\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 8\n",
" 0 0 4\n",
" 0 0 3\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 5\n",
" 0 0 8\n",
" 0 0 2\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 6 0 0\n",
" 6 0 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 7\n",
" 0 0 5\n",
" 0 0 3\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 2 0 0\n",
" 5 0 0\n",
" 8 0 0\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 7\n",
" 0 0 5\n",
" 0 0 3\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 3 0\n",
" 0 3 0\n",
" 0 9 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 4 0\n",
" 0 3 0\n",
" 0 8 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 5 0 0\n",
" 6 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Mean Accuracy: 0.18666666666666665\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 6\n",
" 0 0 7\n",
" 0 0 2\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 9\n",
" 0 0 4\n",
" 0 0 2\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 5 0\n",
" 0 4 0\n",
" 0 6 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 2 0 0\n",
" 4 0 0\n",
" 9 0 0\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 5\n",
" 0 0 8\n",
" 0 0 2\n",
"Accuracy: 0.13333333333333333\n",
"Kappa: 0.0\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 5 0\n",
" 0 3 0\n",
" 0 7 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 0 5\n",
" 0 0 7\n",
" 0 0 3\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 6 0 0\n",
" 5 0 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 5 0\n",
" 0 4 0\n",
" 0 6 0\n",
"Accuracy: 0.26666666666666666\n",
"Kappa: 0.0\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 0 4 0\n",
" 0 3 0\n",
" 0 8 0\n",
"Accuracy: 0.2\n",
"Kappa: 0.0\n",
"\n",
"Mean Accuracy: 0.1933333333333333\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 8 0 0\n",
" 0 2 0\n",
" 0 5 0\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.48979591836734687\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 6 0\n",
" 0 6 0\n",
"Accuracy: 0.6\n",
"Kappa: 0.375\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 0 5\n",
" 0 0 4\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.5098039215686274\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 0 8\n",
" 0 0 3\n",
"Accuracy: 0.4666666666666667\n",
"Kappa: 0.3181818181818182\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 8 0 0\n",
" 0 0 4\n",
" 0 0 3\n",
"Accuracy: 0.7333333333333333\n",
"Kappa: 0.5714285714285714\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 3 0\n",
" 0 6 0\n",
"Accuracy: 0.6\n",
"Kappa: 0.4444444444444444\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 6 0\n",
" 0 6 0\n",
"Accuracy: 0.6\n",
"Kappa: 0.375\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 2 0 0\n",
" 1 0 6\n",
" 0 0 6\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.2857142857142857\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 0 7\n",
" 0 0 4\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.3636363636363636\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 2 0\n",
" 0 7 0\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.38596491228070173\n",
"\n",
"Mean Accuracy: 0.5933333333333333\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 0 5\n",
" 0 0 4\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.5098039215686274\n",
"\n",
"Fold "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 0 6\n",
" 0 0 3\n",
"Accuracy: 0.6\n",
"Kappa: 0.4444444444444444\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 1 0 0\n",
" 0 5 0\n",
" 0 9 0\n",
"Accuracy: 0.4\n",
"Kappa: 0.12337662337662345\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 4 0\n",
" 0 7 0\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.3636363636363636\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 4 0\n",
" 0 4 0\n",
"Accuracy: 0.7333333333333333\n",
"Kappa: 0.5833333333333333\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 4 0\n",
" 0 5 0\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.5098039215686274\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 4 0\n",
" 0 7 0\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.3636363636363636\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 0 6\n",
" 0 0 5\n",
"Accuracy: 0.6\n",
"Kappa: 0.4155844155844156\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 0 5\n",
" 0 0 3\n",
"Accuracy: 0.6666666666666666\n",
"Kappa: 0.506578947368421\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 1 0 6\n",
" 0 0 3\n",
"Accuracy: 0.5333333333333333\n",
"Kappa: 0.375\n",
"\n",
"Mean Accuracy: 0.5933333333333333\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 5 0\n",
" 0 1 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.9\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 5 0\n",
" 0 0 6\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 4 1\n",
" 0 0 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8993288590604027\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 4 0\n",
" 0 0 8\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 2 0\n",
" 0 0 7\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 3 0\n",
" 0 0 8\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 8 0\n",
" 0 2 2\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.765625\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 5 0\n",
" 0 1 2\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8928571428571429\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 1 5 0\n",
" 0 0 3\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8958333333333334\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 6 1\n",
" 0 0 2\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8936170212765958\n",
"\n",
"Mean Accuracy: 0.9533333333333335\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 2 0\n",
" 0 0 8\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 5 1\n",
" 0 0 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.9\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 6 0\n",
" 0 0 5\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 4 0\n",
" 0 0 4\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 2 1\n",
" 0 0 6\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8936170212765958\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 1 7 0\n",
" 0 2 1\n",
"Accuracy: 0.8\n",
"Kappa: 0.653846153846154\n",
"\n",
"Fold "
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 3 0\n",
" 0 0 5\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 7 0\n",
" 0 0 4\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 3 1\n",
" 0 2 4\n",
"Accuracy: 0.8\n",
"Kappa: 0.7000000000000001\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 7 0\n",
" 0 0 5\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Mean Accuracy: 0.9466666666666667\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 6 0\n",
" 0 0 3\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 1 2 0\n",
" 0 1 5\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7916666666666667\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 3 0\n",
" 0 0 6\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 5 0\n",
" 0 0 7\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 3 0\n",
" 0 2 7\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7826086956521741\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 7 0\n",
" 0 1 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8928571428571429\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 3 0 0\n",
" 0 8 1\n",
" 0 1 2\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7619047619047619\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 3 1\n",
" 0 1 4\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7972972972972974\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 8 0 0\n",
" 0 3 1\n",
" 0 0 3\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8905109489051095\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 5 1\n",
" 0 0 3\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8979591836734694\n",
"\n",
"Mean Accuracy: 0.9266666666666667\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 1 2\n",
" 0 1 5\n",
"Accuracy: 0.8\n",
"Kappa: 0.6808510638297872\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 3 1\n",
" 0 0 7\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8936170212765958\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 1 3 0\n",
" 0 0 5\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8972602739726028\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 5 0\n",
" 0 0 3\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 5 0\n",
" 0 1 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.9\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 4 1\n",
" 0 0 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8993288590604027\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 9 0\n",
" 0 1 1\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8717948717948718\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 5 0\n",
" 0 0 6\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 2 0 0\n",
" 0 4 1\n",
" 0 0 8\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8837209302325582\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 5 0\n",
" 0 1 3\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8979591836734694\n",
"\n",
"Mean Accuracy: 0.9333333333333333\n",
"\n",
"Fold 1\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 3 0\n",
" 0 1 7\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8936170212765958\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 7 0 0\n",
" 0 4 0\n",
" 0 0 4\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 5 0 0\n",
" 0 6 0\n",
" 0 0 4\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 4\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 2 0 0\n",
" 1 5 1\n",
" 0 0 6\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7887323943661971\n",
"\n",
"Fold 5\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 4 1\n",
" 0 1 3\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7972972972972974\n",
"\n",
"Fold 6\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 4 0\n",
" 0 1 4\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8993288590604027\n",
"\n",
"Fold 7\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 2 1\n",
" 0 1 5\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7916666666666667\n",
"\n",
"Fold 8\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 6 0 0\n",
" 0 5 2\n",
" 0 0 2\n",
"Accuracy: 0.8666666666666667\n",
"Kappa: 0.7945205479452055\n",
"\n",
"Fold 9\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 3 0\n",
" 0 0 8\n",
"Accuracy: 1.0\n",
"Kappa: 1.0\n",
"\n",
"Fold 10\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 4 0 0\n",
" 0 8 0\n",
" 0 1 2\n",
"Accuracy: 0.9333333333333333\n",
"Kappa: 0.8854961832061069\n",
"\n",
"Mean Accuracy: 0.9266666666666667\n"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 46,
"text": [
"10-element Array{Float64,1}:\n",
" 0.24 \n",
" 0.186667\n",
" 0.193333\n",
" 0.593333\n",
" 0.593333\n",
" 0.953333\n",
" 0.946667\n",
" 0.926667\n",
" 0.933333\n",
" 0.926667"
]
}
],
"prompt_number": 46
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"using PyPlot\n",
"plot(purities, accuracies, \"b-o\")\n",
"xlabel(\"Purity Threshold\"); ylabel(\"Accuracy\")"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAr8AAAImCAYAAABacOJlAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAIABJREFUeJzs3Xl41OW9/vE7C0uAQCSGLQbBUCRIgoCA4gKCkbDKIsSI7RHp75ziGrXQBaWoaBFPKXZRe3koGCcMEWYCiIAIgkJlp1oIFRCEoEGBhC2AQDLz+2NKJCSBTDIzzyzv13Xlon6/s9wzjeH2yWeeb5jT6XQKAAAACAHhpgMAAAAAvkL5BQAAQMig/AIAACBkUH4BAAAQMii/AAAACBmUXwAAAIQMyi8AAABCBuUXAAAAIYPyCwAAgJDhF+W3uLhYEydO1L333qu4uDiFh4frhRdeqPb9Dx8+rIcfflhxcXFq2LChevXqpY8//tiLiQEAABCI/KL8Hj16VG+//bYuXLig4cOHS5LCwsKqdd9z586pX79+Wr16tf70pz9p8eLFat68udLS0vTpp596MzYAAAACTKTpAJLUpk0bHTt2TJJUWFio//u//6v2fWfNmqW8vDytX79ePXv2lCT16dNHnTt31sSJE7VhwwavZAYAAEDg8YuV30s5nU63bp+bm6sOHTqUFV9JioiI0EMPPaRNmzbp0KFDno4IAACAAOV35dddO3bsUEpKSoXjycnJkqS8vDxfRwIAAICfCvjyW1RUpKZNm1Y4fvFYYWGhryMBAADATwV8+QUAAACqyy8+8FYbsbGxKioqqnD84rHY2NhK73fo0CHmgQEAAPxYy5Yt1bJlS48+ZsCX3+TkZP3rX/+qcHz79u2SpE6dOlU4d+jQId1yyy0qKCjwej4AAADUTKtWrbRlyxaPFuCAL7/Dhw/Xo48+qk2bNqlHjx6SpJKSElksFt16661q0aJFhfscOnRIBQUFslgsSkpK8nVkv5WZmamZM2eajuFXeE8q4j0pj/ejIt6TinhPKuI9qYj3pLx///vfeuihh3To0KHgLL/Lli3T6dOnderUKUmuXRoWLFggSRo0aJCioqI0btw4ZWVlad++fUpISJAkPfLII/rrX/+qUaNGadq0aYqLi9Mbb7yhPXv2aOXKlVd8zqSkJHXt2tW7LyyAxMTE8H5chvekIt6T8ng/KuI9qYj3pCLek4p4T3zDb8rvo48+qgMHDkhyXd1t/vz5mj9/vsLCwvT111+rdevWcjgccjgc5fYCrlu3rlatWqWJEyfqiSee0JkzZ9SlSxctW7ZMd955p6mXAwAAAD/kN+X366+/vuptZs+erdmzZ1c43qxZM82ZM8cLqQAAABBM2OoMAAAAIYPyizIZGRmmI/gd3pOKeE/K4/2oiPekIt6TinhPKuI98Y0w56UDtCFi27Zt6tatm7Zu3cpgOQAAgB/yVl9j5RcAAAAhg/ILAACAkEH5BQAAQMig/AIAACBkUH4BAAAQMii/AAAACBmUXwAAAIQMyi8AAABCBuUXAAAAIYPyCwAAgJBB+QUAAEDIoPwCAAAgZESaDgAACHxHjhzRxInTtWnTTpWURCgyslQ9enTU9OkTFRcXZzoeAJSh/AIAauXw4cPq1esB7d37iqTpksIkObRz5yatXZuu9etzKMAA/AZjDwCAWvnVr177T/G9Va7iK7n+erlVe/e+rIkTp5sLBwCXofwCAGpl06adknpWcban/vGPnSoulpxOX6YCgMox9gAAqJWSkgj9uOJ7uXDt2ROh6GgpLEyKji7/1bhxzY9F+vnfYMxBA/7Jz390AAD8XWRkqSSnKi/ADiUklOrVV6VTp6STJ11/Xvp18qR09Gj5fz51Sjp37srPGxXluSJdv76rnHsKc9CA/6L8AgBqpUePjtq5c6NcM7+X26h+/ToqI8P9x71woWIhvrw0V/bP338vffVV+WPFxVd+rshIzxXpRo0un4O+qPwc9OzZr7n/pgCoNcovAKBWpk+fqLVr07V378tyzf6GS3JI2qjExEmaPj2nRo9bp47UtKnrq7YcDun06SuX5qqOFRRUPFZScuXnCwvbKdeKb2V66h//mCqn07OrzQCqh/ILAKiVuLg4ffZZjtq3ny6nc6patLh0vtU/fr0fHv7jymyrVrV7LKfTNZJxpdI8aVKEjhy58hx048bSjTdKHTr8+GeHDtJPfuIawwDgHZRfAECtffttnE6ceE1Ll0oDBphO411hYa5yWr++1KxZ5beZObNUR45ceQ76iSekXbukL7+Uli+XCgt/fPw2bcqX4ot/Nm/OajFQW5RfAECtWSxSXJyUmmo6iX+ozhz0hAnljx49+mMZvvjnBx9If/6zVFrquk2TJuVXiS/+73btpLp1vf2qgOBA+QUA1EppqTR3rvTAA/6//Ziv1GQO+tprXV+3317++Pnz0t69Pxbii+V48WLp+HHXbSIipLZtK5biDh1cjwn/xZZ4vsePKQBArXz8sfTdd9JDD5lO4j/i4uK0fn3Of0rN1MtKjXtz0HXrSklJrq9LOZ3S4cMVS7HdLu3f7/qQn+T6wODlhfjGG6UbbnB9qBDmsCWeGZRfAECtWCyuD2l17246iX+Ji4vz6nZmYWGuGeDmzaW77ip/7ocfXNu9XTpCsX27tGCB6wN5kmuVvl27yj90d801XouNS7AlnhmUXwBAjZ0+7VppnDCBD2L5k/r1pU6dXF+XcjqlQ4cqrhbPmycdOPDj7eLiKh+haNPGNWKBq3M6f9xer6pt9ZYuvfKWeMuXT5XFUvX+0g0bunYygXsovwCAGlu82HUBiTFjTCdBdYSFubZ6a9VKuvvu8ufOnJH27ClfirdskbKzXeck1wjGT35S+RhF48ZXf35/n28tKXFvD+grHTt1ylWAqxIRITmdV740+HffReinP71y5kaNandhlkv/2V/GYC5+n3zyyXqvPD7lFwBQY9nZ0m23SYmJppOgtho0kDp3dn1dyuGQvv22/AjFl19KWVnSN9/8eLuWLSvfiaJ1a9fqpDfmW51O14iHpwrr2bNXfr769SsvjXFxrhnqKxXLy4+5VudLtXNn1VvidexYqq1ba3alw4MHKx672uurV6/2Vzi8eCwqqma/DSr/ffKApFvcf5CroPwCAGrkyBHX/rR/+pPpJPCm8HApIcH1dflWdsXF0u7d5VeLP/tMmjPHVUolV8lr3146efI17d9f9Xxrevp0jR37mtuF9UpX2wsL+3Fl9PJydv317q2ONmrk+ZXRq22J16NHx7I9pT2xMF5S4vr/zN0iXVjo+hDl5be50sr2pReWcac4v/XWpXPQ22r/oitB+QUA1EhOjqtcjB5tOglMadRI6trV9XUph0PKzy9firOyrjzfunr1VK1e7fogXmXls3FjKT7evULl7zOx3ro0eFUiI6WYGNdXbTmdrnGYmlwy/PDhisfOn7/4yFf6PvEMyi8AoEYsFiktjX1kUVF4uOvDcW3auL5HJOnjjyO0e3fV862JiRHascP1q/dQ+fCkJ7fE87WwMNd/XDRsKLVoUfvHO3/eVYK7d4/Q11979xuA8gsAcNuePdLGja5dAoDqiIwslVT1fGu9eqWqX9/HofyAt7fECxR160qxsVJU1JW+TzzDj38ZAADwV9nZrl8rDxliOgkCRY8eHSVtrOLsxv+cR6i78veJZ1B+AQBucTpd5XfkSNcOAUB1TJ8+UYmJv5W0Xq65Vv3nz/X/mW+daC4c/Ebl3yeeRfkFALhl0ybX1cO4nDHccXG+9eGH7erYcYjatx+qjh2H6OGH7VzGF2Uu/T5p2zbTK8/BzC8AwC0Wi2tP1z59TCdBoGG+FdVx8ftk27Zt6tatm8cf3y9WfouLi5WZman4+HhFRUWpS5cuysmp3vYeH374oW6//XY1aNBAMTExGjp0qHbu3OnlxAAQmi5ccH3I7cEHucwtgMDkF+V3xIgRysrK0pQpU7R8+XJ1795dGRkZslqtV7zfokWLNGDAALVo0UJ2u11vvfWW9uzZozvvvFP79u3zUXoACB0rVkhHjzLyACBwGR97WLp0qVauXCmr1ar09HRJUu/evXXgwAFNmDBB6enpCq9ih+pf/epXuvnmm2Wz2cqO9erVS+3bt9fkyZNlsVh88hoAIFRYLNJNN1W8BC4ABArjK7+5ubmKjo7WqFGjyh0fO3asCgoKtHFj5dtdFBYWavfu3Uq7uHv2f7Ru3Vo33XSTFi5cKOeVrrsHAHDLqVPSokWuVd9QuQgBgOBjvPzu2LFDSUlJFVZ3k5OTJUl5eXmV3u/8f66DV69evQrn6tWrpzNnzmjv3r0eTgsAoSs3Vzp71jXvCwCBynj5LSwsVNOmTSscv3issLCw0vs1b95cTZs21bp168odP378uHbs2KGwsLAq7wsAcJ/FIvXuLbVubToJANSc8fJbU+Hh4Xrssce0atUqvfzyyzp8+LC++uorPfTQQzp79qycTmeVs8IAAPcUFEirVkljxphOAgC1Y7wdxsbGVrpCW1RUVHa+KpMnT9bTTz+tl156SS1atFD79u0VHh6usWPHSpLi4+O9ExoAQsy8eVJkpHT//aaTAEDtGN/tISUlRVarVQ6Ho9xK7fbt2yVJnTp1qvK+ERER+sMf/qCXXnpJX3/9ta699lo1b95c/fv31w033KBWrVpd8bkzMzMVExNT7lhGRoYyMjJq8YoAIPhYLNLgwdI115hOAiAYWa3WClvcHj9+3CvPFeY0vCXC8uXLNXDgQM2bN0+jR48uO56Wlqa8vDzl5+crzI2PFW/btk09e/bUjBkz9MQTT1R5m27dumnr1q3q2rVrrV8DAASzvDypUyfJbpeGDzedBkCo8FZfM77ym5aWptTUVI0fP14nT55UYmKirFarVqxYoezs7LLiO27cOGVlZWnfvn1KSEiQJH3yySfauHGjOnfuLKfTqU2bNmn69OkaMGCAHn/8cZMvCwCCRna2FBMjDRxoOgkA1J7x8itJdrtdkyZN0uTJk1VUVKSkpKQKK8EOh0MOh6Pc3r1169bVwoUL9corr+jcuXNq3769XnrpJT355JNurRYDACrncEhz50qjR0uV7CwJAAHH+NiDCYw9AED1rF0r3XWX9Omn0p13mk4DIJR4q68Z3+0BAOC/LBbXvr633246CQB4BuUXAFCpc+ek995z7e3LtukAggU/zgAAlVq6VDp+XHroIdNJAMBzKL8AgEpZLFKXLlLHjqaTAIDnUH4BABUcPy4tWcKqL4DgQ/kFAFSwYIFUUiI98IDpJADgWZRfAEAFFovUr590lavEA0DAofwCAMrJz5c++cS1ywMABBvKLwCgnLlzpagoafhw00kAwPMovwCAMk6n9O670n33SY0bm04DAJ5H+QUAlPniC2nnTnZ5ABC8KL8AgDLZ2dK110r33ms6CQB4B+UXACBJKi11zfs+8IBUp47pNADgHZRfAIAkac0aqaCAXR4ABDfKLwBAkmtv38REqWdP00kAwHsovwAAnTkj2WyuD7qFhZlOAwDeQ/kFAOj996VTpxh5ABD8KL8AAFksrnGHn/zEdBIA8C7KLwCEuKNHpeXL2dsXQGig/AJAiHvvPdeV3dLTTScBAO+j/AJAiLNYpP79pbg400kAwPsiTQcAAJizd6+0fr3r4hYAEApY+QWAEJadLTVqJN13n+kkAOAblF8ACFFOp2vkYcQIqUED02kAwDcovwAQorZskfbsYZcHAKGF8gsAIcpikVq0kPr2NZ0EAHyH8gsAIejCBclqlR58UIqIMJ0GAHyH8gsAIWjlSunIES5nDCD0UH4BIARZLFJSktSli+kkAOBblF8ACDGnTkm5ua4PuoWFmU4DAL5F+QWAELNwoXT2rGveFwBCDeUXAEJMdrZ0551SmzamkwCA71F+ASCEfPed9NFH7O0LIHRRfgEghMybJ0VGSqNGmU4CAGZQfgEghFgs0sCB0jXXmE4CAGZQfgEgRPz739LWrYw8AAhtlF8ACBHZ2VKTJtKgQaaTAIA5lF8ACAFOp6v8jhol1a9vOg0AmEP5BYAQ8Nln0v79jDwAAOUXAEKAxSIlJLj29wWAUEb5BYAgd/68lJPjuqJbOD/1AYQ4v/gxWFxcrMzMTMXHxysqKkpdunRRTk5Ote67cuVK9evXT82aNVN0dLQ6d+6sP//5z3I4HF5ODQCBYdky6dgxRh4AQPKT8jtixAhlZWVpypQpWr58ubp3766MjAxZrdYr3m/58uW69957JUmzZs3SokWL1KdPHz311FN65plnfBEdAPyexSJ17ix16mQ6CQCYF2k6wNKlS7Vy5UpZrValp6dLknr37q0DBw5owoQJSk9PV3gVv6d79913Vb9+fS1ZskRRUVGSpL59+2rXrl2aM2eOZs6c6bPXAQD+6Phx6f33palTTScBAP9gfOU3NzdX0dHRGnXZtTbHjh2rgoICbdy4scr7RkVFqU6dOqp/2b49TZo0KSvDABDKbDbXzG9GhukkAOAfjJffHTt2KCkpqcLqbnJysiQpLy+vyvs+9thjcjgcevLJJ3Xo0CEdP35cWVlZWrhwoX71q195NTcABILsbKlvXyk+3nQSAPAPxsceCgsL1a5duwrHmzZtWna+Kl26dNGyZct0//33669//askKSIiQtOmTVNmZqZ3AgNAgDh4UFqzRvr7300nAQD/Ybz81sa6des0aNAg3X333frv//5vNWzYUKtWrdKkSZN09uxZPffcc6YjAoAxVqtUr540YoTpJADgP4yX39jY2EpXd4uKisrOV+Wpp55S27ZtlZubq7CwMEmuD8uFh4drypQpGjNmjNq2bVvl/TMzMxUTE1PuWEZGhjIYjgMQBCwWaehQqXFj00kA4MqsVmuFXb6OHz/ulecyXn5TUlJktVrlcDjKzf1u375dktTpCnvz5OXlacyYMWXF96JbbrlFDodDX3755RXL78yZM9W1a9davgIA8D//+pe0fbv08sumkwDA1VW2+Lht2zZ169bN489l/ANvw4cPV3FxsRYsWFDu+Jw5cxQfH6+ePXtWed+EhARt3ry5wgUt1q9fL0m67rrrPB8YAAKAxSLFxkr9+5tOAgD+xfjKb1pamlJTUzV+/HidPHlSiYmJslqtWrFihbKzs8tWdceNG6esrCzt27dPCQkJkqRnn31Wjz76qIYMGaL/+Z//UVRUlFatWqUZM2YoNTW1bMcIAAglDoc0d66Uni7VrWs6DQD4F+PlV5LsdrsmTZqkyZMnq6ioSElJSZo3b55Gjx5ddhuHwyGHwyGn01l27Be/+IVatWqlP/zhD/p//+//6cyZM2rbtq2mTJmip59+2sRLAQDjPvlE+vZbLmcMAJUJc17aJkPExRmSrVu3MvMLIOiMG+fa4uyrr6TLPhIBAAHDW33N+MwvAMBzzp6VFiyQxoyh+AJAZSi/ABBEliyRTp50lV8AQEWUXwAIIhaL1L27dOONppMAgH+i/AJAkCgslJYu5YNuAHAllF8ACBLz50tOp2uLMwBA5Si/ABAkLBbp3nul5s1NJwEA/0X5BYAgsG+f9I9/8EE3ALgayi8ABIG5c6WGDaVhw0wnAQD/RvkFgADndLpGHoYPdxVgAEDVKL8AEOC2bpV27WKXBwCoDsovAAQ4i8X1Ibd+/UwnAQD/R/kFgABWUiLNmydlZEiRkabTAID/o/wCQABbtUr6/ntGHgCguii/ABDALBbXpYy7djWdBAACA+UXAAJUcbFkt7tWfcPCTKcBgMBA+QWAALVokXTmjPTgg6aTAEDgoPwCQICyWKTbb5duuMF0EgAIHJRfAAhA338vffQRH3QDAHdRfgEgAOXkSOHh0qhRppMAQGCh/AJAALJYpIEDpdhY00kAILCwJToABJhdu6TNm6X33jOdBAACDyu/ABBgsrOlxo2lwYNNJwGAwEP5BYAA4nS6Rh7uv1+KijKdBgACD+UXAALI+vXS11+zywMA1BTlFwACSHa2dN11Uu/eppMAQGCi/AJAgDh/3rXF2YMPurY5AwC4jx+fABAgPvxQKixk5AEAaoPyCwABwmKRkpNdXwCAmqH8AkAAOHFCWryYVV8AqC3KLwAEALtdOndOysgwnQQAAhvlFwACgMUi9ekjJSSYTgIAgY3yCwB+7ttvpdWrGXkAAE+g/AKAn7Napbp1pZEjTScBgMBH+QUAP2exSEOGSE2amE4CAIGP8gsAfmz7dumLLxh5AABPofwCgB/LzpaaNpUGDDCdBACCA+UXAPyUwyHNnSuNHu2a+QUA1B7lFwD81Nq10sGDjDwAgCdRfgHAT1ksUps2Uq9eppMAQPCg/AKAH/rhB2n+fNeqb1iY6TQAEDz8ovwWFxcrMzNT8fHxioqKUpcuXZSTk3PV+/Xp00fh4eFVfh0+fNgH6QHA8z74QDpxQhozxnQSAAgukaYDSNKIESO0ZcsWvfrqq2rfvr2ys7OVkZEhh8OhjCtcyP7NN9/UqVOnyh07ffq00tLSdMstt6hZs2bejg4AXmGxSN26SR06mE4CAMHFePldunSpVq5cKavVqvT0dElS7969deDAAU2YMEHp6ekKD698gTopKanCsXfeeUcXLlzQz3/+c6/mBgBvKSpyrfxOn246CQAEH+NjD7m5uYqOjtaoUaPKHR87dqwKCgq0ceNGtx5v1qxZio6OLivSABBo5s+XSkulBx4wnQQAgo/x8rtjxw4lJSVVWN1NTk6WJOXl5VX7sXbv3q1169bpgQceUIMGDTyaEwB8JTtbSk2VWrQwnQQAgo/x8ltYWKimTZtWOH7xWGFhYbUf6+9//7skady4cZ4JBwA+tn+/a39f9vYFAO8wXn49paSkRO+8846Sk5PVo0cP03EAoEbmzpUaNJCGDTOdBACCk/HyGxsbW+nqblFRUdn56li6dKm+//57Vn0BBCynU3r3XVfxbdTIdBoACE7Gd3tISUmR1WqVw+EoN/e7fft2SVKnTp2q9TizZs1SvXr19NOf/rTaz52ZmamYmJhyxzIyMq64vRoAeMs//yl9+aU0Y4bpJADgW1arVVartdyx48ePe+W5wpxOp9Mrj1xNy5cv18CBAzVv3jyNHj267HhaWpry8vKUn5+vsKtc3ui7775TQkKC7r///gpvXGW2bdumbt26aevWreratWutXwMAeMIzz7j29y0okCKNL00AgFne6mvGf7ympaUpNTVV48eP18mTJ5WYmCir1aoVK1YoOzu7rPiOGzdOWVlZ2rdvnxISEso9xjvvvKPS0lL29gUQsEpLJatVysig+AKAN/nFj1i73a5JkyZp8uTJKioqUlJSUoWVYIfDIYfDocoWqmfPnq22bduqX79+vowNAB7z8cfSd9+xywMAeJvxsQcTGHsA4G/+67+kDRtcM79XmfQCgJDgrb5mfLcHAAh1p09Ldrtr1ZfiCwDeRfkFAMMWL5aKi6UHHzSdBACCH+UXAAyzWKTbbpMSE00nAYDgR/kFAIMOH5Y+/JAPugGAr1B+AcCgnBzXnO8lm9sAALyI8gsABmVnSwMGSNdeazoJAIQGv9jnFwBC0Z490saNrtVfAIBvsPILAIZkZ0vR0dKQIaaTAEDooPwCgAFOp2uXh5Ejpago02kAIHRQfgHAgI0bpb172eUBAHyN8gsABlgsUqtWUp8+ppMAQGih/AKAj1244PqQ24MPShERptMAQGih/AKAj61YIR09ysgDAJhA+QUAH7NYpE6dpJQU00kAIPRQfgHAh06elBYulMaMcV3ZDQDgW5RfAPCh3Fzphx9c874AAN+j/AKAD1ksUu/eUuvWppMAQGii/AKAjxQUSKtW8UE3ADCJ8gsAPjJvnlSnjnT//aaTAEDoovwCgI9YLNKQIVJMjOkkABC6KL8A4AN5edI//8nIAwCYRvkFAB/Iznat+A4YYDoJAIQ2yi8AeJnD4Sq/o0dL9eqZTgMAoY3yCwBetm6dlJ/PyAMA+APKLwB4mcUiXX+9dPvtppMAACi/AOBF585J8+e7Lmcczk9cADCOH8UA4EVLl0rHj7vKLwDAPMovAHiRxSJ17Sp17Gg6CQBAovwCgNccOyYtWcKqLwD4E8ovAHjJggVSSYn0wAOmkwAALqL8AoCXWCxSv35Sq1amkwAALqL8AoAX5OdLn37K3r4A4G8ovwDgBXPnSlFR0vDhppMAAC5F+QUAD3M6pXfflYYNk6KjTacBAFyK8gsAHvbFF9LOnezyAAD+iPILAB5msUjXXivde6/pJACAy1F+AcCDSktd874PPCDVqWM6DQDgcpRfAPCg1aulQ4fY5QEA/JXb5XfBggVyOBzeyAIAAS87W2rXTurRw3QSAEBl3C6/o0eP1vXXX6+pU6fq8OHD3sgEAAHpzBnJZnOt+oaFmU4DAKiM2+V3zZo1uu222/Tiiy+qdevWeuihh7R+/fpahSguLlZmZqbi4+MVFRWlLl26KCcnp9r3X7RokXr37q0mTZqoUaNG6tSpk95+++1aZQIAd73/vnTqFLs8AIA/c7v83nXXXXrvvfe0f/9+TZw4UatWrdLtt9+uW265RbNnz9a5c+fcDjFixAhlZWVpypQpWr58ubp3766MjAxZrdar3nfatGkaOXKkUlJSNH/+fL3//vt69NFHdeHCBbdzAEBtWCxSz56usQcAgH8Kczqdzto8wIULFzR//nz98Y9/1NatWxUbG6tx48bpqaeeUsuWLa96/6VLl2rw4MGyWq1KT08vO96/f3/l5eUpPz9f4eGVd/StW7eqZ8+emjZtmn75y19WO/O2bdvUrVs3bd26VV27dq32/QCgKkeOSK1aSX/8o/T446bTAEDg81Zfq/VuD/v379fGjRu1Z88eRUZGqlOnTnr99dfVvn17LV68+Kr3z83NVXR0tEaNGlXu+NixY1VQUKCNGzdWed+//OUvql+/vp544onavgwAqJX33nNd2e2S/4YHAPihGpVfh8OhRYsW6d5771WHDh00d+5cPfbYY/r666+1evVqHThwQH369NEzzzxz1cfasWOHkpKSKqzuJicnS5Ly8vKqvO+nn36qpKQkzZ8/XzfeeKMiIyOVkJCg3/z7KdvOAAAgAElEQVTmN4w9APApi0VKS5Pi4kwnAQBcSaS7d5g2bZreeust5efnKyUlRW+//bbGjBmjevXqld2mWbNmmjBhgu6+++6rPl5hYaHaVTIg17Rp07LzVfn222919OhRPfXUU5o6dao6duyolStXatq0aTp48KAsFou7Lw8A3LZ3r7Rhg1SNjykAAAxzu/w+99xzGjp0qN555x317t27ytslJibq+eefr1W4q3E4HDp16pTmzZun0aNHS5J69+6t06dPa+bMmXrhhReUmJjo1QwAkJ0tNWokDR1qOgkA4GrcLr9fffWV2rRpc9XbxcfHa8qUKVe9XWxsbKWru0VFRWXnr3Tfw4cPq3///uWOp6WlaebMmfr8888pv4AHHDlyRBMnTtemTTtVUhKhyMhS9ejRUdOnT1RcCP6e//L3Y//+UiUkdNTp0xPVoEHovR8AEEjcLr+tWrXS6dOn1bBhwwrniouLVbduXdWtW7faj5eSkiKr1SqHw1Fu7nf79u2SpE6dOlV5386dO2vFihVVng+7yi7zmZmZiomJKXcsIyNDGRkZ1YkOhITDhw+rV68HtHfvK5KmSwqT5NDOnZu0dm261q/PCakCXNX7sXfvJt12W+i9HwDgCVartcIWt8ePH/fOkznd9LOf/cz5wAMPVHruwQcfdD7yyCNuPd6yZcucYWFhzpycnHLH+/fv77zuuuucDoejyvu+/fbbzrCwMOfcuXPLHX/yySedkZGRzvz8/Ervt3XrVqck59atW93KCoSihx/+pVNa73TtZXD512fOhx/+pemIPsX7AQC+4a2+5vbK75o1a/T73/++0nNDhgzRr3/9a7ceLy0tTampqRo/frxOnjypxMREWa1WrVixQtnZ2WWrt+PGjVNWVpb27dunhIQESdLDDz+st956S48++qiOHj2qpKQkrVy5Um+88YbGjx9fdjsANbdp0065Vjgr01OrVk1VNXY1DBqrVl35/di0aaov4wAA3OR2+f3+++/VqlWrSs81b95c3333ndsh7Ha7Jk2apMmTJ6uoqEhJSUnlPsQmuT7c5nA45LzkmhyRkZH66KOP9Nvf/lavvPKKioqKdMMNN+jVV1+t1jZrAK6upCRCrl/tVyZcBw9G6L77fJnItCu/H673CwDgr9wuvzExMdqzZ4/69OlT4dzevXsVHR3tdoiGDRtq5syZmjlzZpW3mT17tmbPnl3h+DXXXKM333xTb775ptvPC+DqIiNLJTlVeeFzqH37Uq1b5+NQBt1xR6l27676/XC9XwAAf+V2+b377rs1bdo0jRgxotxODIWFhZo2bZr69u3r0YAAzOrRo6N27two6dZKzm5Ur14dQ+rCDr16ddTu3VW/Hz16dPR1JACAG9wuv7/73e/UvXt3tW/fXqNHj9Z1112ngwcPav78+bpw4YJeeOEFb+QEYMj06RP18cfpys9/WVJPuS4M6ZC0UYmJkzR9eo7ZgD42ffpErV2brr17eT8AIBC5XX47dOigdevW6ZlnntHbb78th8OhiIgI9e7dWzNmzFCHDh28kROAIXFxcXr44RxNnTpd7dtPlcNx6T6/obetV1xcnNavz/nPPr9TL9v3OPTeDwAING6XX8m1v+6qVat05swZHTt2TE2bNlVUVJSnswHwEx9+GKehQ19Tbq7pJP4hLi5Os2e/ZjoGAKAGalR+L2rQoIEaNGjgqSwA/NA330gbN0rvvms6CQAAtVej8ltSUqJly5bpyy+/1NmzZyucnzx5cq2DAfAPublSnTrS4MGmkwAAUHtul9/CwkLdcccd2rVrV5W3ofwCwcNmk/r1ky67EjgAAAEp3N07TJo0SfXr19f+/fslSRs2bNDu3bv17LPPqn379srPz/d0RgCGHD4srV0rjRxpOgkAAJ7hdvldtWqVnnnmmbKrvEVERKhdu3Z67bXXdM899+iXv/ylx0MCMGPRItefoXUFNwBAMHO7/H7zzTdq06aNIiIiFB4ertOnT5edGzJkiD766COPBgRgjs0m3XWXQuoiFgCA4OZ2+b322mt17NgxhYWFqWXLltq+fXvZuWPHjqmkpMSjAQGYceyYtGoVIw8AgODi9gfeunbtqry8PA0dOlSDBg3SSy+9pMaNG6tu3br6zW9+o1tvreySnwACzZIlUkmJNHy46SQAAHiO2+X38ccf1969eyVJL774ojZs2KD/+q//kiQlJibq9ddf92xCAEbYbNKtt0rx8aaTAADgOW6X39TUVKWmpkqSmjVrpm3btmnHjh0KCwtTUlKSIiNrdd0MAH6guFj68EPppZdMJwEAwLPcmvk9c+aMevXqpZUrV/74AOHhSklJUXJyMsUXCBLLlkk//CCNGGE6CQAAnuVW+W3QoIF27NhByQWCnM0m3XyzdMMNppMAAOBZbu/2cOutt2rTpk3eyALAD/zwg/TBB+zyAAAITm4v4c6YMUNDhw5V8+bNNXLkSDVq1MgbuQAY8tFHrplfyi8AIBi5vfJ722236dtvv9XYsWPVuHFjRUdHKzo6uux/N27c2Bs5AfiIzSZ16CAlJZlOAgCA57m98jvyKstBYWFhNQ4DwKwLF6TFi6VHHzWdBAAA73C7/M6ZM8cLMQD4gzVrXFd2Y+QBABCs3B57ABC8bDapTRvXTg8AAAQjt1d+33nnnauONvzsZz+rcSAAZpSWSrm50k9/KjG9BAAIVm6X37Fjx171NpRfIPB89pl0+DAjDwCA4OZ2+d23b1+FY4WFhVq0aJFycnJktVo9EgyAb9lsUqtWUs+eppMAAOA9bpffNm3aVHqsW7duOn/+vF5//XW98847nsgGwEecTslul4YPl8L5JAAAIIh59K+5fv36afHixZ58SAA+sGWLdPAgIw8AgODn0fKbn5+viIgITz4kAB+w2aTYWOnOO00nAQDAu9wee/j0008rHDt37py++OIL/f73v1e/fv08EgyAbzidrvI7bJgU6fZPBAAAAovbf9X16dOnynP33HOP/vznP9cmDwAf27FD+uor6U9/Mp0EAADvc7v8fvzxxxWO1a9fX23atFGLFi08EgqA79hsUuPGUt++ppMAAOB9Hl35BRB47HZpyBCpXj3TSQAA8D63P/C2a9cuffLJJ5WeW7Nmjfbs2VPrUAB8Y88eaft2dnkAAIQOt8vvM888o0WLFlV67v3339ezzz5b61AAfMNmkxo0kPr3N50EAADfcLv8btmyRXdWsR9S7969tWnTplqHAuAbdrs0YICrAAMAEArcLr8nTpxQdHR0peeioqJ07NixWocC4H35+dLmzYw8AABCi9vlt1WrVtq4cWOl5zZv3qyWLVvWOhQA77Pbpbp1pUGDTCcBAMB33C6/w4cP17Rp0ypsebZ69WpNmzZNw4cP91g4AN5jt0upqa5tzgAACBVub3X2/PPP68MPP9Q999yjG2+8Udddd50OHjyo3bt366abbtKUKVO8EBOAJ333nbRunTRrlukkAAD4ltsrvzExMVq/fr1eeOEFXXPNNdq/f79iY2P14osvav369WrSpIk3cgLwoIULpfBwaehQ00kAAPAtt1d+JSk6OlrPP/+8nn/+eU/nAeADdrvUp48UG2s6CQAAvuX2yu/hw4e1a9euSs/t2rVLR44cqVGQ4uJiZWZmKj4+XlFRUerSpYtycnKuer85c+YoPDy80q/Dhw/XKAsQzIqKpNWr2eUBABCa3F75feyxxxQTE6O33367wrkZM2bo5MmTslqtbgcZMWKEtmzZoldffVXt27dXdna2MjIy5HA4lJGRcdX7z5kzRx06dCh3rGnTpm7nAILd4sVSaak0bJjpJAAA+J7b5fezzz7TzJkzKz3Xv39/Pfnkk26HWLp0qVauXCmr1ar09HRJrgtmHDhwQBMmTFB6errCw6+8SN2pUyd17drV7ecGQo3dLvXqJbErIQAgFLk99nD06FFde+21lZ6LiYmp0dhDbm6uoqOjNWrUqHLHx44dq4KCgir3Fb6U0+l0+3mBUHPqlLRiBSMPAIDQ5Xb5bdasmf71r39Vem7Hjh2KrcEnaHbs2KGkpKQKq7vJycmSpLy8vKs+xuDBgxUZGanY2FiNHDmyWvcBQs0HH0jnzklsxw0ACFVul98BAwbolVdeqfCht927d+uVV17RwIED3Q5RWFhY6XzuxWOFhYVV3rdly5Z67rnnNGvWLK1Zs0YvvfSSNm/erFtvvVXbt293OwsQzOx2qVs3qU0b00kAADDD7Znf3/3ud1qyZIk6d+6su+++u+wiF6tXr9a1116rF154wRs5q9S/f3/179+/7J/vuOMODRo0SMnJyZo8ebJyc3N9mgfwV2fPSkuXSpMmmU4CAIA5bpff+Ph4bd68WZMnT9ayZcu0cuVKxcXF6ac//alefPFF1alTx+0QsbGxla7uFhUVlZ13x/XXX6/bb79dGzZsuOLtMjMzFRMTU+5YRkZGtXaXAALNhx9Kp09LI0aYTgIAQHlWq7XCbmHHjx/3ynPV6CIX8fHxmnXJdVEdDoeWLl2qxx9/XB988IHOnTvn1uOlpKTIarXK4XCUm/u9OLbQqVOnmsRUWFjYFc/PnDmTHSIQMux26aabpBtvNJ0EAIDyKlt83LZtm7p16+bx53J75vdSe/fu1W9/+1u1bt1aQ4cO1bJlyzSyBh8jHz58uIqLi7VgwYJyx+fMmaP4+Hj17NnTrcfbt2+f1q5dq9tuu83tLEAwOn/etb8vuzwAAEKd2yu/Z8+e1fz58zVr1iytXbu27Pizzz6rX//61zXa7SEtLU2pqakaP368Tp48qcTERFmtVq1YsULZ2dllK7jjxo1TVlaW9u3bp4SEBElSamqq+vbtq5tuukmNGjXS9u3bNX36dEVGRuqll15yOwsQjD7+WDpxgpEHAACqXX43bdqkWbNmad68eTp16pTi4uL0+OOPa9CgQRowYICGDBlSo+J7kd1u16RJkzR58mQVFRUpKSlJ8+bN0+jRo8tu43A45HA4yu3pm5ycrOzsbB08eFBnz55Vs2bNdM899+j5559Xu3btapwHCCZ2u5SYKKWkmE4CAIBZ1Sq/ycnJysvLU4MGDTRs2DCNGTNGqampioyM9NgwcsOGDTVz5swqrx4nSbNnz9bs2bPLHZsxY4ZHnh8IVqWl0sKF0tix0lXG4AEACHrVKr95eXmKiorSiy++qEceeaTCDgkA/NfatdKRI4w8AAAgVfMDb6+//rp+8pOf6Je//KVatGih4cOHa8GCBTp//vxVd1QAYJbdLl13ndS9u+kkAACYV63y+8QTT+jzzz/X5s2b9cgjj2j16tUaPXq0mjdvrvHjx3s7I4Aacjhc5XfECCm8Vnu7AAAQHNz667Bbt2564403dOjQIWVlZenmm2/WvHnzJEk///nP9b//+79XvBQxAN/atEn69ltGHgAAuKhGa0FRUVF66KGHtHr1au3Zs0e/+c1vdPr0aU2cOFHXXXedpzMCqCG7XWrWTLrjDtNJAADwD7X+RWhiYqJefvll5efn6/3339eAAQM8kQtALTmdks0mDRsmRUSYTgMAgH+o0eWNKxMREaFBgwZp0KBBnnpIALXwxRfSvn2MPAAAcCk+AgMEKbtdiomR7r7bdBIAAPwH5RcIUjabNHSoVLeu6SQAAPgPyi8QhL78Utq5k5EHAAAuR/kFgpDdLjVsKN17r+kkAAD4F8ovEIRsNmnQICkqynQSAAD8C+UXCDJffy1t28bIAwAAlaH8AkEmN1eqV08aONB0EgAA/A/lFwgyNpvUv78UHW06CQAA/ofyCwSRggLps88YeQAAoCqUXyCILFwoRUa69vcFAAAVUX6BIGKzSX37StdcYzoJAAD+ifILBImjR6VPPmHkAQCAK6H8AkFi8WLJ4ZCGDTOdBAAA/0X5BYKEzSbdeafUvLnpJAAA+C/KLxAETpyQPvqIkQcAAK6G8gsEgQ8+kC5coPwCAHA1lF8gCNhsUo8eUkKC6SQAAPg3yi8Q4E6flpYtY9UXAIDqoPwCAe7DD6WzZ6WRI00nAQDA/1F+gQBns0kpKVK7dqaTAADg/yi/QAA7d05asoSRBwAAqovyCwSwVaukkycZeQAAoLoov0AAs9mk9u2lm24ynQQAgMBA+QUCVEmJtGiRa+QhLMx0GgAAAgPlFwhQn34qFRYy8gAAgDsov0CAstmk1q2lbt1MJwEAIHBQfoEA5HBIubmMPAAA4C7KLxCANmyQDh1i5AEAAHdRfoEAZLNJLVpIvXqZTgIAQGCh/AIBxul0ld9hw6Rw/g0GAMAt/NUJBJh//lM6cICRBwAAaoLyCwQYm01q2lTq3dt0EgAAAg/lFwggF0cehg6V6tQxnQYAgMDjF+W3uLhYmZmZio+PV1RUlLp06aKcnBy3H+e5555TeHi4kpOTvZASMO/f/5Z27WLkAQCAmoo0HUCSRowYoS1btujVV19V+/btlZ2drYyMDDkcDmVkZFTrMT7//HP94Q9/UPPmzRXGxqcIUjabFB0t3XOP6SQAAAQm4+V36dKlWrlypaxWq9LT0yVJvXv31oEDBzRhwgSlp6cr/CofaS8pKdHYsWP1i1/8Qp9//rkKCwt9ER3wOZtNGjRIql/fdBIAAAKT8bGH3NxcRUdHa9SoUeWOjx07VgUFBdq4ceNVH2PatGk6fvy4pk6dKqfT6a2ogFF790pffMHIAwAAtWG8/O7YsUNJSUkVVncvzu3m5eVd8f47d+7Uyy+/rDfffFMNGzb0Wk7ANLvdteI7YIDpJAAABC7j5bewsFBNmzatcPzisSuNMJSWluqRRx7RyJEjlZaW5rWMgD+w26W0NIn/xgMAoOaMz/zWxh//+Eft3btXS5YsMR0F8KpvvpE2bJDefdd0EgAAApvx8hsbG1vp6m5RUVHZ+crk5+dr8uTJmj59uiIjI3X8+HFJrg+/lZaW6sSJE6pXr57q88kgBIHcXNe+voMHm04CAEBgM15+U1JSZLVa5XA4ys39bt++XZLUqVOnSu+3b98+/fDDD3ryySf15JNPVjh/zTXXKDMzUzNmzKjyuTMzMxUTE1PuWEZGRrW3VwN8xW6X+vWTLvt2BQAgKFitVlmt1nLHLi5selqY0/D2CMuXL9fAgQM1b948jR49uux4Wlqa8vLylJ+fX+m+vSdOnNAXX3xR7pjT6VRmZqZOnjyp2bNnKz4+XomJiRXuu23bNnXr1k1bt25V165dPf+iAA86ckRq0UL629+kn//cdBoAAHzDW33N+MpvWlqaUlNTNX78eJ08eVKJiYmyWq1asWKFsrOzy4rvuHHjlJWVpX379ikhIUFNmjTRXXfdVeHxmjRpopKSkkrPAYFo4ULXn/fdZzYHAADBwHj5lSS73a5JkyZp8uTJKioqUlJSUoWVYIfDIYfDcdV9fMPCwrjCG4KK3S7ddZcUF2c6CQAAgc/42IMJjD0gUBw/LjVrJs2YIT3+uOk0AAD4jrf6mvF9fgFU7f33pQsXpOHDTScBACA4UH4BP2a3S7feKsXHm04CAEBwoPwCfqq4WFq+XBo50nQSAACCB+UX8FPLlkk//CCNGGE6CQAAwYPyC/gpu126+WbphhtMJwEAIHhQfgE/9MMP0pIljDwAAOBplF/AD330kWvml/ILAIBnUX4BP2S3Sx06SElJppMAABBcKL+An7lwQVq0iFVfAAC8gfIL+Jk1a6Rjxyi/AAB4A+UX8DN2u9SmjWunBwAA4FmUX8CPlJZKubmuVd+wMNNpAAAIPpRfwI989pn0/feMPAAA4C2UX8CP2O1Sq1ZSz56mkwAAEJwov4CfcDpd5Xf4cCmcfzMBAPAK/ooF/MSWLVJ+PiMPAAB4E+UX8BN2uxQbK915p+kkAAAEL8ov4AecTslmk4YNkyIjTacBACB4UX4BP7Bjh7RnDyMPAAB4G+UX8AN2u9S4sdS3r+kkAAAEN8ov4AdsNmnIEKlePdNJAAAIbpRfwLA9e6Tt2xl5AADAFyi/gGF2u9SggdS/v+kkAAAEP8ovYJjNJg0Y4CrAAADAuyi/gEH5+dLmzYw8AADgK5RfwKDcXKluXWnQINNJAAAIDZRfwCCbTUpNdW1zBgAAvI/yCxjy3XfSunWMPAAA4EuUX8CQRYuk8HBp6FDTSQAACB2UX8AQm03q00eKjTWdBACA0EH5BQwoKpJWr2bkAQAAX6P8Aga8/75UWioNG2Y6CQAAoYXyCxhgs0m9ekktW5pOAgBAaKH8Aj526pS0YgUjDwAAmED5BXxs6VLp3Dlp+HDTSQAACD2UX8DHbDapWzepTRvTSQAACD2UX8CHzp51rfwy8gAAgBmUX8CHVqyQTp+WRowwnQQAgNBE+QV8yGaTbrpJuvFG00kAAAhNlF/AR86flxYvZuQBAACT/KL8FhcXKzMzU/Hx8YqKilKXLl2Uk5Nz1futXLlSqampio+PV/369dW8eXP169dPy5Yt80FqwD2rV0snTjDyAACASX5RfkeMGKGsrCxNmTJFy5cvV/fu3ZWRkSGr1XrF+xUVFSk5OVkzZ87URx99pL/97W+qU6eOBg0apOzsbB+lB6rHZpMSE6WUFNNJAAAIXWFOp9NpMsDSpUs1ePBgWa1Wpaenlx3v37+/8vLylJ+fr/Dw6nf0kpIStW3bVjfccIM++eSTSm+zbds2devWTVu3blXXrl1r/RqAqyktdV3NbexY6dVXTacBAMD/eauvGV/5zc3NVXR0tEaNGlXu+NixY1VQUKCNGze69XiRkZFq0qSJIiMjPRkTqJV166QjRxh5AADANOPld8eOHUpKSqqwupucnCxJysvLu+pjOBwOlZSUqKCgQL/73e+0e/duPf30017JC9SEzSZdd53UvbvpJAAAhDbjy6OFhYVq165dheNNmzYtO381AwcO1IoVKyRJDRo0UHZ2tgYPHuzZoEANORyS3e7a5cGNCR4AAOAFxsuvJ/zlL3/RiRMndOjQIb377rsaM2aMzp8/rzFjxpiOBmjzZunbbxl5AADAHxgvv7GxsZWu7hYVFZWdv5pLV44HDx6sgQMH6oknnqD8wi/YbFKzZtIdd5hOAgAAjJfflJQUWa1WORyOcnO/27dvlyR16tTJ7cfs3r27li9frsOHD6tZs2ZV3i4zM1MxMTHljmVkZCgjI8Pt5wQq43S6yu+wYVJEhOk0AAD4J6vVWmGL2+PHj3vluYxvdbZ8+XINHDhQ8+bN0+jRo8uOp6WllW11FhYWVu3Hczqduvvuu7V9+3YdOXKk0m3S2OoMvvLFF9LNN0vLl0v9+5tOAwBA4PBWXzO+8puWlqbU1FSNHz9eJ0+eVGJioqxWq1asWKHs7Oyy4jtu3DhlZWVp3759SkhIkCTdd999uvnmm9W5c2fFxsaqoKBAc+bM0aeffqo33njDrf2BAW+w2aSYGOnuu00nAQAAkh+UX0my2+2aNGmSJk+erKKiIiUlJVVYCXY4HHI4HLp0ofqOO+7QggUL9Je//EUnT55UTEyMunfvrg8++EADBgww8VKAcmw2aehQqW5d00kAAIDkB2MPJjD2AF/48kspKUlauFC67z7TaQAACCxBe4U3IFjZ7VLDhtK995pOAgAALqL8Al5is0mDBklRUaaTAACAiyi/gBfs3y9t28aFLQAA8DeUX8AL7HapXj1p4EDTSQAAwKUov4AX2GyufX2jo00nAQAAl6L8Ah526JD02WeMPAAA4I8ov4CH5eZKkZGu/X0BAIB/ofwCHmazSX37StdcYzoJAAC4HOUX8KCjR6VPPmHkAQAAf0X5BTxo8WLJ4ZCGDTOdBAAAVIbyC3iQ3S7deafUvLnpJAAAoDKUX8BDTp6UPvqIkQcAAPwZ5RfwkCVLpPPnKb8AAPgzyi/gIXa71KOHlJBgOgkAAKgK5RfwgDNnpGXLWPUFAMDfUX4BD1i+3FWAR440nQQAAFwJ5RfwALtdSkmR2rUznQQAAFwJ5ReopXPnpPffZ+QBAIBAQPkFamnVKtc2Z4w8AADg/yi/QC3Z7VL79tJNN5lOAgAArobyC9RCSYm0cKFr5CEszHQaAABwNZRfoBY+/VQqLGTkAQCAQEH5BWrBbpdat5a6dTOdBAAAVEdIl9/7739SY8dO0JEjR0xHQQByOFzll5EHAAACR0iX36+/nqk5c0bqttvSKcBw24YN0qFDjDwAABBIQrr8ul7+rdq792VNnDjddBgEGLtdatFC6tXLdBIAAFBdIV5+L+qpTZt2mg6BAOJ0SjabNGyYFM6/RQAABAz+2pYkhevcuQjTIRBA/vlPaf9+Rh4AAAg0lF9JkkP79pVq7Fhp82bTWRAI7HapaVOpd2/TSQAAgDsov5KkjerataPWrJF69JBuuUX6+9+lM2dM54K/stmkoUOlOnVMJwEAAO4I8fLrkLReiYmTtGzZRH31lbRkidS8ufTzn0vx8dLTT0u7d5vOCX+yc6f05ZeMPAAAEIhCuvy2bZuphx+2a/36HMXFxSkiQho0SPrgA2nvXul//keyWKQbb5RSU12/6i4pMZ0aptntUnS0dM89ppMAAAB3hXT5XbDgT5o9+zXFxcVVONe2rTRtmvTNN64CfOaMa6WvTRvpxRelggLf54V/sNlc/5FUv77pJAAAwF0hXX6ro149acwY6R//kD7/XBo8WJo+Xbr+emnUKGn1ate2VwgN+/a5vg8YeQAAIDBRft3QubP01lvSt99Kf/yjlJcn9e0rdewo/fnP0okTphPC2+x214rvgAGmkwAAgJqg/NZAkybS44+7yu+aNVJKivTMM1KrVtJ//7drZRDByWaT0tKkhg1NJwEAADVB+a2FsDDXPq85OVJ+vv5/e3ceFlW9/wH8fYZhl0WIFHABtRQBE7hJWYob4obXJXdTAfGJqyJ2nzSXXK5LuVaWWXpFwAWXEE2voimBeQXc6ppgP1JM8+IKEZKSwHx/f0zMdaXsItEAACAASURBVBxQYDZg3q/nOY/OlznnfM6HM/N8+J7v+R7MmQMcPgz4+Skfebt1K1BaauwoSVdu3AAyMznkgYiIqCFj8asjrq7A/PnA1atAcrJyNoAJE4AWLYDZs5VjRalh27dPOa/voEHGjoSIiIjqisWvjsnlwJAhwJEjyvmBJ04ENm0C2rUDBgxQziNcUWHsKKkukpKA3r0BR0djR0JERER1xeJXj154AVizRnmDXGwscO8eEBoKtG0LvP8+cOeOsSOkmrp7FzhxgkMeiIiIGjoWvwZgbQ1MmgScPg2cOaPsPfzHP5RDIsaNA06e5HRp9d3+/cp///pX48ZBRERE2qk3xW9JSQliYmLg7u4Oa2tr+Pn5YdeuXc9cLykpCSNHjoSnpydsbGzg6emJ8ePH4/LlywaIuvb+8hdg82Zlb/CKFcpiuFu3/02jdv++sSOkqiQlAd27A1U8D4WIiIgakHpT/A4bNgwJCQlYtGgRUlJS8PLLL2PMmDFITEx86nqrVq1CaWkpFixYgCNHjmDp0qX47rvv4O/vj5ycHANFX3tOTsDMmcCPPwJff60cEzx1KuDurvz34kVjR0iVioqA48c55IGIiKgxkBs7AAA4dOgQjh07hsTERIwaNQoAEBQUhGvXruGdd97BqFGjIJNVXacfOHBA4/HEvXr1goeHBz788ENs2rRJ7/FrQyYD+vRRLjduKG+O27gR+OwzZU9jVBQwbBhgYWHsSE3XwYNAWRkwdKixIyEiIiJt1Yue3+TkZNjZ2WHEiBFq7WFhYcjPz0dWVla16z5Z+AKAq6sr3N3dcePGDZ3Hqk8tWgCLFyvnDN69GzAzA8aMAVq2VE6jdv26sSM0TUlJwCuvKHvliYiIqGGrF8XvxYsX4eXlpdG76+vrCwDIzs6u1fby8vJw/fp1eHt76yxGQzI3B0aMAFJTlU+RGzVK+fhkT0/lDVdHjgAKhbGjNA0lJUBKCoc8EBERNRb1ovgtKCiAk5OTRntlW0FBQY23VV5ejvDwcNjZ2WHmzJk6i9FYOnYE1q1T3iD3+efK3t9+/YAXXwRWrwZqkRqqg5QU5VP6hg0zdiRERESkC/Wi+NUVhUKBiIgInDp1CgkJCXBvRNepmzQBIiOB8+eBU6eUj0+eN095KX7SJCAri9Ol6UNSEtC5M9CmjbEjISIiIl2oFze8OTs7V9m7W1hYqPr5swghEBkZie3btyMhIQGhoaHPXCcmJgaOTzyua8yYMRgzZkwNIzc8SQJefVW5rFkDbNmi7BGOjwf8/YG//U05TtjGxtiRNlx3797FrFkrkZmZg//7PzM891wFwsI6YuXKWVWOMSciIiLtJCYmaszwVVRUpJ+diXpgypQpws7OTlRUVKi1JyYmCkmSREZGxlPXVygUIjw8XMhkMhEXF/fM/Z07d04AEOfOndMq7vqiokKIQ4eEGDRICEkSwsFBiBkzhPjxR2NH1vDcvn1btG3bUwAZAlAIZX96hQAyRNu2PcWdO3eMHSIREZFJ0Fe9Vi+GPQwdOhQlJSX48ssv1drj4uLg7u6OwMDAatcVf/b4xsXFYePGjZg4caK+w613ZDKgf3/gwAEgL0/Z+7tjB9Chg/JpcklJyqm66Nlmz16FK1eWA3gFgPRnqwzAK7hyZRlmzVppvOCIiIhIa/Vi2EO/fv0QHByMqKgoFBcXo23btkhMTMTRo0exfft2SJKyCImIiEBCQgLy8vLQsmVLAEB0dDRiY2MRHh4OHx8fZGZmqrZraWkJPz8/oxyTsXh4AMuXAwsXKoveDRuAN94A3NyAKVOU44bd3P73/spL/KdP56C83AxyeQW6dGm4l/iFUN6gdv++ciku/t//q2t7/PXp0zkAqitwA3H69FJDHg4RERHpWL0ofgFg7969mDdvHhYsWIDCwkJ4eXlh586dGDlypOo9CoUCCoUC4rE7uw4ePAhJkhAbG4vY2Fi1bXp4eCAvL89gx1CfWFoCY8cqlwsXlEXw6tXAkiXAkCHK3mFv7zt47bXRf/Z0roSyp1OBnJzT+PbbUcjI2GWQAlihUE4pVpuC9WlFbEVF9fuSJMDOTnOxt1f+4XDhghn++EOqZm0ZysvN9JECIiIiMhBJCNObI+D8+fMICAjAuXPn4O/vb+xwDKa4GNi2Tfn0uOxswN7+HRQXD4fyEv+TMjBp0l5s2bKqym09elS7HtWntZWUPD1uc/P/FahVFa1Pe/1km42NcphIdby9ByIn5yD+N+ThcQp07BiK7Ox/PT1gIiIi0pq+6rV60/NL+mdvr+zxjYoCvv0WCA19+iX+vXuX4vbtqgvWP/54+r5sbKouQF1dlXMU16ZotbTUdSaq16VLR+TkZKHqPwiy0KVLR8MFQ0RERDrH4tcESRLQvTvQvLkZiouffonfygpwcal5z6qdnXJOYnkDPbNWrpyFb78dhStXlgEIhPJmNwWALLRtOw8rV+4yboBERESklQZaopAuyOUVAASqu8Tv4VGBvXsNHJSRubi4ICNj1583AS594iZAw4yBJiIiIv1h8WvCeIm/ai4uLtWOdSYiIqKGrV7M80vGsXLlLLRtOxdABpSX9vHnvxl/XuKfZbzgiIiIiPSAPb8mjJf4iYiIyNSw+DVxvMRPREREpoTDHoiIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZLD4JSIiIiKTweKXiIiIiEwGi18iIiIiMhksfomIiIjIZNSb4rekpAQxMTFwd3eHtbU1/Pz8sGvXrmeud+PGDcTExCAoKAiOjo6QyWSIj483QMRERERE1NDUm+J32LBhSEhIwKJFi5CSkoKXX34ZY8aMQWJi4lPXu3z5Mnbs2AErKysMHDgQACBJkiFCJiIiIqIGpl4Uv4cOHcKxY8ewYcMGREZGIigoCBs3bkRwcDDeeecdKBSKatcNCgrCnTt3cOTIEbz99tsGjLrxedYfGqaIOdHEnKhjPjQxJ5qYE03MiSbmxDDqRfGbnJwMOzs7jBgxQq09LCwM+fn5yMrKqnbdx3t5hRB6i9EU8EOniTnRxJyoYz40MSeamBNNzIkm5sQw6kXxe/HiRXh5eUEmUw/H19cXAJCdnW2MsIiIiIiokakXxW9BQQGcnJw02ivbCgoKDB0SERERETVC9aL4JSIiIiIyBLmxAwAAZ2fnKnt3CwsLVT/Xh0uXLulluw1VUVERzp8/b+ww6hXmRBNzoo750MScaGJONDEnmpgTdXqr00Q9MGXKFGFnZycqKirU2hMTE4UkSSIjI6NG2zlz5oyQJEnEx8c/9X35+fnCzc1NAODChQsXLly4cOFSTxc3NzeRn59f5xqzKvWi53fo0KHYtGkTvvzyS4wcOVLVHhcXB3d3dwQGBup0f66urjh79ixu3ryp0+0SERERke64urrC1dVVp9usF8Vvv379EBwcjKioKBQXF6Nt27ZITEzE0aNHsX37dtV0ZhEREUhISEBeXh5atmypWv/LL78EAOTl5QEAzpw5AxsbGwDAG2+8UeU+9ZFMIiIiIqrfJCHqx+S4v//+O+bNm4fdu3ejsLAQXl5emDNnjlpPcFhYGBISEnD16lW0atVK1f74FGmSJKnm+5UkCRUVFYY7CCIiIiKq1+pN8UtEREREpG+NaqqzkpISxMTEwN3dHdbW1vDz88OuXbueud6NGzcQExODoKAgODo6QiaTIT4+3gAR619dc5KUlISRI0fC09MTNjY28PT0xPjx43H58mUDRK1fdc3JsWPHEBwcDHd3d1hZWaFZs2bo3bs3Dh8+bICo9aeu+XjS/PnzIZPJVA+naejqmpe4uDjIZLIqlzt37hggcv3Q9jzZv38/goKC4ODggCZNmsDHxwebNm3SY8T6V9ec9OjRo9pzxJTPk2PHjqF37954/vnnYWdnh5deegmffPIJFAqFnqPWL21ycuTIEbz22muwsbGBo6MjBg8ejJycHD1HrF8lJSWYNWsW+vbtCxcXF8hkMixevLjG69+5cweTJk2Ci4sLbG1t0bVrV6SmptYuCJ3ePmdkwcHBomnTpmLjxo0iLS1NREZGCkmSxI4dO5663jfffCNcXFxE3759xdixY2s0Y0RDUdecBAYGitDQUBEbGytOnDghtm3bJjp27Cjs7OxEdna2gaLXj7rmZNeuXWLmzJli9+7d4sSJEyI5OVmEhIQISZLEtm3bDBS97tU1H4/77rvvhJWVlWjevLnw9fXVY7SGU9e8bNmyRfUdkpWVpbaUlZUZKHrd0+Y8ef/994WZmZmYNm2aOHLkiEhNTRXr168X69evN0Dk+lPXnOTk5GicG6mpqcLCwkJ07drVQNHrR11zcvjwYSFJkujVq5f46quvxPHjx0V0dLSQJEnMmDHDQNHrR11zsm/fPiFJkhg2bJg4fPiwSExMFB06dBBOTk7iypUrBope965evSocHR1Fjx49VLlYvHhxjdYtLS0VPj4+olWrVmLHjh3i2LFjYsiQIcLc3Fykp6fXOIZGU/z+61//EpIkiZ07d6q19+3bV7i7u2tMo/Y4hUKh+v/Zs2cbTfGrTU7u3Lmj0Zafny8sLCzE5MmTdR6roWiTk6qUlZWJFi1aiO7du+syTIPRRT7KyspE586dRUxMjOjRo0ejKH61yUtl8Xvu3Dl9h2kw2uTj7NmzwszMTKxatUrfYRqUrr9L4uLihCRJIjY2VpdhGpQ2ORk7dqywtrYWDx48UGsPCQkRDg4OeonXELTJSfv27YWfn59a27Vr14SlpaUYN26cXuI1tHv37tWq+F2/fr2QJElkZmaq2srLy4W3t7cIDAys8X4bzbCH5ORk2NnZYcSIEWrtYWFhyM/PR1ZWVrXrVs4mAUB1s1xjoE1OXFxcNNpcXV3h7u6OGzdu6DxWQ9EmJ1WRy+VwcHCAXF4vJk6pNV3k44MPPkBRURGWLl3aaD4/ushLY8kFoF0+Pv30U1hZWWH69On6DtOgdP1dsnnzZtjZ2WHUqFG6DNOgtMmJtbU1zM3NYWVlpdbu4OAAa2trvcRrCHXNSUFBAXJzc9GvXz+19latWsHb2xv79u1rFN8xtT2G5ORkdOjQQW0KXDMzM4wfPx6nT5+u8RS2jab4vXjxIry8vNRmfgCgGn+YnZ1tjLCMStc5ycvLw/Xr1+Ht7a2zGA1NFzlRKBQoLy9Hfn4+Fi5ciNzcXMycOVMv8eqbtvnIycnBsmXLsGHDBtja2uotTkPTxXkyaNAgyOVyODs7Y/jw4Q36O0ibfJw4cQJeXl7Ys2cP2rdvD7lcjpYtW2LOnDkoKyvTa9z6pMvv19zcXJw8eRKjR49WTdPZEGmTk6lTp0KhUCA6Oho3b95EUVEREhISsG/fPsyePVuvcetTXXPy6NEjAIClpaXGzywtLfHgwQNcuXJFx9HWfxcvXkSnTp002mv7uWs0xW9BQQGcnJw02ivbqnp8cmOny5yUl5cjPDwcdnZ2DbbQA3STkwEDBsDCwgItWrTAmjVrsH37dgwaNEjnsRqCNvmoqKhAeHg4hg8frtE70dBpkxdXV1fMnz8fmzdvRlpaGpYsWYIzZ87glVdewQ8//KC3mPVJm3z897//RW5uLmbMmIGYmBgcP34ckyZNwurVqxEWFqa3mPVNl9+vsbGxAJRz2Tdk2uTEz88Phw8fxp49e+Du7g4nJydERERg+fLliImJ0VvM+lbXnDRr1gxOTk44efKkWntRUREuXrwISZJMsq4pLCzUyeeuYV6rJYNSKBSIiIjAqVOnkJSUBHd3d2OHZFSffvopfvvtN9y8eRNbt27FuHHj8OjRI4wbN87YoRnUhx9+iCtXruDgwYPGDqVeCQkJQUhIiOr166+/joEDB8LX1xcLFixAcnKyEaMzPIVCgfv372Pnzp2qeduDgoLw+++/46OPPsLixYvRtm1bI0dpPOXl5YiPj4evry+6dOli7HCM5uTJkxg4cCB69uyJKVOmwNbWFsePH8e8efPw8OFDzJ8/39ghGpRMJsPUqVOxZMkSLFu2DJGRkSguLkZMTAwePnwIIYRGbzLVXKPJnLOzc5UVf2FhoernpkYXORFCIDIyEtu3b0dcXBxCQ0N1Hqch6SIn7dq1Q0BAAAYNGoRdu3ahT58+DXY8Y13zcf36dSxYsAALFy6EXC5HUVERioqKUF5ejoqKCvz2228oLS3Va+z6pOvvk9atW+O1115DZmamTuIzNG3y4ezsDEmS1P4gAKC6WvD999/rMFLD0dU5cujQIdy+fbvB9/oC2uVkxowZ8PT0RHJyMgYMGICgoCD84x//wLvvvotFixbh6tWreotbn7TJyYIFCzBz5kwsWbIEzZs3x4svvgiZTKa6YmKKHVHOzs6q3D2utp+7RlP8durUCZcuXdKYD7DyMqOPj48xwjIqbXMihMDkyZMRFxeHzZs3Y+zYsXqL1VD0cZ68/PLLKCoqapBzc9Y1H3l5eSgtLUV0dDScnJxUy6lTp3Dp0iU0bdoUc+fO1Xv8+qKv75PHb65tSLTJx0svvfTUm1pMMSeP27x5MywtLfHmm2/qPEZD0yYn2dnZCAgI0Dgf/vKXv0ChUODHH3/UfcAGoE1OzMzMsGbNGhQWFuKHH37AzZs38dVXX+HatWto06YN3Nzc9Bp7feTr64sLFy5otNf6u7nG80LUc5VzBO7atUutPSQkRLRo0UJtOrOnOXPmTKOZ6kybnCgUChERESFkMpn45z//qe9QDUZX50klhUIhgoKChJOTU62nNqoP6pqPoqIikZ6errakpaWJzp07izZt2oj09HRx+fJlQxyCXuj6PLly5YqwtbUVw4YN02WYBqNNPjZt2lTlnKbR0dFCLpeL69ev6yVmfdPFOXLz5k0hl8vF6NGj9RWmQWmTk3bt2glfX1+N79G5c+cKSZLEhQsX9BKzvun6u+TcuXNCLpeLdevW6TJMo7l7926tpjrbsGGDkCRJZGVlqdrKysqEt7e3ePXVV2u830ZT/AqhnDfPyclJbNq0SaSmplY5kXR4eHiVX7h79uwRe/bsEStWrBCSJIlp06ap2hqyuuZk2rRpQpIkERERITIzM0VGRoZqOX/+vDEORWfqmpPBgweLBQsWiKSkJJGWliZ27Ngh+vbtKyRJEhs2bDDGoeiENp+bJwUFBQkfHx99h2wQdc1Lnz59xPLly8X+/fvF8ePHxUcffSTc3NyEg4NDg35ATF3zUVZWJgICAoSjo6NYt26d+Prrr8Xs2bOFXC4X06dPN8ah6Iy2n50PPvhASJIkjh07Zsiw9aquOaksagYMGCD2798vjh49KmbPni3Mzc1F3759jXEoOlPXnKSlpYkVK1aIlJQUcfjwYbF48WJha2srQkNDa1001zeHDh0Se/bsEbGxsUKSJDFy5EhVzVU513NVOfnjjz/UHnLx9ddfi6FDhwoLCwtx4sSJGu+/URW/JSUlYsaMGcLV1VVYWlqKzp07a/y1NWnSJCGTycS1a9fU2iVJUi0ymUzt/w1ZXXPi4eGhlofHF09PT0Mfhk7VNScrV64UXbp0EU5OTkIul4vnnntO9O/fXxw6dMjQh6BT2nxuntRYHnIhRN3zMnPmTOHt7S3s7e2Fubm5cHd3FxMmTBA//fSToQ9Bp7Q5TwoLC8Vbb70lmjdvLiwsLESHDh3EmjVrDBm+Xmj72Wnfvr1o06aNocI1CG1ysn//ftG9e3fx/PPPiyZNmghfX1+xbNkyjQdfNDR1zcmpU6fEq6++KhwcHISVlZXo1KmTWLt2rSgvLzf0Ieich4dHtTVXZQ6qO09u374tJk6cKJydnYW1tbXo2rWrOH78eK32LwnRCGZJJiIiIiKqgUZzwxsRERER0bOw+CUiIiIik8Hil4iIiIhMBotfIiIiIjIZLH6JiIiIyGSw+CUiIiIik8Hil4iIiIhMBotfIiIiIjIZLH6JiIiIyGSw+CWiBi8uLg4ymUy1mJubo2XLlggPD0d+fr5O9/Xzzz9DJpMhISFB1Xbq1CksXrwYv/32m072kZaWpnY81S1mZmYAgEWLFkEmk6GwsFAn+9eWPuLp0aMHevbs+cz3Vf5+4uPjdbZvImpc5MYOgIhIV+Li4tChQwc8fPgQ6enpeP/995Geno6LFy/C2tpaJ/twc3NDZmYm2rRpo2qrLH7DwsLg4OCg9T4CAgKQmZmpei2EwNChQ9GuXTusXr1a6+03RJIkQZKkWr2fiKgqLH6JqNHw8fGBv78/ACAoKAgVFRVYsmQJ9u3bhzFjxmi17YqKClRUVMDCwgJdunSp8j1CCK32UcnOzk5jHxYWFnB0dKx233UlhMAff/wBKysrnW5X14QQLGiJSCc47IGIGq3AwEAAwLVr1wBUf+l80qRJ8PT0VL2uvHS+atUqLF26FJ6enrCyskJaWprGsIdFixZh1qxZAABPT0/VkIT09HRERETAyckJDx8+1Nhnr1694OPjo9PjvXXrFsaMGQNHR0c0b94c4eHhKC4uVnuPTCbD9OnT8fnnn8PLywtWVlaqY/npp58wduxYNGvWDFZWVujYsSM+++wztfUVCgWWLl2K9u3bw9bWFk2bNsVLL72EdevW1Sme0tJSzJkzB56enrC0tESLFi0wbdq0Gg0hyc/Px8iRI2Fvbw9HR0eMHj0at27dqm3aiMjEsOeXiBqty5cvAwBcXFxUbdX1HlbVvm7dOrRv3x5r166Fvb092rVrp9G7GxkZiV9//RWffPIJkpOT4erqCgDw8vJC06ZNsWXLFuzYsQMRERGqdXJycpCWlqZRWGpr+PDhGD16NCIjI3HhwgXMmTMHkiRh8+bNau/bt28fTp48iUWLFqF58+ZwcXFBTk4OunbtCg8PD6xduxbNmzdHSkoKoqOjce/ePSxYsAAAsHLlSixevBjvvfceunfvjrKyMly6dKnKYvVZ8QghMGTIEKSmpmLu3Lno1q0b/vOf/2DhwoXIyMhARkYGLCwsqjzWhw8fok+fPrh16xY++OADvPjiizh48CBGjRql05wSUSMkiIgauC1btghJkkRWVpYoKysT9+/fFwcPHhQuLi7C3t5e3LlzRwghRFBQkOjZs6fG+hMnThQeHh6q11evXhWSJIkXXnhBlJeXq7238mfx8fGqtlWrVglJksS1a9c0tt2jRw/h5+en1hYVFSUcHR3F77//XuNjbN26tQgNDa3yZwsXLhSSJInVq1ertU+dOlVYW1urtUmSJJo2bSqKiorU2kNCQkSrVq3E/fv31dqnT58urK2tVe8fNGiQ8Pf3f2qsNY0nJSWlyvft3r1bSJIkNm3apGp78ne3YcMGIUmSOHDggNq6U6ZM0fj9EBE9jsMeiKjReOWVV2BhYQF7e3uEhobCzc0NKSkpaj2/tTF48GDVjAp1FR0dje+//x6nTp0CABQXF2Pr1q2YOHEibGxstNr2kwYPHqz22tfXF6Wlpbh7965ae69evdRuzCstLcXx48cxdOhQWFlZoby8XLX0798fpaWlqhvwAgMD8f3332Pq1Kk4cuSIxjCG2sSTmpoKQDns5HFvvPEGbG1tVT+vyjfffAN7e3sMGjRIrX3s2LHVrkNEBHDYAxE1Ilu3boWXlxfkcjmaNWuGZs2aabW9yiEM2vjrX/+K1q1bY/369ejatSvi4uLw4MEDTJ06VettP8nZ2VnttaWlJQBojDl+8rgKCgpQUVGBdevWVTl2V5Ik3Lt3DwAwZ84c2NraYtu2bfj8889hZmaG7t27Y8WKFQgICKhVPAUFBZDL5RrvkyQJzZo1Q0FBQbXHWlBQUOXvV9vfORE1fuz5JaJGw8vLC/7+/ujUqVOVRZCVlRVKS0s12gsKCqoc86uL2QVkMhmmTp2KpKQk3Lp1C5999hn69OmDF154Qett19WTx9W0aVOYmZkhLCwMZ8+e1VjOnDmD/v37AwDMzMwwc+ZMnDt3Dr/++isSExPxyy+/ICQkpMrcPo2zszPKy8tVhXUlIQRu3bqF55577qnrVnVzG294I6JnYfFLRCbD09MTubm5ePTokaqtoKAA//73v7XabmWP5oMHD6r8+eTJkyGXyzF27Fjk5uZi2rRpWu1P12xsbNCzZ0+cP38evr6+8Pf311icnJw01rO3t8fw4cPxt7/9DYWFhfj5559rtd8+ffoAALZt26bWnpSUhAcPHqB3797VrturVy/cv38fBw4cUGvfsWNHrWIgItPDYQ9EZDLefPNNfPHFFxg/fjwmT56MgoICrFq1Cg4ODlrN0dupUycAwMcff4wJEybA3NwcHTp0QJMmTQAAjo6Oqn17eHggNDRUJ8ejSx9//DFef/11dOvWDVFRUWjdujXu37+Py5cv48CBA6rxt6GhofD19UVAQABcXFxw7do1fPTRR/Dw8Kh1b3ZwcDBCQkIwe/ZsFBcXo2vXrrhw4QIWLlwIf39/vPnmm2rvf/x3NGHCBHz44YeYMGECli1bhnbt2uHQoUM4evSo9skgokaNPb9E1CjUZIhC165dER8fj+zsbAwZMgTLly/H3Llz0aNHD62GOAQFBWHOnDk4cOAAunXrhsDAQJw/f17tPaNHjwYAREVF1WkfT4vvaU8/q+lxeXl54fz58/Dx8cH8+fMREhKCyZMnY+/evQgODla9r1evXjhx4gSioqLQt29fvPfeewgODkZ6errq5sDaxJOcnIy3334b4uW9bAAAAORJREFUW7ZswcCBA7F27VpMnDgRqampMDc3r/YYra2tkZqaij59+uDdd9/FiBEjkJ+fj507d9boeInIdElCm+4OIiKqkb///e/44osv8Msvv6Bp06bGDoeIyGRx2AMRkR5lZmYiNzcXGzZswFtvvcXCl4jIyNjzS0SkRzKZDLa2thgwYAC2bNmi87l9iYiodlj8EhEREZHJ4A1vRERERGQyWPwSERERkclg8UtEREREJoPFLxERERGZDBa/RERERGQyWPwSERERkclg8UtEREREJoPFLxERERGZDBa/RERERGQy/h9OVcs3g+N1FAAAAABJRU5ErkJggg==",
"text": [
"Figure(PyObject <matplotlib.figure.Figure object at 0x7fc17639b5d0>)"
]
},
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 52,
"text": [
"PyObject <matplotlib.text.Text object at 0x7fc1763c41d0>"
]
}
],
"prompt_number": 52
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"Looks like we achieve high prediction performance and low model complexity at the 60% threshold."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###Random Forests\n",
"[[back to top]](#Sections)\n",
"\n",
"Decision trees can be *bagged* together to create ensemble tree learners, one such type being the [random forest](https://de.wikipedia.org/wiki/Random_Forest). \n",
"\n",
"Each tree in a random forest is trained on a bootstrapped sample of the original dataset, and for each split, only a randomly chosen subset of the features is considered. The trees are fully grown and not pruned, and the majority vote of their individual predictions is taken as the forest\u2019s overall prediction."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Random forests have the advantage of being less prone to over-fitting, and thus generalize better than individual decision trees. Another advantage they provide is in performance, as they don\u2019t need to search through large feature spaces in their entirety for each split. But they lack the transparency offered by decision trees and are less interpretable."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"forest = build_forest(labels, features, 2, 10);\n",
"[length(tree) for tree in forest.trees]'"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 28,
"text": [
"1x10 Array{Any,2}:\n",
" 8 11 5 9 11 11 8 10 7 8"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
"We can also use the model to make predictions and to perform 3-fold cross validation using the same arguments:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"predictions = apply_forest(forest, features);\n",
"nfoldCV_forest(labels, features, 2, 10, 3);"
],
"language": "python",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"Fold 1"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 18 0 0\n",
" 0 15 1\n",
" 0 0 16\n",
"Accuracy: 0.98\n",
"Kappa: 0.969951923076923\n",
"\n",
"Fold 2\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 18 0 0\n",
" 0 16 2\n",
" 0 2 12\n",
"Accuracy: 0.92\n",
"Kappa: 0.8792270531400966\n",
"\n",
"Fold 3\n",
"Classes: {\"setosa\",\"versicolor\",\"virginica\"}\n",
"Matrix: 3x3 Array{Int64,2}:\n",
" 14 0 0\n",
" 1 15 0\n",
" 0 8 12\n",
"Accuracy: 0.82\n",
"Kappa: 0.7324613555291318\n",
"\n",
"Mean Accuracy: 0.9066666666666666\n"
]
}
],
"prompt_number": 53
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"###Resume\n",
"[[back to top]](#Sections)\n",
"\n",
"- Apart from making predictions, decision trees can be used for feature selection. \n",
"\n",
"- By looking at the top and bottom splits of a trained tree, we can get a sense of which features are worth keeping and which ones to worth discarding, respectively. \n",
"\n",
"- This way, if the decision trees or random forests provide poor accuracy results for the dataset at hand, we could pass on the features with more predictive power to another type of classifier and potentially speed it up drastically."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}