updated plots for interactive slides

d3ac3df7 · ill-phil · 43d2dc30 · d3ac3df7 · d3ac3df7 · d3ac3df7
Commit d3ac3df7 authored Jan 23, 2020 by ill-phil
--- a/Example_Machine_Learning_Course_dhl/Machine_Learning_Example.pdf
+++ b/Example_Machine_Learning_Course_dhl/Machine_Learning_Example.pdf
--- a/Example_Machine_Learning_Course_dhl/Machine_Learning_Example.pptx
+++ b/Example_Machine_Learning_Course_dhl/Machine_Learning_Example.pptx
--- a/Example_Machine_Learning_Course_dhl/plots/EXX_C Scale X-Axis.html
+++ b/Example_Machine_Learning_Course_dhl/plots/EXX_C Scale X-Axis.html
--- a/Example_Machine_Learning_Course_dhl/plots/EXX_NN_real_C Scale X-Axis.html
+++ b/Example_Machine_Learning_Course_dhl/plots/EXX_NN_real_C Scale X-Axis.html
--- a/Example_Machine_Learning_Course_dhl/plots/EXX_alpha_calc_comp_C Scale X-Axis.html
+++ b/Example_Machine_Learning_Course_dhl/plots/EXX_alpha_calc_comp_C Scale X-Axis.html
--- a/Example_Machine_Learning_Course_dhl/plots/EXX_alpha_calc_comp_KNeighborsRegressor_real_C Scale X-Axis.html
+++ b/Example_Machine_Learning_Course_dhl/plots/EXX_alpha_calc_comp_KNeighborsRegressor_real_C Scale X-Axis.html
--- a/Example_Machine_Learning_Course_dhl/plots/EXX_compRB_C Scale X-Axis.html
+++ b/Example_Machine_Learning_Course_dhl/plots/EXX_compRB_C Scale X-Axis.html
--- a/SimpleMachineLearning.ipynb
+++ b/SimpleMachineLearning.ipynb
@@ -81,7 +81,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
@@ -123,7 +123,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
@@ -153,9 +153,9 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 4,
   "metadata": {
-    "scrolled": true
+    "scrolled": false
   },
   "outputs": [
    {
@@ -189,26 +189,7 @@
       "      <th>ECX_rawest</th>\n",
       "      <th>EXX_rawest</th>\n",
       "      <th>EXX_rawest_abbe</th>\n",
-       "      <th>EXX</th>\n",
+       "      <th>...</th>\n",
-       "      <th>EXX_abbe</th>\n",
-       "      <th>EBX</th>\n",
-       "      <th>ECX</th>\n",
-       "      <th>EXX_compRB</th>\n",
-       "      <th>EXX_compRB_abbe</th>\n",
-       "      <th>EBX_compRB</th>\n",
-       "      <th>ECX_compRB</th>\n",
-       "      <th>EXX_alpha_lit_raw</th>\n",
-       "      <th>EXX_alpha_lit_raw_abbe</th>\n",
-       "      <th>EXX_alpha_calc_raw</th>\n",
-       "      <th>EXX_alpha_calc_raw_abbe</th>\n",
-       "      <th>EXX_alpha_lit</th>\n",
-       "      <th>EXX_alpha_lit_abbe</th>\n",
-       "      <th>EXX_alpha_calc</th>\n",
-       "      <th>EXX_alpha_calc_abbe</th>\n",
-       "      <th>EXX_alpha_lit_comp</th>\n",
-       "      <th>EXX_alpha_lit_abbe_comp</th>\n",
-       "      <th>EXX_alpha_calc_comp</th>\n",
-       "      <th>EXX_alpha_calc_abbe_comp</th>\n",
       "      <th>A Z-Axis</th>\n",
       "      <th>C Air New</th>\n",
       "      <th>M Table Front</th>\n",
@@ -234,26 +215,7 @@
       "      <td>-0.000013</td>\n",
       "      <td>-0.000146</td>\n",
       "      <td>-0.000149</td>\n",
-       "      <td>0.000000</td>\n",
+       "      <td>...</td>\n",
-       "      <td>0.000000</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>-0.000146</td>\n",
-       "      <td>-0.000149</td>\n",
-       "      <td>-0.000146</td>\n",
-       "      <td>-0.000149</td>\n",
-       "      <td>0.000000</td>\n",
-       "      <td>0.000000</td>\n",
-       "      <td>0.000000</td>\n",
-       "      <td>0.000000</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
-       "      <td>0.000000e+00</td>\n",
       "      <td>24.922462</td>\n",
       "      <td>24.405299</td>\n",
       "      <td>25.62</td>\n",
@@ -277,26 +239,7 @@
       "      <td>-0.000013</td>\n",
       "      <td>-0.000148</td>\n",
       "      <td>-0.000151</td>\n",
-       "      <td>-0.000002</td>\n",
+       "      <td>...</td>\n",
-       "      <td>-0.000003</td>\n",
-       "      <td>4.859676e-07</td>\n",
-       "      <td>5.176426e-07</td>\n",
-       "      <td>1.530894e-07</td>\n",
-       "      <td>2.279430e-07</td>\n",
-       "      <td>-3.857213e-08</td>\n",
-       "      <td>-2.392657e-07</td>\n",
-       "      <td>-0.000138</td>\n",
-       "      <td>-0.000141</td>\n",
-       "      <td>-0.000133</td>\n",
-       "      <td>-0.000141</td>\n",
-       "      <td>0.000008</td>\n",
-       "      <td>0.000007</td>\n",
-       "      <td>0.000013</td>\n",
-       "      <td>0.000008</td>\n",
-       "      <td>1.483262e-07</td>\n",
-       "      <td>2.231798e-07</td>\n",
-       "      <td>1.460445e-07</td>\n",
-       "      <td>2.229123e-07</td>\n",
       "      <td>24.933953</td>\n",
       "      <td>24.398915</td>\n",
       "      <td>25.62</td>\n",
@@ -320,26 +263,7 @@
       "      <td>-0.000012</td>\n",
       "      <td>-0.000151</td>\n",
       "      <td>-0.000155</td>\n",
-       "      <td>-0.000005</td>\n",
+       "      <td>...</td>\n",
-       "      <td>-0.000006</td>\n",
-       "      <td>1.854722e-06</td>\n",
-       "      <td>7.224513e-07</td>\n",
-       "      <td>2.085466e-07</td>\n",
-       "      <td>1.507092e-08</td>\n",
-       "      <td>3.950401e-07</td>\n",
-       "      <td>-4.152619e-07</td>\n",
-       "      <td>-0.000131</td>\n",
-       "      <td>-0.000135</td>\n",
-       "      <td>-0.000121</td>\n",
-       "      <td>-0.000134</td>\n",
-       "      <td>0.000015</td>\n",
-       "      <td>0.000014</td>\n",
-       "      <td>0.000025</td>\n",
-       "      <td>0.000015</td>\n",
-       "      <td>1.997186e-07</td>\n",
-       "      <td>6.242884e-09</td>\n",
-       "      <td>1.954896e-07</td>\n",
-       "      <td>5.747064e-09</td>\n",
       "      <td>24.938310</td>\n",
       "      <td>24.396495</td>\n",
       "      <td>25.62</td>\n",
@@ -363,26 +287,7 @@
       "      <td>-0.000013</td>\n",
       "      <td>-0.000154</td>\n",
       "      <td>-0.000158</td>\n",
-       "      <td>-0.000008</td>\n",
+       "      <td>...</td>\n",
-       "      <td>-0.000010</td>\n",
-       "      <td>2.523367e-06</td>\n",
-       "      <td>-1.004397e-07</td>\n",
-       "      <td>2.611769e-07</td>\n",
-       "      <td>-9.044044e-09</td>\n",
-       "      <td>4.800849e-07</td>\n",
-       "      <td>-3.291921e-07</td>\n",
-       "      <td>-0.000123</td>\n",
-       "      <td>-0.000128</td>\n",
-       "      <td>-0.000109</td>\n",
-       "      <td>-0.000127</td>\n",
-       "      <td>0.000022</td>\n",
-       "      <td>0.000020</td>\n",
-       "      <td>0.000036</td>\n",
-       "      <td>0.000022</td>\n",
-       "      <td>2.489803e-07</td>\n",
-       "      <td>-2.124072e-08</td>\n",
-       "      <td>2.431377e-07</td>\n",
-       "      <td>-2.192573e-08</td>\n",
       "      <td>24.942666</td>\n",
       "      <td>24.394075</td>\n",
       "      <td>25.62</td>\n",
@@ -406,26 +311,7 @@
       "      <td>-0.000016</td>\n",
       "      <td>-0.000157</td>\n",
       "      <td>-0.000162</td>\n",
-       "      <td>-0.000012</td>\n",
+       "      <td>...</td>\n",
-       "      <td>-0.000014</td>\n",
-       "      <td>3.576657e-06</td>\n",
-       "      <td>-2.394458e-06</td>\n",
-       "      <td>2.311570e-07</td>\n",
-       "      <td>2.323917e-07</td>\n",
-       "      <td>5.118773e-08</td>\n",
-       "      <td>-1.853303e-07</td>\n",
-       "      <td>-0.000117</td>\n",
-       "      <td>-0.000122</td>\n",
-       "      <td>-0.000098</td>\n",
-       "      <td>-0.000120</td>\n",
-       "      <td>0.000028</td>\n",
-       "      <td>0.000026</td>\n",
-       "      <td>0.000048</td>\n",
-       "      <td>0.000029</td>\n",
-       "      <td>2.162887e-07</td>\n",
-       "      <td>2.175233e-07</td>\n",
-       "      <td>2.091663e-07</td>\n",
-       "      <td>2.166883e-07</td>\n",
       "      <td>24.947021</td>\n",
       "      <td>24.391655</td>\n",
       "      <td>25.62</td>\n",
@@ -439,20 +325,42 @@
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
+       "<p>5 rows × 40 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
-       "   index          time  position       EX0  ...  C X-Axis Air  A Machine door  C Scale X-Axis  M Scale Y-Achse Case\n",
+       "   index          time  position       EX0       EX1       EX2  EBX_rawest  \\\n",
-       "0    0.0  1.560785e+09     -0.00 -0.000146  ...     24.341418       24.026120       25.023880                 24.75\n",
+       "0    0.0  1.560785e+09     -0.00 -0.000146 -0.000148 -0.000149    0.000008   \n",
-       "1    1.0  1.560785e+09     -0.05 -0.000148  ...     24.332481       24.023566       25.026434                 24.75\n",
+       "1    1.0  1.560785e+09     -0.05 -0.000148 -0.000150 -0.000151    0.000009   \n",
-       "2    2.0  1.560785e+09     -0.10 -0.000151  ...     24.329092       24.022598       25.027402                 24.75\n",
+       "2    2.0  1.560785e+09     -0.10 -0.000151 -0.000153 -0.000154    0.000010   \n",
-       "3    3.0  1.560785e+09     -0.15 -0.000154  ...     24.325705       24.021630       25.028370                 24.75\n",
+       "3    3.0  1.560785e+09     -0.15 -0.000154 -0.000156 -0.000157    0.000011   \n",
-       "4    4.0  1.560785e+09     -0.20 -0.000157  ...     24.322317       24.020662       25.029338                 24.75\n",
+       "4    4.0  1.560785e+09     -0.20 -0.000157 -0.000160 -0.000161    0.000012   \n",
+       "\n",
+       "   ECX_rawest  EXX_rawest  EXX_rawest_abbe  ...   A Z-Axis  C Air New  \\\n",
+       "0   -0.000013   -0.000146        -0.000149  ...  24.922462  24.405299   \n",
+       "1   -0.000013   -0.000148        -0.000151  ...  24.933953  24.398915   \n",
+       "2   -0.000012   -0.000151        -0.000155  ...  24.938310  24.396495   \n",
+       "3   -0.000013   -0.000154        -0.000158  ...  24.942666  24.394075   \n",
+       "4   -0.000016   -0.000157        -0.000162  ...  24.947021  24.391655   \n",
+       "\n",
+       "   M Table Front  M Table Side  A Rotation Table  M Bellow X  C X-Axis Air  \\\n",
+       "0          25.62     25.346120         24.504701   25.036120     24.341418   \n",
+       "1          25.62     25.343566         24.511085   25.033566     24.332481   \n",
+       "2          25.62     25.342598         24.513505   25.032598     24.329092   \n",
+       "3          25.62     25.341630         24.515925   25.031630     24.325705   \n",
+       "4          25.62     25.340662         24.518345   25.030662     24.322317   \n",
+       "\n",
+       "   A Machine door  C Scale X-Axis  M Scale Y-Achse Case  \n",
+       "0       24.026120       25.023880                 24.75  \n",
+       "1       24.023566       25.026434                 24.75  \n",
+       "2       24.022598       25.027402                 24.75  \n",
+       "3       24.021630       25.028370                 24.75  \n",
+       "4       24.020662       25.029338                 24.75  \n",
       "\n",
       "[5 rows x 40 columns]"
      ]
     },
-     "execution_count": 23,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -463,7 +371,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
@@ -480,7 +388,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
@@ -533,7 +441,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
@@ -541,18 +449,44 @@
     "output_type": "stream",
     "text": [
      "This is the X part to train the algorithms on:\n",
-      "         position   A Z-Axis  C Air New  ...  A Machine door  C Scale X-Axis  M Scale Y-Achse Case\n",
+      "         position   A Z-Axis  C Air New  M Table Front  M Table Side  \\\n",
-      "0          -0.00  24.922462  24.405299  ...       24.026120       25.023880                 24.75\n",
+      "0          -0.00  24.922462  24.405299          25.62     25.346120   \n",
-      "1          -0.05  24.933953  24.398915  ...       24.023566       25.026434                 24.75\n",
+      "1          -0.05  24.933953  24.398915          25.62     25.343566   \n",
-      "2          -0.10  24.938310  24.396495  ...       24.022598       25.027402                 24.75\n",
+      "2          -0.10  24.938310  24.396495          25.62     25.342598   \n",
-      "3          -0.15  24.942666  24.394075  ...       24.021630       25.028370                 24.75\n",
+      "3          -0.15  24.942666  24.394075          25.62     25.341630   \n",
-      "4          -0.20  24.947021  24.391655  ...       24.020662       25.029338                 24.75\n",
+      "4          -0.20  24.947021  24.391655          25.62     25.340662   \n",
-      "...          ...        ...        ...  ...             ...             ...                   ...\n",
+      "...          ...        ...        ...            ...           ...   \n",
-      "156300     -0.30  23.994117  23.785936  ...       22.808021       24.350000                 24.06\n",
+      "156300     -0.30  23.994117  23.785936          24.91     24.660000   \n",
-      "156301     -0.35  23.989774  23.788305  ...       22.807232       24.350000                 24.06\n",
+      "156301     -0.35  23.989774  23.788305          24.91     24.660000   \n",
-      "156302     -0.40  23.985429  23.790675  ...       22.806442       24.350000                 24.06\n",
+      "156302     -0.40  23.985429  23.790675          24.91     24.660000   \n",
-      "156303     -0.45  23.981085  23.793044  ...       22.805652       24.350000                 24.06\n",
+      "156303     -0.45  23.981085  23.793044          24.91     24.660000   \n",
-      "156304     -0.50  23.962510  23.803176  ...       22.802275       24.350000                 24.06\n",
+      "156304     -0.50  23.962510  23.803176          24.91     24.660000   \n",
+      "\n",
+      "        A Rotation Table  M Bellow X  C X-Axis Air  A Machine door  \\\n",
+      "0              24.504701   25.036120     24.341418       24.026120   \n",
+      "1              24.511085   25.033566     24.332481       24.023566   \n",
+      "2              24.513505   25.032598     24.329092       24.022598   \n",
+      "3              24.515925   25.031630     24.325705       24.021630   \n",
+      "4              24.518345   25.030662     24.322317       24.020662   \n",
+      "...                  ...         ...           ...             ...   \n",
+      "156300         23.608075   24.180000     23.673957       22.808021   \n",
+      "156301         23.605310   24.180000     23.675537       22.807232   \n",
+      "156302         23.602546   24.180000     23.677117       22.806442   \n",
+      "156303         23.599782   24.180000     23.678696       22.805652   \n",
+      "156304         23.587961   24.180000     23.685451       22.802275   \n",
+      "\n",
+      "        C Scale X-Axis  M Scale Y-Achse Case  \n",
+      "0            25.023880                 24.75  \n",
+      "1            25.026434                 24.75  \n",
+      "2            25.027402                 24.75  \n",
+      "3            25.028370                 24.75  \n",
+      "4            25.029338                 24.75  \n",
+      "...                ...                   ...  \n",
+      "156300       24.350000                 24.06  \n",
+      "156301       24.350000                 24.06  \n",
+      "156302       24.350000                 24.06  \n",
+      "156303       24.350000                 24.06  \n",
+      "156304       24.350000                 24.06  \n",
      "\n",
      "[156305 rows x 11 columns]\n"
     ]
@@ -568,7 +502,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
@@ -862,7 +796,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
@@ -877,7 +811,7 @@
       "       [ -9.36810841]])"
      ]
     },
-     "execution_count": 28,
+     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -906,40 +840,40 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 41,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Iteration 1, loss = 9.71133154\n",
+      "Iteration 1, loss = 8.38764054\n",
-      "Iteration 2, loss = 3.71765057\n",
+      "Iteration 2, loss = 3.72069698\n",
-      "Iteration 3, loss = 2.63920472\n",
+      "Iteration 3, loss = 2.57127449\n",
-      "Iteration 4, loss = 2.26290314\n",
+      "Iteration 4, loss = 2.22488600\n",
-      "Iteration 5, loss = 2.14301792\n",
+      "Iteration 5, loss = 2.02794688\n",
-      "Iteration 6, loss = 2.00277585\n",
+      "Iteration 6, loss = 1.89214184\n",
-      "Iteration 7, loss = 1.77628251\n",
+      "Iteration 7, loss = 1.69559486\n",
-      "Iteration 8, loss = 1.67976752\n",
+      "Iteration 8, loss = 1.61075341\n",
-      "Iteration 9, loss = 1.53118154\n",
+      "Iteration 9, loss = 1.54779916\n",
-      "Iteration 10, loss = 1.41997353\n",
+      "Iteration 10, loss = 1.40130177\n",
-      "Iteration 11, loss = 1.35314198\n",
+      "Iteration 11, loss = 1.31125627\n",
-      "Iteration 12, loss = 1.22314352\n",
+      "Iteration 12, loss = 1.36531385\n",
-      "Iteration 13, loss = 1.24868897\n",
+      "Iteration 13, loss = 1.31427094\n",
-      "Iteration 14, loss = 1.17286362\n",
+      "Iteration 14, loss = 1.22763524\n",
-      "Iteration 15, loss = 1.16889021\n",
+      "Iteration 15, loss = 1.17399099\n",
-      "Iteration 16, loss = 1.08565469\n",
+      "Iteration 16, loss = 1.22866807\n",
-      "Iteration 17, loss = 1.28326932\n",
+      "Iteration 17, loss = 1.15295456\n",
-      "Iteration 18, loss = 1.25242884\n",
+      "Iteration 18, loss = 1.13846861\n",
-      "Iteration 19, loss = 1.18214886\n",
+      "Iteration 19, loss = 1.10823847\n",
-      "Iteration 20, loss = 1.05301875\n"
+      "Iteration 20, loss = 1.13801330\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "C:\\Anaconda3\\lib\\site-packages\\sklearn\\neural_network\\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.\n",
+      "C:\\Anaconda3\\envs\\jupyter\\lib\\site-packages\\sklearn\\neural_network\\_multilayer_perceptron.py:571: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.\n",
      "  % self.max_iter, ConvergenceWarning)\n"
     ]
    },
@@ -949,13 +883,13 @@
       "MLPRegressor(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,\n",
       "             beta_2=0.999, early_stopping=False, epsilon=1e-08,\n",
       "             hidden_layer_sizes=(50,), learning_rate='constant',\n",
-       "             learning_rate_init=0.01, max_iter=20, momentum=0.9,\n",
+       "             learning_rate_init=0.01, max_fun=15000, max_iter=20, momentum=0.9,\n",
       "             n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,\n",
       "             random_state=None, shuffle=True, solver='adam', tol=0.0001,\n",
       "             validation_fraction=0.1, verbose=1, warm_start=False)"
      ]
     },
-     "execution_count": 41,
+     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -980,7 +914,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 42,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
@@ -990,7 +924,7 @@
      "[array([-0.25      , 24.95746094, 24.72      , 25.77      , 25.52500601,\n",
      "       24.77252704, 25.16      , 24.66      , 24.36750901, 25.14      ,\n",
      "       24.85500601])]\n",
-      "Prediction [-12.71699561]\n",
+      "Prediction [-11.65489425]\n",
      "Reality: [-12.17030708]\n"
     ]
    }
@@ -1012,7 +946,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
@@ -1059,7 +993,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 34,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
@@ -1080,7 +1014,7 @@
       "        -1.61482099, -1.46201097]])"
      ]
     },
-     "execution_count": 34,
+     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -1123,7 +1057,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 35,
+   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -1444,7 +1378,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.7.5"
+   "version": "3.6.7"
  }
 },
 "nbformat": 4,

 %% Cell type:markdown id: tags:
 # Simple machine learning with Python & Scikit-learn3
 ## Python - What's that?
 - Programming language: Python
    - Popular in general & especially with data analytics & science
    - Interpreted/Script language -> No annoying compilation, but directly running the code
    - Who else uses Python???
        - Youtube
        - CERN
        - NASA
        - Wikipedia
        - Google
        - ...
    - Popular applications in Python/with Python interfaces:
        - 3D:
            - Blender
            - Cinema 4D
            - FeeCAD
            - Ultimaker Cura
        - 2D:
            - GIMP
            - Scribus
            - Inkscape
 - Jupyter Notebooks:
    - Still want to remember why you did what e.g. last month? Jupyter helps you with structuring and commenting code
    - THIS HERE is a jupyter notebook, write code & text
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
 %% Cell type:markdown id: tags:
 ## Want to do this yourself?
 Install anaconda & follow one of the countless guides online, e.g.
 https://jupyter.readthedocs.io/en/latest/install.html#installing-jupyter-using-anaconda-and-conda
 %% Cell type:markdown id: tags:
 ## Programming
 ### Libraries - using software written by people, who can code better than me
 Importing Libraries -
 https://docs.python.org/3/reference/import.html
 What's
 > import ***;
 for Java,
 > #include <***.h>
 for C++,
 is
 > import ***
 for Python.
 %% Cell type:code id: tags:
 ``` python
 import numpy as np # THE mathematics library for python
 # while importing, the name to call the library with can be redefined as well. Too lazy to write "numpy"? Rename it "np"
 import pandas as pd # powerful library for working with large datasets - I use it to load a simple CSV...
 import pickle # Saving data for further use, hence "pickle"
 from tqdm import tqdm # Nice progressbar for long running tasks
 print("Finished importing stuff")
 ```
 %% Output
    Finished importing stuff
 %% Cell type:markdown id: tags:
 <br>
 <br>
 <br>
 <br>
 <br>
 <br>
 <br>
 ### THE DATA
 Well, data science requres some data. So I provided some here: [BIG CSV FILE - basically a spreadsheet to kill your excel](./data4.csv)
 Although the following data analysis could probably be done in Excel, that would not be fun, so rather use this notebook.
 %% Cell type:code id: tags:
 ``` python
 print("Loading 100MB CSV")
 # Use pandas' csv_read function, which outputs a so-called dataframe - see below, what that can do
 csv = pd.read_csv("./data_train.csv", encoding="ISO-8859-1")
 print("DONE")
 ```
 %% Output
    Loading 100MB CSV
    DONE
 %% Cell type:markdown id: tags:
 That wasn't really quick, but let's look at the info:
 %% Cell type:code id: tags:
 ``` python
 csv.head()
 ```
 %% Output
-       index          time  position       EX0  ...  C X-Axis Air  A Machine door  C Scale X-Axis  M Scale Y-Achse Case
+       index          time  position       EX0       EX1       EX2  EBX_rawest  \
-    0    0.0  1.560785e+09     -0.00 -0.000146  ...     24.341418       24.026120       25.023880                 24.75
+    0    0.0  1.560785e+09     -0.00 -0.000146 -0.000148 -0.000149    0.000008
-    1    1.0  1.560785e+09     -0.05 -0.000148  ...     24.332481       24.023566       25.026434                 24.75
+    1    1.0  1.560785e+09     -0.05 -0.000148 -0.000150 -0.000151    0.000009
-    2    2.0  1.560785e+09     -0.10 -0.000151  ...     24.329092       24.022598       25.027402                 24.75
+    2    2.0  1.560785e+09     -0.10 -0.000151 -0.000153 -0.000154    0.000010
-    3    3.0  1.560785e+09     -0.15 -0.000154  ...     24.325705       24.021630       25.028370                 24.75
+    3    3.0  1.560785e+09     -0.15 -0.000154 -0.000156 -0.000157    0.000011
-    4    4.0  1.560785e+09     -0.20 -0.000157  ...     24.322317       24.020662       25.029338                 24.75
+    4    4.0  1.560785e+09     -0.20 -0.000157 -0.000160 -0.000161    0.000012
+       ECX_rawest  EXX_rawest  EXX_rawest_abbe  ...   A Z-Axis  C Air New  \
+    0   -0.000013   -0.000146        -0.000149  ...  24.922462  24.405299
+    1   -0.000013   -0.000148        -0.000151  ...  24.933953  24.398915
+    2   -0.000012   -0.000151        -0.000155  ...  24.938310  24.396495
+    3   -0.000013   -0.000154        -0.000158  ...  24.942666  24.394075
+    4   -0.000016   -0.000157        -0.000162  ...  24.947021  24.391655
+       M Table Front  M Table Side  A Rotation Table  M Bellow X  C X-Axis Air  \
+    0          25.62     25.346120         24.504701   25.036120     24.341418
+    1          25.62     25.343566         24.511085   25.033566     24.332481
+    2          25.62     25.342598         24.513505   25.032598     24.329092
+    3          25.62     25.341630         24.515925   25.031630     24.325705
+    4          25.62     25.340662         24.518345   25.030662     24.322317
+       A Machine door  C Scale X-Axis  M Scale Y-Achse Case
+    0       24.026120       25.023880                 24.75
+    1       24.023566       25.026434                 24.75
+    2       24.022598       25.027402                 24.75
+    3       24.021630       25.028370                 24.75
+    4       24.020662       25.029338                 24.75
    [5 rows x 40 columns]
 %% Cell type:code id: tags:
 ``` python
 print("This csv contains", csv.size, "fields")
 ```
 %% Output
    This csv contains 6252200 fields
 %% Cell type:code id: tags:
 ``` python
 print(csv.shape)
 print("In", csv.shape[0], "rows")
 ```
 %% Output
    (156305, 40)
    In 156305 rows
 %% Cell type:markdown id: tags:
 ### THE DATA AGAIN
 We're the machine tool laboratory, so this is obviously machine tool data ;)
 <img src="img/Render_01.PNG" width="50%">
 Over multiple weeks, we took a machine tool, <br>
 made it move from 0 to 500mm in 50mm steps (see column position), <br>
 and measured the temperature at multiple points in the machine &<br>
 measured the deviation between ideal position and real position using 3 laser interferometers.
 TODO MEasurement setup
 Metal expands & contracts with it's thermal state, so now it's possible to correlate temperature and position deviation.
 #### BUT WHY?
 Make a machine tool more precise using __only software__, which is basically free - who would NOT do that?!
 ... well, to use this kind of software it might be necessary to calibrate the machine tool extensively, but that might still be a small cost factor compared to hardware modifications.
 #### What are we trying to achieve?
 Thermal influences on the machine are responsible for up to 75% of the target/actual deviation of the tool center position. They can be measured quite easily, but the functional dependence between the temperature at multiple points in the machine is not quite as simple.
 One could model the machine in CAD, create thermo-mechanical analyses in FEM software, model the machine's environment and air flows for a CFD software and than run that software stack continuously (which is actually being done as well), or do the next best, but much simpler thing: <br>
 Triainig an algorithm on data to predict the thermal deviation from multiple temperature sensors. This apporach requires basically no knowledge about the system at all.
 %% Cell type:code id: tags:
 ``` python
 x = csv[
            ["position"] +
            list(csv.columns[list(csv.columns).index("A Z-Axis"):list(csv.columns).index("M Scale Y-Achse Case")+1])
        ]
 print("This is the X part to train the algorithms on:\n", x)
 ```
 %% Output
    This is the X part to train the algorithms on:
-             position   A Z-Axis  C Air New  ...  A Machine door  C Scale X-Axis  M Scale Y-Achse Case
+             position   A Z-Axis  C Air New  M Table Front  M Table Side  \
-    0          -0.00  24.922462  24.405299  ...       24.026120       25.023880                 24.75
+    0          -0.00  24.922462  24.405299          25.62     25.346120
-    1          -0.05  24.933953  24.398915  ...       24.023566       25.026434                 24.75
+    1          -0.05  24.933953  24.398915          25.62     25.343566
-    2          -0.10  24.938310  24.396495  ...       24.022598       25.027402                 24.75
+    2          -0.10  24.938310  24.396495          25.62     25.342598
-    3          -0.15  24.942666  24.394075  ...       24.021630       25.028370                 24.75
+    3          -0.15  24.942666  24.394075          25.62     25.341630
-    4          -0.20  24.947021  24.391655  ...       24.020662       25.029338                 24.75
+    4          -0.20  24.947021  24.391655          25.62     25.340662
-    ...          ...        ...        ...  ...             ...             ...                   ...
+    ...          ...        ...        ...            ...           ...
-    156300     -0.30  23.994117  23.785936  ...       22.808021       24.350000                 24.06
+    156300     -0.30  23.994117  23.785936          24.91     24.660000
-    156301     -0.35  23.989774  23.788305  ...       22.807232       24.350000                 24.06
+    156301     -0.35  23.989774  23.788305          24.91     24.660000
-    156302     -0.40  23.985429  23.790675  ...       22.806442       24.350000                 24.06
+    156302     -0.40  23.985429  23.790675          24.91     24.660000
-    156303     -0.45  23.981085  23.793044  ...       22.805652       24.350000                 24.06
+    156303     -0.45  23.981085  23.793044          24.91     24.660000
-    156304     -0.50  23.962510  23.803176  ...       22.802275       24.350000                 24.06
+    156304     -0.50  23.962510  23.803176          24.91     24.660000
+            A Rotation Table  M Bellow X  C X-Axis Air  A Machine door  \
+    0              24.504701   25.036120     24.341418       24.026120
+    1              24.511085   25.033566     24.332481       24.023566
+    2              24.513505   25.032598     24.329092       24.022598
+    3              24.515925   25.031630     24.325705       24.021630
+    4              24.518345   25.030662     24.322317       24.020662
+    ...                  ...         ...           ...             ...
+    156300         23.608075   24.180000     23.673957       22.808021
+    156301         23.605310   24.180000     23.675537       22.807232
+    156302         23.602546   24.180000     23.677117       22.806442
+    156303         23.599782   24.180000     23.678696       22.805652
+    156304         23.587961   24.180000     23.685451       22.802275
+            C Scale X-Axis  M Scale Y-Achse Case
+    0            25.023880                 24.75
+    1            25.026434                 24.75
+    2            25.027402                 24.75
+    3            25.028370                 24.75
+    4            25.029338                 24.75
+    ...                ...                   ...
+    156300       24.350000                 24.06
+    156301       24.350000                 24.06
+    156302       24.350000                 24.06
+    156303       24.350000                 24.06
+    156304       24.350000                 24.06
    [156305 rows x 11 columns]
 %% Cell type:code id: tags:
 ``` python
 y = {}
 mult = 1e6
 print("FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values")
 for dataset in [dataset for dataset in list(csv.columns) if "E" in dataset]:
    y[dataset] = csv[[dataset]].values*mult
 print("And those are the Y-Vectors:\n")
 for i in y:
    print(i, y[i], "\n\n")
 ```
 %% Output
    FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values
    And those are the Y-Vectors:
    EX0 [[-145.58887844]
     [-147.70383915]
     [-150.63214641]
     ...
     [-163.62984328]
     [-163.42779672]
     [-161.77353804]]
    EX1 [[-147.52466373]
     [-149.75382681]
     [-153.0037913 ]
     ...
     [-172.47091093]
     [-173.38001859]
     [-173.11703744]]
    EX2 [[-148.69029205]
     [-150.68360674]
     [-153.56378397]
     ...
     [-165.64035997]
     [-165.34269771]
     [-163.72420809]]
    EBX_rawest [[ 8.23738418]
     [ 8.72335174]
     [10.09210592]
     ...
     [37.62156446]
     [42.3498803 ]
     [48.27021022]]
    ECX_rawest [[-13.19750469]
     [-12.67986208]
     [-12.47505343]
     ...
     [ -8.55539017]
     [ -8.14851484]
     [ -8.30072361]]
    EXX_rawest [[-145.58887844]
     [-147.70383915]
     [-150.63214641]
     ...
     [-163.62984328]
     [-163.42779672]
     [-161.77353804]]
    EXX_rawest_abbe [[-148.71554643]
     [-151.27421295]
     [-155.20160986]
     ...
     [-188.25386036]
     [-191.44300996]
     [-193.90254047]]
    EXX [[  0.        ]
     [ -2.1149607 ]
     [ -5.04326797]
     ...
     [-11.22441365]
     [-11.02236709]
     [ -9.36810841]]
    EXX_abbe [[  0.        ]
     [ -2.55866651]
     [ -6.48606343]
     ...
     [-18.38372173]
     [-21.57287132]
     [-24.03240183]]
    EBX [[ 0.        ]
     [ 0.48596755]
     [ 1.85472173]
     ...
     [14.42182827]
     [19.15014412]
     [25.07047403]]
    ECX [[  0.        ]
     [  0.51764261]
     [  0.72245126]
     ...
     [-14.67985857]
     [-14.27298324]
     [-14.42519201]]
    EXX_compRB [[0.        ]
     [0.15308936]
     [0.20854658]
     ...
     [3.78256741]
     [3.72149844]
     [4.28200059]]
    EXX_compRB_abbe [[0.        ]
     [0.22794299]
     [0.01507092]
     ...
     [2.97160382]
     [2.58834382]
     [3.0130646 ]]
    EBX_compRB [[ 0.        ]
     [-0.03857213]
     [ 0.39504006]
     ...
     [ 1.09849965]
     [ 1.45237801]
     [ 1.61783558]]
    ECX_compRB [[ 0.        ]
     [-0.23926568]
     [-0.41526189]
     ...
     [ 0.21006921]
     [ 0.58245007]
     [ 0.68225539]]
    EXX_alpha_lit_raw [[-145.58887844]
     [-137.69326552]
     [-130.61022469]
     ...
     [ -85.70984328]
     [ -75.76779672]
     [ -64.37353804]]
    EXX_alpha_lit_raw_abbe [[-148.71554643]
     [-141.26363932]
     [-135.17968815]
     ...
     [-110.33386036]
     [-103.78300996]
     [ -96.50254047]]
    EXX_alpha_calc_raw [[-145.58887844]
     [-132.89787225]
     [-121.01906716]
     ...
     [ -48.38360627]
     [ -33.77578008]
     [ -17.71574178]]
    EXX_alpha_calc_raw_abbe [[-148.71554643]
     [-140.70140285]
     [-134.05517172]
     ...
     [-105.95754117]
     [ -98.85965087]
     [ -91.03214148]]
    EXX_alpha_lit [[ 0.        ]
     [ 7.89561293]
     [14.97865375]
     ...
     [66.69558635]
     [76.63763291]
     [88.03189159]]
    EXX_alpha_lit_abbe [[ 0.        ]
     [ 7.45190711]
     [13.53585828]
     ...
     [59.53627827]
     [66.08712868]
     [73.36759817]]
    EXX_alpha_calc [[  0.        ]
     [ 12.6910062 ]
     [ 24.56981129]
     ...
     [104.02182336]
     [118.62964954]
     [134.68968785]]
    EXX_alpha_calc_abbe [[ 0.        ]
     [ 8.01414358]
     [14.66037471]
     ...
     [63.91259746]
     [71.01048777]
     [78.83799716]]
    EXX_alpha_lit_comp [[0.        ]
     [0.14832618]
     [0.19971855]
     ...
     [1.57852137]
     [1.24194088]
     [1.52693028]]
    EXX_alpha_lit_abbe_comp [[0.        ]
     [0.22317981]
     [0.00624288]
     ...
     [0.76755777]
     [0.10878626]
     [0.2579943 ]]
    EXX_alpha_calc_comp [[0.        ]
     [0.14604446]
     [0.19548963]
     ...
     [0.52271098]
     [0.05415144]
     [0.2071612 ]]
    EXX_alpha_calc_abbe_comp [[ 0.        ]
     [ 0.22291229]
     [ 0.00574706]
     ...
     [ 0.64376915]
     [-0.03047626]
     [ 0.10325781]]
 %% Cell type:markdown id: tags:
 ### Algorithm Choice
 Choosing a suitable algorithm depends on multiple factors.
 First off, the task on hand requires "Supervised Learning" - a function should be trained on existing examples f(x)=y to predict y for a given x.
 Many Supervised Learning Algorithms can do __CLASSIFICATION__ (Is this an apple or a banana?), e.g. Neural Networks, Naive Bayes, Decision Trees, Support Vector Machines, etc. <br>
 But classification is not interesing at all here. The output should be a continuous variable - a task called __REGRESSION.__ <br>
 Suitable algorithms here are Neural Networks (again), Support Vector Regression, or multiple others shown later-on.
 So what's __THE BEST__ algorithm for the task??
 I have __NO__ idea. <br>
 Also it depends.
 Often with machine learning a general direction can be given, but multiple possible solutions exist, with similar prediction quality, so for a first shot, use trial and error with a bit of additional brain power and __A WHOLE LOT OF COMPUTING POWER.__
 <br>
 <br>
 <br>
 <br>
 <br>
 %% Cell type:code id: tags:
 ``` python
 y["EXX"]
 ```
 %% Output
    array([[  0.        ],
           [ -2.1149607 ],
           [ -5.04326797],
           ...,
           [-11.22441365],
           [-11.02236709],
           [ -9.36810841]])
 %% Cell type:markdown id: tags:
 ### Programming with the algorithms
 General direction:
 1. Import model to be used explicitly
    - > from sklearn.neural_network import MLPRegressor
 1. Create an instance of that object, give it some settings (more on that later)
    - > regressor_model = MLPRegressor([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, ...])
    - Mainly the hidden layers & the number of iterations are interesting - but for everything there are useful default values inside *scikit learn.*
    - The format of the first and the last layer are automatically set by the format of the input & output data
 1. Train it using regressor_model.fit(X,y)
 <br><br>
 1. To predict with the model on new data, now siply use regressor_model.predict(x)
 %% Cell type:code id: tags:
 ``` python
 from sklearn.neural_network import MLPRegressor
 regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01)
 regressor_model.fit(x, y["EXX"].ravel())
 ```
 %% Output
-    Iteration 1, loss = 9.71133154
+    Iteration 1, loss = 8.38764054
-    Iteration 2, loss = 3.71765057
+    Iteration 2, loss = 3.72069698
-    Iteration 3, loss = 2.63920472
+    Iteration 3, loss = 2.57127449
-    Iteration 4, loss = 2.26290314
+    Iteration 4, loss = 2.22488600
-    Iteration 5, loss = 2.14301792
+    Iteration 5, loss = 2.02794688
-    Iteration 6, loss = 2.00277585
+    Iteration 6, loss = 1.89214184
-    Iteration 7, loss = 1.77628251
+    Iteration 7, loss = 1.69559486
-    Iteration 8, loss = 1.67976752
+    Iteration 8, loss = 1.61075341
-    Iteration 9, loss = 1.53118154
+    Iteration 9, loss = 1.54779916
-    Iteration 10, loss = 1.41997353
+    Iteration 10, loss = 1.40130177
-    Iteration 11, loss = 1.35314198
+    Iteration 11, loss = 1.31125627
-    Iteration 12, loss = 1.22314352
+    Iteration 12, loss = 1.36531385
-    Iteration 13, loss = 1.24868897
+    Iteration 13, loss = 1.31427094
-    Iteration 14, loss = 1.17286362
+    Iteration 14, loss = 1.22763524
-    Iteration 15, loss = 1.16889021
+    Iteration 15, loss = 1.17399099
-    Iteration 16, loss = 1.08565469
+    Iteration 16, loss = 1.22866807
-    Iteration 17, loss = 1.28326932
+    Iteration 17, loss = 1.15295456
-    Iteration 18, loss = 1.25242884
+    Iteration 18, loss = 1.13846861
-    Iteration 19, loss = 1.18214886
+    Iteration 19, loss = 1.10823847
-    Iteration 20, loss = 1.05301875
+    Iteration 20, loss = 1.13801330
-    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
+    C:\Anaconda3\envs\jupyter\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:571: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
      % self.max_iter, ConvergenceWarning)
    MLPRegressor(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,
                 beta_2=0.999, early_stopping=False, epsilon=1e-08,
                 hidden_layer_sizes=(50,), learning_rate='constant',
-                 learning_rate_init=0.01, max_iter=20, momentum=0.9,
+                 learning_rate_init=0.01, max_fun=15000, max_iter=20, momentum=0.9,
                 n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
                 random_state=None, shuffle=True, solver='adam', tol=0.0001,
                 validation_fraction=0.1, verbose=1, warm_start=False)
 %% Cell type:markdown id: tags:
 Soooo, what now? What do I do with that thing??
 Obiously use it for its intended purpose: Prediction.
 Let's try:
 %% Cell type:code id: tags:
 ``` python
 print([x.values[12345]])
 print("Prediction", regressor_model.predict([x.values[12345]]))
 print("Reality:", y["EXX"][12345])
 ```
 %% Output
    [array([-0.25      , 24.95746094, 24.72      , 25.77      , 25.52500601,
           24.77252704, 25.16      , 24.66      , 24.36750901, 25.14      ,
           24.85500601])]
-    Prediction [-12.71699561]
+    Prediction [-11.65489425]
    Reality: [-12.17030708]
 %% Cell type:markdown id: tags:
 An exemplary prediciton with an error of around 1µm, I like that :)
 If it's not around 1µm rest asured that it's not due to my lack of subtraction skills, but the initial values for all nodes in the
 %% Cell type:code id: tags:
 ``` python
 # As said before, I will test multiple methods against each other
 # So the results (like regressor_model) need to be stored in the following dictionary
 eval_tools = {
        "KNeighborsRegressor":  {},
        "LinearRegression": {},
        "RandomForestClassifier":{},
        "SVM":{},
        "NN":{},
        }
 print("Trained models will be stored here:", eval_tools)
 ```
 %% Output
    Trained models will be stored here: {'KNeighborsRegressor': {}, 'LinearRegression': {}, 'RandomForestClassifier': {}, 'SVM': {}, 'NN': {}}
 %% Cell type:markdown id: tags:
 <br>
 <br>
 <br>
 <br>
 <br>
 #### Standard Scaling
 The X data might be very different. Here it's temperatures, so values between 15 and 40. But also positions with values between -0.5 & 0. Data could also be in uch higher numbers.
 Some algorithms do not like that. To make all data look alike, there is the so-called *standard scaler* in sklearn.
 $$ z = \frac{x - u}{s} $$
 with $z$ as the new, scaled value, $x$ as the old value, $u$ the averave over all $x$s and $s$ as the standard deviation over all $x$s.
 So afterwards everything is (per definition) in a range around 0, 95% between -2 and +2 - see standard deviation.
 %% Cell type:code id: tags:
 ``` python
 from sklearn.preprocessing import StandardScaler
 sc = StandardScaler()
 x_scaled = sc.fit_transform(x)
 x_scaled
 # Wow, now it looks really boring and non-human-readable
 ```
 %% Output
    array([[ 1.71505175, -0.66339907, -0.78043436, ..., -0.42618374,
            -0.60411572, -0.56115931],
           [ 1.37205413, -0.64987993, -0.7894403 , ..., -0.42860905,
            -0.60028578, -0.56115931],
           [ 1.02905651, -0.64475477, -0.79285448, ..., -0.42952849,
            -0.59883384, -0.56115931],
           ...,
           [-1.02892923, -1.76580651, -1.64749359, ..., -1.58459194,
            -1.61482099, -1.46201097],
           [-1.37192685, -1.77091687, -1.64415116, ..., -1.58534204,
            -1.61482099, -1.46201097],
           [-1.71492448, -1.79277049, -1.62985781, ..., -1.58854971,
            -1.61482099, -1.46201097]])
 %% Cell type:markdown id: tags:
 <br>
 <br>
 <br>
 <br>
 <br>
 #### Using smart Trial & Error - *Hyperparameter Tuning*
 There are multiple parameters for the algorithm, which can be set beforehand (see above). To trade computing power for brain power, the computer finds the best parameters for the algorithm.
 That's what *GridSearchCV* is for.
 Instead of training the regressors directly using all the data, GridsearchCV is given multiple values for each parameter of the algorithm and trains the algorithm with all possible combination of the paramters, but only a subset of the training data & tests the algorithm on the remaining data. The combination of paramters, which is most accurate at predicting, wins.
 Guess what, that might take a while...
 "a while" might be something between getting a fresh coffee and it being done or going on holiday & coming back to 1% progress, obviously depends on parameter choice.
 That's also the reason, why it will not be done live here. For this data here, the time range was a short city trip holiday on a high performance computer.
 Instead of using the exhaustive approach GridSearchCV, it's also possible to take $n$ random combinations of parameters.
 -> RandomizedSearchCV
 __Try it yourself!__
 %% Cell type:code id: tags:
 ``` python
 from sklearn.linear_model import LinearRegression
 import pickle
 from sklearn.preprocessing import StandardScaler
 from sklearn.neighbors import KNeighborsRegressor
 from sklearn.ensemble import RandomForestRegressor
 from sklearn.metrics import r2_score
 from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
 from sklearn.svm import SVR
 ```
 %% Cell type:code id: tags:
 ``` python
 print("this code commented out does the heavy lifting. Not doing that today.")
 for error_name, y_train in tqdm(y.items()):
    y_train=y_train.ravel()
    n_jobs = 2
    """
    neigh = KNeighborsRegressor(n_jobs=n_jobs)  # n jobs is the number of processors
    params = {
        "n_neighbors": [ 5, 10, 20, 50], # [4, 6, 8, 10, 12,]
        "leaf_size": [200], # [2, 4, 8, 16, 32],
        "weights": ["distance"],
        "algorithm": ["auto"],
        "n_jobs": [n_jobs],
    }
    # Making models with hyper parameters sets
    # cv (cross validation) groups can be raised...
    neigh1 = GridSearchCV(neigh, param_grid=params, n_jobs=n_jobs, cv=3, verbose=1)
    # Learning
    neigh1.fit(x_scaled, y_train)
    # The best hyper parameters set
    reg = LinearRegression().fit(x_scaled, y_train)
    clfs = RandomForestRegressor(n_jobs=n_jobs, oob_score=True)
    params = {
            'n_estimators':[50, 100, 200],
            'max_depth':[ 5, 10, 20, 50],
            'n_jobs':[n_jobs]
            }
    clfs = GridSearchCV(clfs, param_grid=params, n_jobs = n_jobs, cv = 3, verbose = 1)
    clfs.fit(x_scaled,y_train)
    #The best hyper parameters set
    # print("Best Hyper Parameters:\n",clfs.best_params_)
    # y_pred  = clfs.predict(X_test)
    # score = mean_absolute_error(y_test, y_pred)
    # print(score)
    svm = GridSearchCV(
            estimator=SVR(kernel='rbf'),
            param_grid={
                'C': [0.1, 1, 10, 100, 1000],
                'epsilon': [0.0001, 0.001, 0.01, 0.1, 1, 10],
                'gamma': [0.0001, 0.001, 0.01, 0.1, 1, 5]
            }, verbose=1, n_jobs=n_jobs)
    svm.fit(x_scaled,y_train)
    mlp = MLPRegressor()
    param_grid = {'hidden_layer_sizes': [(100), (100, 100), (50), (50, 50)],
                  'activation': ['relu'],
                  'solver': ['adam'],
                  'learning_rate': ['adaptive'],
                  'learning_rate_init': [0.01],
                  'power_t': [0.5],
                  'alpha': [0.0001],
                  'max_iter': [200, 1000],
                  'early_stopping': [False],
                  'warm_start': [False]}
    nn = GridSearchCV(mlp, param_grid=param_grid, verbose=True, n_jobs=n_jobs)
    nn.fit(x_scaled, y_train)
    eval_tools["KNeighborsRegressor"][error_name] = neigh1
    eval_tools["LinearRegression"][error_name] = reg
    eval_tools["RandomForestClassifier"][error_name] = clfs
    eval_tools["StandardScaler"] = sc
    eval_tools["NN"][error_name] = nn
    eval_tools["SVM"][error_name] = svm
 for i in eval_tools:
    print(i)
    filename = "ML_eval_tools_"+i+"_"+x_train_name+".sav"
    pickle.dump(eval_tools[i], open(filename, "wb"))
    """
 ```
 %% Output
    100%|██████████████████████████████████████████████████████████████████████████████████████████| 27/27 [00:00<?, ?it/s][A
    this code commented out does the heavy lifting. Not doing that today.
 %% Cell type:markdown id: tags:
 Instead, let's just try something out with a subset of the original data and illegally shortened training phases.
 %% Cell type:code id: tags:
 ``` python
 eval_tools = {
        "LinearRegression": {},
        "NN":{},
        }
 for error_name in tqdm(["EXX", "EXX_compRB"]):
    regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01)
    regressor_model.fit(x_scaled, y[error_name].ravel())
    reg = LinearRegression().fit(x_scaled, y[error_name].ravel())
    eval_tools["StandardScaler"] = sc
    eval_tools["LinearRegression"][error_name] = reg
    eval_tools["NN"][error_name] = regressor_model
 for i in eval_tools:
    print(i)
    filename = "ML_eval_tools_"+i+".sav"
    pickle.dump(eval_tools[i], open(filename, "wb"))
 ```
 %% Output
      0%|                                                                                            | 0/2 [00:00<?, ?it/s][A[A
    Iteration 1, loss = 2.37517953
    Iteration 2, loss = 0.12905149
    Iteration 3, loss = 0.09789837
    Iteration 4, loss = 0.06032041
    Iteration 5, loss = 0.05504129
    Iteration 6, loss = 0.05392060
    Iteration 7, loss = 0.05141265
    Iteration 8, loss = 0.05067939
    Iteration 9, loss = 0.05143948
    Iteration 10, loss = 0.04910270
    Iteration 11, loss = 0.05031202
    Iteration 12, loss = 0.04809143
    Iteration 13, loss = 0.04870659
    Iteration 14, loss = 0.04770111
    Iteration 15, loss = 0.04766778
    Iteration 16, loss = 0.04743532
    Iteration 17, loss = 0.04751504
    Iteration 18, loss = 0.04508614
    Iteration 19, loss = 0.04696037
    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
      % self.max_iter, ConvergenceWarning)
     50%|██████████████████████████████████████████                                          | 1/2 [00:09<00:09,  9.79s/it][A[A
    Iteration 20, loss = 0.04541264
    Iteration 1, loss = 0.08507141
    Iteration 2, loss = 0.04744902
    Iteration 3, loss = 0.04485742
    Iteration 4, loss = 0.04368564
    Iteration 5, loss = 0.04222761
    Iteration 6, loss = 0.04236689
    Iteration 7, loss = 0.04155312
    Iteration 8, loss = 0.04141260
    Iteration 9, loss = 0.04099000
    Iteration 10, loss = 0.04064505
    Iteration 11, loss = 0.04033634
    Iteration 12, loss = 0.04061129
    Iteration 13, loss = 0.03987049
    Iteration 14, loss = 0.04025951
    Iteration 15, loss = 0.03992405
    Iteration 16, loss = 0.04006715
    Iteration 17, loss = 0.03967800
    Iteration 18, loss = 0.03929736
    Iteration 19, loss = 0.03960700
    C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
      % self.max_iter, ConvergenceWarning)
    100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:19<00:00,  9.96s/it][A[A
    Iteration 20, loss = 0.03957958
    LinearRegression
    NN
    StandardScaler
 %% Cell type:markdown id: tags:
 ## Validation Data
 So the neural network can, as shown, reproduce the EXX error now. But the common (and smart) way of validating machine learning algorithms is a split into three parts:
 - Triaining dataset (biggest part)
    - Data, which the algorithm is trained with, hence the name
 - Test set (small part)
    - This dataset is used to check after each iteration of training, how well the trained model performs
 - Validation dataset (small part)
    - This subset is kept back until the training is finished.
    - For the purpose described above, it is crucial, that the algorithm of choice also performs well in a temperature range, which was not included in the training dataset - the correction model for thermal drift of the machine tool should not produce garbage in new temperature states, e.g. on a hot summer day.
    <br>The chosen model should be able to extrapolate - many machine learning modes lack in that regard. This can be tested with the validation dataset.
 %% Cell type:markdown id: tags:
 ## White Box Knowledge
 So, machine learning is nice in that regard, that no prior knowledge about the predited system is necessary.
 __But I HAVE SOME KNOWLEDGE, LET ME USE THAT!__
 Also, using some existing knowledge can be quite beneficial.
 - Machine learning creates a black box, which spits out some result - I do not know what it did to get there (yes, for the more simple algorithms I could find out)
 - I could create some kind of hybrid model and mix my own knowledge (White Box) and the machine learning black box. Black and white mixed is grey, thus grey box. <br>
 Some approaches might not even work without additional white box modelling, see below.
 %% Cell type:markdown id: tags:
 ## Result Discussion
 %% Cell type:code id: tags:
 ``` python
 ```
 %% Cell type:code id: tags:
 ``` python
 ```