Skip to content
Snippets Groups Projects
Commit d3ac3df7 authored by ill-phil's avatar ill-phil
Browse files

updated plots for interactive slides

parent 43d2dc30
No related branches found
No related tags found
No related merge requests found
File added
File added
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Simple machine learning with Python & Scikit-learn3 # Simple machine learning with Python & Scikit-learn3
## Python - What's that? ## Python - What's that?
- Programming language: Python - Programming language: Python
- Popular in general & especially with data analytics & science - Popular in general & especially with data analytics & science
- Interpreted/Script language -> No annoying compilation, but directly running the code - Interpreted/Script language -> No annoying compilation, but directly running the code
- Who else uses Python??? - Who else uses Python???
- Youtube - Youtube
- CERN - CERN
- NASA - NASA
- Wikipedia - Wikipedia
- Google - Google
- ... - ...
- Popular applications in Python/with Python interfaces: - Popular applications in Python/with Python interfaces:
- 3D: - 3D:
- Blender - Blender
- Cinema 4D - Cinema 4D
- FeeCAD - FeeCAD
- Ultimaker Cura - Ultimaker Cura
- 2D: - 2D:
- GIMP - GIMP
- Scribus - Scribus
- Inkscape - Inkscape
- Jupyter Notebooks: - Jupyter Notebooks:
- Still want to remember why you did what e.g. last month? Jupyter helps you with structuring and commenting code - Still want to remember why you did what e.g. last month? Jupyter helps you with structuring and commenting code
- THIS HERE is a jupyter notebook, write code & text - THIS HERE is a jupyter notebook, write code & text
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Want to do this yourself? ## Want to do this yourself?
Install anaconda & follow one of the countless guides online, e.g. Install anaconda & follow one of the countless guides online, e.g.
https://jupyter.readthedocs.io/en/latest/install.html#installing-jupyter-using-anaconda-and-conda https://jupyter.readthedocs.io/en/latest/install.html#installing-jupyter-using-anaconda-and-conda
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Programming ## Programming
### Libraries - using software written by people, who can code better than me ### Libraries - using software written by people, who can code better than me
Importing Libraries - Importing Libraries -
https://docs.python.org/3/reference/import.html https://docs.python.org/3/reference/import.html
What's What's
> import ***; > import ***;
for Java, for Java,
> #include <***.h> > #include <***.h>
for C++, for C++,
is is
> import *** > import ***
for Python. for Python.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np # THE mathematics library for python import numpy as np # THE mathematics library for python
# while importing, the name to call the library with can be redefined as well. Too lazy to write "numpy"? Rename it "np" # while importing, the name to call the library with can be redefined as well. Too lazy to write "numpy"? Rename it "np"
import pandas as pd # powerful library for working with large datasets - I use it to load a simple CSV... import pandas as pd # powerful library for working with large datasets - I use it to load a simple CSV...
import pickle # Saving data for further use, hence "pickle" import pickle # Saving data for further use, hence "pickle"
from tqdm import tqdm # Nice progressbar for long running tasks from tqdm import tqdm # Nice progressbar for long running tasks
print("Finished importing stuff") print("Finished importing stuff")
``` ```
%% Output %% Output
Finished importing stuff Finished importing stuff
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
### THE DATA ### THE DATA
Well, data science requres some data. So I provided some here: [BIG CSV FILE - basically a spreadsheet to kill your excel](./data4.csv) Well, data science requres some data. So I provided some here: [BIG CSV FILE - basically a spreadsheet to kill your excel](./data4.csv)
Although the following data analysis could probably be done in Excel, that would not be fun, so rather use this notebook. Although the following data analysis could probably be done in Excel, that would not be fun, so rather use this notebook.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("Loading 100MB CSV") print("Loading 100MB CSV")
# Use pandas' csv_read function, which outputs a so-called dataframe - see below, what that can do # Use pandas' csv_read function, which outputs a so-called dataframe - see below, what that can do
csv = pd.read_csv("./data_train.csv", encoding="ISO-8859-1") csv = pd.read_csv("./data_train.csv", encoding="ISO-8859-1")
print("DONE") print("DONE")
``` ```
%% Output %% Output
Loading 100MB CSV Loading 100MB CSV
DONE DONE
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
That wasn't really quick, but let's look at the info: That wasn't really quick, but let's look at the info:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
csv.head() csv.head()
``` ```
%% Output %% Output
index time position EX0 ... C X-Axis Air A Machine door C Scale X-Axis M Scale Y-Achse Case index time position EX0 EX1 EX2 EBX_rawest \
0 0.0 1.560785e+09 -0.00 -0.000146 ... 24.341418 24.026120 25.023880 24.75 0 0.0 1.560785e+09 -0.00 -0.000146 -0.000148 -0.000149 0.000008
1 1.0 1.560785e+09 -0.05 -0.000148 ... 24.332481 24.023566 25.026434 24.75 1 1.0 1.560785e+09 -0.05 -0.000148 -0.000150 -0.000151 0.000009
2 2.0 1.560785e+09 -0.10 -0.000151 ... 24.329092 24.022598 25.027402 24.75 2 2.0 1.560785e+09 -0.10 -0.000151 -0.000153 -0.000154 0.000010
3 3.0 1.560785e+09 -0.15 -0.000154 ... 24.325705 24.021630 25.028370 24.75 3 3.0 1.560785e+09 -0.15 -0.000154 -0.000156 -0.000157 0.000011
4 4.0 1.560785e+09 -0.20 -0.000157 ... 24.322317 24.020662 25.029338 24.75 4 4.0 1.560785e+09 -0.20 -0.000157 -0.000160 -0.000161 0.000012
ECX_rawest EXX_rawest EXX_rawest_abbe ... A Z-Axis C Air New \
0 -0.000013 -0.000146 -0.000149 ... 24.922462 24.405299
1 -0.000013 -0.000148 -0.000151 ... 24.933953 24.398915
2 -0.000012 -0.000151 -0.000155 ... 24.938310 24.396495
3 -0.000013 -0.000154 -0.000158 ... 24.942666 24.394075
4 -0.000016 -0.000157 -0.000162 ... 24.947021 24.391655
M Table Front M Table Side A Rotation Table M Bellow X C X-Axis Air \
0 25.62 25.346120 24.504701 25.036120 24.341418
1 25.62 25.343566 24.511085 25.033566 24.332481
2 25.62 25.342598 24.513505 25.032598 24.329092
3 25.62 25.341630 24.515925 25.031630 24.325705
4 25.62 25.340662 24.518345 25.030662 24.322317
A Machine door C Scale X-Axis M Scale Y-Achse Case
0 24.026120 25.023880 24.75
1 24.023566 25.026434 24.75
2 24.022598 25.027402 24.75
3 24.021630 25.028370 24.75
4 24.020662 25.029338 24.75
[5 rows x 40 columns] [5 rows x 40 columns]
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("This csv contains", csv.size, "fields") print("This csv contains", csv.size, "fields")
``` ```
%% Output %% Output
This csv contains 6252200 fields This csv contains 6252200 fields
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(csv.shape) print(csv.shape)
print("In", csv.shape[0], "rows") print("In", csv.shape[0], "rows")
``` ```
%% Output %% Output
(156305, 40) (156305, 40)
In 156305 rows In 156305 rows
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### THE DATA AGAIN ### THE DATA AGAIN
We're the machine tool laboratory, so this is obviously machine tool data ;) We're the machine tool laboratory, so this is obviously machine tool data ;)
<img src="img/Render_01.PNG" width="50%"> <img src="img/Render_01.PNG" width="50%">
Over multiple weeks, we took a machine tool, <br> Over multiple weeks, we took a machine tool, <br>
made it move from 0 to 500mm in 50mm steps (see column position), <br> made it move from 0 to 500mm in 50mm steps (see column position), <br>
and measured the temperature at multiple points in the machine &<br> and measured the temperature at multiple points in the machine &<br>
measured the deviation between ideal position and real position using 3 laser interferometers. measured the deviation between ideal position and real position using 3 laser interferometers.
TODO MEasurement setup TODO MEasurement setup
Metal expands & contracts with it's thermal state, so now it's possible to correlate temperature and position deviation. Metal expands & contracts with it's thermal state, so now it's possible to correlate temperature and position deviation.
#### BUT WHY? #### BUT WHY?
Make a machine tool more precise using __only software__, which is basically free - who would NOT do that?! Make a machine tool more precise using __only software__, which is basically free - who would NOT do that?!
... well, to use this kind of software it might be necessary to calibrate the machine tool extensively, but that might still be a small cost factor compared to hardware modifications. ... well, to use this kind of software it might be necessary to calibrate the machine tool extensively, but that might still be a small cost factor compared to hardware modifications.
#### What are we trying to achieve? #### What are we trying to achieve?
Thermal influences on the machine are responsible for up to 75% of the target/actual deviation of the tool center position. They can be measured quite easily, but the functional dependence between the temperature at multiple points in the machine is not quite as simple. Thermal influences on the machine are responsible for up to 75% of the target/actual deviation of the tool center position. They can be measured quite easily, but the functional dependence between the temperature at multiple points in the machine is not quite as simple.
One could model the machine in CAD, create thermo-mechanical analyses in FEM software, model the machine's environment and air flows for a CFD software and than run that software stack continuously (which is actually being done as well), or do the next best, but much simpler thing: <br> One could model the machine in CAD, create thermo-mechanical analyses in FEM software, model the machine's environment and air flows for a CFD software and than run that software stack continuously (which is actually being done as well), or do the next best, but much simpler thing: <br>
Triainig an algorithm on data to predict the thermal deviation from multiple temperature sensors. This apporach requires basically no knowledge about the system at all. Triainig an algorithm on data to predict the thermal deviation from multiple temperature sensors. This apporach requires basically no knowledge about the system at all.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = csv[ x = csv[
["position"] + ["position"] +
list(csv.columns[list(csv.columns).index("A Z-Axis"):list(csv.columns).index("M Scale Y-Achse Case")+1]) list(csv.columns[list(csv.columns).index("A Z-Axis"):list(csv.columns).index("M Scale Y-Achse Case")+1])
] ]
print("This is the X part to train the algorithms on:\n", x) print("This is the X part to train the algorithms on:\n", x)
``` ```
%% Output %% Output
This is the X part to train the algorithms on: This is the X part to train the algorithms on:
position A Z-Axis C Air New ... A Machine door C Scale X-Axis M Scale Y-Achse Case position A Z-Axis C Air New M Table Front M Table Side \
0 -0.00 24.922462 24.405299 ... 24.026120 25.023880 24.75 0 -0.00 24.922462 24.405299 25.62 25.346120
1 -0.05 24.933953 24.398915 ... 24.023566 25.026434 24.75 1 -0.05 24.933953 24.398915 25.62 25.343566
2 -0.10 24.938310 24.396495 ... 24.022598 25.027402 24.75 2 -0.10 24.938310 24.396495 25.62 25.342598
3 -0.15 24.942666 24.394075 ... 24.021630 25.028370 24.75 3 -0.15 24.942666 24.394075 25.62 25.341630
4 -0.20 24.947021 24.391655 ... 24.020662 25.029338 24.75 4 -0.20 24.947021 24.391655 25.62 25.340662
... ... ... ... ... ... ... ... ... ... ... ... ... ...
156300 -0.30 23.994117 23.785936 ... 22.808021 24.350000 24.06 156300 -0.30 23.994117 23.785936 24.91 24.660000
156301 -0.35 23.989774 23.788305 ... 22.807232 24.350000 24.06 156301 -0.35 23.989774 23.788305 24.91 24.660000
156302 -0.40 23.985429 23.790675 ... 22.806442 24.350000 24.06 156302 -0.40 23.985429 23.790675 24.91 24.660000
156303 -0.45 23.981085 23.793044 ... 22.805652 24.350000 24.06 156303 -0.45 23.981085 23.793044 24.91 24.660000
156304 -0.50 23.962510 23.803176 ... 22.802275 24.350000 24.06 156304 -0.50 23.962510 23.803176 24.91 24.660000
A Rotation Table M Bellow X C X-Axis Air A Machine door \
0 24.504701 25.036120 24.341418 24.026120
1 24.511085 25.033566 24.332481 24.023566
2 24.513505 25.032598 24.329092 24.022598
3 24.515925 25.031630 24.325705 24.021630
4 24.518345 25.030662 24.322317 24.020662
... ... ... ... ...
156300 23.608075 24.180000 23.673957 22.808021
156301 23.605310 24.180000 23.675537 22.807232
156302 23.602546 24.180000 23.677117 22.806442
156303 23.599782 24.180000 23.678696 22.805652
156304 23.587961 24.180000 23.685451 22.802275
C Scale X-Axis M Scale Y-Achse Case
0 25.023880 24.75
1 25.026434 24.75
2 25.027402 24.75
3 25.028370 24.75
4 25.029338 24.75
... ... ...
156300 24.350000 24.06
156301 24.350000 24.06
156302 24.350000 24.06
156303 24.350000 24.06
156304 24.350000 24.06
[156305 rows x 11 columns] [156305 rows x 11 columns]
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
y = {} y = {}
mult = 1e6 mult = 1e6
print("FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values") print("FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values")
for dataset in [dataset for dataset in list(csv.columns) if "E" in dataset]: for dataset in [dataset for dataset in list(csv.columns) if "E" in dataset]:
y[dataset] = csv[[dataset]].values*mult y[dataset] = csv[[dataset]].values*mult
print("And those are the Y-Vectors:\n") print("And those are the Y-Vectors:\n")
for i in y: for i in y:
print(i, y[i], "\n\n") print(i, y[i], "\n\n")
``` ```
%% Output %% Output
FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values FROM HERE ON, ALL Y-VALUES ARE IN µm, as there used to be problems with precision with very small y-values
And those are the Y-Vectors: And those are the Y-Vectors:
EX0 [[-145.58887844] EX0 [[-145.58887844]
[-147.70383915] [-147.70383915]
[-150.63214641] [-150.63214641]
... ...
[-163.62984328] [-163.62984328]
[-163.42779672] [-163.42779672]
[-161.77353804]] [-161.77353804]]
EX1 [[-147.52466373] EX1 [[-147.52466373]
[-149.75382681] [-149.75382681]
[-153.0037913 ] [-153.0037913 ]
... ...
[-172.47091093] [-172.47091093]
[-173.38001859] [-173.38001859]
[-173.11703744]] [-173.11703744]]
EX2 [[-148.69029205] EX2 [[-148.69029205]
[-150.68360674] [-150.68360674]
[-153.56378397] [-153.56378397]
... ...
[-165.64035997] [-165.64035997]
[-165.34269771] [-165.34269771]
[-163.72420809]] [-163.72420809]]
EBX_rawest [[ 8.23738418] EBX_rawest [[ 8.23738418]
[ 8.72335174] [ 8.72335174]
[10.09210592] [10.09210592]
... ...
[37.62156446] [37.62156446]
[42.3498803 ] [42.3498803 ]
[48.27021022]] [48.27021022]]
ECX_rawest [[-13.19750469] ECX_rawest [[-13.19750469]
[-12.67986208] [-12.67986208]
[-12.47505343] [-12.47505343]
... ...
[ -8.55539017] [ -8.55539017]
[ -8.14851484] [ -8.14851484]
[ -8.30072361]] [ -8.30072361]]
EXX_rawest [[-145.58887844] EXX_rawest [[-145.58887844]
[-147.70383915] [-147.70383915]
[-150.63214641] [-150.63214641]
... ...
[-163.62984328] [-163.62984328]
[-163.42779672] [-163.42779672]
[-161.77353804]] [-161.77353804]]
EXX_rawest_abbe [[-148.71554643] EXX_rawest_abbe [[-148.71554643]
[-151.27421295] [-151.27421295]
[-155.20160986] [-155.20160986]
... ...
[-188.25386036] [-188.25386036]
[-191.44300996] [-191.44300996]
[-193.90254047]] [-193.90254047]]
EXX [[ 0. ] EXX [[ 0. ]
[ -2.1149607 ] [ -2.1149607 ]
[ -5.04326797] [ -5.04326797]
... ...
[-11.22441365] [-11.22441365]
[-11.02236709] [-11.02236709]
[ -9.36810841]] [ -9.36810841]]
EXX_abbe [[ 0. ] EXX_abbe [[ 0. ]
[ -2.55866651] [ -2.55866651]
[ -6.48606343] [ -6.48606343]
... ...
[-18.38372173] [-18.38372173]
[-21.57287132] [-21.57287132]
[-24.03240183]] [-24.03240183]]
EBX [[ 0. ] EBX [[ 0. ]
[ 0.48596755] [ 0.48596755]
[ 1.85472173] [ 1.85472173]
... ...
[14.42182827] [14.42182827]
[19.15014412] [19.15014412]
[25.07047403]] [25.07047403]]
ECX [[ 0. ] ECX [[ 0. ]
[ 0.51764261] [ 0.51764261]
[ 0.72245126] [ 0.72245126]
... ...
[-14.67985857] [-14.67985857]
[-14.27298324] [-14.27298324]
[-14.42519201]] [-14.42519201]]
EXX_compRB [[0. ] EXX_compRB [[0. ]
[0.15308936] [0.15308936]
[0.20854658] [0.20854658]
... ...
[3.78256741] [3.78256741]
[3.72149844] [3.72149844]
[4.28200059]] [4.28200059]]
EXX_compRB_abbe [[0. ] EXX_compRB_abbe [[0. ]
[0.22794299] [0.22794299]
[0.01507092] [0.01507092]
... ...
[2.97160382] [2.97160382]
[2.58834382] [2.58834382]
[3.0130646 ]] [3.0130646 ]]
EBX_compRB [[ 0. ] EBX_compRB [[ 0. ]
[-0.03857213] [-0.03857213]
[ 0.39504006] [ 0.39504006]
... ...
[ 1.09849965] [ 1.09849965]
[ 1.45237801] [ 1.45237801]
[ 1.61783558]] [ 1.61783558]]
ECX_compRB [[ 0. ] ECX_compRB [[ 0. ]
[-0.23926568] [-0.23926568]
[-0.41526189] [-0.41526189]
... ...
[ 0.21006921] [ 0.21006921]
[ 0.58245007] [ 0.58245007]
[ 0.68225539]] [ 0.68225539]]
EXX_alpha_lit_raw [[-145.58887844] EXX_alpha_lit_raw [[-145.58887844]
[-137.69326552] [-137.69326552]
[-130.61022469] [-130.61022469]
... ...
[ -85.70984328] [ -85.70984328]
[ -75.76779672] [ -75.76779672]
[ -64.37353804]] [ -64.37353804]]
EXX_alpha_lit_raw_abbe [[-148.71554643] EXX_alpha_lit_raw_abbe [[-148.71554643]
[-141.26363932] [-141.26363932]
[-135.17968815] [-135.17968815]
... ...
[-110.33386036] [-110.33386036]
[-103.78300996] [-103.78300996]
[ -96.50254047]] [ -96.50254047]]
EXX_alpha_calc_raw [[-145.58887844] EXX_alpha_calc_raw [[-145.58887844]
[-132.89787225] [-132.89787225]
[-121.01906716] [-121.01906716]
... ...
[ -48.38360627] [ -48.38360627]
[ -33.77578008] [ -33.77578008]
[ -17.71574178]] [ -17.71574178]]
EXX_alpha_calc_raw_abbe [[-148.71554643] EXX_alpha_calc_raw_abbe [[-148.71554643]
[-140.70140285] [-140.70140285]
[-134.05517172] [-134.05517172]
... ...
[-105.95754117] [-105.95754117]
[ -98.85965087] [ -98.85965087]
[ -91.03214148]] [ -91.03214148]]
EXX_alpha_lit [[ 0. ] EXX_alpha_lit [[ 0. ]
[ 7.89561293] [ 7.89561293]
[14.97865375] [14.97865375]
... ...
[66.69558635] [66.69558635]
[76.63763291] [76.63763291]
[88.03189159]] [88.03189159]]
EXX_alpha_lit_abbe [[ 0. ] EXX_alpha_lit_abbe [[ 0. ]
[ 7.45190711] [ 7.45190711]
[13.53585828] [13.53585828]
... ...
[59.53627827] [59.53627827]
[66.08712868] [66.08712868]
[73.36759817]] [73.36759817]]
EXX_alpha_calc [[ 0. ] EXX_alpha_calc [[ 0. ]
[ 12.6910062 ] [ 12.6910062 ]
[ 24.56981129] [ 24.56981129]
... ...
[104.02182336] [104.02182336]
[118.62964954] [118.62964954]
[134.68968785]] [134.68968785]]
EXX_alpha_calc_abbe [[ 0. ] EXX_alpha_calc_abbe [[ 0. ]
[ 8.01414358] [ 8.01414358]
[14.66037471] [14.66037471]
... ...
[63.91259746] [63.91259746]
[71.01048777] [71.01048777]
[78.83799716]] [78.83799716]]
EXX_alpha_lit_comp [[0. ] EXX_alpha_lit_comp [[0. ]
[0.14832618] [0.14832618]
[0.19971855] [0.19971855]
... ...
[1.57852137] [1.57852137]
[1.24194088] [1.24194088]
[1.52693028]] [1.52693028]]
EXX_alpha_lit_abbe_comp [[0. ] EXX_alpha_lit_abbe_comp [[0. ]
[0.22317981] [0.22317981]
[0.00624288] [0.00624288]
... ...
[0.76755777] [0.76755777]
[0.10878626] [0.10878626]
[0.2579943 ]] [0.2579943 ]]
EXX_alpha_calc_comp [[0. ] EXX_alpha_calc_comp [[0. ]
[0.14604446] [0.14604446]
[0.19548963] [0.19548963]
... ...
[0.52271098] [0.52271098]
[0.05415144] [0.05415144]
[0.2071612 ]] [0.2071612 ]]
EXX_alpha_calc_abbe_comp [[ 0. ] EXX_alpha_calc_abbe_comp [[ 0. ]
[ 0.22291229] [ 0.22291229]
[ 0.00574706] [ 0.00574706]
... ...
[ 0.64376915] [ 0.64376915]
[-0.03047626] [-0.03047626]
[ 0.10325781]] [ 0.10325781]]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Algorithm Choice ### Algorithm Choice
Choosing a suitable algorithm depends on multiple factors. Choosing a suitable algorithm depends on multiple factors.
First off, the task on hand requires "Supervised Learning" - a function should be trained on existing examples f(x)=y to predict y for a given x. First off, the task on hand requires "Supervised Learning" - a function should be trained on existing examples f(x)=y to predict y for a given x.
Many Supervised Learning Algorithms can do __CLASSIFICATION__ (Is this an apple or a banana?), e.g. Neural Networks, Naive Bayes, Decision Trees, Support Vector Machines, etc. <br> Many Supervised Learning Algorithms can do __CLASSIFICATION__ (Is this an apple or a banana?), e.g. Neural Networks, Naive Bayes, Decision Trees, Support Vector Machines, etc. <br>
But classification is not interesing at all here. The output should be a continuous variable - a task called __REGRESSION.__ <br> But classification is not interesing at all here. The output should be a continuous variable - a task called __REGRESSION.__ <br>
Suitable algorithms here are Neural Networks (again), Support Vector Regression, or multiple others shown later-on. Suitable algorithms here are Neural Networks (again), Support Vector Regression, or multiple others shown later-on.
So what's __THE BEST__ algorithm for the task?? So what's __THE BEST__ algorithm for the task??
I have __NO__ idea. <br> I have __NO__ idea. <br>
Also it depends. Also it depends.
Often with machine learning a general direction can be given, but multiple possible solutions exist, with similar prediction quality, so for a first shot, use trial and error with a bit of additional brain power and __A WHOLE LOT OF COMPUTING POWER.__ Often with machine learning a general direction can be given, but multiple possible solutions exist, with similar prediction quality, so for a first shot, use trial and error with a bit of additional brain power and __A WHOLE LOT OF COMPUTING POWER.__
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
y["EXX"] y["EXX"]
``` ```
%% Output %% Output
array([[ 0. ], array([[ 0. ],
[ -2.1149607 ], [ -2.1149607 ],
[ -5.04326797], [ -5.04326797],
..., ...,
[-11.22441365], [-11.22441365],
[-11.02236709], [-11.02236709],
[ -9.36810841]]) [ -9.36810841]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Programming with the algorithms ### Programming with the algorithms
General direction: General direction:
1. Import model to be used explicitly 1. Import model to be used explicitly
- > from sklearn.neural_network import MLPRegressor - > from sklearn.neural_network import MLPRegressor
1. Create an instance of that object, give it some settings (more on that later) 1. Create an instance of that object, give it some settings (more on that later)
- > regressor_model = MLPRegressor([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, ...]) - > regressor_model = MLPRegressor([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, ...])
- Mainly the hidden layers & the number of iterations are interesting - but for everything there are useful default values inside *scikit learn.* - Mainly the hidden layers & the number of iterations are interesting - but for everything there are useful default values inside *scikit learn.*
- The format of the first and the last layer are automatically set by the format of the input & output data - The format of the first and the last layer are automatically set by the format of the input & output data
1. Train it using regressor_model.fit(X,y) 1. Train it using regressor_model.fit(X,y)
<br><br> <br><br>
1. To predict with the model on new data, now siply use regressor_model.predict(x) 1. To predict with the model on new data, now siply use regressor_model.predict(x)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from sklearn.neural_network import MLPRegressor from sklearn.neural_network import MLPRegressor
regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01) regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01)
regressor_model.fit(x, y["EXX"].ravel()) regressor_model.fit(x, y["EXX"].ravel())
``` ```
%% Output %% Output
Iteration 1, loss = 9.71133154 Iteration 1, loss = 8.38764054
Iteration 2, loss = 3.71765057 Iteration 2, loss = 3.72069698
Iteration 3, loss = 2.63920472 Iteration 3, loss = 2.57127449
Iteration 4, loss = 2.26290314 Iteration 4, loss = 2.22488600
Iteration 5, loss = 2.14301792 Iteration 5, loss = 2.02794688
Iteration 6, loss = 2.00277585 Iteration 6, loss = 1.89214184
Iteration 7, loss = 1.77628251 Iteration 7, loss = 1.69559486
Iteration 8, loss = 1.67976752 Iteration 8, loss = 1.61075341
Iteration 9, loss = 1.53118154 Iteration 9, loss = 1.54779916
Iteration 10, loss = 1.41997353 Iteration 10, loss = 1.40130177
Iteration 11, loss = 1.35314198 Iteration 11, loss = 1.31125627
Iteration 12, loss = 1.22314352 Iteration 12, loss = 1.36531385
Iteration 13, loss = 1.24868897 Iteration 13, loss = 1.31427094
Iteration 14, loss = 1.17286362 Iteration 14, loss = 1.22763524
Iteration 15, loss = 1.16889021 Iteration 15, loss = 1.17399099
Iteration 16, loss = 1.08565469 Iteration 16, loss = 1.22866807
Iteration 17, loss = 1.28326932 Iteration 17, loss = 1.15295456
Iteration 18, loss = 1.25242884 Iteration 18, loss = 1.13846861
Iteration 19, loss = 1.18214886 Iteration 19, loss = 1.10823847
Iteration 20, loss = 1.05301875 Iteration 20, loss = 1.13801330
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet. C:\Anaconda3\envs\jupyter\lib\site-packages\sklearn\neural_network\_multilayer_perceptron.py:571: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
% self.max_iter, ConvergenceWarning) % self.max_iter, ConvergenceWarning)
MLPRegressor(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9, MLPRegressor(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,
beta_2=0.999, early_stopping=False, epsilon=1e-08, beta_2=0.999, early_stopping=False, epsilon=1e-08,
hidden_layer_sizes=(50,), learning_rate='constant', hidden_layer_sizes=(50,), learning_rate='constant',
learning_rate_init=0.01, max_iter=20, momentum=0.9, learning_rate_init=0.01, max_fun=15000, max_iter=20, momentum=0.9,
n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5, n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
random_state=None, shuffle=True, solver='adam', tol=0.0001, random_state=None, shuffle=True, solver='adam', tol=0.0001,
validation_fraction=0.1, verbose=1, warm_start=False) validation_fraction=0.1, verbose=1, warm_start=False)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Soooo, what now? What do I do with that thing?? Soooo, what now? What do I do with that thing??
Obiously use it for its intended purpose: Prediction. Obiously use it for its intended purpose: Prediction.
Let's try: Let's try:
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print([x.values[12345]]) print([x.values[12345]])
print("Prediction", regressor_model.predict([x.values[12345]])) print("Prediction", regressor_model.predict([x.values[12345]]))
print("Reality:", y["EXX"][12345]) print("Reality:", y["EXX"][12345])
``` ```
%% Output %% Output
[array([-0.25 , 24.95746094, 24.72 , 25.77 , 25.52500601, [array([-0.25 , 24.95746094, 24.72 , 25.77 , 25.52500601,
24.77252704, 25.16 , 24.66 , 24.36750901, 25.14 , 24.77252704, 25.16 , 24.66 , 24.36750901, 25.14 ,
24.85500601])] 24.85500601])]
Prediction [-12.71699561] Prediction [-11.65489425]
Reality: [-12.17030708] Reality: [-12.17030708]
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
An exemplary prediciton with an error of around 1µm, I like that :) An exemplary prediciton with an error of around 1µm, I like that :)
If it's not around 1µm rest asured that it's not due to my lack of subtraction skills, but the initial values for all nodes in the If it's not around 1µm rest asured that it's not due to my lack of subtraction skills, but the initial values for all nodes in the
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# As said before, I will test multiple methods against each other # As said before, I will test multiple methods against each other
# So the results (like regressor_model) need to be stored in the following dictionary # So the results (like regressor_model) need to be stored in the following dictionary
eval_tools = { eval_tools = {
"KNeighborsRegressor": {}, "KNeighborsRegressor": {},
"LinearRegression": {}, "LinearRegression": {},
"RandomForestClassifier":{}, "RandomForestClassifier":{},
"SVM":{}, "SVM":{},
"NN":{}, "NN":{},
} }
print("Trained models will be stored here:", eval_tools) print("Trained models will be stored here:", eval_tools)
``` ```
%% Output %% Output
Trained models will be stored here: {'KNeighborsRegressor': {}, 'LinearRegression': {}, 'RandomForestClassifier': {}, 'SVM': {}, 'NN': {}} Trained models will be stored here: {'KNeighborsRegressor': {}, 'LinearRegression': {}, 'RandomForestClassifier': {}, 'SVM': {}, 'NN': {}}
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
#### Standard Scaling #### Standard Scaling
The X data might be very different. Here it's temperatures, so values between 15 and 40. But also positions with values between -0.5 & 0. Data could also be in uch higher numbers. The X data might be very different. Here it's temperatures, so values between 15 and 40. But also positions with values between -0.5 & 0. Data could also be in uch higher numbers.
Some algorithms do not like that. To make all data look alike, there is the so-called *standard scaler* in sklearn. Some algorithms do not like that. To make all data look alike, there is the so-called *standard scaler* in sklearn.
$$ z = \frac{x - u}{s} $$ $$ z = \frac{x - u}{s} $$
with $z$ as the new, scaled value, $x$ as the old value, $u$ the averave over all $x$s and $s$ as the standard deviation over all $x$s. with $z$ as the new, scaled value, $x$ as the old value, $u$ the averave over all $x$s and $s$ as the standard deviation over all $x$s.
So afterwards everything is (per definition) in a range around 0, 95% between -2 and +2 - see standard deviation. So afterwards everything is (per definition) in a range around 0, 95% between -2 and +2 - see standard deviation.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from sklearn.preprocessing import StandardScaler from sklearn.preprocessing import StandardScaler
sc = StandardScaler() sc = StandardScaler()
x_scaled = sc.fit_transform(x) x_scaled = sc.fit_transform(x)
x_scaled x_scaled
# Wow, now it looks really boring and non-human-readable # Wow, now it looks really boring and non-human-readable
``` ```
%% Output %% Output
array([[ 1.71505175, -0.66339907, -0.78043436, ..., -0.42618374, array([[ 1.71505175, -0.66339907, -0.78043436, ..., -0.42618374,
-0.60411572, -0.56115931], -0.60411572, -0.56115931],
[ 1.37205413, -0.64987993, -0.7894403 , ..., -0.42860905, [ 1.37205413, -0.64987993, -0.7894403 , ..., -0.42860905,
-0.60028578, -0.56115931], -0.60028578, -0.56115931],
[ 1.02905651, -0.64475477, -0.79285448, ..., -0.42952849, [ 1.02905651, -0.64475477, -0.79285448, ..., -0.42952849,
-0.59883384, -0.56115931], -0.59883384, -0.56115931],
..., ...,
[-1.02892923, -1.76580651, -1.64749359, ..., -1.58459194, [-1.02892923, -1.76580651, -1.64749359, ..., -1.58459194,
-1.61482099, -1.46201097], -1.61482099, -1.46201097],
[-1.37192685, -1.77091687, -1.64415116, ..., -1.58534204, [-1.37192685, -1.77091687, -1.64415116, ..., -1.58534204,
-1.61482099, -1.46201097], -1.61482099, -1.46201097],
[-1.71492448, -1.79277049, -1.62985781, ..., -1.58854971, [-1.71492448, -1.79277049, -1.62985781, ..., -1.58854971,
-1.61482099, -1.46201097]]) -1.61482099, -1.46201097]])
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
<br> <br>
<br> <br>
<br> <br>
<br> <br>
<br> <br>
#### Using smart Trial & Error - *Hyperparameter Tuning* #### Using smart Trial & Error - *Hyperparameter Tuning*
There are multiple parameters for the algorithm, which can be set beforehand (see above). To trade computing power for brain power, the computer finds the best parameters for the algorithm. There are multiple parameters for the algorithm, which can be set beforehand (see above). To trade computing power for brain power, the computer finds the best parameters for the algorithm.
That's what *GridSearchCV* is for. That's what *GridSearchCV* is for.
Instead of training the regressors directly using all the data, GridsearchCV is given multiple values for each parameter of the algorithm and trains the algorithm with all possible combination of the paramters, but only a subset of the training data & tests the algorithm on the remaining data. The combination of paramters, which is most accurate at predicting, wins. Instead of training the regressors directly using all the data, GridsearchCV is given multiple values for each parameter of the algorithm and trains the algorithm with all possible combination of the paramters, but only a subset of the training data & tests the algorithm on the remaining data. The combination of paramters, which is most accurate at predicting, wins.
Guess what, that might take a while... Guess what, that might take a while...
"a while" might be something between getting a fresh coffee and it being done or going on holiday & coming back to 1% progress, obviously depends on parameter choice. "a while" might be something between getting a fresh coffee and it being done or going on holiday & coming back to 1% progress, obviously depends on parameter choice.
That's also the reason, why it will not be done live here. For this data here, the time range was a short city trip holiday on a high performance computer. That's also the reason, why it will not be done live here. For this data here, the time range was a short city trip holiday on a high performance computer.
Instead of using the exhaustive approach GridSearchCV, it's also possible to take $n$ random combinations of parameters. Instead of using the exhaustive approach GridSearchCV, it's also possible to take $n$ random combinations of parameters.
-> RandomizedSearchCV -> RandomizedSearchCV
__Try it yourself!__ __Try it yourself!__
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from sklearn.linear_model import LinearRegression from sklearn.linear_model import LinearRegression
import pickle import pickle
from sklearn.preprocessing import StandardScaler from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsRegressor from sklearn.neighbors import KNeighborsRegressor
from sklearn.ensemble import RandomForestRegressor from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score from sklearn.metrics import r2_score
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from sklearn.svm import SVR from sklearn.svm import SVR
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print("this code commented out does the heavy lifting. Not doing that today.") print("this code commented out does the heavy lifting. Not doing that today.")
for error_name, y_train in tqdm(y.items()): for error_name, y_train in tqdm(y.items()):
y_train=y_train.ravel() y_train=y_train.ravel()
n_jobs = 2 n_jobs = 2
""" """
neigh = KNeighborsRegressor(n_jobs=n_jobs) # n jobs is the number of processors neigh = KNeighborsRegressor(n_jobs=n_jobs) # n jobs is the number of processors
params = { params = {
"n_neighbors": [ 5, 10, 20, 50], # [4, 6, 8, 10, 12,] "n_neighbors": [ 5, 10, 20, 50], # [4, 6, 8, 10, 12,]
"leaf_size": [200], # [2, 4, 8, 16, 32], "leaf_size": [200], # [2, 4, 8, 16, 32],
"weights": ["distance"], "weights": ["distance"],
"algorithm": ["auto"], "algorithm": ["auto"],
"n_jobs": [n_jobs], "n_jobs": [n_jobs],
} }
# Making models with hyper parameters sets # Making models with hyper parameters sets
# cv (cross validation) groups can be raised... # cv (cross validation) groups can be raised...
neigh1 = GridSearchCV(neigh, param_grid=params, n_jobs=n_jobs, cv=3, verbose=1) neigh1 = GridSearchCV(neigh, param_grid=params, n_jobs=n_jobs, cv=3, verbose=1)
# Learning # Learning
neigh1.fit(x_scaled, y_train) neigh1.fit(x_scaled, y_train)
# The best hyper parameters set # The best hyper parameters set
reg = LinearRegression().fit(x_scaled, y_train) reg = LinearRegression().fit(x_scaled, y_train)
clfs = RandomForestRegressor(n_jobs=n_jobs, oob_score=True) clfs = RandomForestRegressor(n_jobs=n_jobs, oob_score=True)
params = { params = {
'n_estimators':[50, 100, 200], 'n_estimators':[50, 100, 200],
'max_depth':[ 5, 10, 20, 50], 'max_depth':[ 5, 10, 20, 50],
'n_jobs':[n_jobs] 'n_jobs':[n_jobs]
} }
clfs = GridSearchCV(clfs, param_grid=params, n_jobs = n_jobs, cv = 3, verbose = 1) clfs = GridSearchCV(clfs, param_grid=params, n_jobs = n_jobs, cv = 3, verbose = 1)
clfs.fit(x_scaled,y_train) clfs.fit(x_scaled,y_train)
#The best hyper parameters set #The best hyper parameters set
# print("Best Hyper Parameters:\n",clfs.best_params_) # print("Best Hyper Parameters:\n",clfs.best_params_)
# y_pred = clfs.predict(X_test) # y_pred = clfs.predict(X_test)
# score = mean_absolute_error(y_test, y_pred) # score = mean_absolute_error(y_test, y_pred)
# print(score) # print(score)
svm = GridSearchCV( svm = GridSearchCV(
estimator=SVR(kernel='rbf'), estimator=SVR(kernel='rbf'),
param_grid={ param_grid={
'C': [0.1, 1, 10, 100, 1000], 'C': [0.1, 1, 10, 100, 1000],
'epsilon': [0.0001, 0.001, 0.01, 0.1, 1, 10], 'epsilon': [0.0001, 0.001, 0.01, 0.1, 1, 10],
'gamma': [0.0001, 0.001, 0.01, 0.1, 1, 5] 'gamma': [0.0001, 0.001, 0.01, 0.1, 1, 5]
}, verbose=1, n_jobs=n_jobs) }, verbose=1, n_jobs=n_jobs)
svm.fit(x_scaled,y_train) svm.fit(x_scaled,y_train)
mlp = MLPRegressor() mlp = MLPRegressor()
param_grid = {'hidden_layer_sizes': [(100), (100, 100), (50), (50, 50)], param_grid = {'hidden_layer_sizes': [(100), (100, 100), (50), (50, 50)],
'activation': ['relu'], 'activation': ['relu'],
'solver': ['adam'], 'solver': ['adam'],
'learning_rate': ['adaptive'], 'learning_rate': ['adaptive'],
'learning_rate_init': [0.01], 'learning_rate_init': [0.01],
'power_t': [0.5], 'power_t': [0.5],
'alpha': [0.0001], 'alpha': [0.0001],
'max_iter': [200, 1000], 'max_iter': [200, 1000],
'early_stopping': [False], 'early_stopping': [False],
'warm_start': [False]} 'warm_start': [False]}
nn = GridSearchCV(mlp, param_grid=param_grid, verbose=True, n_jobs=n_jobs) nn = GridSearchCV(mlp, param_grid=param_grid, verbose=True, n_jobs=n_jobs)
nn.fit(x_scaled, y_train) nn.fit(x_scaled, y_train)
eval_tools["KNeighborsRegressor"][error_name] = neigh1 eval_tools["KNeighborsRegressor"][error_name] = neigh1
eval_tools["LinearRegression"][error_name] = reg eval_tools["LinearRegression"][error_name] = reg
eval_tools["RandomForestClassifier"][error_name] = clfs eval_tools["RandomForestClassifier"][error_name] = clfs
eval_tools["StandardScaler"] = sc eval_tools["StandardScaler"] = sc
eval_tools["NN"][error_name] = nn eval_tools["NN"][error_name] = nn
eval_tools["SVM"][error_name] = svm eval_tools["SVM"][error_name] = svm
for i in eval_tools: for i in eval_tools:
print(i) print(i)
filename = "ML_eval_tools_"+i+"_"+x_train_name+".sav" filename = "ML_eval_tools_"+i+"_"+x_train_name+".sav"
pickle.dump(eval_tools[i], open(filename, "wb")) pickle.dump(eval_tools[i], open(filename, "wb"))
""" """
``` ```
%% Output %% Output
100%|██████████████████████████████████████████████████████████████████████████████████████████| 27/27 [00:00<?, ?it/s] 100%|██████████████████████████████████████████████████████████████████████████████████████████| 27/27 [00:00<?, ?it/s]
this code commented out does the heavy lifting. Not doing that today. this code commented out does the heavy lifting. Not doing that today.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Instead, let's just try something out with a subset of the original data and illegally shortened training phases. Instead, let's just try something out with a subset of the original data and illegally shortened training phases.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
eval_tools = { eval_tools = {
"LinearRegression": {}, "LinearRegression": {},
"NN":{}, "NN":{},
} }
for error_name in tqdm(["EXX", "EXX_compRB"]): for error_name in tqdm(["EXX", "EXX_compRB"]):
regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01) regressor_model = MLPRegressor(hidden_layer_sizes=(50, ), max_iter=20, verbose=1, alpha=0.001, batch_size='auto', learning_rate='constant', learning_rate_init=0.01)
regressor_model.fit(x_scaled, y[error_name].ravel()) regressor_model.fit(x_scaled, y[error_name].ravel())
reg = LinearRegression().fit(x_scaled, y[error_name].ravel()) reg = LinearRegression().fit(x_scaled, y[error_name].ravel())
eval_tools["StandardScaler"] = sc eval_tools["StandardScaler"] = sc
eval_tools["LinearRegression"][error_name] = reg eval_tools["LinearRegression"][error_name] = reg
eval_tools["NN"][error_name] = regressor_model eval_tools["NN"][error_name] = regressor_model
for i in eval_tools: for i in eval_tools:
print(i) print(i)
filename = "ML_eval_tools_"+i+".sav" filename = "ML_eval_tools_"+i+".sav"
pickle.dump(eval_tools[i], open(filename, "wb")) pickle.dump(eval_tools[i], open(filename, "wb"))
``` ```
%% Output %% Output
0%| | 0/2 [00:00<?, ?it/s] 0%| | 0/2 [00:00<?, ?it/s]
Iteration 1, loss = 2.37517953 Iteration 1, loss = 2.37517953
Iteration 2, loss = 0.12905149 Iteration 2, loss = 0.12905149
Iteration 3, loss = 0.09789837 Iteration 3, loss = 0.09789837
Iteration 4, loss = 0.06032041 Iteration 4, loss = 0.06032041
Iteration 5, loss = 0.05504129 Iteration 5, loss = 0.05504129
Iteration 6, loss = 0.05392060 Iteration 6, loss = 0.05392060
Iteration 7, loss = 0.05141265 Iteration 7, loss = 0.05141265
Iteration 8, loss = 0.05067939 Iteration 8, loss = 0.05067939
Iteration 9, loss = 0.05143948 Iteration 9, loss = 0.05143948
Iteration 10, loss = 0.04910270 Iteration 10, loss = 0.04910270
Iteration 11, loss = 0.05031202 Iteration 11, loss = 0.05031202
Iteration 12, loss = 0.04809143 Iteration 12, loss = 0.04809143
Iteration 13, loss = 0.04870659 Iteration 13, loss = 0.04870659
Iteration 14, loss = 0.04770111 Iteration 14, loss = 0.04770111
Iteration 15, loss = 0.04766778 Iteration 15, loss = 0.04766778
Iteration 16, loss = 0.04743532 Iteration 16, loss = 0.04743532
Iteration 17, loss = 0.04751504 Iteration 17, loss = 0.04751504
Iteration 18, loss = 0.04508614 Iteration 18, loss = 0.04508614
Iteration 19, loss = 0.04696037 Iteration 19, loss = 0.04696037
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet. C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
% self.max_iter, ConvergenceWarning) % self.max_iter, ConvergenceWarning)
50%|██████████████████████████████████████████ | 1/2 [00:09<00:09, 9.79s/it] 50%|██████████████████████████████████████████ | 1/2 [00:09<00:09, 9.79s/it]
Iteration 20, loss = 0.04541264 Iteration 20, loss = 0.04541264
Iteration 1, loss = 0.08507141 Iteration 1, loss = 0.08507141
Iteration 2, loss = 0.04744902 Iteration 2, loss = 0.04744902
Iteration 3, loss = 0.04485742 Iteration 3, loss = 0.04485742
Iteration 4, loss = 0.04368564 Iteration 4, loss = 0.04368564
Iteration 5, loss = 0.04222761 Iteration 5, loss = 0.04222761
Iteration 6, loss = 0.04236689 Iteration 6, loss = 0.04236689
Iteration 7, loss = 0.04155312 Iteration 7, loss = 0.04155312
Iteration 8, loss = 0.04141260 Iteration 8, loss = 0.04141260
Iteration 9, loss = 0.04099000 Iteration 9, loss = 0.04099000
Iteration 10, loss = 0.04064505 Iteration 10, loss = 0.04064505
Iteration 11, loss = 0.04033634 Iteration 11, loss = 0.04033634
Iteration 12, loss = 0.04061129 Iteration 12, loss = 0.04061129
Iteration 13, loss = 0.03987049 Iteration 13, loss = 0.03987049
Iteration 14, loss = 0.04025951 Iteration 14, loss = 0.04025951
Iteration 15, loss = 0.03992405 Iteration 15, loss = 0.03992405
Iteration 16, loss = 0.04006715 Iteration 16, loss = 0.04006715
Iteration 17, loss = 0.03967800 Iteration 17, loss = 0.03967800
Iteration 18, loss = 0.03929736 Iteration 18, loss = 0.03929736
Iteration 19, loss = 0.03960700 Iteration 19, loss = 0.03960700
C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet. C:\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (20) reached and the optimization hasn't converged yet.
% self.max_iter, ConvergenceWarning) % self.max_iter, ConvergenceWarning)
100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:19<00:00, 9.96s/it] 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:19<00:00, 9.96s/it]
Iteration 20, loss = 0.03957958 Iteration 20, loss = 0.03957958
LinearRegression LinearRegression
NN NN
StandardScaler StandardScaler
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Validation Data ## Validation Data
So the neural network can, as shown, reproduce the EXX error now. But the common (and smart) way of validating machine learning algorithms is a split into three parts: So the neural network can, as shown, reproduce the EXX error now. But the common (and smart) way of validating machine learning algorithms is a split into three parts:
- Triaining dataset (biggest part) - Triaining dataset (biggest part)
- Data, which the algorithm is trained with, hence the name - Data, which the algorithm is trained with, hence the name
- Test set (small part) - Test set (small part)
- This dataset is used to check after each iteration of training, how well the trained model performs - This dataset is used to check after each iteration of training, how well the trained model performs
- Validation dataset (small part) - Validation dataset (small part)
- This subset is kept back until the training is finished. - This subset is kept back until the training is finished.
- For the purpose described above, it is crucial, that the algorithm of choice also performs well in a temperature range, which was not included in the training dataset - the correction model for thermal drift of the machine tool should not produce garbage in new temperature states, e.g. on a hot summer day. - For the purpose described above, it is crucial, that the algorithm of choice also performs well in a temperature range, which was not included in the training dataset - the correction model for thermal drift of the machine tool should not produce garbage in new temperature states, e.g. on a hot summer day.
<br>The chosen model should be able to extrapolate - many machine learning modes lack in that regard. This can be tested with the validation dataset. <br>The chosen model should be able to extrapolate - many machine learning modes lack in that regard. This can be tested with the validation dataset.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## White Box Knowledge ## White Box Knowledge
So, machine learning is nice in that regard, that no prior knowledge about the predited system is necessary. So, machine learning is nice in that regard, that no prior knowledge about the predited system is necessary.
__But I HAVE SOME KNOWLEDGE, LET ME USE THAT!__ __But I HAVE SOME KNOWLEDGE, LET ME USE THAT!__
Also, using some existing knowledge can be quite beneficial. Also, using some existing knowledge can be quite beneficial.
- Machine learning creates a black box, which spits out some result - I do not know what it did to get there (yes, for the more simple algorithms I could find out) - Machine learning creates a black box, which spits out some result - I do not know what it did to get there (yes, for the more simple algorithms I could find out)
- I could create some kind of hybrid model and mix my own knowledge (White Box) and the machine learning black box. Black and white mixed is grey, thus grey box. <br> - I could create some kind of hybrid model and mix my own knowledge (White Box) and the machine learning black box. Black and white mixed is grey, thus grey box. <br>
Some approaches might not even work without additional white box modelling, see below. Some approaches might not even work without additional white box modelling, see below.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Result Discussion ## Result Discussion
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment