{ "cells": [ { "cell_type": "markdown", "id": "metric-heath", "metadata": {}, "source": [ "## Deep learning example from Accounting/Finance" ] }, { "cell_type": "markdown", "id": "operating-assignment", "metadata": {}, "source": [ "The following example demonstrates a simple example of deep learning that uses accounting/finance data. It also demonstrates, how to implement a deep learning model to traditional structured data. However, it also shows how deep learning is usually not the best option for structured data with relatively small datasets (<100k observations). Deep learning models perform better with large unstructured datasets." ] }, { "cell_type": "code", "execution_count": 1, "id": "twelve-oregon", "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "markdown", "id": "concerned-baking", "metadata": {}, "source": [ "The data has a little under 20k observations. The variables are different financial ratios and board characteristics of S&P1500 companies." ] }, { "cell_type": "code", "execution_count": 2, "id": "existing-ancient", "metadata": {}, "outputs": [], "source": [ "compu_df = pd.read_csv('_data.txt',delimiter='\\t')" ] }, { "cell_type": "code", "execution_count": 3, "id": "polar-active", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | GVKEY | \n", "datadate | \n", "fyear | \n", "cusip | \n", "conm | \n", "act | \n", "at | \n", "bkvlps | \n", "capx | \n", "ceq | \n", "... | \n", "ind_chairman_is_ex_ceo | \n", "ind_independent_board_members | \n", "ind_strictly_independent_board_members | \n", "ind_board_member_affiliations | \n", "ind_non_executive_board_members | \n", "ind_board_gender_diversity_percent | \n", "ind_board_specific_skills_percent | \n", "ind_executive_members_gender_diversity_percent | \n", "ind_average_board_tenure | \n", "ind_board_member_compensation | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "21542 | \n", "20081231 | \n", "2008 | \n", "000360206 | \n", "AAON INC | \n", "80.118 | \n", "140.743 | \n", "5.6088 | \n", "9.610 | \n", "96.522 | \n", "... | \n", "0.0 | \n", "87.500 | \n", "57.140 | \n", "1.425 | \n", "90.91 | \n", "13.395 | \n", "50.61 | \n", "7.735 | \n", "8.270 | \n", "1650394.0 | \n", "
1 | \n", "21542 | \n", "20091231 | \n", "2009 | \n", "000360206 | \n", "AAON INC | \n", "96.240 | \n", "156.211 | \n", "6.8544 | \n", "9.774 | \n", "117.999 | \n", "... | \n", "0.0 | \n", "87.500 | \n", "50.000 | \n", "1.180 | \n", "90.00 | \n", "12.915 | \n", "60.00 | \n", "6.670 | \n", "8.705 | \n", "1590889.5 | \n", "
2 | \n", "21542 | \n", "20101231 | \n", "2010 | \n", "000360206 | \n", "AAON INC | \n", "91.748 | \n", "160.277 | \n", "7.0725 | \n", "17.470 | \n", "116.739 | \n", "... | \n", "0.0 | \n", "84.620 | \n", "51.925 | \n", "1.000 | \n", "90.00 | \n", "11.110 | \n", "58.33 | \n", "9.090 | \n", "8.780 | \n", "1801674.0 | \n", "
3 | \n", "21542 | \n", "20111231 | \n", "2011 | \n", "000360206 | \n", "AAON INC | \n", "84.387 | \n", "178.981 | \n", "4.9762 | \n", "35.914 | \n", "122.504 | \n", "... | \n", "0.0 | \n", "86.670 | \n", "50.000 | \n", "1.090 | \n", "90.00 | \n", "11.110 | \n", "57.14 | \n", "10.000 | \n", "9.180 | \n", "1847006.5 | \n", "
4 | \n", "21542 | \n", "20121231 | \n", "2012 | \n", "000360206 | \n", "AAON INC | \n", "91.546 | \n", "193.493 | \n", "5.6341 | \n", "14.147 | \n", "138.136 | \n", "... | \n", "0.0 | \n", "87.500 | \n", "50.000 | \n", "1.190 | \n", "90.00 | \n", "14.290 | \n", "54.55 | \n", "9.090 | \n", "9.170 | \n", "1810953.0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
19632 | \n", "28191 | \n", "20151231 | \n", "2015 | \n", "V7780T103 | \n", "ROYAL CARIBBEAN GROUP | \n", "837.022 | \n", "20921.855 | \n", "36.9876 | \n", "1613.340 | \n", "8063.039 | \n", "... | \n", "1.0 | \n", "84.620 | \n", "50.000 | \n", "0.905 | \n", "83.33 | \n", "16.670 | \n", "53.85 | \n", "12.500 | \n", "8.880 | \n", "1744895.0 | \n", "
19633 | \n", "28191 | \n", "20161231 | \n", "2016 | \n", "V7780T103 | \n", "ROYAL CARIBBEAN GROUP | \n", "748.305 | \n", "22310.324 | \n", "42.5054 | \n", "2494.363 | \n", "9121.412 | \n", "... | \n", "1.0 | \n", "83.330 | \n", "50.000 | \n", "0.880 | \n", "83.33 | \n", "16.670 | \n", "53.85 | \n", "14.290 | \n", "9.340 | \n", "1737800.0 | \n", "
19634 | \n", "28191 | \n", "20171231 | \n", "2017 | \n", "V7780T103 | \n", "ROYAL CARIBBEAN GROUP | \n", "843.028 | \n", "22296.317 | \n", "50.1659 | \n", "564.138 | \n", "10702.303 | \n", "... | \n", "1.0 | \n", "83.330 | \n", "50.000 | \n", "0.885 | \n", "83.33 | \n", "20.000 | \n", "57.14 | \n", "13.395 | \n", "9.105 | \n", "1793588.5 | \n", "
19635 | \n", "28191 | \n", "20181231 | \n", "2018 | \n", "V7780T103 | \n", "ROYAL CARIBBEAN GROUP | \n", "1242.044 | \n", "27698.270 | \n", "53.1319 | \n", "3660.028 | \n", "11105.461 | \n", "... | \n", "1.0 | \n", "85.710 | \n", "50.000 | \n", "0.745 | \n", "81.82 | \n", "22.220 | \n", "58.33 | \n", "14.290 | \n", "9.180 | \n", "1858984.0 | \n", "
19636 | \n", "28191 | \n", "20191231 | \n", "2019 | \n", "V7780T103 | \n", "ROYAL CARIBBEAN GROUP | \n", "1162.628 | \n", "30320.284 | \n", "58.2557 | \n", "3024.663 | \n", "12163.846 | \n", "... | \n", "1.0 | \n", "85.165 | \n", "48.075 | \n", "0.800 | \n", "80.00 | \n", "25.000 | \n", "60.00 | \n", "15.190 | \n", "9.195 | \n", "1884002.5 | \n", "
19637 rows × 102 columns
\n", "\n", " | at | \n", "bkvlps | \n", "capx | \n", "ceq | \n", "csho | \n", "cstk | \n", "dlc | \n", "dltt | \n", "dvc | \n", "ebit | \n", "... | \n", "age | \n", "tridx | \n", "mb | \n", "cap_int | \n", "lvg | \n", "roa | \n", "roe | \n", "roi | \n", "fyear | \n", "current_roa | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
1 | \n", "140.743 | \n", "5.6088 | \n", "9.610 | \n", "96.522 | \n", "17.209 | \n", "0.071 | \n", "2.992 | \n", "0.000 | \n", "5.621 | \n", "43.388 | \n", "... | \n", "204.0 | \n", "1.070373 | \n", "3.722721 | \n", "0.068280 | \n", "0.030998 | \n", "0.203129 | \n", "0.079563 | \n", "0.296192 | \n", "2009.0 | \n", "0.177459 | \n", "
2 | \n", "156.211 | \n", "6.8544 | \n", "9.774 | \n", "117.999 | \n", "17.215 | \n", "0.071 | \n", "0.076 | \n", "0.000 | \n", "6.201 | \n", "43.754 | \n", "... | \n", "216.0 | \n", "1.017457 | \n", "2.843429 | \n", "0.062569 | \n", "0.000644 | \n", "0.177459 | \n", "0.082621 | \n", "0.234926 | \n", "2010.0 | \n", "0.136601 | \n", "
3 | \n", "160.277 | \n", "7.0725 | \n", "17.470 | \n", "116.739 | \n", "16.506 | \n", "0.068 | \n", "0.000 | \n", "0.000 | \n", "6.067 | \n", "32.715 | \n", "... | \n", "228.0 | \n", "1.494462 | \n", "3.988689 | \n", "0.108999 | \n", "0.000000 | \n", "0.136601 | \n", "0.047020 | \n", "0.187547 | \n", "2011.0 | \n", "0.078142 | \n", "
4 | \n", "178.981 | \n", "4.9762 | \n", "35.914 | \n", "122.504 | \n", "24.618 | \n", "0.098 | \n", "4.575 | \n", "0.000 | \n", "5.935 | \n", "23.971 | \n", "... | \n", "240.0 | \n", "1.646141 | \n", "4.117600 | \n", "0.200658 | \n", "0.037346 | \n", "0.078142 | \n", "0.027727 | \n", "0.114168 | \n", "2012.0 | \n", "0.141860 | \n", "
5 | \n", "193.493 | \n", "5.6341 | \n", "14.147 | \n", "138.136 | \n", "24.518 | \n", "0.098 | \n", "0.000 | \n", "0.000 | \n", "8.840 | \n", "44.238 | \n", "... | \n", "252.0 | \n", "1.706581 | \n", "3.704230 | \n", "0.073114 | \n", "0.000000 | \n", "0.141860 | \n", "0.053644 | \n", "0.198710 | \n", "2013.0 | \n", "0.174277 | \n", "
6 | \n", "215.444 | \n", "4.4702 | \n", "9.041 | \n", "164.106 | \n", "36.711 | \n", "0.147 | \n", "0.000 | \n", "0.000 | \n", "7.428 | \n", "55.803 | \n", "... | \n", "264.0 | \n", "3.949485 | \n", "7.147331 | \n", "0.041965 | \n", "0.000000 | \n", "0.174277 | \n", "0.032012 | \n", "0.228797 | \n", "2014.0 | \n", "0.189424 | \n", "
7 | \n", "233.117 | \n", "3.2208 | \n", "16.127 | \n", "174.059 | \n", "54.042 | \n", "0.216 | \n", "0.000 | \n", "0.000 | \n", "9.656 | \n", "71.563 | \n", "... | \n", "276.0 | \n", "4.185799 | \n", "6.951689 | \n", "0.069180 | \n", "0.000000 | \n", "0.189424 | \n", "0.036494 | \n", "0.253696 | \n", "2015.0 | \n", "0.196381 | \n", "
8 | \n", "232.854 | \n", "3.3750 | \n", "20.967 | \n", "178.918 | \n", "53.012 | \n", "0.212 | \n", "0.000 | \n", "0.000 | \n", "11.857 | \n", "71.695 | \n", "... | \n", "288.0 | \n", "4.381591 | \n", "6.880000 | \n", "0.090044 | \n", "0.000000 | \n", "0.196381 | \n", "0.037149 | \n", "0.255581 | \n", "2016.0 | \n", "0.208069 | \n", "
9 | \n", "256.530 | \n", "3.9106 | \n", "26.604 | \n", "205.898 | \n", "52.651 | \n", "0.211 | \n", "0.000 | \n", "0.000 | \n", "12.676 | \n", "79.574 | \n", "... | \n", "300.0 | \n", "6.286182 | \n", "8.451389 | \n", "0.103707 | \n", "0.000000 | \n", "0.208069 | \n", "0.030674 | \n", "0.259235 | \n", "2017.0 | \n", "0.183631 | \n", "
10 | \n", "296.780 | \n", "4.5252 | \n", "41.713 | \n", "237.226 | \n", "52.423 | \n", "0.210 | \n", "0.000 | \n", "0.000 | \n", "13.653 | \n", "74.148 | \n", "... | \n", "312.0 | \n", "7.030023 | \n", "8.110139 | \n", "0.140552 | \n", "0.000000 | \n", "0.183631 | \n", "0.028326 | \n", "0.229730 | \n", "2018.0 | \n", "0.138132 | \n", "
11 | \n", "308.197 | \n", "4.7604 | \n", "37.268 | \n", "247.499 | \n", "51.991 | \n", "0.208 | \n", "0.000 | \n", "0.000 | \n", "16.717 | \n", "57.678 | \n", "... | \n", "324.0 | \n", "6.776646 | \n", "7.364927 | \n", "0.120923 | \n", "0.000000 | \n", "0.138132 | \n", "0.023355 | \n", "0.172009 | \n", "2019.0 | \n", "0.144608 | \n", "
12 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
13 | \n", "1377.511 | \n", "16.8937 | \n", "27.535 | \n", "656.895 | \n", "38.884 | \n", "44.201 | \n", "63.600 | \n", "392.984 | \n", "0.000 | \n", "125.529 | \n", "... | \n", "444.0 | \n", "0.386537 | \n", "1.089755 | \n", "0.019989 | \n", "0.695064 | \n", "0.057096 | \n", "0.109870 | \n", "0.074914 | \n", "2009.0 | \n", "0.029731 | \n", "
14 | \n", "1501.042 | \n", "18.9167 | \n", "28.855 | \n", "746.906 | \n", "39.484 | \n", "44.870 | \n", "100.833 | \n", "336.191 | \n", "0.000 | \n", "95.415 | \n", "... | \n", "456.0 | \n", "0.518011 | \n", "1.214800 | \n", "0.019223 | \n", "0.585112 | \n", "0.029731 | \n", "0.049185 | \n", "0.041225 | \n", "2010.0 | \n", "0.040984 | \n", "
15 | \n", "1703.727 | \n", "21.0112 | \n", "124.879 | \n", "835.845 | \n", "39.781 | \n", "44.986 | \n", "114.075 | \n", "329.802 | \n", "2.983 | \n", "137.016 | \n", "... | \n", "468.0 | \n", "0.695924 | \n", "1.307398 | \n", "0.073298 | \n", "0.531052 | \n", "0.040984 | \n", "0.063897 | \n", "0.059932 | \n", "2011.0 | \n", "0.030844 | \n", "
16 | \n", "2195.653 | \n", "21.4697 | \n", "91.218 | \n", "864.649 | \n", "40.273 | \n", "44.849 | \n", "122.865 | \n", "669.489 | \n", "12.081 | \n", "142.360 | \n", "... | \n", "480.0 | \n", "0.322468 | \n", "0.892886 | \n", "0.041545 | \n", "0.916388 | \n", "0.030844 | \n", "0.087720 | \n", "0.044105 | \n", "2012.0 | \n", "0.025738 | \n", "
17 | \n", "2136.900 | \n", "23.3254 | \n", "37.600 | \n", "918.600 | \n", "39.382 | \n", "44.700 | \n", "86.400 | \n", "622.200 | \n", "11.900 | \n", "136.600 | \n", "... | \n", "492.0 | \n", "0.546782 | \n", "0.800844 | \n", "0.017596 | \n", "0.771391 | \n", "0.025738 | \n", "0.074763 | \n", "0.035675 | \n", "2013.0 | \n", "0.033144 | \n", "
18 | \n", "2199.500 | \n", "25.2654 | \n", "26.500 | \n", "999.500 | \n", "39.560 | \n", "44.700 | \n", "69.700 | \n", "564.300 | \n", "11.800 | \n", "142.600 | \n", "... | \n", "504.0 | \n", "0.669914 | \n", "1.108631 | \n", "0.012048 | \n", "0.634317 | \n", "0.033144 | \n", "0.065790 | \n", "0.046581 | \n", "2014.0 | \n", "0.006733 | \n", "
19 | \n", "1515.000 | \n", "23.8574 | \n", "46.300 | \n", "845.100 | \n", "35.423 | \n", "44.900 | \n", "69.000 | \n", "85.000 | \n", "11.900 | \n", "-8.600 | \n", "... | \n", "516.0 | \n", "0.823134 | \n", "1.164419 | \n", "0.030561 | \n", "0.182227 | \n", "0.006733 | \n", "0.010365 | \n", "0.010967 | \n", "2015.0 | \n", "0.033077 | \n", "
20 | \n", "1442.100 | \n", "25.0847 | \n", "88.400 | \n", "865.800 | \n", "34.515 | \n", "44.900 | \n", "12.000 | \n", "136.100 | \n", "10.400 | \n", "66.100 | \n", "... | \n", "528.0 | \n", "0.688919 | \n", "1.048049 | \n", "0.061299 | \n", "0.171056 | \n", "0.033077 | \n", "0.052568 | \n", "0.047610 | \n", "2016.0 | \n", "0.037564 | \n", "
21 | \n", "1504.100 | \n", "26.6112 | \n", "33.600 | \n", "914.200 | \n", "34.354 | \n", "45.200 | \n", "2.000 | \n", "155.300 | \n", "10.200 | \n", "77.200 | \n", "... | \n", "540.0 | \n", "0.995869 | \n", "1.241958 | \n", "0.022339 | \n", "0.172063 | \n", "0.037564 | \n", "0.049762 | \n", "0.052828 | \n", "2017.0 | \n", "0.010232 | \n", "
22 | \n", "1524.700 | \n", "26.9703 | \n", "22.000 | \n", "936.300 | \n", "34.716 | \n", "45.300 | \n", "0.000 | \n", "177.200 | \n", "10.300 | \n", "86.000 | \n", "... | \n", "552.0 | \n", "1.283369 | \n", "1.456788 | \n", "0.014429 | \n", "0.189256 | \n", "0.010232 | \n", "0.011437 | \n", "0.014010 | \n", "2018.0 | \n", "0.004943 | \n", "
23 | \n", "1517.200 | \n", "26.0406 | \n", "17.400 | \n", "905.900 | \n", "34.788 | \n", "45.300 | \n", "0.000 | \n", "141.700 | \n", "10.500 | \n", "110.700 | \n", "... | \n", "564.0 | \n", "0.870483 | \n", "1.433915 | \n", "0.011468 | \n", "0.156419 | \n", "0.004943 | \n", "0.005774 | \n", "0.007159 | \n", "2019.0 | \n", "0.002116 | \n", "
24 | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
25 | \n", "1549.913 | \n", "12.6376 | \n", "34.063 | \n", "644.051 | \n", "50.963 | \n", "0.581 | \n", "0.000 | \n", "230.000 | \n", "25.271 | \n", "114.309 | \n", "... | \n", "468.0 | \n", "0.820994 | \n", "1.507406 | \n", "0.021977 | \n", "0.357115 | \n", "0.029314 | \n", "0.046798 | \n", "0.051981 | \n", "2009.0 | \n", "0.035692 | \n", "
26 | \n", "1521.153 | \n", "13.2923 | \n", "18.582 | \n", "687.050 | \n", "51.688 | \n", "0.517 | \n", "0.000 | \n", "172.500 | \n", "26.727 | \n", "82.506 | \n", "... | \n", "480.0 | \n", "0.973093 | \n", "1.554283 | \n", "0.012216 | \n", "0.251073 | \n", "0.035692 | \n", "0.050842 | \n", "0.063164 | \n", "2010.0 | \n", "0.041404 | \n", "
27 | \n", "1548.670 | \n", "14.0406 | \n", "23.942 | \n", "739.025 | \n", "52.635 | \n", "0.526 | \n", "0.000 | \n", "140.500 | \n", "28.152 | \n", "118.882 | \n", "... | \n", "492.0 | \n", "1.198713 | \n", "1.873139 | \n", "0.015460 | \n", "0.190115 | \n", "0.041404 | \n", "0.046320 | \n", "0.072904 | \n", "2011.0 | \n", "0.036446 | \n", "
28 | \n", "1879.598 | \n", "14.9230 | \n", "22.124 | \n", "795.886 | \n", "53.333 | \n", "0.533 | \n", "0.000 | \n", "300.000 | \n", "29.744 | \n", "124.148 | \n", "... | \n", "504.0 | \n", "1.101268 | \n", "1.381760 | \n", "0.011771 | \n", "0.376938 | \n", "0.036446 | \n", "0.062292 | \n", "0.062510 | \n", "2012.0 | \n", "0.033480 | \n", "
29 | \n", "1869.251 | \n", "15.6340 | \n", "28.052 | \n", "850.398 | \n", "54.394 | \n", "0.544 | \n", "0.000 | \n", "215.000 | \n", "31.309 | \n", "111.711 | \n", "... | \n", "516.0 | \n", "1.064461 | \n", "1.276065 | \n", "0.015007 | \n", "0.252823 | \n", "0.033480 | \n", "0.057671 | \n", "0.058740 | \n", "2013.0 | \n", "0.034399 | \n", "
30 rows × 37 columns
\n", "