From 88d51e43b2bfdf764384142f892a88929d877692 Mon Sep 17 00:00:00 2001 From: Jan Date: Thu, 7 Jul 2022 00:00:10 -0600 Subject: [PATCH] cryptic graph execution error --- rnn_tutorial_jm_output.ipynb | 1981 ++++++++++++++++++++---------------------- 1 file changed, 932 insertions(+), 1049 deletions(-) rewrite rnn_tutorial_jm_output.ipynb (80%) diff --git a/rnn_tutorial_jm_output.ipynb b/rnn_tutorial_jm_output.ipynb dissimilarity index 80% index 833a8c9..957f102 100644 --- a/rnn_tutorial_jm_output.ipynb +++ b/rnn_tutorial_jm_output.ipynb @@ -1,1049 +1,932 @@ -{ - "nbformat": 4, - "nbformat_minor": 0, - "metadata": { - "colab": { - "name": "rnn_tutorial_jm.ipynb", - "provenance": [], - "collapsed_sections": [] - }, - "kernelspec": { - "name": "python3", - "display_name": "Python 3" - }, - "language_info": { - "name": "python" - } - }, - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "hEhvqXQBNN6r" - }, - "source": [ - "# Understanding Simple Recurrent Neural Networks In Keras\n", - "From https://machinelearningmastery.com/understanding-simple-recurrent-neural-networks-in-keras/ \n", - "With changes by Jan Mandel\n", - "2021-12-26: small changes for clarity\n", - "2022-06-16: added same by functional interface" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "pINycjukQijP" - }, - "source": [ - "This tutorial is designed for anyone looking for an understanding of how recurrent neural networks (RNN) work and how to use them via the Keras deep learning library. While all the methods required for solving problems and building applications are provided by the Keras library, it is also important to gain an insight on how everything works. In this article, the computations taking place in the RNN model are shown step by step. Next, a complete end to end system for time series prediction is developed.\n", - "\n", - "After completing this tutorial, you will know:\n", - "\n", - "* The structure of RNN\n", - "* How RNN computes the output when given an input\n", - "* How to prepare data for a SimpleRNN in Keras\n", - "* How to train a SimpleRNN model\n", - "\n", - "Let’s get started." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "3D7sqhF-QtJ4" - }, - "source": [ - "## Tutorial Overview\n", - "\n", - "This tutorial is divided into two parts; they are:\n", - "\n", - "1. The structure of the RNN\n", - " 1. Different weights and biases associated with different layers of the RNN.\n", - " 2. How computations are performed to compute the output when given an input.\n", - "2. A complete application for time series prediction." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "jU0opwkmQ9wm" - }, - "source": [ - "## Prerequisites\n", - "\n", - "It is assumed that you have a basic understanding of RNNs before you start implementing them. An [Introduction To Recurrent Neural Networks And The Math That Powers Them](https://machinelearningmastery.com/an-introduction-to-recurrent-neural-networks-and-the-math-that-powers-them) gives you a quick overview of RNNs.\n", - "\n", - "Let’s now get right down to the implementation part." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "-UcYCr1FMaE7" - }, - "source": [ - "## Import section" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "FimVgAB4LsI9" - }, - "source": [ - "from pandas import read_csv\n", - "import numpy as np\n", - "from keras.models import Sequential\n", - "from keras.layers import Dense, SimpleRNN\n", - "from sklearn.preprocessing import MinMaxScaler\n", - "from sklearn.metrics import mean_squared_error\n", - "import math\n", - "import matplotlib.pyplot as plt\n", - "import tensorflow as tf" - ], - "execution_count": 1, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "O-okDDv9Md4Z" - }, - "source": [ - "## Keras SimpleRNN" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "W-WQZp0qRta0" - }, - "source": [ - "The function below returns a model that includes a SimpleRNN layer and a Dense layer for learning sequential data. The input_shape specifies the parameter (time_steps x features). We’ll simplify everything and use univariate data, i.e., one feature only; the time_steps are discussed below." - ] - }, - { - "cell_type": "code", - "source": [ - "api_type = 2 # 1 = sequential, 2 = functional" - ], - "metadata": { - "id": "Qf35beQp1Ewv" - }, - "execution_count": 2, - "outputs": [] - }, - { - "cell_type": "code", - "metadata": { - "id": "Aoh0LWQ7Ltdk" - }, - "source": [ - "def create_RNN_sequential(hidden_units, dense_units, input_shape, activation):\n", - " model = Sequential()\n", - " model.add(SimpleRNN(hidden_units, input_shape=input_shape, \n", - " activation=activation[0]))\n", - " model.add(Dense(units=dense_units, activation=activation[1]))\n", - " model.compile(loss='mean_squared_error', optimizer='adam')\n", - " return model" - ], - "execution_count": 3, - "outputs": [] - }, - { - "cell_type": "code", - "source": [ - "def create_RNN_functional(hidden_units, dense_units, input_shape=None, activation=None,\n", - " return_sequences=False,stateful=False,batch_shape=None):\n", - " inputs = tf.keras.Input(shape=input_shape)\n", - " if stateful:\n", - " x = tf.keras.layers.SimpleRNN(hidden_units, batch_shape=batch_shape,\n", - " return_sequences=return_sequences,\n", - " stateful=true,\n", - " activation=activation[0])(inputs)\n", - " else:\n", - " x = tf.keras.layers.SimpleRNN(hidden_units, input_shape=input_shape,\n", - " return_sequences=return_sequences,\n", - " activation=activation[0])(inputs)\n", - " outputs = tf.keras.layers.Dense(dense_units, activation=activation[1])(x)\n", - " model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", - " model.compile(loss='mean_squared_error', optimizer='adam')\n", - " return model" - ], - "metadata": { - "id": "zBQxW0h8pvGb" - }, - "execution_count": 4, - "outputs": [] - }, - { - "cell_type": "code", - "source": [ - "def create_RNN(hidden_units, dense_units, input_shape, activation):\n", - " if api_type==1:\n", - " print('using sequential api')\n", - " return create_RNN_sequential(hidden_units, dense_units, input_shape, activation)\n", - " if api_type==2:\n", - " print('using functional api')\n", - " return create_RNN_functional(hidden_units, dense_units, input_shape, activation)\n", - " print('api_type must be 1 or 2, got ',api_type)\n", - " raise(ValueError)\n", - "\n", - "demo_model = create_RNN(hidden_units=2, dense_units=1, input_shape=(3,1), \n", - " activation=['linear', 'linear'])" - ], - "metadata": { - "id": "dYLai9HypVRo", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "48aa6528-3189-45d0-f808-17bba44233a0" - }, - "execution_count": 5, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "using functional api\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "_CZtOyDPSuMy" - }, - "source": [ - "The object demo_model is returned with 2 hidden units created via a the SimpleRNN layer and 1 dense unit created via the Dense layer. The input_shape is set at 3×1 and a linear activation function is used in both layers for simplicity. Just to recall the linear activation function f(x)=x makes no change in the input." - ] - }, - { - "cell_type": "code", - "source": [ - "print(dir(demo_model))\n", - "# help(demo_model)\n", - "help(demo_model.get_weights)" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "TgDQmxwQpUtw", - "outputId": "b8666c9a-8d1d-45b7-d166-f1ae8071232f" - }, - "execution_count": 6, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "['_SCALAR_UPRANKING_ON', '_TF_MODULE_IGNORED_PROPERTIES', '__call__', '__class__', '__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_activity_regularizer', '_add_trackable', '_add_trackable_child', '_add_variable_with_custom_getter', '_assert_compile_was_called', '_assert_weights_created', '_auto_track_sub_layers', '_autocast', '_autographed_call', '_base_model_initialized', '_build_input_shape', '_call_accepts_kwargs', '_call_arg_was_passed', '_call_fn_arg_defaults', '_call_fn_arg_positions', '_call_fn_args', '_call_full_argspec', '_callable_losses', '_cast_single_input', '_check_call_args', '_checkpoint_dependencies', '_clear_losses', '_cluster_coordinator', '_compile_was_called', '_compiled_trainable_state', '_compute_dtype', '_compute_dtype_object', '_compute_output_and_mask_jointly', '_compute_tensor_usage_count', '_configure_steps_per_execution', '_conform_to_reference_input', '_dedup_weights', '_default_training_arg', '_deferred_dependencies', '_delete_tracking', '_deserialization_dependencies', '_deserialize_from_proto', '_distribution_strategy', '_dtype', '_dtype_policy', '_dynamic', '_eager_losses', '_enable_dict_to_input_mapping', '_expects_mask_arg', '_expects_training_arg', '_feed_input_names', '_feed_input_shapes', '_feed_inputs', '_flatten', '_flatten_layers', '_flatten_modules', '_flatten_to_reference_inputs', '_functional_construction_call', '_gather_children_attribute', '_gather_saveables_for_checkpoint', '_get_call_arg_value', '_get_callback_model', '_get_cell_name', '_get_compile_args', '_get_existing_metric', '_get_input_masks', '_get_legacy_saved_model_children', '_get_node_attribute_at_index', '_get_optimizer', '_get_save_spec', '_get_trainable_state', '_graph_network_add_loss', '_graph_network_add_metric', '_handle_activity_regularization', '_handle_deferred_dependencies', '_handle_deferred_layer_dependencies', '_handle_weight_regularization', '_in_multi_worker_mode', '_inbound_nodes', '_inbound_nodes_value', '_infer_output_signature', '_init_batch_counters', '_init_call_fn_args', '_init_graph_network', '_init_set_name', '_initial_weights', '_input_coordinates', '_input_layers', '_input_spec', '_insert_layers', '_instrument_layer_creation', '_instrumented_keras_api', '_instrumented_keras_layer_class', '_instrumented_keras_model_class', '_is_compiled', '_is_graph_network', '_is_layer', '_is_model_for_instrumentation', '_jit_compile', '_keras_api_names', '_keras_api_names_v1', '_keras_tensor_symbolic_call', '_layer_call_argspecs', '_layer_checkpoint_dependencies', '_list_extra_dependencies_for_serialization', '_list_functions_for_serialization', '_lookup_dependency', '_losses', '_map_resources', '_maybe_build', '_maybe_cast_inputs', '_maybe_create_attribute', '_maybe_initialize_trackable', '_maybe_load_initial_epoch_from_ckpt', '_metrics', '_metrics_lock', '_must_restore_from_config', '_name', '_name_based_attribute_restore', '_name_based_restores', '_name_scope', '_nested_inputs', '_nested_outputs', '_network_nodes', '_no_dependency', '_nodes_by_depth', '_non_trainable_weights', '_obj_reference_counts', '_obj_reference_counts_dict', '_object_identifier', '_outbound_nodes', '_outbound_nodes_value', '_outer_name_scope', '_output_coordinates', '_output_layers', '_output_mask_cache', '_output_shape_cache', '_output_tensor_cache', '_predict_counter', '_preload_simple_restoration', '_preserve_input_structure_in_config', '_reset_compile_cache', '_restore_from_checkpoint_position', '_run_eagerly', '_run_internal_graph', '_saved_model_arg_spec', '_saved_model_inputs_spec', '_self_name_based_restores', '_self_saveable_object_factories', '_self_setattr_tracking', '_self_tracked_trackables', '_self_unconditional_checkpoint_dependencies', '_self_unconditional_deferred_dependencies', '_self_unconditional_dependency_names', '_self_update_uid', '_serialize_to_proto', '_set_call_arg_value', '_set_connectivity_metadata', '_set_dtype_policy', '_set_inputs', '_set_mask_keras_history_checked', '_set_mask_metadata', '_set_output_names', '_set_save_spec', '_set_trainable_state', '_set_training_mode', '_setattr_tracking', '_should_cast_single_input', '_should_compute_mask', '_should_eval', '_single_restoration_from_checkpoint_position', '_split_out_first_arg', '_stateful', '_steps_per_execution', '_supports_masking', '_symbolic_call', '_tensor_usage_count', '_test_counter', '_tf_api_names', '_tf_api_names_v1', '_thread_local', '_track_trackable', '_trackable_children', '_trackable_saved_model_saver', '_trackable_saver', '_tracking_metadata', '_train_counter', '_trainable', '_trainable_weights', '_training_state', '_unconditional_checkpoint_dependencies', '_unconditional_dependency_names', '_undeduplicated_weights', '_update_uid', '_updated_config', '_updates', '_use_input_spec_as_call_signature', '_validate_compile', '_validate_graph_inputs_and_outputs', '_validate_target_and_loss', 'activity_regularizer', 'add_loss', 'add_metric', 'add_update', 'add_variable', 'add_weight', 'apply', 'build', 'built', 'call', 'compile', 'compiled_loss', 'compiled_metrics', 'compute_dtype', 'compute_loss', 'compute_mask', 'compute_metrics', 'compute_output_shape', 'compute_output_signature', 'count_params', 'distribute_strategy', 'dtype', 'dtype_policy', 'dynamic', 'evaluate', 'evaluate_generator', 'finalize_state', 'fit', 'fit_generator', 'from_config', 'get_config', 'get_input_at', 'get_input_mask_at', 'get_input_shape_at', 'get_layer', 'get_losses_for', 'get_output_at', 'get_output_mask_at', 'get_output_shape_at', 'get_updates_for', 'get_weights', 'history', 'inbound_nodes', 'input', 'input_mask', 'input_names', 'input_shape', 'input_spec', 'inputs', 'layers', 'load_weights', 'loss', 'losses', 'make_predict_function', 'make_test_function', 'make_train_function', 'metrics', 'metrics_names', 'name', 'name_scope', 'non_trainable_variables', 'non_trainable_weights', 'optimizer', 'outbound_nodes', 'output', 'output_mask', 'output_names', 'output_shape', 'outputs', 'predict', 'predict_function', 'predict_generator', 'predict_on_batch', 'predict_step', 'reset_metrics', 'reset_states', 'run_eagerly', 'save', 'save_spec', 'save_weights', 'set_weights', 'state_updates', 'stateful', 'stop_training', 'submodules', 'summary', 'supports_masking', 'test_function', 'test_on_batch', 'test_step', 'to_json', 'to_yaml', 'train_function', 'train_on_batch', 'train_step', 'train_tf_function', 'trainable', 'trainable_variables', 'trainable_weights', 'updates', 'variable_dtype', 'variables', 'weights', 'with_name_scope']\n", - "Help on method get_weights in module keras.engine.training:\n", - "\n", - "get_weights() method of keras.engine.functional.Functional instance\n", - " Retrieves the weights of the model.\n", - " \n", - " Returns:\n", - " A flat list of Numpy arrays.\n", - "\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "BI0vVCjQTBPF" - }, - "source": [ - "Look at the model following https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ :" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 598 - }, - "id": "Rdz2nblaTMV5", - "outputId": "9afa6abe-c1e0-47cd-9c93-0b5ad1943e1e" - }, - "source": [ - "print(demo_model.summary())\n", - "from keras.utils.vis_utils import plot_model\n", - "plot_model(demo_model, to_file='model_plot.png', \n", - " show_shapes=True, show_layer_names=True)" - ], - "execution_count": 7, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Model: \"model\"\n", - "_________________________________________________________________\n", - " Layer (type) Output Shape Param # \n", - "=================================================================\n", - " input_1 (InputLayer) [(None, 3, 1)] 0 \n", - " \n", - " simple_rnn (SimpleRNN) (None, 2) 8 \n", - " \n", - " dense (Dense) (None, 1) 3 \n", - " \n", - "=================================================================\n", - "Total params: 11\n", - "Trainable params: 11\n", - "Non-trainable params: 0\n", - "_________________________________________________________________\n", - "None\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "" - ], - "image/png": "\n" - }, - "metadata": {}, - "execution_count": 7 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6KMS82uNTZz4" - }, - "source": [ - "If we have m\n", - " hidden units (\n", - "m\n", - "=\n", - "2\n", - " in the above case), then:\n", - "* Input: \n", - "x\n", - "∈\n", - "R\n", - "* Hidden unit: \n", - "h\n", - "∈\n", - "Rm\n", - "* Weights for input units: \n", - "wx\n", - "∈\n", - " Rm\n", - "* Weights for hidden units: \n", - "wh\n", - "∈\n", - "Rm x m\n", - "* Bias for hidden units: \n", - "bh\n", - "∈\n", - "R\n", - "m\n", - "*Weight for the dense layer: \n", - "wy\n", - "∈\n", - "R\n", - "m\n", - "*Bias for the dense layer: \n", - "by\n", - "∈\n", - "R\n", - "\n", - "Let’s look at the above weights. The weights are generated randomly so they will be different every time. The important thing is to learn what the structure of each object being used looks like and how it interacts with others to produce the final output. The get_weights() method of the model object returns a list of arrays, which consists of the weights and the bias of each layer, in the order of the layers. The first layer's input takes two entries, the (external) input values and the values of the hidden variables from the previous step." - ] - }, - { - "cell_type": "code", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "f-C4rYMDL1_l", - "outputId": "f6d67620-1382-4ec7-9562-917469c4a5df" - }, - "source": [ - "w = demo_model.get_weights()\n", - "#print(len(w),' weight arrays:',w)\n", - "wname=('wx','wh','bh','wy','by','wz','bz')\n", - "for i in range(len(w)):\n", - " print(i,':',wname[i],'shape=',w[i].shape)\n", - "\n", - "wx = w[0]\n", - "wh = w[1]\n", - "bh = w[2]\n", - "wy = w[3]\n", - "by = w[4]" - ], - "execution_count": 8, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "0 : wx shape= (1, 2)\n", - "1 : wh shape= (2, 2)\n", - "2 : bh shape= (2,)\n", - "3 : wy shape= (2, 1)\n", - "4 : by shape= (1,)\n" - ] - } - ] - }, - { - "cell_type": "code", - "source": [ - "# help(SimpleRNN)" - ], - "metadata": { - "id": "kNzpNlwP4MSs" - }, - "execution_count": 9, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "oks2sHZlUQZB" - }, - "source": [ - "Now let’s do a simple experiment to see how the layers from a SimpleRNN and Dense layer produce an output. Keep this figure in view.\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eGi9cUkgUkTe" - }, - "source": [ - "We’ll input x for three time steps and let the network generate an output. The values of the hidden units at time steps 1, 2 and 3 will be computed. \n", - "h0\n", - " is initialized to the zero vector. The output \n", - "o3\n", - " is computed from \n", - "h3\n", - " and \n", - "w3\n", - ". An activation function is linear, f(x)=x, so the update of $h(k)$ and the output $o(k)$ are given by" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "QSD8ABJXxS7K" - }, - "source": [ - "\\begin{align*}\n", - "h\\left( 0\\right) = &\\left[\n", - "\\begin{array}\n", - "[c]{c}%\n", - "0\\\\\n", - "0\n", - "\\end{array}\n", - "\\right] \\\\\n", - "h\\left( k+1\\right) =&x\\left( k\\right) \\left[\n", - "\\begin{array}\n", - "[c]{c}%\n", - "w_{x,0}\\\\\n", - "w_{x,1}%\n", - "\\end{array}\n", - "\\right] +\\left[ h_{0}(k),h_{1}(k)\\right] \\left[\n", - "\\begin{array}\n", - "[c]{cc}%\n", - "w_{h,00} & w_{h,01}\\\\\n", - "w_{h,10} & w_{h,11}%\n", - "\\end{array}\n", - "\\right] +\\left[\n", - "\\begin{array}\n", - "[c]{c}%\n", - "b_{h,0}\\\\\n", - "b_{h,1}%\n", - "\\end{array}\n", - "\\right] \\\\\n", - "o(k+1)=& \\left[ h_{0}(k+1),h_{1}(k+1)\\right] \\left[\n", - "\\begin{array}\n", - "[c]{c}%\n", - "w_{y,0}\\\\\n", - "w_{y,1}%\n", - "\\end{array}\n", - "\\right] \n", - "\\end{align*}\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "MAvXBZodx8RP" - }, - "source": [ - "We compute this for $k=1,2,3$ and compare with the output of the model:" - ] - }, - { - "cell_type": "code", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "kMOhAImzL7Ch", - "outputId": "c726a247-33ef-469b-867a-a861d12d4c94" - }, - "source": [ - "w = demo_model.get_weights()\n", - "wx = w[0]\n", - "wh = w[1]\n", - "bh = w[2]\n", - "wy = w[3]\n", - "by = w[4]\n", - "x = np.array([1, 2, 3])\n", - "# Reshape the input to the required sample_size x time_steps x features \n", - "x_input = np.reshape(x,(1, 3, 1))\n", - "y_pred_model = demo_model.predict(x_input)\n", - "\n", - "\n", - "m = 2\n", - "h0 = np.zeros(m)\n", - "h1 = np.dot(x[0], wx) + np.dot(h0,wh) + bh\n", - "h2 = np.dot(x[1], wx) + np.dot(h1,wh) + bh\n", - "h3 = np.dot(x[2], wx) + np.dot(h2,wh) + bh\n", - "o3 = np.dot(h3, wy) + by\n", - "\n", - "print('h1 = ', h1,'h2 = ', h2,'h3 = ', h3)\n", - "\n", - "print(\"Prediction from network \", y_pred_model)\n", - "print(\"Prediction from our computation \", o3)" - ], - "execution_count": 10, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "h1 = [[-1.06725907 0.72348773]] h2 = [[-0.85886271 1.63453803]] h3 = [[-1.46430677 1.54551815]]\n", - "Prediction from network [[0.42967746]]\n", - "Prediction from our computation [[0.42967746]]\n" - ] - } - ] - }, - { - "cell_type": "code", - "source": [ - "# the same using arrays\n", - "demo_model = create_RNN_functional(hidden_units=2, dense_units=1, input_shape=(3,1), \n", - " activation=['linear', 'linear'],return_sequences=True)\n", - "w = demo_model.get_weights()\n", - "\n", - "x = np.array([1, 2, 3])\n", - "# Reshape the input to the required sample_size x time_steps x features \n", - "x_input = np.reshape(x,(1, 3, 1))\n", - "y_pred_model = demo_model.predict(x_input)\n", - "\n", - "h = np.zeros(2)\n", - "o = np.empty(3)\n", - "for i in range(3):\n", - " h = np.dot(x[i], w[0]) + np.dot(h, w[1]) + w[2]\n", - " o[i]=np.dot(h, w[3]) + w[4]\n", - "\n", - "print(\"Prediction from network \", y_pred_model)\n", - "print(\"Prediction from our computation \", o)" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "Q2kuhi2KBNY4", - "outputId": "7d5d5917-8bea-4221-f9ef-2a85f2d96776" - }, - "execution_count": 11, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Prediction from network [[[1.0083389 ]\n", - " [0.89703053]\n", - " [1.7940611 ]]]\n", - "Prediction from our computation [1.00833888 0.89703049 1.794061 ]\n" - ] - } - ] - }, - { - "cell_type": "code", - "source": [ - "# stateful\n", - "demo_model = create_RNN_functional(hidden_units=2, dense_units=1, input_shape=(3,1), \n", - " activation=['linear', 'linear'],return_sequences=True)\n", - "w = demo_model.get_weights()\n", - "\n", - "x = np.array([1, 2, 3])\n", - "# Reshape the input to the required sample_size x time_steps x features \n", - "x_input = np.reshape(x,(1, 3, 1))\n", - "y_pred_model = demo_model.predict(x_input)\n", - "\n", - "h = np.zeros(2)\n", - "o = np.empty(3)\n", - "for i in range(3):\n", - " h = np.dot(x[i], w[0]) + np.dot(h, w[1]) + w[2]\n", - " o[i]=np.dot(h, w[3]) + w[4]\n", - "\n", - "print(\"Prediction from network \", y_pred_model)\n", - "print(\"Prediction from our computation \", o)" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "SLYRVhFGHa4r", - "outputId": "ffcae068-f129-496e-f9c1-2b5d8ee59e5a" - }, - "execution_count": 12, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Prediction from network [[[0.02811029]\n", - " [0.50526243]\n", - " [1.0105247 ]]]\n", - "Prediction from our computation [0.02811029 0.5052624 1.01052474]\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JopR12ZaVAyS" - }, - "source": [ - "The predictions came out the same! This confirms that we know what the network is doing." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "dHGF-tofMpJP" - }, - "source": [ - "## Step 1, 2: Reading Data and Splitting Into Train And Test" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "5x748YZuY-yL" - }, - "source": [ - "The following function reads the train and test data from a given URL and splits it into a given percentage of train and test data. It returns single dimensional arrays for train and test data after scaling the data between 0 and 1 using MinMaxScaler from scikit-learn." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "JyrxUuiuL8gv" - }, - "source": [ - "# Parameter split_percent defines the ratio of training examples\n", - "def get_train_test(data, split_percent=0.8):\n", - " scaler = MinMaxScaler(feature_range=(0, 1))\n", - " data = scaler.fit_transform(data).flatten()\n", - " n = len(data)\n", - " # Point for splitting data into train and test\n", - " split = int(n*split_percent)\n", - " train_data = data[range(split)]\n", - " test_data = data[split:]\n", - " return train_data, test_data, data\n", - "\n", - "sunspots_url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-sunspots.csv'\n", - "df = read_csv(sunspots_url, usecols=[1], engine='python')\n", - "train_data, test_data, data = get_train_test(np.array(df.values.astype('float32')))" - ], - "execution_count": 13, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iCsHwJOcZMJ7" - }, - "source": [ - "Let's print the data shape so that we know what we got." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "h5AmHug8JViT", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "6a29edd5-9583-4443-86b9-edbdff8050c8" - }, - "source": [ - "data.shape" - ], - "execution_count": 14, - "outputs": [ - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "(2820,)" - ] - }, - "metadata": {}, - "execution_count": 14 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "QHoBV8CSMt44" - }, - "source": [ - "## Step 3: Reshaping Data For Keras" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "B1CW_mu8Zbwb" - }, - "source": [ - "The next step is to prepare the data for Keras model training. The input array should be shaped as: **(total_samples, x time_steps, x features)**.\n", - "There are many ways of preparing time series data for training. We’ll create input rows with non-overlapping time steps. An example is shown in the figure below. Here time_steps denotes the number of previous time steps to use for predicting the next value of the time series data. We have for time_steps = 2, features = 1, and the first 6 terms are split total_samples=3 samples: 0, 10 predict the next term 20, then 20, 30 predict the next term 40, etc." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "OeEc_dmqZmtx" - }, - "source": [ - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iLqm8291Pd5X" - }, - "source": [ - "The following function get_XY() takes a one dimensional array as input and converts it to the required input X and target Y arrays." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "IxJEj52BL__o" - }, - "source": [ - "# Prepare the input X and target Y\n", - "def get_XY(dat, time_steps):\n", - " # Indices of target array\n", - " Y_ind = np.arange(time_steps, len(dat), time_steps)\n", - " Y = dat[Y_ind]\n", - " # Prepare X\n", - " rows_x = len(Y)\n", - " X = dat[range(time_steps*rows_x)]\n", - " X = np.reshape(X, (rows_x, time_steps, 1)) \n", - " return X, Y" - ], - "execution_count": 15, - "outputs": [] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "RFhadJjzQO7p" - }, - "source": [ - "For illustration, on the simple example above it returns the expected result: " - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "V38oXJ32QiFK", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "78928106-fde1-494d-b317-7acb619e1bb5" - }, - "source": [ - "dat = np.linspace(0.,70.,8).reshape(-1,1)\n", - "print(\"dat shape=\",dat.shape)\n", - "X, Y = get_XY(dat, 2)\n", - "print(\"X shape=\",X.shape)\n", - "print(\"Y shape=\",Y.shape)\n", - "#print('dat=',dat)\n", - "print('X=',X)\n", - "print('Y=',Y)\n" - ], - "execution_count": 16, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "dat shape= (8, 1)\n", - "X shape= (3, 2, 1)\n", - "Y shape= (3, 1)\n", - "X= [[[ 0.]\n", - " [10.]]\n", - "\n", - " [[20.]\n", - " [30.]]\n", - "\n", - " [[40.]\n", - " [50.]]]\n", - "Y= [[20.]\n", - " [40.]\n", - " [60.]]\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4V4IE7TvQpDW" - }, - "source": [ - "Now use it for the sunspot data. We’ll use 12 time_steps for the sunspots dataset as the sunspots generally have a cycle of 12 months. You can experiment with other values of time_steps." - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "YBnBPDxiQsjL", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "adb1b3e9-ce11-42b7-855b-66db73c58a51" - }, - "source": [ - "time_steps = 24\n", - "trainX, trainY = get_XY(train_data, time_steps)\n", - "testX, testY = get_XY(test_data, time_steps)\n", - "print(\"trainX shape=\",trainX.shape)\n", - "print(\"trainY shape=\",trainY.shape)\n", - "print(\"testX shape=\",testX.shape)\n", - "print(\"testY shape=\",testY.shape)" - ], - "execution_count": 17, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "trainX shape= (93, 24, 1)\n", - "trainY shape= (93,)\n", - "testX shape= (23, 24, 1)\n", - "testY shape= (23,)\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "Xz2JRTGKMzo2" - }, - "source": [ - "## Step 4: Create RNN Model And Train" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "SyAE6XLnMGDO", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "28868a27-1106-405e-e7ca-cc7573bd8286" - }, - "source": [ - "model = create_RNN(hidden_units=3, dense_units=1, input_shape=(time_steps,1), \n", - " activation=['tanh', 'tanh'])\n", - "model.fit(trainX, trainY, epochs=20, batch_size=1, verbose=2)" - ], - "execution_count": 18, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "using functional api\n", - "Epoch 1/20\n", - "93/93 - 2s - loss: 0.0133 - 2s/epoch - 22ms/step\n", - "Epoch 2/20\n", - "93/93 - 0s - loss: 0.0103 - 497ms/epoch - 5ms/step\n", - "Epoch 3/20\n", - "93/93 - 1s - loss: 0.0090 - 550ms/epoch - 6ms/step\n", - "Epoch 4/20\n", - "93/93 - 0s - loss: 0.0079 - 479ms/epoch - 5ms/step\n", - "Epoch 5/20\n", - "93/93 - 1s - loss: 0.0071 - 526ms/epoch - 6ms/step\n", - "Epoch 6/20\n", - "93/93 - 1s - loss: 0.0069 - 620ms/epoch - 7ms/step\n", - "Epoch 7/20\n", - "93/93 - 1s - loss: 0.0065 - 521ms/epoch - 6ms/step\n", - "Epoch 8/20\n", - "93/93 - 1s - loss: 0.0061 - 591ms/epoch - 6ms/step\n", - "Epoch 9/20\n", - "93/93 - 1s - loss: 0.0060 - 550ms/epoch - 6ms/step\n", - "Epoch 10/20\n", - "93/93 - 1s - loss: 0.0058 - 591ms/epoch - 6ms/step\n", - "Epoch 11/20\n", - "93/93 - 1s - loss: 0.0058 - 532ms/epoch - 6ms/step\n", - "Epoch 12/20\n", - "93/93 - 1s - loss: 0.0055 - 593ms/epoch - 6ms/step\n", - "Epoch 13/20\n", - "93/93 - 1s - loss: 0.0053 - 618ms/epoch - 7ms/step\n", - "Epoch 14/20\n", - "93/93 - 1s - loss: 0.0052 - 616ms/epoch - 7ms/step\n", - "Epoch 15/20\n", - "93/93 - 1s - loss: 0.0051 - 587ms/epoch - 6ms/step\n", - "Epoch 16/20\n", - "93/93 - 1s - loss: 0.0051 - 571ms/epoch - 6ms/step\n", - "Epoch 17/20\n", - "93/93 - 1s - loss: 0.0050 - 598ms/epoch - 6ms/step\n", - "Epoch 18/20\n", - "93/93 - 1s - loss: 0.0050 - 559ms/epoch - 6ms/step\n", - "Epoch 19/20\n", - "93/93 - 1s - loss: 0.0049 - 655ms/epoch - 7ms/step\n", - "Epoch 20/20\n", - "93/93 - 1s - loss: 0.0049 - 519ms/epoch - 6ms/step\n" - ] - }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "" - ] - }, - "metadata": {}, - "execution_count": 18 - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "tluiPIaxM9FH" - }, - "source": [ - "## Step 5: Compute And Print The Root Mean Square Error" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "LnWdCmqJMK-k", - "colab": { - "base_uri": "https://localhost:8080/" - }, - "outputId": "61fa0c32-96e7-4af5-cc84-2853c8bccee9" - }, - "source": [ - "def print_error(trainY, testY, train_predict, test_predict): \n", - " # Error of predictions\n", - " train_rmse = math.sqrt(mean_squared_error(trainY, train_predict))\n", - " test_rmse = math.sqrt(mean_squared_error(testY, test_predict))\n", - " # Print RMSE\n", - " print('Train RMSE: %.3f RMSE' % (train_rmse))\n", - " print('Test RMSE: %.3f RMSE' % (test_rmse)) \n", - "\n", - "# make predictions\n", - "train_predict = model.predict(trainX)\n", - "test_predict = model.predict(testX)\n", - "# Mean square error\n", - "print_error(trainY, testY, train_predict, test_predict)" - ], - "execution_count": 19, - "outputs": [ - { - "output_type": "stream", - "name": "stdout", - "text": [ - "Train RMSE: 0.069 RMSE\n", - "Test RMSE: 0.102 RMSE\n" - ] - } - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "eed7244NNHs9" - }, - "source": [ - "" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "nDwJvflyNGz3" - }, - "source": [ - "## Step 6: View The result" - ] - }, - { - "cell_type": "code", - "metadata": { - "id": "fQPjWHWvMMGL", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 420 - }, - "outputId": "ad07c596-5267-4cba-cca7-e44f8649cff7" - }, - "source": [ - "# Plot the result\n", - "def plot_result(trainY, testY, train_predict, test_predict):\n", - " actual = np.append(trainY, testY)\n", - " predictions = np.append(train_predict, test_predict)\n", - " rows = len(actual)\n", - " plt.figure(figsize=(15, 6), dpi=80)\n", - " plt.plot(range(rows), actual)\n", - " plt.plot(range(rows), predictions)\n", - " plt.axvline(x=len(trainY), color='r')\n", - " plt.legend(['Actual', 'Predictions'])\n", - " plt.xlabel('Observation number after given time steps')\n", - " plt.ylabel('Sunspots scaled')\n", - " plt.title('Actual and Predicted Values. The Red Line Separates The Training And Test Examples')\n", - "plot_result(trainY, testY, train_predict, test_predict)" - ], - "execution_count": 20, - "outputs": [ - { - "output_type": "display_data", - "data": { - "text/plain": [ - "
" - ], - "image/png": "\n" - }, - "metadata": { - "needs_background": "light" - } - } - ] - } - ] -} \ No newline at end of file +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "rnn_tutorial_jm.ipynb", + "provenance": [], + "collapsed_sections": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "hEhvqXQBNN6r" + }, + "source": [ + "# Understanding Simple Recurrent Neural Networks In Keras\n", + "From https://machinelearningmastery.com/understanding-simple-recurrent-neural-networks-in-keras/ \n", + "With changes by Jan Mandel\n", + "2021-12-26: small changes for clarity\n", + "2022-06-16: added same by functional interface" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pINycjukQijP" + }, + "source": [ + "This tutorial is designed for anyone looking for an understanding of how recurrent neural networks (RNN) work and how to use them via the Keras deep learning library. While all the methods required for solving problems and building applications are provided by the Keras library, it is also important to gain an insight on how everything works. In this article, the computations taking place in the RNN model are shown step by step. Next, a complete end to end system for time series prediction is developed.\n", + "\n", + "After completing this tutorial, you will know:\n", + "\n", + "* The structure of RNN\n", + "* How RNN computes the output when given an input\n", + "* How to prepare data for a SimpleRNN in Keras\n", + "* How to train a SimpleRNN model\n", + "\n", + "Let’s get started." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3D7sqhF-QtJ4" + }, + "source": [ + "## Tutorial Overview\n", + "\n", + "This tutorial is divided into two parts; they are:\n", + "\n", + "1. The structure of the RNN\n", + " 1. Different weights and biases associated with different layers of the RNN.\n", + " 2. How computations are performed to compute the output when given an input.\n", + "2. A complete application for time series prediction." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jU0opwkmQ9wm" + }, + "source": [ + "## Prerequisites\n", + "\n", + "It is assumed that you have a basic understanding of RNNs before you start implementing them. An [Introduction To Recurrent Neural Networks And The Math That Powers Them](https://machinelearningmastery.com/an-introduction-to-recurrent-neural-networks-and-the-math-that-powers-them) gives you a quick overview of RNNs.\n", + "\n", + "Let’s now get right down to the implementation part." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-UcYCr1FMaE7" + }, + "source": [ + "## Import section" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "FimVgAB4LsI9" + }, + "source": [ + "from pandas import read_csv\n", + "import numpy as np\n", + "from keras.models import Sequential\n", + "from keras.layers import Dense, SimpleRNN\n", + "from sklearn.preprocessing import MinMaxScaler\n", + "from sklearn.metrics import mean_squared_error\n", + "import math\n", + "import matplotlib.pyplot as plt\n", + "import tensorflow as tf" + ], + "execution_count": 18, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "O-okDDv9Md4Z" + }, + "source": [ + "## Keras SimpleRNN" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "W-WQZp0qRta0" + }, + "source": [ + "The function below returns a model that includes a SimpleRNN layer and a Dense layer for learning sequential data. The input_shape specifies the parameter (time_steps x features). We’ll simplify everything and use univariate data, i.e., one feature only; the time_steps are discussed below." + ] + }, + { + "cell_type": "code", + "source": [ + "api_type = 2 # 1 = sequential, 2 = functional" + ], + "metadata": { + "id": "Qf35beQp1Ewv" + }, + "execution_count": 19, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "Aoh0LWQ7Ltdk" + }, + "source": [ + "def create_RNN_sequential(hidden_units, dense_units, input_shape, activation):\n", + " model = Sequential()\n", + " model.add(SimpleRNN(hidden_units, input_shape=input_shape, \n", + " activation=activation[0]))\n", + " model.add(Dense(units=dense_units, activation=activation[1]))\n", + " model.compile(loss='mean_squared_error', optimizer='adam')\n", + " return model" + ], + "execution_count": 20, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "def create_RNN_functional(hidden_units, dense_units, input_shape=None, activation=None,\n", + " return_sequences=False,stateful=False,batch_shape=None):\n", + " if stateful:\n", + " inputs = tf.keras.Input(batch_shape=batch_shape)\n", + " else:\n", + " inputs = tf.keras.Input(shape=input_shape)\n", + " #inputs = tf.keras.Input(shape=input_shape)\n", + " x = tf.keras.layers.SimpleRNN(hidden_units,\n", + " return_sequences=return_sequences,\n", + " stateful=stateful,\n", + " activation=activation[0])(inputs)\n", + " outputs = tf.keras.layers.Dense(dense_units, activation=activation[1])(x)\n", + " model = tf.keras.Model(inputs=inputs, outputs=outputs)\n", + " model.compile(loss='mean_squared_error', optimizer='adam')\n", + " return model" + ], + "metadata": { + "id": "zBQxW0h8pvGb" + }, + "execution_count": 33, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "def create_RNN(hidden_units, dense_units, input_shape, activation):\n", + " if api_type==1:\n", + " print('using sequential api')\n", + " return create_RNN_sequential(hidden_units, dense_units, input_shape, activation)\n", + " if api_type==2:\n", + " print('using functional api')\n", + " return create_RNN_functional(hidden_units, dense_units, input_shape, activation)\n", + " print('api_type must be 1 or 2, got ',api_type)\n", + " raise(ValueError)\n", + "\n", + "demo_model = create_RNN(hidden_units=2, dense_units=1, input_shape=(3,1), \n", + " activation=['linear', 'linear'])" + ], + "metadata": { + "id": "dYLai9HypVRo", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "10369aad-12d7-4a76-bb7f-a77bb7418a3f" + }, + "execution_count": 22, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "using functional api\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_CZtOyDPSuMy" + }, + "source": [ + "The object demo_model is returned with 2 hidden units created via a the SimpleRNN layer and 1 dense unit created via the Dense layer. The input_shape is set at 3×1 and a linear activation function is used in both layers for simplicity. Just to recall the linear activation function f(x)=x makes no change in the input." + ] + }, + { + "cell_type": "code", + "source": [ + "print(dir(demo_model))\n", + "# help(demo_model)\n", + "help(demo_model.get_weights)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "TgDQmxwQpUtw", + "outputId": "0e5a264f-3e1f-4e7c-83b7-683af91accae" + }, + "execution_count": 23, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "['_SCALAR_UPRANKING_ON', '_TF_MODULE_IGNORED_PROPERTIES', '__call__', '__class__', '__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_activity_regularizer', '_add_trackable', '_add_trackable_child', '_add_variable_with_custom_getter', '_assert_compile_was_called', '_assert_weights_created', '_auto_track_sub_layers', '_autocast', '_autographed_call', '_base_model_initialized', '_build_input_shape', '_call_accepts_kwargs', '_call_arg_was_passed', '_call_fn_arg_defaults', '_call_fn_arg_positions', '_call_fn_args', '_call_full_argspec', '_callable_losses', '_cast_single_input', '_check_call_args', '_checkpoint_dependencies', '_clear_losses', '_cluster_coordinator', '_compile_was_called', '_compiled_trainable_state', '_compute_dtype', '_compute_dtype_object', '_compute_output_and_mask_jointly', '_compute_tensor_usage_count', '_configure_steps_per_execution', '_conform_to_reference_input', '_dedup_weights', '_default_training_arg', '_deferred_dependencies', '_delete_tracking', '_deserialization_dependencies', '_deserialize_from_proto', '_distribution_strategy', '_dtype', '_dtype_policy', '_dynamic', '_eager_losses', '_enable_dict_to_input_mapping', '_expects_mask_arg', '_expects_training_arg', '_feed_input_names', '_feed_input_shapes', '_feed_inputs', '_flatten', '_flatten_layers', '_flatten_modules', '_flatten_to_reference_inputs', '_functional_construction_call', '_gather_children_attribute', '_gather_saveables_for_checkpoint', '_get_call_arg_value', '_get_callback_model', '_get_cell_name', '_get_compile_args', '_get_existing_metric', '_get_input_masks', '_get_legacy_saved_model_children', '_get_node_attribute_at_index', '_get_optimizer', '_get_save_spec', '_get_trainable_state', '_graph_network_add_loss', '_graph_network_add_metric', '_handle_activity_regularization', '_handle_deferred_dependencies', '_handle_deferred_layer_dependencies', '_handle_weight_regularization', '_in_multi_worker_mode', '_inbound_nodes', '_inbound_nodes_value', '_infer_output_signature', '_init_batch_counters', '_init_call_fn_args', '_init_graph_network', '_init_set_name', '_initial_weights', '_input_coordinates', '_input_layers', '_input_spec', '_insert_layers', '_instrument_layer_creation', '_instrumented_keras_api', '_instrumented_keras_layer_class', '_instrumented_keras_model_class', '_is_compiled', '_is_graph_network', '_is_layer', '_is_model_for_instrumentation', '_jit_compile', '_keras_api_names', '_keras_api_names_v1', '_keras_tensor_symbolic_call', '_layer_call_argspecs', '_layer_checkpoint_dependencies', '_list_extra_dependencies_for_serialization', '_list_functions_for_serialization', '_lookup_dependency', '_losses', '_map_resources', '_maybe_build', '_maybe_cast_inputs', '_maybe_create_attribute', '_maybe_initialize_trackable', '_maybe_load_initial_epoch_from_ckpt', '_metrics', '_metrics_lock', '_must_restore_from_config', '_name', '_name_based_attribute_restore', '_name_based_restores', '_name_scope', '_nested_inputs', '_nested_outputs', '_network_nodes', '_no_dependency', '_nodes_by_depth', '_non_trainable_weights', '_obj_reference_counts', '_obj_reference_counts_dict', '_object_identifier', '_outbound_nodes', '_outbound_nodes_value', '_outer_name_scope', '_output_coordinates', '_output_layers', '_output_mask_cache', '_output_shape_cache', '_output_tensor_cache', '_predict_counter', '_preload_simple_restoration', '_preserve_input_structure_in_config', '_reset_compile_cache', '_restore_from_checkpoint_position', '_run_eagerly', '_run_internal_graph', '_saved_model_arg_spec', '_saved_model_inputs_spec', '_self_name_based_restores', '_self_saveable_object_factories', '_self_setattr_tracking', '_self_tracked_trackables', '_self_unconditional_checkpoint_dependencies', '_self_unconditional_deferred_dependencies', '_self_unconditional_dependency_names', '_self_update_uid', '_serialize_to_proto', '_set_call_arg_value', '_set_connectivity_metadata', '_set_dtype_policy', '_set_inputs', '_set_mask_keras_history_checked', '_set_mask_metadata', '_set_output_names', '_set_save_spec', '_set_trainable_state', '_set_training_mode', '_setattr_tracking', '_should_cast_single_input', '_should_compute_mask', '_should_eval', '_single_restoration_from_checkpoint_position', '_split_out_first_arg', '_stateful', '_steps_per_execution', '_supports_masking', '_symbolic_call', '_tensor_usage_count', '_test_counter', '_tf_api_names', '_tf_api_names_v1', '_thread_local', '_track_trackable', '_trackable_children', '_trackable_saved_model_saver', '_trackable_saver', '_tracking_metadata', '_train_counter', '_trainable', '_trainable_weights', '_training_state', '_unconditional_checkpoint_dependencies', '_unconditional_dependency_names', '_undeduplicated_weights', '_update_uid', '_updated_config', '_updates', '_use_input_spec_as_call_signature', '_validate_compile', '_validate_graph_inputs_and_outputs', '_validate_target_and_loss', 'activity_regularizer', 'add_loss', 'add_metric', 'add_update', 'add_variable', 'add_weight', 'apply', 'build', 'built', 'call', 'compile', 'compiled_loss', 'compiled_metrics', 'compute_dtype', 'compute_loss', 'compute_mask', 'compute_metrics', 'compute_output_shape', 'compute_output_signature', 'count_params', 'distribute_strategy', 'dtype', 'dtype_policy', 'dynamic', 'evaluate', 'evaluate_generator', 'finalize_state', 'fit', 'fit_generator', 'from_config', 'get_config', 'get_input_at', 'get_input_mask_at', 'get_input_shape_at', 'get_layer', 'get_losses_for', 'get_output_at', 'get_output_mask_at', 'get_output_shape_at', 'get_updates_for', 'get_weights', 'history', 'inbound_nodes', 'input', 'input_mask', 'input_names', 'input_shape', 'input_spec', 'inputs', 'layers', 'load_weights', 'loss', 'losses', 'make_predict_function', 'make_test_function', 'make_train_function', 'metrics', 'metrics_names', 'name', 'name_scope', 'non_trainable_variables', 'non_trainable_weights', 'optimizer', 'outbound_nodes', 'output', 'output_mask', 'output_names', 'output_shape', 'outputs', 'predict', 'predict_function', 'predict_generator', 'predict_on_batch', 'predict_step', 'reset_metrics', 'reset_states', 'run_eagerly', 'save', 'save_spec', 'save_weights', 'set_weights', 'state_updates', 'stateful', 'stop_training', 'submodules', 'summary', 'supports_masking', 'test_function', 'test_on_batch', 'test_step', 'to_json', 'to_yaml', 'train_function', 'train_on_batch', 'train_step', 'train_tf_function', 'trainable', 'trainable_variables', 'trainable_weights', 'updates', 'variable_dtype', 'variables', 'weights', 'with_name_scope']\n", + "Help on method get_weights in module keras.engine.training:\n", + "\n", + "get_weights() method of keras.engine.functional.Functional instance\n", + " Retrieves the weights of the model.\n", + " \n", + " Returns:\n", + " A flat list of Numpy arrays.\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BI0vVCjQTBPF" + }, + "source": [ + "Look at the model following https://machinelearningmastery.com/visualize-deep-learning-neural-network-model-keras/ :" + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 598 + }, + "id": "Rdz2nblaTMV5", + "outputId": "602e15fa-2bc6-4453-bb12-503ed600a5cf" + }, + "source": [ + "print(demo_model.summary())\n", + "from keras.utils.vis_utils import plot_model\n", + "plot_model(demo_model, to_file='model_plot.png', \n", + " show_shapes=True, show_layer_names=True)" + ], + "execution_count": 24, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Model: \"model_2\"\n", + "_________________________________________________________________\n", + " Layer (type) Output Shape Param # \n", + "=================================================================\n", + " input_5 (InputLayer) [(None, 3, 1)] 0 \n", + " \n", + " simple_rnn_2 (SimpleRNN) (None, 2) 8 \n", + " \n", + " dense_2 (Dense) (None, 1) 3 \n", + " \n", + "=================================================================\n", + "Total params: 11\n", + "Trainable params: 11\n", + "Non-trainable params: 0\n", + "_________________________________________________________________\n", + "None\n" + ] + }, + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "" + ], + "image/png": "\n" + }, + "metadata": {}, + "execution_count": 24 + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6KMS82uNTZz4" + }, + "source": [ + "If we have m\n", + " hidden units (\n", + "m\n", + "=\n", + "2\n", + " in the above case), then:\n", + "* Input: \n", + "x\n", + "∈\n", + "R\n", + "* Hidden unit: \n", + "h\n", + "∈\n", + "Rm\n", + "* Weights for input units: \n", + "wx\n", + "∈\n", + " Rm\n", + "* Weights for hidden units: \n", + "wh\n", + "∈\n", + "Rm x m\n", + "* Bias for hidden units: \n", + "bh\n", + "∈\n", + "R\n", + "m\n", + "*Weight for the dense layer: \n", + "wy\n", + "∈\n", + "R\n", + "m\n", + "*Bias for the dense layer: \n", + "by\n", + "∈\n", + "R\n", + "\n", + "Let’s look at the above weights. The weights are generated randomly so they will be different every time. The important thing is to learn what the structure of each object being used looks like and how it interacts with others to produce the final output. The get_weights() method of the model object returns a list of arrays, which consists of the weights and the bias of each layer, in the order of the layers. The first layer's input takes two entries, the (external) input values and the values of the hidden variables from the previous step." + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "f-C4rYMDL1_l", + "outputId": "99e50027-1295-4f2e-8bbc-90790397728d" + }, + "source": [ + "w = demo_model.get_weights()\n", + "#print(len(w),' weight arrays:',w)\n", + "wname=('wx','wh','bh','wy','by','wz','bz')\n", + "for i in range(len(w)):\n", + " print(i,':',wname[i],'shape=',w[i].shape)\n", + "\n", + "wx = w[0]\n", + "wh = w[1]\n", + "bh = w[2]\n", + "wy = w[3]\n", + "by = w[4]" + ], + "execution_count": 25, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "0 : wx shape= (1, 2)\n", + "1 : wh shape= (2, 2)\n", + "2 : bh shape= (2,)\n", + "3 : wy shape= (2, 1)\n", + "4 : by shape= (1,)\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# help(SimpleRNN)" + ], + "metadata": { + "id": "kNzpNlwP4MSs" + }, + "execution_count": 26, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oks2sHZlUQZB" + }, + "source": [ + "Now let’s do a simple experiment to see how the layers from a SimpleRNN and Dense layer produce an output. Keep this figure in view.\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "eGi9cUkgUkTe" + }, + "source": [ + "We’ll input x for three time steps and let the network generate an output. The values of the hidden units at time steps 1, 2 and 3 will be computed. \n", + "h0\n", + " is initialized to the zero vector. The output \n", + "o3\n", + " is computed from \n", + "h3\n", + " and \n", + "w3\n", + ". An activation function is linear, f(x)=x, so the update of $h(k)$ and the output $o(k)$ are given by" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QSD8ABJXxS7K" + }, + "source": [ + "\\begin{align*}\n", + "h\\left( 0\\right) = &\\left[\n", + "\\begin{array}\n", + "[c]{c}%\n", + "0\\\\\n", + "0\n", + "\\end{array}\n", + "\\right] \\\\\n", + "h\\left( k+1\\right) =&x\\left( k\\right) \\left[\n", + "\\begin{array}\n", + "[c]{c}%\n", + "w_{x,0}\\\\\n", + "w_{x,1}%\n", + "\\end{array}\n", + "\\right] +\\left[ h_{0}(k),h_{1}(k)\\right] \\left[\n", + "\\begin{array}\n", + "[c]{cc}%\n", + "w_{h,00} & w_{h,01}\\\\\n", + "w_{h,10} & w_{h,11}%\n", + "\\end{array}\n", + "\\right] +\\left[\n", + "\\begin{array}\n", + "[c]{c}%\n", + "b_{h,0}\\\\\n", + "b_{h,1}%\n", + "\\end{array}\n", + "\\right] \\\\\n", + "o(k+1)=& \\left[ h_{0}(k+1),h_{1}(k+1)\\right] \\left[\n", + "\\begin{array}\n", + "[c]{c}%\n", + "w_{y,0}\\\\\n", + "w_{y,1}%\n", + "\\end{array}\n", + "\\right] \n", + "\\end{align*}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MAvXBZodx8RP" + }, + "source": [ + "We compute this for $k=1,2,3$ and compare with the output of the model:" + ] + }, + { + "cell_type": "code", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "kMOhAImzL7Ch", + "outputId": "3c518453-dec4-48b8-8785-02435d4700d0" + }, + "source": [ + "w = demo_model.get_weights()\n", + "wx = w[0]\n", + "wh = w[1]\n", + "bh = w[2]\n", + "wy = w[3]\n", + "by = w[4]\n", + "x = np.array([1, 2, 3])\n", + "# Reshape the input to the required sample_size x time_steps x features \n", + "x_input = np.reshape(x,(1, 3, 1))\n", + "y_pred_model = demo_model.predict(x_input)\n", + "\n", + "\n", + "m = 2\n", + "h0 = np.zeros(m)\n", + "h1 = np.dot(x[0], wx) + np.dot(h0,wh) + bh\n", + "h2 = np.dot(x[1], wx) + np.dot(h1,wh) + bh\n", + "h3 = np.dot(x[2], wx) + np.dot(h2,wh) + bh\n", + "o3 = np.dot(h3, wy) + by\n", + "\n", + "print('h1 = ', h1,'h2 = ', h2,'h3 = ', h3)\n", + "\n", + "print(\"Prediction from network \", y_pred_model)\n", + "print(\"Prediction from our computation \", o3)" + ], + "execution_count": 27, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "h1 = [[ 0.51664114 -0.68749332]] h2 = [[ 1.86312974 -1.14933595]] h3 = [[ 3.30697657 -0.75672767]]\n", + "Prediction from network [[0.8985116]]\n", + "Prediction from our computation [[0.89851162]]\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# the same using arrays\n", + "demo_model = create_RNN_functional(hidden_units=2, dense_units=1, input_shape=(3,1), \n", + " activation=['linear', 'linear'],return_sequences=True)\n", + "w = demo_model.get_weights()\n", + "\n", + "x = np.array([1, 2, 3])\n", + "# Reshape the input to the required sample_size x time_steps x features \n", + "x_input = np.reshape(x,(1, 3, 1))\n", + "y_pred_model = demo_model.predict(x_input)\n", + "\n", + "h = np.zeros(2)\n", + "o = np.empty(3)\n", + "for i in range(3):\n", + " h = np.dot(x[i], w[0]) + np.dot(h, w[1]) + w[2]\n", + " o[i]=np.dot(h, w[3]) + w[4]\n", + "\n", + "print(\"Prediction from network \", y_pred_model)\n", + "print(\"Prediction from our computation \", o)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "Q2kuhi2KBNY4", + "outputId": "406fae03-b596-4f8a-f072-fe20dc50e710" + }, + "execution_count": 28, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Prediction from network [[[0.139544 ]\n", + " [0.09143725]\n", + " [0.18287447]]]\n", + "Prediction from our computation [0.13954399 0.09143724 0.18287446]\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# stateful\n", + "demo_model = create_RNN_functional(hidden_units=2, dense_units=1, \n", + " activation=['linear', 'linear'],return_sequences=True,\n", + " stateful=True,batch_shape=(2,3,1))\n", + "print(demo_model.summary())\n", + "from keras.utils.vis_utils import plot_model\n", + "plot_model(demo_model, to_file='model_plot.png', \n", + " show_shapes=True, show_layer_names=True)\n", + "\n", + "w = demo_model.get_weights()\n", + "\n", + "x = np.array([1, 2, 3])\n", + "# Reshape the input to the required sample_size x time_steps x features \n", + "x_input = np.reshape(x,(1, 3, 1))\n", + "y_pred_model = demo_model.predict(x_input)\n", + "\n", + "h = np.zeros(2)\n", + "o = np.empty(3)\n", + "for i in range(3):\n", + " h = np.dot(x[i], w[0]) + np.dot(h, w[1]) + w[2]\n", + " o[i]=np.dot(h, w[3]) + w[4]\n", + "\n", + "print(\"Prediction from network \", y_pred_model)\n", + "print(\"Prediction from our computation \", o)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "id": "SLYRVhFGHa4r", + "outputId": "6b98a38d-cd2c-4834-ae11-b90807d8cadf" + }, + "execution_count": 35, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Model: \"model_5\"\n", + "_________________________________________________________________\n", + " Layer (type) Output Shape Param # \n", + "=================================================================\n", + " input_9 (InputLayer) [(2, 3, 1)] 0 \n", + " \n", + " simple_rnn_6 (SimpleRNN) (2, 3, 2) 8 \n", + " \n", + " dense_5 (Dense) (2, 3, 1) 3 \n", + " \n", + "=================================================================\n", + "Total params: 11\n", + "Trainable params: 11\n", + "Non-trainable params: 0\n", + "_________________________________________________________________\n", + "None\n" + ] + }, + { + "output_type": "error", + "ename": "InvalidArgumentError", + "evalue": "ignored", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mInvalidArgumentError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;31m# Reshape the input to the required sample_size x time_steps x features\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0mx_input\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0my_pred_model\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdemo_model\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx_input\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 16\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0mh\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mzeros\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\u001b[0m in \u001b[0;36merror_handler\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 65\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;31m# pylint: disable=broad-except\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 66\u001b[0m \u001b[0mfiltered_tb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_process_traceback_frames\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__traceback__\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 67\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwith_traceback\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfiltered_tb\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 68\u001b[0m \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 69\u001b[0m \u001b[0;32mdel\u001b[0m \u001b[0mfiltered_tb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py\u001b[0m in \u001b[0;36mquick_execute\u001b[0;34m(op_name, num_outputs, inputs, attrs, ctx, name)\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[0mctx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mensure_initialized\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 54\u001b[0m tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,\n\u001b[0;32m---> 55\u001b[0;31m inputs, attrs, num_outputs)\n\u001b[0m\u001b[1;32m 56\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mcore\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_NotOkStatusException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 57\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mname\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mInvalidArgumentError\u001b[0m: Graph execution error:\n\nDetected at node 'model_5/simple_rnn_6/TensorArrayUnstack/TensorListFromTensor' defined at (most recent call last):\n File \"/usr/lib/python3.7/runpy.py\", line 193, in _run_module_as_main\n \"__main__\", mod_spec)\n File \"/usr/lib/python3.7/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py\", line 16, in \n app.launch_new_instance()\n File \"/usr/local/lib/python3.7/dist-packages/traitlets/config/application.py\", line 846, in launch_instance\n app.start()\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/kernelapp.py\", line 499, in start\n self.io_loop.start()\n File \"/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py\", line 132, in start\n self.asyncio_loop.run_forever()\n File \"/usr/lib/python3.7/asyncio/base_events.py\", line 541, in run_forever\n self._run_once()\n File \"/usr/lib/python3.7/asyncio/base_events.py\", line 1786, in _run_once\n handle._run()\n File \"/usr/lib/python3.7/asyncio/events.py\", line 88, in _run\n self._context.run(self._callback, *self._args)\n File \"/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py\", line 122, in _handle_events\n handler_func(fileobj, events)\n File \"/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py\", line 300, in null_wrapper\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py\", line 577, in _handle_events\n self._handle_recv()\n File \"/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py\", line 606, in _handle_recv\n self._run_callback(callback, msg)\n File \"/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py\", line 556, in _run_callback\n callback(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py\", line 300, in null_wrapper\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py\", line 283, in dispatcher\n return self.dispatch_shell(stream, msg)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py\", line 233, in dispatch_shell\n handler(stream, idents, msg)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py\", line 399, in execute_request\n user_expressions, allow_stdin)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/ipkernel.py\", line 208, in do_execute\n res = shell.run_cell(code, store_history=store_history, silent=silent)\n File \"/usr/local/lib/python3.7/dist-packages/ipykernel/zmqshell.py\", line 537, in run_cell\n return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py\", line 2718, in run_cell\n interactivity=interactivity, compiler=compiler, result=result)\n File \"/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py\", line 2822, in run_ast_nodes\n if self.run_code(code, result):\n File \"/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py\", line 2882, in run_code\n exec(code_obj, self.user_global_ns, self.user_ns)\n File \"\", line 15, in \n y_pred_model = demo_model.predict(x_input)\n File \"/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\", line 64, in error_handler\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/training.py\", line 1982, in predict\n tmp_batch_outputs = self.predict_function(iterator)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/training.py\", line 1801, in predict_function\n return step_function(self, iterator)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/training.py\", line 1790, in step_function\n outputs = model.distribute_strategy.run(run_step, args=(data,))\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/training.py\", line 1783, in run_step\n outputs = model.predict_step(data)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/training.py\", line 1751, in predict_step\n return self(x, training=False)\n File \"/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\", line 64, in error_handler\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py\", line 1096, in __call__\n outputs = call_fn(inputs, *args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\", line 92, in error_handler\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py\", line 452, in call\n inputs, training=training, mask=mask)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py\", line 589, in _run_internal_graph\n outputs = node.layer(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/layers/recurrent.py\", line 679, in __call__\n return super(RNN, self).__call__(inputs, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\", line 64, in error_handler\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py\", line 1096, in __call__\n outputs = call_fn(inputs, *args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py\", line 92, in error_handler\n return fn(*args, **kwargs)\n File \"/usr/local/lib/python3.7/dist-packages/keras/layers/recurrent.py\", line 1614, in call\n inputs, mask=mask, training=training, initial_state=initial_state)\n File \"/usr/local/lib/python3.7/dist-packages/keras/layers/recurrent.py\", line 826, in call\n zero_output_for_mask=self.zero_output_for_mask)\n File \"/usr/local/lib/python3.7/dist-packages/keras/backend.py\", line 4589, in rnn\n for ta, input_ in zip(input_ta, flatted_inputs))\n File \"/usr/local/lib/python3.7/dist-packages/keras/backend.py\", line 4589, in \n for ta, input_ in zip(input_ta, flatted_inputs))\nNode: 'model_5/simple_rnn_6/TensorArrayUnstack/TensorListFromTensor'\nSpecified a list with shape [2,1] from a tensor with shape [1,1]\n\t [[{{node model_5/simple_rnn_6/TensorArrayUnstack/TensorListFromTensor}}]] [Op:__inference_predict_function_2332]" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JopR12ZaVAyS" + }, + "source": [ + "The predictions came out the same! This confirms that we know what the network is doing." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dHGF-tofMpJP" + }, + "source": [ + "## Step 1, 2: Reading Data and Splitting Into Train And Test" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5x748YZuY-yL" + }, + "source": [ + "The following function reads the train and test data from a given URL and splits it into a given percentage of train and test data. It returns single dimensional arrays for train and test data after scaling the data between 0 and 1 using MinMaxScaler from scikit-learn." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "JyrxUuiuL8gv" + }, + "source": [ + "# Parameter split_percent defines the ratio of training examples\n", + "def get_train_test(data, split_percent=0.8):\n", + " scaler = MinMaxScaler(feature_range=(0, 1))\n", + " data = scaler.fit_transform(data).flatten()\n", + " n = len(data)\n", + " # Point for splitting data into train and test\n", + " split = int(n*split_percent)\n", + " train_data = data[range(split)]\n", + " test_data = data[split:]\n", + " return train_data, test_data, data\n", + "\n", + "sunspots_url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-sunspots.csv'\n", + "df = read_csv(sunspots_url, usecols=[1], engine='python')\n", + "train_data, test_data, data = get_train_test(np.array(df.values.astype('float32')))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iCsHwJOcZMJ7" + }, + "source": [ + "Let's print the data shape so that we know what we got." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "h5AmHug8JViT" + }, + "source": [ + "data.shape" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QHoBV8CSMt44" + }, + "source": [ + "## Step 3: Reshaping Data For Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "B1CW_mu8Zbwb" + }, + "source": [ + "The next step is to prepare the data for Keras model training. The input array should be shaped as: **(total_samples, x time_steps, x features)**.\n", + "There are many ways of preparing time series data for training. We’ll create input rows with non-overlapping time steps. An example is shown in the figure below. Here time_steps denotes the number of previous time steps to use for predicting the next value of the time series data. We have for time_steps = 2, features = 1, and the first 6 terms are split total_samples=3 samples: 0, 10 predict the next term 20, then 20, 30 predict the next term 40, etc." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OeEc_dmqZmtx" + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iLqm8291Pd5X" + }, + "source": [ + "The following function get_XY() takes a one dimensional array as input and converts it to the required input X and target Y arrays." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "IxJEj52BL__o" + }, + "source": [ + "# Prepare the input X and target Y\n", + "def get_XY(dat, time_steps):\n", + " # Indices of target array\n", + " Y_ind = np.arange(time_steps, len(dat), time_steps)\n", + " Y = dat[Y_ind]\n", + " # Prepare X\n", + " rows_x = len(Y)\n", + " X = dat[range(time_steps*rows_x)]\n", + " X = np.reshape(X, (rows_x, time_steps, 1)) \n", + " return X, Y" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RFhadJjzQO7p" + }, + "source": [ + "For illustration, on the simple example above it returns the expected result: " + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "V38oXJ32QiFK" + }, + "source": [ + "dat = np.linspace(0.,70.,8).reshape(-1,1)\n", + "print(\"dat shape=\",dat.shape)\n", + "X, Y = get_XY(dat, 2)\n", + "print(\"X shape=\",X.shape)\n", + "print(\"Y shape=\",Y.shape)\n", + "#print('dat=',dat)\n", + "print('X=',X)\n", + "print('Y=',Y)\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4V4IE7TvQpDW" + }, + "source": [ + "Now use it for the sunspot data. We’ll use 12 time_steps for the sunspots dataset as the sunspots generally have a cycle of 12 months. You can experiment with other values of time_steps." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "YBnBPDxiQsjL" + }, + "source": [ + "time_steps = 24\n", + "trainX, trainY = get_XY(train_data, time_steps)\n", + "testX, testY = get_XY(test_data, time_steps)\n", + "print(\"trainX shape=\",trainX.shape)\n", + "print(\"trainY shape=\",trainY.shape)\n", + "print(\"testX shape=\",testX.shape)\n", + "print(\"testY shape=\",testY.shape)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Xz2JRTGKMzo2" + }, + "source": [ + "## Step 4: Create RNN Model And Train" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "SyAE6XLnMGDO" + }, + "source": [ + "model = create_RNN(hidden_units=3, dense_units=1, input_shape=(time_steps,1), \n", + " activation=['tanh', 'tanh'])\n", + "model.fit(trainX, trainY, epochs=20, batch_size=1, verbose=2)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tluiPIaxM9FH" + }, + "source": [ + "## Step 5: Compute And Print The Root Mean Square Error" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "LnWdCmqJMK-k" + }, + "source": [ + "def print_error(trainY, testY, train_predict, test_predict): \n", + " # Error of predictions\n", + " train_rmse = math.sqrt(mean_squared_error(trainY, train_predict))\n", + " test_rmse = math.sqrt(mean_squared_error(testY, test_predict))\n", + " # Print RMSE\n", + " print('Train RMSE: %.3f RMSE' % (train_rmse))\n", + " print('Test RMSE: %.3f RMSE' % (test_rmse)) \n", + "\n", + "# make predictions\n", + "train_predict = model.predict(trainX)\n", + "test_predict = model.predict(testX)\n", + "# Mean square error\n", + "print_error(trainY, testY, train_predict, test_predict)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "eed7244NNHs9" + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nDwJvflyNGz3" + }, + "source": [ + "## Step 6: View The result" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fQPjWHWvMMGL" + }, + "source": [ + "# Plot the result\n", + "def plot_result(trainY, testY, train_predict, test_predict):\n", + " actual = np.append(trainY, testY)\n", + " predictions = np.append(train_predict, test_predict)\n", + " rows = len(actual)\n", + " plt.figure(figsize=(15, 6), dpi=80)\n", + " plt.plot(range(rows), actual)\n", + " plt.plot(range(rows), predictions)\n", + " plt.axvline(x=len(trainY), color='r')\n", + " plt.legend(['Actual', 'Predictions'])\n", + " plt.xlabel('Observation number after given time steps')\n", + " plt.ylabel('Sunspots scaled')\n", + " plt.title('Actual and Predicted Values. The Red Line Separates The Training And Test Examples')\n", + "plot_result(trainY, testY, train_predict, test_predict)" + ], + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file -- 2.11.4.GIT