jh-dev/test_station.ipynb

   1 {
   2  "cells": [
   3   {
   4    "cell_type": "markdown",
   5    "id": "53b848a1-52c7-4f89-bd36-12ae168d9a71",
   6    "metadata": {},
   7    "source": [
   8     "### Setup"
   9    ]
  10   },
  11   {
  12    "cell_type": "code",
  13    "execution_count": null,
  14    "id": "2889acbb-a0e2-4a41-8b8c-78fb77320abe",
  15    "metadata": {},
  16    "outputs": [],
  17    "source": [
  18     "# Environment\n",
  19     "import numpy as np\n",
  20     "import pandas as pd\n",
  21     "from MesoPy import Meso\n",
  22     "import matplotlib.pyplot as plt\n",
  23     "\n",
  24     "meso_token=\"4192c18707b848299783d59a9317c6e1\"\n",
  25     "m=Meso(meso_token)"
  26    ]
  27   },
  28   {
  29    "cell_type": "markdown",
  30    "id": "f670c34f-ffe4-4e93-a8ef-bde4e15f3a54",
  31    "metadata": {},
  32    "source": [
  33     "### Data Read\n",
  34     "\n",
  35     "Below we print all the RAWS stations in Colorado with full data availability over the time period in question. The search and output is done in the notebook `fine_RAWS_station`.,\n",
  36     "\n",
  37     "Note: the time period is arbitrary at this point, and the `find_RAWS_station` notebook summarized this data availability for this particular time period. We need to consider how this should work for a different time period. Question: will a particular STID give different available data series for different times, or always the same data series names?"
  38    ]
  39   },
  40   {
  41    "cell_type": "code",
  42    "execution_count": null,
  43    "id": "3a684470-b52f-4ad7-b4cb-d2ca9e68c65d",
  44    "metadata": {},
  45    "outputs": [],
  46    "source": [
  47     "# Read Mesonet data\n",
  48     "\n",
  49     "vars='air_temp,relative_humidity,precip_accum,fuel_moisture'\n",
  50     "time_start = \"201806010800\"  # June 1 2018 08:00 in format yyyymmddHHMM\n",
  51     "time_end   = \"201907200900\"  # June 20 2018 09:00 in format yyyymmddHHMM \n",
  52     "stations = pd.read_csv(\"station_df_co.csv\")\n",
  53     "stations[stations[\"fuel_moisture\"]==1].head()"
  54    ]
  55   },
  56   {
  57    "cell_type": "code",
  58    "execution_count": null,
  59    "id": "dca40610-483d-4540-b3eb-4c7ed9efd129",
  60    "metadata": {},
  61    "outputs": [],
  62    "source": [
  63     "meso_ts = m.timeseries(time_start, time_end, \n",
  64     "                       stid=\"CPTC2\", vars=vars)\n",
  65     "station = meso_ts['STATION'][0]"
  66    ]
  67   },
  68   {
  69    "cell_type": "markdown",
  70    "id": "355e0157-9c49-473d-80a7-f05ed5ccb51f",
  71    "metadata": {},
  72    "source": [
  73     "Organize RAWS data."
  74    ]
  75   },
  76   {
  77    "cell_type": "code",
  78    "execution_count": null,
  79    "id": "193485f5-796b-49ed-94d3-60a3746c3573",
  80    "metadata": {},
  81    "outputs": [],
  82    "source": [
  83     "import data_funcs as datf\n",
  84     "\n",
  85     "raws_dat = datf.format_raws(station)"
  86    ]
  87   },
  88   {
  89    "cell_type": "code",
  90    "execution_count": null,
  91    "id": "e90aa779-7ca3-46b2-8275-0df6a11d32cb",
  92    "metadata": {},
  93    "outputs": [],
  94    "source": [
  95     "%matplotlib inline\n",
  96     "plt.figure(figsize=(16,4))\n",
  97     "plt.plot(raws_dat['rh'],linestyle='-',c='k')\n",
  98     "plt.title(station['STID'] + ' rh')\n",
  99     "plt.xlabel('Time (hours)') \n",
 100     "plt.ylabel('Relative Humidity (%)')"
 101    ]
 102   },
 103   {
 104    "cell_type": "code",
 105    "execution_count": null,
 106    "id": "1fa29896-6962-4ef9-af6f-55230591cdb3",
 107    "metadata": {},
 108    "outputs": [],
 109    "source": [
 110     "from datetime import datetime, timedelta, time\n",
 111     "import numpy as np\n",
 112     "import matplotlib.pyplot as plt\n",
 113     "import pytz\n",
 114     "\n",
 115     "%matplotlib inline\n",
 116     "plt.figure(figsize=(16,4))\n",
 117     "plt.plot(raws_dat['fm'],linestyle='-',c='k',label='10-h fuel data')\n",
 118     "plt.title(station['STID'] + ' 10 h fuel moisture data')\n",
 119     "plt.xlabel('Time (hours)') \n",
 120     "plt.ylabel('Fuel moisture content (%)')\n",
 121     "plt.legend()"
 122    ]
 123   },
 124   {
 125    "cell_type": "markdown",
 126    "id": "27bd7ffc-4a72-4683-a97a-05ea289b196e",
 127    "metadata": {},
 128    "source": [
 129     "## Run Augmented Moisture Model with RAWS Data"
 130    ]
 131   },
 132   {
 133    "cell_type": "code",
 134    "execution_count": null,
 135    "id": "70993f7b-c464-477c-8715-d8e873033e88",
 136    "metadata": {},
 137    "outputs": [],
 138    "source": [
 139     "import moisture_models as mod\n",
 140     "\n",
 141     "## Model params\n",
 142     "\n",
 143     "hours=1200 # total simulation\n",
 144     "h2 = 300\n",
 145     "m = np.zeros(hours) # preallocate\n",
 146     "m[0]= raws_dat['fm'][0]             # initial state  \n",
 147     "P = np.zeros(hours)\n",
 148     "P[0] = 1e-3 # background state variance\n",
 149     "H = np.array([1.])   # all oQ = np.array([0.02]) # process noise variancebserved\n",
 150     "Q = np.array([1e-3]) # process noise variance\n",
 151     "R = np.array([1e-3]) # data variance"
 152    ]
 153   },
 154   {
 155    "cell_type": "markdown",
 156    "id": "4860f4a5-a56d-4b90-9b41-5227ec550b27",
 157    "metadata": {},
 158    "source": [
 159     "Augmented Model"
 160    ]
 161   },
 162   {
 163    "cell_type": "code",
 164    "execution_count": null,
 165    "id": "0cc8d535-3d92-4198-a30c-128db9eff4cd",
 166    "metadata": {},
 167    "outputs": [],
 168    "source": [
 169     "m,Ec = mod.run_augmented_kf(raws_dat['fm'],raws_dat['Ed'],raws_dat['Ew'],raws_dat['rain'],h2,hours)  # extract from state"
 170    ]
 171   },
 172   {
 173    "cell_type": "code",
 174    "execution_count": null,
 175    "id": "87e89264-491e-4139-986e-fdc4667e41d6",
 176    "metadata": {},
 177    "outputs": [],
 178    "source": [
 179     "def plot_moisture(hmin,hmax):\n",
 180     "  print('training from 0 to',h2,'plot from',hmin,'to',hmax)\n",
 181     "  plt.figure(figsize=(16,4))\n",
 182     "  plt.plot(range(hmin,hmax),raws_dat['Ed'][hmin:hmax],linestyle='--',c='r',label='Drying Equilibrium (%)')\n",
 183     "  plt.plot(range(hmin,hmax),raws_dat['Ew'][hmin:hmax],linestyle='--',c='b',label='Wetting Equilibrium (%)')\n",
 184     "  plt.plot(range(hmin,hmax),Ec[hmin:hmax],linestyle='--',c='g',label='Equilibrium Correction (%)')\n",
 185     "  plt.plot(range(hmin,hmax),m[hmin:hmax],linestyle='-',c='k',label='filtered')\n",
 186     "  plt.plot(range(hmin,hmax),raws_dat['fm'][hmin:hmax],linestyle='-',c='b',label='RAWS data (%)')\n",
 187     "  plt.plot(range(hmin,hmax),raws_dat['rain'][hmin:hmax],linestyle='-',c='b',label='RTMA rain (mm/h)')\n",
 188     "  if hmin>=h2:\n",
 189     "    plt.plot(m[hmin:h2],linestyle='-',c='k',label='Filtered')\n",
 190     "  h1 = np.maximum(hmin,h2)\n",
 191     "  plt.plot(range(h1,hmax),m[h1:hmax],linestyle='-',c='r',label='Forecast (%)')\n",
 192     "  plt.title(station['STID'] +' Kalman filtering and forecast with augmented state, real data. Training 0:%i hmax' % h2)\n",
 193     "  plt.xlabel('Time (hours)') \n",
 194     "  plt.ylabel('Fuel moisture content (%)')\n",
 195     "  plt.legend()"
 196    ]
 197   },
 198   {
 199    "cell_type": "code",
 200    "execution_count": null,
 201    "id": "cac76c32-2343-4d3d-9eb0-46e90333fca8",
 202    "metadata": {},
 203    "outputs": [],
 204    "source": [
 205     "plot_moisture(0,hours)"
 206    ]
 207   },
 208   {
 209    "cell_type": "code",
 210    "execution_count": null,
 211    "id": "6a86ff9d-bfa3-4c00-849f-90ce105dab40",
 212    "metadata": {},
 213    "outputs": [],
 214    "source": [
 215     "plot_moisture(1000,hours)"
 216    ]
 217   },
 218   {
 219    "cell_type": "markdown",
 220    "id": "bf25236e-8e50-49c6-b2ca-c055ec5b350a",
 221    "metadata": {},
 222    "source": [
 223     "## Model Validation\n",
 224     "\n",
 225     "Calculate Mean Absolute Prediction Error (MAPE) for the forecast versus observed fuel moisture data. For comparison, I will calculate MAPE for the entire series, the forecast period, and just the final value (where errors may have accumulated up to)."
 226    ]
 227   },
 228   {
 229    "cell_type": "code",
 230    "execution_count": null,
 231    "id": "24ed5d78-1de8-4e32-af3c-30f3dc445f3a",
 232    "metadata": {},
 233    "outputs": [],
 234    "source": [
 235     "def mape(a1, a2):\n",
 236     "    if a1.shape==():\n",
 237     "        n=1\n",
 238     "    else:\n",
 239     "        n = len(a1)\n",
 240     "    err = 1/n*np.sum(np.abs(a1-a2))\n",
 241     "    return err"
 242    ]
 243   },
 244   {
 245    "cell_type": "code",
 246    "execution_count": null,
 247    "id": "255fef4c-0f4c-4346-a9d9-239bb75e04c2",
 248    "metadata": {
 249     "tags": []
 250    },
 251    "outputs": [],
 252    "source": [
 253     "print('Total MAPE: '+ str(np.round(mape(raws_dat['fm'][0:hours], m), 5)))\n",
 254     "print('-'*25)\n",
 255     "print('Train Period: '+ str(np.round(mape(raws_dat['fm'][0:300], m[0:300]), 5)))\n",
 256     "print('-'*25)\n",
 257     "print('Test Period: '+ str(np.round(mape(raws_dat['fm'][301:hours], m[301:hours]), 5)))\n",
 258     "print('-'*25)\n",
 259     "print('Final Time: '+ str(np.round(mape(raws_dat['fm'][-1], m[-1]), 5)))"
 260    ]
 261   },
 262   {
 263    "cell_type": "code",
 264    "execution_count": null,
 265    "id": "f4d53996-49c0-4e21-a8c3-7fc51d6a33c8",
 266    "metadata": {},
 267    "outputs": [],
 268    "source": []
 269   }
 270  ],
 271  "metadata": {
 272   "kernelspec": {
 273    "display_name": "Python 3 (ipykernel)",
 274    "language": "python",
 275    "name": "python3"
 276   },
 277   "language_info": {
 278    "codemirror_mode": {
 279     "name": "ipython",
 280     "version": 3
 281    },
 282    "file_extension": ".py",
 283    "mimetype": "text/x-python",
 284    "name": "python",
 285    "nbconvert_exporter": "python",
 286    "pygments_lexer": "ipython3",
 287    "version": "3.9.12"
 288   }
 289  },
 290  "nbformat": 4,
 291  "nbformat_minor": 5
 292 }