--- title: "htsr-package" author: "Pierre Chevallier" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{htsr-package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` *HydroSciences Laboratory (IRD, CNRS, Univ. Montpellier, IMT Mines Ales), France* **v 2.1.6** # Content * Origin + Brief history + Hydraccess + Sqlite data base * Principles * Dependencies + System library + User library + Special case of RODBC * File formats + Sqlite data base + Time-series + Calibrations and discharge measurements + Gaps * Functions + Short cut fonction + Data base functions + File functions + Hydro-meteo time-series + Plot functions + Micellaneous functions * Development main references + Sqlite management + R coding * Appendix + Correspondence of the data base tables between Hydraccess and Sqlite htsr + Fields of the main tables # Origin ## Brief history IRD and previously ORSTOM have developed quite early (in the 70ties) data bases for storing hydrological and meteorological time-series. During the 80ties the managing tools use home-made codes (Hydrom and Pluviom), which were in the late 90ties adapted with commercial applications like Microsoft Access or Oracle and more recently for web interface. With the development of open access platforms and applications, in the one hand, and the need of more flexibility, in the other hand, it appears useful to build a set of functions able to manage hydro-meteo time-series independently of the operating system and of proprietary tools. The `htsr` package was developed on the basis of the public domain data base management system Sqlite and on the hydro-meteo time-series data base application Hydraccess. It works on Linux, Windows or MacOS platforms, with the free programming language R and the interface RStudio preliminary installed. ## Hydraccess Hydraccess was developed at IRD () by Philippe Vauchel. It is based on several tools provided by Microsoft Office (R) including Microsoft Access. It can be obtained from Hydracces is available in French, English, Spanish, Portuguese and Russian. ## Sqlite data base Sqlite is an embedded data base on the public domain (). It is self-contained, and serverless, without preliminary configuration. It can be installed using the instructions given by . It can be used directly through many dedicated applications available for a large set of operating systems. # Principle The Sqlite data base is organized with the same tables and fields as the Hydraccess structure, except that the table names are changed for practical reasons. A correspondence of these names in given in appendix. In Hydraccess, two families or time-series coexist: "hydro data" (Type_Station = H) and "weather data" (Type_Station = M). The first one includes the main tables Cotes/WL, Debits/DI and Qualite/QU; the second one includes Pluies/PR and Meteo/WE. Other tables regarding discharge measurements or calibration are also affected to the "hydro data" family. This distinction does not exist anymore in the Sqlite data base processes. That means that one should verify if no stations have the same name in the one and the other family. If it is the case the name must be changed before converting a Hydraccess data base in a Sqlite data base. A time-series is always attached to a **station**, which describes its location and managing infos, and to a **sensor**, which is attached to the data type and to the used device. Both are indexed with a main id. In Hydraccess: * **The station id** has no particular syntax; it associated to a number of description fields. Note that the station Name field is mandatory. * **The capteur/sensor id** has at most 5 digits and the first one defines the timestep of the variable: **I** for instantaneous, **J** for daily and **M** for monthly. By convention, a daily value is affected at 12:00:00 and a monthly value to the day 16 at 00:00:00. This convention remains for the Sqlite data base. The other 4 digits of the sensor id are free in Hydraccess. In the Sqlite data base the sensor id length is not any more limited, but it could be a good idea to conserve it. Other description fields are also associated with the sensors. In brief, that means that a time-series record must have 4 objects: * a date-time; * a numeric value; * a station id factor; * a sensor id factor. Practically it associates the two tables "station" and "sensor" with one of the five following: "water level", "discharge", "weather", "precipitation" or "quality". The other associated tables can be used for intermediary calculations, like the computation of the discharge from the water level, or for more detailed information. # Dependencies The following packages and their own dependencies are needed to apply the htsr package. * `tidyverse`, `directlabels`, `editData`, `openair`, `Rcpp`, `RSQLite`, `RODBC`, `shiny`, `shinyFiles`, `terra`, `WriteXLS`, `fs` ## Special case of RODBC Because Hydraccess is only configured for Microsoft Windows, the associated data bases work only on this platform. That means that the operation, which consists to convert a Hydraccess data base into a Sqlite data base must be done in this environment. The `RODBC` package is associated to the htsr package, but it can be only used in the Windows platform through the function `d_convert_hydraccess`. In particular, **this function does not work on Linux**. Because a Hydraccess data base is proprietary depending, a preliminary configuration of the MS-Windows platform must be done: * Install "Microsoft Database Engine" (freely available on Microsoft web site). * In the administration tools, config the connection ODBC 32b with the Access data base to be used. An alternative consists to execute `\windows\sysWOW64\odbcad32.exe`. * Open the RStudio session in 32b (this can be precised in the RStudio general options ; RStudio must be re-launched for taking the change into account). # File formats ## Sqlite data base * extension : `.sqlite` * SQLite data base with 38 tables. ## Time-series * extension : `.hts` * Rdata file containing a “tibble” object, `tstab`, with 4 columns : `Date`, `Value`, `Station`, `Sensor` ``` > tstab Date Value Station Sensor 1 2015-10-03 12:00:00 2.17 CKS2500 IQ ``` ## Gaps * extension : `.gap` * Rdata file containing a “tibble” object, `ze`, with 3 columns ``` > ze # A tibble: 14 x 3 date valeur stacapt 1 2015-10-03 17:00:00 1 _ 2 2015-12-11 23:30:00 1 _ ``` # Functions 6 categories of functions are provided by the htsr package, with, in addition, one short-cut function. They are distinguished with a prefix character and listed below. Infos on their uses are detailed in the on-line help or using in the console the command `?` or `help()`. ## Short-cut function * `fc(file)`, shortcut for file.choose(file), which is very frequently needed. ## Data base functions (prefix `d_` or `ds_`) ### Data base tools * `d_backup(fsq)`, backups a htsr sqlite data base. * `d_compact(bd.sqlite)`, compacts a htsr sqlite data base * `d_create(fsq, cr_table = TRUE, bku = TRUE)`, creates a htsr sqlite data base * `ds_inventory_station()` and `ds_inventory_sensor()`, produce the inventory of the stations, respectively sensors of a selected station, stored in a htsr sqlite data base. * `ds_sensor()`, creates, modifies or removes a sensor. * `ds_station()`, creates, modifies or removes a station. * `d_table(fsq, table, op = "C", bku = TRUE)` creates or removes a table ### Data import/export * `d_exp_discalib(fsq, sta, calib=TRUE, dism=TRUE)`, exports discharge measurements and calibrations from data base * `ds_exp_hts()`, extract hts files from a sqlite data base * `d_imp_hts(fsq, file, table, bku = TRUE)`, imports a hts file into a data base * `d_rem_hts(fsq, table, sta, sen, start = NA, end = NA)`, removes a htsr record from a data base ### Conversion * `d_convert_hydraccess(fsq, db.hydraccess)`, converts a full Hydraccess database into a new htsr sqlite database. **NB:** Only works in Windows environment with a 32b R session. * `d_convert_weewx(fsq, db.weewx, sta, name_st, tzo = "CET", bku = TRUE)`, converts a weewx data base () into a htsr sqlite base. * `d_convert_meteofrance_d(fmeteo)` and `d_convert_meteofrance_h(fmeteo)`, convert a Meteo France csv daily data, respectively hourly, file into a htsr sqlite base. * `d_convert_hubeau(hubeau.dir, station.id, fsqname)`, one or more hydrological station data from the data.eau.france collection into a htsr sqlite base. ## File functions (prefix `f_`) * `f_change_id(file, sta, sen, overwrite)`, changes station id or sensor id in a hts file.. Generates a file with the prefix `nw_`. * `f_convert(file, form_start = "hts", form_end = "xlsx", ta = NA, sen = NA, output = NA, variable = NA)`, converts data-series file in another format: `hts`, `xls/xlsx`, `csv`) and viceversa. * `f_csv_multivar(files, daily = TRUE, fileo = "fileo")`, build a multivariable table file in csv format. * `f_month2day(file)`: interpolation of daily records from a monthly time series * `f_properties(file, gaps = FALSE)` displays the properties of a hts time-series. ## Hydro-meteo time-series (prefix `h_` or `hs_`) ### Data manipulations * `h_addna(file)`, adds records with NA in a time series at given dates. The output file is named with the prefix `nap_`. * `h_changetz`, allows to change the timezone of a time series * `h_common(files)`, extracts 2 (or more) time-series on their common period. Generates a file with the prefix `co_`. * `h_condition(files, condition)`, conditionally extracts a time-series regarding another one. Generates a file with the prefix `cd_`. * `h_cumul(file, start = NA, end = NA)`, cumulates the values of a time-series. Generates a file with the prefix `cu_`. * `h_nodata(file, threshold=NA, test="=", start=NA, end=NA)`, replaces values with NA conditionally or in a time interval. Generates a file with the prefix `na_`. * `h_rbind(files, sensor, gap = TRUE)`, binds 2 time-series on consecutive periods * `h_replace(file, old.val, new.val)`, replaces a value by another. Generates a file with the prefix `re_`. * `h_rollav (file, ti = 7, position = c("central", "right"))`, computes a rolling average of a daily time-series. Generates a file with the prefix `ro_`. * `h_substitute(files)`, substitutes the missing values in a series by existing values of another series. Generates a file with the prefix `su_`. * `h_weightedsum(files, weights = NA, constant = 0)`, makes a weighted sum of time-series. Generates a file with the prefix `ws_`. ### Data critics * `h_gaperr(file, nv = 1, itv0 = 43201, df)`, replaces errors with gaps in a time-series based on neighboring values. Generates a file with the prefix `eg_`. * `h_gapfill(file, npdt)`, produces a simple gapfilling in a time-series. Generates a file with the prefix `gf_`. * `h_gaprem_itv(file, itv0 = 43201)`, removes gaps in a time-series with a time interval threshold. Generates a file with the prefix `gr_`. ### Time treatments * `hs_tstep()`, computes infra-daily data with a fixed time step. Generates a file with the suffix `_xxxx`, where xxxx is the `tst` value. It also makes monthly operations, based on a daily time-series. Generates hts files with the suffixes `_C`, `_G` or `_M` and MS Excel files with the prefixes `ad_` and `cm_`. * `h_year(file, mhy = 1, op = "M", dig = 1)` extracts an annual time series from a daily time series. * `h_restrict(file, start=NA, end=NA)`, restricts a series between 2 dates. Generates hts files with the suffixes `re_`. * `h_season(file, monthstart)`, produces a seasonal selection. Generates a file with the prefix `sx_`, where x is 2, 3 or 4. * `h_avday` computes a one year time series filled with the mean values of each calendar day over an interval longer than 4 years. ### Hydro-meteo processes * `h_stat_basic(file)`, gives basic statistics of a time-series * `h_adjust(file, time_unit = "year")`: adjustment of a time-series to a statistical model. For instance only a linear model is allowed. * `h_wl_di(fsq, sta, seni, seno, dstart = NA, dend = NA, dbo = TRUE)`, computes a discharge time-series from water levels data and calibration curves. * `h_rainsnow(fpr, fta, ta0,ta1,sta=NA)`, shares the solid and liquid precipitations with a temperature criteria. Generates a file with the prefix `pr_`. Generates a file with the prefix `sn_`. ## Plot functions (prefix `p_` or `ps_`) ### General plots * `ps_plothts()`, plots hts files. ### Other plots * `p_box_month(file, title = "Title", axeY = "Y-axis", savefig = FALSE, fileo = "plot.png", width = 8, height = 6)`, plots a boxplot of the 12 months of a time-series. * `p_clim(p_clim <- function (files, type="line", hydro.month=1, title="Title", yaxis="Value", y.down=NA, y.up=NA, rpal=FALSE, pal=mapalette, legend.l=NA))`, plots climatologies in hydrological year. * `p_discalib(fcalib, sen, plotcalib= TRUE, plotdism=TRUE, title="Title", savefig=FALSE, width= 8, height= 6, fout="plot.png", limx =FALSE, limy = FALSE, xinf=NA, xsup=NA, yinf=NA, ysup=NA)`, plots calibration curves water levels vs discharges. * `p_gaps(nbf, title = "Inventory", BW = FALSE, margin = 0.1)`, plots of data inventory * `p_hypso(file, abbrev, prop = FALSE, range=50, fact=5, title="Title", savefig=FALSE, width= 8,height= 6, fileo="plot.png")`, plots the hypsometry curve of one or more basins * `p_scatter(files, intercept.zero = FALSE, remove.zero = FALSE, lg.axis = c(NA, NA),title = "Title")`, plots a scatter plot of 2 or more time-series * `p_wind(fsq, sta, swd, swv, ws.int = 0.5, angle = 45, grid.line = 10, type = "default", breaks = 5, offset = 5, paddle = FALSE)`, plots wind rose ## Weather functions (prefix `w_`) * `w_atmp_alt (f_atmp, f_temp, alt0 = 0, alt)` computes atmospheric pressure, function of altitude. * `w_etp (method = c("Turc", "Penman-Monteith", "Priestley-Taylor", "Makkink", "Heargraves-Samani"), freq = c("day", "month"), f_temp, f_relh = NA, f_radg = NA, f_radn = NA, f_atmp = NA, f_wvel = NA, f_tmin = NA, f_tmax = NA, lat = NA, alt = NA, albedo = NA, z = NA)`, computes the potential evapotranspiration with several methods. Generates a file with the prefix `xEtpyy_`, where x is J (daily) or M (monthly), and yy is Tu (Turc), PM (Penman-Monteith), PT (Priestley-Taylor), Ma (Makkink) or HS (Heargraves-Samani). * `w_temp_alt (file , alt0 = 0, alt, grad = -0.0065)` computes temperature, function of altitude. * `w_spc2rel_hum ((f_spechum, f_temp, f_atm)` converts specific humidity to relative humidity. ## Miscellaneous functions (prefix `z_`) * `z_coord(ncoord = NA, ccoord = NA, type)`, converts coordinate from numeric to character and reverse. # Development main references ## SQLite management * SQLite, sql database engine, 2017, Tutorial Point, * Ripley B., ODBC Connectivity, 2020. * ## R coding * Chang W., 2013. R Graphics Cookbook. O'Reilly. 398p. ISBN 978-1-449-31695-2. Actualized on * Wickham H., 2015. R Packages - Organize, test, document, and share your code. O'Reilly. 190p. ISBN 978-1-491-91059-7. Actualized version on * Wickham H., 2021. Mastering Shiny. 348p. ISBN 978-1-492-04738-4. * Wickham H. & Grolemund G., 2017. R for Data Science - Import, tidy, transform, visualize, and model data. O'Reilly. 494p. ISBN 978-1-491-91039-9. * # Appendix ## Correspondence of the data base tables between Hydraccess and Sqlite htsr ### Main tables ``` -------------------------------------------- Designation Hydraccess Sqlite name --------------- -------------- ------------- Discharge Debits DI Precipitation Pluies PR Quality Qualite QU Sensors Capteurs SS Station Station_Base ST Weather Meteo WE Water level Cotes WL -------------------------------------------- ``` ### Other tables ``` -------------------------------------------------------------- Designation Hydraccess Sqlite name --------------------------- -------------------- ------------- Basin Bassins BA Calibration Date Etal_Dates CD Country Zones_Pays CO Disch. measur. proceeding Jaugeages_Dep DP Discharge measurement Jaugeages DM Elevation zero Zero_NG EZ Equipment Equipements EQ Event Evenements EV Large basis Bassins_Grands LB Liquid flow calibration Etal_HQ LC Manager Gestionnaires MG Nature id Codes_Nature NC Operating mode Modes_Opératoires OM Origine id. Codes_Origine OC Profile Profils_Data PF Profile data Profils_Data PD Propeller Helices PP Quality id. Codes_Qualite QC Region Regions RE River Rivieres RV Sensor comm. Capteurs_Comm SM Sensor history Capteurs_HistApp SH Settings Parametrage SE Small basin Bassins_Petits SB Solid flow calibration Etal_HK SC Station equipment Stations_Equipment SQ Station file Dossiers_Stations SF Sub-zone Zone_Sous SZ Temp_station2 Temp_Stations2 TS Valve Vannes VA Zone Zones ZO -------------------------------------------------------------- ``` ## Fields of the main tables They correspond to French spelling, which is used in Hydraccess. The compulsory fields are marked with (!). ### Stations_Base/ST Ordre, Type_Station (!), Id_Station (!), Id_Secondaire, Id_Tertiaire, Type_Meteo, Nom, Pays, Zone, SousZone, GrandBassin, Bassin, PetitBassin, Riviere, Gestionnaire, Latitude, Longitude, Altitude, Superficie_bv, Mois_Debut_Hydro, Debut_Activite, Activite, Critere_OuiNon, Critere_Numerique, Critere_Texte, Nom_Observateur, Adresse, Teletransmission, Enregistreur, Fictive, Commentaire, Flag, District, Localite ### Capteurs/SS Type_Station (!), Id_Station (!), Capteur (!), Table (!), Nature, Description, Commentaire, Code_Limni, Principal, Fictif, Maj_Journaliers, Maj_Traduction, Acquisition_Auto, Operationnel, Liste_Inst, Liste_Jour, Liste_Mois, Agregation, Decalage_Temps, Mini, Maxi, Gradient_Maxi, Precision, Decimales, Pente ### Cotes/WL, Debits/DI, Meteo/WE, Qualite/QU Type_Station (!), Id_Station (!), Capteur (!), Table (!), Date (!), Valeur, Origine, Qualite ### Pluies/PR Type_Station (!), Id_Station (!), Capteur (!), Table (!), Date (!), Valeur, Origine, Qualite, Nature