Title: | Package to Download and Create the DB of LUCAS Data Harmonized |
---|---|
Description: | Reproduces the harmonized DB of the ESTAT survey of the same name. The survey data is served as separate spreadsheets with noticeable differences in the collected attributes. The tool here presented carries out a series of instructions that harmonize the attributes in terms of name, meaning, and occurrence, while also introducing a series of new variables, instrumental to adding value to the product. Outputs include one harmonized table with all the years, and three separate geometries, corresponding to the theoretical point, the gps location where the measurement was made and the 250m east-facing transect. |
Authors: | Momchil Yordanov [cre], Laura Martinez [aut], Raphael dAndrimont [aut] |
Maintainer: | Momchil Yordanov <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0 |
Built: | 2025-02-15 04:11:03 UTC |
Source: | https://github.com/cran/lucas |
Add geometries to lucas harmonized table: - location of theoretical point(th_geom) from fields th_long, the_lat - location of lucas survey (gps_geom) from fields gps_long, gps_lat - lucas transect geometr (trans_geom) from fields gps_long, gps_lat - distance between theoretical and survey point (th_gps_dist)
Add_geom(con, save_dir)
Add_geom(con, save_dir)
con |
Connection to db |
save_dir |
Dirrectory where to save geometries |
To create the conection please see Connect_to_db
## Not run: Add_geom(con) ## End(Not run)
## Not run: Add_geom(con) ## End(Not run)
Adds new columns to all table that will be necessary for when tables are merged. Includes letter group - first level of LUCAS land cover/land use classification system year - year of survey file_path_gisco_n/s/e/w/p - file path to full HD images on ESTAT GISCO cloud service for North, South, East, West, and Point images
Add_new_cols(con, years)
Add_new_cols(con, years)
con |
Connection to db |
years |
Numeric vector of years to be harmonised |
To create the conection please see Connect_to_db
## Not run: Add_new_cols(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Add_new_cols(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
Add photo fields 2006
Add missing columns
Add_photo_fields_2006(con) Add_missing_cols(con, years)
Add_photo_fields_2006(con) Add_missing_cols(con, years)
con |
Connection to database |
years |
Numeric vector of years to be harmonised |
Add_photo_fields_2006
: missing columns photo_n/e/s/w in 2006 data from the information of the exif DB
Add_missing_cols
: missing columns to all tables before merge
To create the conection please see Connect_to_db
To create the conection please see Connect_to_db
## Not run: Add_photo_fields_2006n(con) ## End(Not run) ## Not run: Add_missing_cols(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Add_photo_fields_2006n(con) ## End(Not run) ## Not run: Add_missing_cols(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
Adds revisit column to lucas harmonized table to show the number of times between the years when the point was revisited.
Add_revisit(con)
Add_revisit(con)
con |
Connection to db |
To create the conection please see Connect_to_db
## Not run: Add_revisit(con) ## End(Not run)
## Not run: Add_revisit(con) ## End(Not run)
Corrects any typo, spelling mistake, or spelling difference in the user-created mapping CSVs, used to generate labels in subsequent User_friendly() function by alligning them to the mapping CSV of the latest survey.
Align_Map_CSVs(mapp_csv_folder, years)
Align_Map_CSVs(mapp_csv_folder, years)
mapp_csv_folder |
Directory where mapping files are stored |
years |
Numeric vector of years to be harmonised |
To create the conection please see Connect_to_db
## Not run: Align_Map_CSVs('/data/LUCAS_harmo/data/mappings', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Align_Map_CSVs('/data/LUCAS_harmo/data/mappings', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
Check that the user has downoad all the files needed
Assert_files(data_dir)
Assert_files(data_dir)
data_dir |
Character. Folder where you saved all the micro data downloaded from EUROSTAT |
Nothing if OK error if failed
Consistency check for Allign_map_CSVs function and creation of a new document with the explicit legends documented in document C3 (Classification) from LUCAS surveys.
Check_Map_CSVs(mapp_csv_folder, years, save_dir)
Check_Map_CSVs(mapp_csv_folder, years, save_dir)
mapp_csv_folder |
Directory where mapping files are stored |
years |
Numeric vector of years to be harmonised |
save_dir |
Directory where new C3 legends will be saved. Idealy (for consistency's sake) this directory should be the support_dir from main.R script. |
To create the conection please see Connect_to_db
connect to the db where you want to upload all LUCAS points
Connect_to_db(user, host, port, password, dbname)
Connect_to_db(user, host, port, password, dbname)
user |
Character. User of the database |
host |
Character. Host of the DB |
port |
Integer. Port to connect to usually 5432 |
password |
Character. Password to access to the DB |
dbname |
Character |
conection to the db
## Not run: con <- Connect_to_db("andrrap", "localhost", 5432,"andrrap","andrrap") con <- Connect_to_db("martlur", "/var/run/postgresql", 5432,"martlur","postgres") con <- Connect_to_db("postgres", "172.15.0.10", 5432,"test","postgres") ## End(Not run)
## Not run: con <- Connect_to_db("andrrap", "localhost", 5432,"andrrap","andrrap") con <- Connect_to_db("martlur", "/var/run/postgresql", 5432,"martlur","postgres") con <- Connect_to_db("postgres", "172.15.0.10", 5432,"test","postgres") ## End(Not run)
Perform consistency checks on newly created tables to ensure conformity in terms of column order and data types
Consistency_check(con, years, manChangedVars)
Consistency_check(con, years, manChangedVars)
con |
Connection to db |
years |
Numeric vector of years to be harmonised |
manChangedVars |
File path to csv of attributes and relevant years to which manual manipulation has been done and thus cannot clear a consistency of values check |
To create the conection please see Connect_to_db
## Not run: Consistency_check(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Consistency_check(con, c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
Harmonize long values in all tables
Correct_long(lucas2009)
Correct_long(lucas2009)
lucas2009 |
Dataframe with the 2009 data |
Correct_long
: the long values of 2009 data by applying th_ew to th_long and erase this column
To create the conection please see Connect_to_db
## Not run: Correct_long(con) ## End(Not run)
## Not run: Correct_long(con) ## End(Not run)
Applying a correction of the values of columns th_long and th_lat according to the latest LUCAS grid
Correct_th_loc(con, lucas_grid)
Correct_th_loc(con, lucas_grid)
con |
Connection to db |
lucas_grid |
File path to the csv file of the latest LUCAS grid |
To create the conection please see Connect_to_db
## Not run: Correct_th_loc(con, lucas_grid) ## End(Not run)
## Not run: Correct_th_loc(con, lucas_grid) ## End(Not run)
Create database tags (primary key), index, and spatial index and a new id column for the harmonized table
Create_tags(con)
Create_tags(con)
con |
Connection to db |
To create the conection please see Connect_to_db
## Not run: Create_tags(con) ## End(Not run)de
## Not run: Create_tags(con) ## End(Not run)de
Re-order columns of final tables
Final_order_cols(con)
Final_order_cols(con)
con |
Connection to db |
To create the conection please see Connect_to_db
## Not run: Final_order_cols(con) ## End(Not run)
## Not run: Final_order_cols(con) ## End(Not run)
Merge all tables into a single harmonized version containing all years and change to relevant data type, as mapped in the record descriptor
Merge_harmo(con, rd)
Merge_harmo(con, rd)
con |
Connection to db |
rd |
Record descriptor in CSV format |
To create the conection please see Connect_to_db
## Not run: Merge_harmo(con, '/data/LUCAS_harmo/data/supportDocs/LUCAS_harmo_RD.csv') ## End(Not run)
## Not run: Merge_harmo(con, '/data/LUCAS_harmo/data/supportDocs/LUCAS_harmo_RD.csv') ## End(Not run)
Changes order of columns to fit the last survey (2018) and set all column data type to character varying in order to prepare for merge
Order_cols(con, years)
Order_cols(con, years)
con |
Connection to db |
years |
Numeric vector of years to be harmonised |
To create the conection please see Connect_to_db
## Not run: Order_cols(con, c(2006, 2009, 2012, 2015)) ## End(Not run)
## Not run: Order_cols(con, c(2006, 2009, 2012, 2015)) ## End(Not run)
Updates values in all tables to fit the last survey (2018) in terms of the coding of different variables; update is based on pre-made mappings
Recode_vars(con, csv, years, nonHarmonizeableVars)
Recode_vars(con, csv, years, nonHarmonizeableVars)
con |
Connection to db |
csv |
CSV file that holds the pre-prepared variable mappings |
years |
Numeric vector of years to be harmonised |
nonHarmonizeableVars |
Characer vector of variables that cannot be harmonized between the years. These pertain to the variables which have been collected at the earlier stages (before 2018) of survey as ordered categorical variables, and at later stages (at and later than (?) 2018) - as discrete numbers. Such attributes are lc1/2_perc, lu1/2_perc and soil_stones_perc. |
To create the conection please see Connect_to_db
## Not run: Recode_vars(con, '/data/LUCAS_harmo/data/mappings/RecodeVars.csv', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Recode_vars(con, '/data/LUCAS_harmo/data/mappings/RecodeVars.csv', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
Removes unwanted columns as specified by user
Remove_vars(con, vars)
Remove_vars(con, vars)
con |
Connection to db |
vars |
Character vector of variables, specified by name to remove |
To create the conection please see Connect_to_db
## Not run: Remove_vars(con, vars) ## End(Not run)
## Not run: Remove_vars(con, vars) ## End(Not run)
Columns with different names between the surveys must be made to fit the last survey before merge
Rename_cols(con, csv)
Rename_cols(con, csv)
con |
Connection the database |
csv |
CSV file with the relevant column name mappings |
To create the conection please see Connect_to_db
## Not run: Rename_cols(con, '/data/LUCAS_harmo/data/mappings/columnRename.csv') ## End(Not run)
## Not run: Rename_cols(con, '/data/LUCAS_harmo/data/mappings/columnRename.csv') ## End(Not run)
Perform consistency checks on newly created UF fields to ensure conformity in terms of column order and data types
UF_Consistency_check(con)
UF_Consistency_check(con)
con |
Connection to db |
To create the conection please see Connect_to_db
## Not run: UF_Consistency_check(con) ## End(Not run)
## Not run: UF_Consistency_check(con) ## End(Not run)
Updates Record descriptor by adding a field (year) showing the year for which the variable exists and removing variables listed in Remove_vars function from RD
Update_rd(con, rd, years)
Update_rd(con, rd, years)
con |
Connection to db |
rd |
Path to record descriptor csv |
years |
Character vector of the years of survey |
To create the conection please see Connect_to_db
## Not run: Update_rd(con, rd, years) ## End(Not run)
## Not run: Update_rd(con, rd, years) ## End(Not run)
Upload_exif
Upload_exif(con, exif)
Upload_exif(con, exif)
con |
Connection to database |
exif |
the actual EXIF cvs located in mappings |
Upload_exif
: to DB the exif information of LUCAS
To create the conection please see Connect_to_db
## Not run: Add_photo_fields_2006n(con) ## End(Not run)
## Not run: Add_photo_fields_2006n(con) ## End(Not run)
Upload to the DB all the 2009-2018 lucas csv downloaded from : https://ec.europa.eu/eurostat/web/lucas/data/primary-data there should be: EU_2012_20200213.CSV EU_2018_20200213.CSV OutScope_2015_20200225.CSV EU_2009_20200213.CSV EU_2015_20200225.CSV For 2006 it first combines them into one dataset for the entire year comprising of: BE_2006_0.xls CZ_2006_0.xls DE_2006_0.xls ES_2006_0.xls FR_2006_0.xls HU_2006_0.xls IT_2006_0.xls LU_2006_0.xls NL_2006_0.xls PL_2006_0.xls SK_2006_0.xls
Upload_to_db(data_dir, con)
Upload_to_db(data_dir, con)
data_dir |
Character. Folder where you saved all the micro data downloaded from EUROSTAT |
con |
PosGresSQLConnection Object. |
Boolean. True if the update to the DB worked FALSE otherwise
To create the conection please see lucas]Connect_to_db
To assert that you have the files Assert_files
## Not run: Upload_to_db('/data/LUCAS_harmo/data/input', con) ## End(Not run)
## Not run: Upload_to_db('/data/LUCAS_harmo/data/input', con) ## End(Not run)
Convert values in designated columns (lc1, lc1_spec, lu1, lu1_type, lc2, lc2_spec, lu2, lu2_type, cprn_lc) to uppercase for consistency's sake
Upper_case(con, years)
Upper_case(con, years)
con |
Connection to db |
years |
Numeric vector of years to be harmonised |
To create the conection please see Connect_to_db
## Not run: Upper_case(con,c(2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: Upper_case(con,c(2009, 2012, 2015, 2018)) ## End(Not run)
Creates columns with labels for coded variables and decodes all variables where possible to explicit labels
User_friendly(con, data_dir, years)
User_friendly(con, data_dir, years)
con |
Connection to db |
data_dir |
Directory where files are stored |
years |
Numeric vector of years to be harmonised |
To create the conection please see Connect_to_db
## Not run: User_friendly(con, '/data/LUCAS_harmo/data/mappings', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)
## Not run: User_friendly(con, '/data/LUCAS_harmo/data/mappings', c(2006, 2009, 2012, 2015, 2018)) ## End(Not run)