Overview
The Global Ecovillage Newtwork’s (GEN) website contains a directory of hundreds of ecovillages around the world.
My goal: to efficiently transfer all ecovillage data points from GEN’s embedded map to my own Google My Maps.
My approach: download GEN’s geographic data, extract and covert data to CSV, and upload the CSV to Google My Maps.
Step-by-step
In trying to figure out how to scrape data from embedded maps, I came across this site, which suggested inspecting the HTML for geographic data. Searching the code for terms like “lat”, I discovered that the geographic coordinates data were all buried in right in the HTML!
So using wget in bash, I downloaded the HTML:
wget -O GEN_map.html https://ecovillage.org/projects/map/
Exploring the HTML code, it appeared that the latitude-longitude data for all the points on the map were in one very, very, very long line of code (in this case, line 162):
Next, I ran a series of piped commands to extract and organize the text and format it as a CSV file:
grep '^<body class=["]page-template' GEN_map.html | # extract line containing coordinate data
awk '{gsub("},{","\n")}1' | # split most ecovillage records onto separate lines
awk '{gsub("}","\n")}1' | # split the remaining ecovillage records onto separate lines
awk '{gsub("{","\n")}1' | # split the remaining ecovillage records onto separate lines
grep '^\"ID\"' | # extract only lines containing ecovillage records
awk '{gsub("\"ID\":\"","")}1' | # remove string '"ID":"'
awk '{gsub("\",\"post_title\":\"","\t")}1' | # replace string '","post_title":"' with tab
awk '{gsub("\",\"post_type\":\"","\t")}1' | # replace string '","post_type":"' with tab
awk '{gsub("\",\"lat\":","\t")}1' | # replace string '","lat":' with tab
awk '{gsub(",\"lng\":","\t")}1' | # replace string ',"lng":' with tab
awk '{gsub(",","")}1' | # remove commas (in preparation for writing CSV)
sed 's/\\//g' | # remove backslashes (they mess up the syntax)
sed 's/\"//g' | # remove quotation marks (they mess up the syntax)
sed 's/\t/,/g' | # replace tabs with commas
sed '1s/^/ID,post_title\,post_type,lat,long\n/' > GEN_map_LatLong.csv # append header and write to CSV
Figuring out most of commands above was an iterative process of going back and forth between the script and the output to figure out (1) where to put line breaks, (2) what extraneous text to remove, (3) where to put tab breaks, and (4) which troublesome characters I needed to remove for clean syntax. I chose to start by separating fields with tab breaks and then replace tabs with commas after removing all troublesome commas.
Keep in mind that the set of commands I used is specific to GEN’s formatting. This script probably would need to be modified to work with other websites.
Finally, I imported the CSV into Google Maps:
The results
Here’s a side-by-side comparison from a zoomed-in portion of the two maps–one from GEN and the one I created:
It looks like all the points were transfer successfully!
RESOURCES/LINKS
- Global Ecovillage Network
- Google My Maps
- Importing CSV File to Google Maps (Stack Overflow)
- Scrape data from an interactive map (ParseHub)
- Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Was this useful for you? How would you have done it? I’d love to hear your thoughts in the comments below!