Global Watersheds Web App – Help / About
This is the combined Help and About page for the Global Watersheds web app at https://mghydro.com/watersheds/. The web app can find the watershed (drainage area) upstream of almost anyplace on Earth faster than most other methods.
Created with the datasets MERIT-Hydro, MERIT-Basins, and HydroSHEDS.
If you use the app in any published work, you can cite the web page, or if you prefer, the GitHub repository which has a DOI from Zenodo: 10.5281/zenodo.7314287
Heberger, Matthew. delineator.py: fast, accurate global watershed delineation using hybrid vector- and raster-based methods. 2022. https://doi.org./10.5281/zenodo.7314287
The app is free and contains no advertising (and no cookies!). The web hosting costs around $25 per month.
If you found this web app fun or useful, or if it saved you time in your work, consider helping to offset my web hosting cost by sending a contribution via via PayPal. I'm a PhD student at the moment, so it will really help!
If you have feedback or suggestions, please get in touch at matt@mghydro.com.
There is no guarantee of the correctness or suitability of these results for any purpose. The author assumes no liability for any harm or damages that result from the use of these data. You should carefully review the results and verify their accuracy.
I have released open-source code for doing watershed delineation on your own computer with Python. This code uses the same hybrid method used in the web app. One advantage of the Python script is that you can run it in "batch mode" to delineate hundreds or thousands of watersheds. The Python script also has several parameters that you can change that alter its performance. By contrast, in the web app, most of these parameters are hard-coded for the sake of simplicity and speed. Visit: https://github.com/mheberger/delineator
If you want to download geodata of your watershed or flowpath, open "Options," and check the box "Make downloadable" box under options. Next, create a new watershed. Any new watersheds you make while this box is checked will be saved on the server and available for download.
Three download formats are available:
I've set up an API so you can get watersheds and rivers without using the map interface. You can use the API with any programming language to automate your workflows, for example to create hundreds (or thousands!) of watersheds. I wrote a blog post about how to do this with Python, and you can download the demo code in a Jupyter notebook.
To use the API, you need to provide a carefully formatted URL. There are two different links -- one for watershed boundaries and another for rivers.
https://mghydro.com/app/watershed_api
https://mghydro.com/app/upstream_rivers_api
You need to append at least two parameters (lat, lng) and an optional third parameter for the precision. Parameters are to be entered as a query string:
lat: a number from -180 to +180.
lng: a number from -60 to +85.
precision: "low" or "high", without quotes.
Optional: if omitted, defaults to "low."
(Note that if your watershed has an area of over 50,000 km², the app will automatically revert to low-precision.)
Latitude and longitude should be in decimal degrees (31.416) and not DMS (31°26'N)
https://mghydro.com/app/watershed_api?lat=43.253&lng=-77.609&precision=high https://mghydro.com/app/upstream_rivers_api?lat=43.253&lng=-77.609&precision=high
If the server could handle your request and create a watershed, you will get the HTML status code 200
, and the
body of the response will be of mimetype="application/json"
.
If there is a problem with one of your inputs, the you will get a 400 Bad Request
status code. For any other kind of error
where the app cannot create a watershed (for example, your outlet point is over the ocean), you will get a 404 Not Found
status code.
If you see a status code of 500, Internal Server Error
, that means something is wrong, so please send me
an email and I'll see if I can fix it.
The response is plain text GeoJSON. Inside is a FeatureCollection
. Watersheds only have a single Feature
,
a Polygon
that represents the watershed boundary. (Sometimes the app produces multi-part polygons, but the extra parts
are usually the size of single pixels. I programmed the app to discard these, as it makes managing the data much simpler.)
{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "Polygon", "coordinates": [ [ [80.51958, 40.11708], [-80.51375, 40.11791], ... , [-80.51958, 40.11708] ] ] }, "properties": { "area_km2": "421", "outlet_lat": 40.23, "outlet_lng": -80.61 } } ] }
The rivers GeoJSON looks similar. But here, the FeatureCollection
will usually contain multiple Features, as each river reach, or segment, is a separate Feature
.
{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "LineString", "coordinates": [ [-77.64167, 43.12167], [-77.66083, 43.11083], [-77.67417, 43.1075], [-77.67833, 43.09583] ] }, "properties": { "comid": 72056019, "sorder": 4 } }, ... { "type": "Feature", "geometry": { "type": "LineString", "coordinates": [ [-77.92583, 42.06417], [-77.93083, 42.07083], [-77.93667, 42.0725], [-77.9475, 42.06917] ] }, "properties": { "comid": 72058947, "sorder": 1 } } ] }
With the API, you can plug the URL right into QGIS. Choose Layer > Add Layer > Add Vector Layer, then and in the field, "Vector Dataset(s)," add the URL, then click "Add." Adjust the styles, and you can have something that looks like this:
During my PhD research in hydrology and remote sensing, I had to delineate thousands of watersheds. I needed a routine that was fast and accurate, and I wasn't happy with any of the existing software that I tried. I ended up writing some code in Python to do the job. I "invented" a technique that uses both vector- and raster-based data and is both fast and accurate. I wrote invented in quotes because I was surprised no one had done it before. Nevertheless, I could not find any mention of such a method in the literature. Months later that I found a conference paper from 1999 that described the exact method. So let's say I had "rediscovered" it.
I thought the code I wrote was useful so I made it open source and posted it on GitHub. This is great for other programmers and scientists. But what about the other 99% of people? They deserve to know about watersheds too! So I decided to create a web app where anyone could try it out.<;p>
As far as know, this is the only (free) app to delineate watersheds anywhere in the world. It is also much faster than any other method I've tried.
I hope you enjoy using the Global Watersheds app. I am endlessly fascinated by clicking on different places around the world, and seeing where water comes from and where it's going.
The app is available in English and French, since those are the languages that I know well. If you are interested in helping translate the app into another language, please get in touch! The app seems to have many visitors from Latin America, the Middle East, China, and elsewhere.
I spent a lot of time searching for an app that delineates watersheds, and was surprised that something like this did not already exist. Nevertheless, I found some other great sites that are somewhat related:
In my experimentation, there are a few places where the watershed delineation results are just not very good. This can occur where the terrain is very flat, hydrologically complex, or both. For example, there may be irrigation canals or pipelines. Other problem areas are plains and deltas, where rivers branch or braid. The input datasets do not attempt to represent distributaries, where the flow in a river splits or fans out.
Typical problem areas include:
Unfortunately, neither MERIT-Hydro nor HydroSHEDS have data for some islands. Hawaii is missing, as are the Azores. However, there is data for the Canary Islands, Fiji, Tuvalu, the Galapagos, and many others. I recently found a dataset called HDMA that includes data for many of these smaller islands, and I may consider adding it.
The app offers two main functions: "Downstream - trace flow" or "Upstream - delineate watershed."
Tracing the flow downstream gives you an (approximate) flow path to the ocean or to an inland sink -- what hydrologists call an endorheic basin. The flow path is based on following the steepest slope on a digital map of the earth's elevation. The flow path algorithm used here cannot deal with distributaries, such as those that occur in deltas or braided channels. So the flowpath may not go where you expect it to, especially as it gets close to the sea.
A watershed, or drainage basin, is the area upstream or upslope of a point on the earth's surface. Precipitation that falls in a watershed will eventually flow towards the watershed's outlet. So knowing the watershed for a point on a lake, river, or stream can (usually) tell you where the water came from. Watersheds are extremely important in hydrology and environmental science, for everything from studies of flooding, water pollution, and much more.
This app considers surface flow paths only. In other words, it does not consider water flowing underground in aquifers. It also ignores most man-made water transfers (in canals or pipelines).
Under Options, you can choose either high-precision or low.
High-precision mode is the default. The app will use it as long as you are zoomed in on the map (zoom level 9 or higher) and have chosen MERIT-Hydro as the data source.
For watersheds with an area over 50,000 km², the app will automatically revert to low-precision mode.
High-precision mode is only available with MERIT-Hydro data. I did not implement it for HydroSHEDS. While HydroSHEDS is an excellent dataset, and has a lot of features that make it very nice to work with, I don't think it is accurate enough to justify doing more detailed calculations. I may change my mind when HydroSHEDS v2 is published.
There are two main differences between high- and low-precision mode. The first difference is in the level of detail in the output. If you zoom in on a river centerline or watershed boundary, you can see the jagged lines that come from extracting shapes from these pixels. In low-precision mode, the watershed boundary has been simplified, and contains fewer vertices. This makes it faster to process and to display.
The second difference has to do with the precision of the watershed boundary with respect to the outlet. In low precision mode, the watershed boundary will not necessarily intersect the outlet point you requested. This is because, in low-precision mode, the app only uses vector data and methods. Your watershed will be assembled from a series of predefined unit catchment polygons, each with an average size of around 40 km², or 15 square miles. The boundary will always be some distance downstream of the outlet. For large watersheds, you may hardly notice this discrepancy but for small watersheds, the error can be quite obvious.
In high-precision mode, the app will try to find a watershed boundary that more precisely intersects the outlet point you chose. This requires an extra processing step using raster data and methods. (Raster refers to the use of gridded data.) This method is slower and more computationally intensive than vector methods, but gives more accurate results.
Even in high-precision mode, the app uses vector methods whenever it can, and only uses raster methods to "split" the most downstream unit catchment. Combining the two approaches (raster and vector) is faster and more accurate than using either one by itself. This hybrid method, while not totally unique, is what makes this app's approach different from standard methods of watershed delineation that you can do using a desktop GIS.
By default, the app will display river centerlines in addition to the watershed boundary.
For large watersheds, the app may not display all the river reaches it found. If it did, the map would be too "busy" and hard to read. Also, showing thousands of rivers on the map could make your web browser slow to a crawl or crash. To show the right amount of detail, the app prunes the river network. We only shows four orders of rivers, according to their Strahler number. For example, if the most downstream river reach is of order 7, we only show rivers that are order 3 and up. Small headwater streams of order 1 and 2 will not be shown.
Nevertheless, downloads will contain all of the rivers in your watershed that are available from the source dataset (MERIT or HydroSHEDS), including all the little headwater streams. The files have an attribute named sorder that you can use this to filter the results or to set the symbology if you are making a map. I found that setting the line width proportional to the square root of the stream order looks nice.
The river polyline features have two properties or attributes:
Sometimes the watersheds created by the app are weird or just look wrong. The app does not work that well for small watersheds. The source data is global, and it's not intended for detailed, local applications. If your results look odd, click a slightly different location and try again.
Here is one trick I found for getting good results: start by tracing a downstream flow path first, to see where the dataset thinks the river centerline is. Click somewhere along this line to delineate an upstream watershed. Tracing downstream flowpaths will show you where the river ends and the ocean begins (according to the source data).
You can use keyboard shortcuts to quickly switch back and forth:
If you think there's a bug or something else wrong, please drop me a line at matt@mghydro.com.
When you request a watershed in high-precision mode, for the app to return meaningful results, it needs to "snap the pour point," or move the outlet to a river centerline in the source dataset. The app will try to automatically relocate the watershed outlet, or pour point, to be coincident with a stream channel in the gridded flow accumulation dataset. Finding the right river can be an art and a science. Several different algorithms for snapping pour points been proposed in the literature; the one we use here is a simple one. Getting good results often requires some trial and error. So, if your watershed is not what you expected, click somewhere else nearby and try again.
The app finds all the "unit catchments" that make up the watershed and merges them together. In geographic information sciences, this is called a dissolve or a unary union. This is often the rate-limiting step in our calculations. For enormous watersheds (like the Nile, Amazon, Mississippi, Congo...), it can take a few minutes to merge thousands of little polygons into one big polygon. The good news is, the app only does this work once, and the result is saved and reused (what programmers call memoization). The next time someone requests the same watershed, it should be very fast. So the more you use this app to delineate different watersheds around the world, the faster it will be in the future.
The rivers can look jagged because of the conversion from a raster, or gridded, dataset to vector polylines. If you're making a map, and you're concerned about appearances, there are a couple things you can try. GIS software has routines that can help you smooth out these jagged lines, and make them more curvy and aesthetically pleasing. For example, in QGIS, open the Processing Toolbar, and search for "Smooth" or the GRASS tool "v.generalize". Note that the lines will end up with more vertices. With some trial and error, you can use "simplify" and "smooth" to improve the appearance of the rivers. They will not be more accurate, but they may help you make pretty maps!
You can also look for more detailed river data for your maps, for example clipping it to the watershed boundary. For large regions, Natural Earth may be suitable. In the United States, the National Hydrography Dataset is excellent. In Canada, there is CanVec rivers, lakes, and glaciers. You will find many others if you search for "hydrography + GIS + country name" as many countries and regions publish geodata. Otherwise, OpenStreetMap data can be quite good, if inconsistent, and is available globally. You will need to filter the data to select waterways of various types, which can be a little tricky. Luckily, the Yamazaki Lab has done this work and shared the dataset: see OSM Water. This dataset is enormous (7 GB, gzipped), so you will probably need to extract a portion in order to work with it in desktop GIS software. It also represents a snapshot from 2021, so more up-to-data data may be preferable.
This section describes in a bit more detail the algorithm that the app uses to delineate watersheds. The method makes use of two distinct classes of data, vector and raster. This "hybrid" method is faster and more accurate than using either type of data by itself.
TBy way of background, there are many software tools for automated (computerized) watershed delineation. Most methods use gridded terrain data -- a "digital elevation model" or DEM. Normally, to get the most accurate watershed boundaries, you want to use the highest resolution data that is available. For global studies, the current state of the art is to use a DEM with 3 arcsecond resolution (about 90 meters near the equator). Examples include HydroSHEDS and MERIT-Hydro. In the near future, the standard may shift to 12-m resolution, for example with TanDEM-X. And while there are advantages to using higher resolution data, it requires more computer memory and longer processing times.
For an ordinary user, without access to a supercomputer, things get complicated when dealing with large watersheds. When you try to delineate a watershed near the mouth of the Amazon or Mississippi Rivers with high-resolution data, you need to load huge raster datasets into memory. The processing time can also be very slow.
As an alternative to using large raster datasets to delineate watersheds, one can use a shortcut method with vector data. Here, vector refers to data that is made up of polylines and polygons described by a series of vertices. In the US, there is the National Hydrography Dataset (NHD), or for global studies, one could use HydroBASINS or MERIT-Basins. The creators of these datasets have already done the work of processing gridded terrain data and creating watershed boundaries. In these datasets, the land surface is divided into thousands of polygons called subwatersheds or unit catchments.
To find your watershed, you search for all of the unit catchments that are upstream of your outlet. This can be done very efficiently using a network analysis algorithm. So instead of using millions of small pixels as your building blocks, you use tens or hundreds of polygons (which are much larger than pixels). As a result, processing is usually much faster.
To find your watershed boundary, you can merge the selected unit catchments into a single polygon. In geographic science, this is called "dissolve" or "unary union." This operation can usually be done on an ordinary desktop computer.
However, using vector data alone yields imperfect results. The resulting watershed is always going to be a little too big or a little too small, because it is unlikely that the unit catchment boundaries intersect your desired watershed outlet point. For large watersheds, the error may be relatively small, and barely noticeable, especially on a map that is zoomed out to show the whole watershed. But for small watersheds, the error in terms of watershed area may be unacceptably large. And even for the largest watersheds, when you zoom in on the map to the area near the outlet, the results just "look wrong" because the watershed boundary does not go through the outlet point as it should.
When I was working on this problem, and running into "out of memory" errors on my computer, a colleague suggested I try a method that combines both vector data and raster data. With this method, you use vector data and methods for the upper watershed, and raster methods for the downstream portion near the outlet. This "hybrid" method is the best of both worlds since it combines the speed of vector-based methods with the accuracy of raster-based methods.
The hybrid approach is done in four steps once you have identified an outlet location, or point:
In step 3, we use conventional raster-based methods, which are slower but more detailed than vector-based methods. However, we only need to use the raster method on a relatively small area. We open raster datasets (flow direction, flow accumulation) for analysis in "windowed reading" mode. This means we only read into computer memory the portion of the rasters that are within the boundaries of the home unit catchment. Reading a small piece of the raster dataset uses much less memory than reading the entire file into memory.
In step 4, we may optionally remove any internal "donut holes" from the watershed.
I was delighted by the hybrid method I "invented," impressed with its speed, and eager to share it with my hydrologist pals.
I later discovered that the hybrid method for watershed delineation is not exactly new. I have traced its origin back to a 1999 conference paper by Dean Djokic and Zichuan Ye, programmers at the GIS software firm ESRI. They called their method Fast Watershed Delineation, or FWD. They wrote that it was first created at ESRI in 1997 as a set of scripts called Watershed Delineator, "developed for the Texas Natural Resource Conservation Commission, with the sole purpose of efficiently delineating watersheds." Instead of using an existing vector dataset of unit catchments, which I don't believe existed at the time, the scripts create a new set at runtime using the Arc/Info subwatershed command. While their scripts (in the now defunct scripting language Avenue) were in the public domain, they nevertheless required the purchase of proprietary ESRI software to use them.
Since that time, this hybrid method has been adopted by the U.S. Geological Service (USGS). For example the NHD Watershed Tool for ArcView (circa 2003) (ref: USGS), and NHDPlus Tools for ArcMap (circa 2010) (ref: Horizon Systems). These scripts use vector data on unit catchments created by the USGS and raster data from the National Elevation Dataset. A limitation of these tools is that they require proprietary software, and are limited to watersheds in the United States.
USGS scientists have also used the hybrid method of watershed delineation in web-based applications. A factsheet from 2000 describes its use on the Massachusetts StreamStats website. This web app was subsequently expanded to the rest of the country, and is called StreamStats. (Besides watershed delineation, the StreamStats app can "get basin characteristics and estimates of flow statistics, and more.") Further, the hybrid method is also used by the USGS for its online programming interface (API) for the National Hydrography Dataset, called the NLDI. Using the API, programmers can access USGS services and data to find watersheds, as well as many other features and data related to rivers, lakes, and streams in the US.
After months of searching online, I was convinced that the hybrid method of watershed delineation had never been described in a journal article, and had only appeared in "gray literature" such as conference presentations and manuals for defunct software. Finally, I found a discussion of the hybrid method in a 2014 paper in the journal Computers & Geosciences. In this article, Anthony Castronova and Jonathan Goodall, environmental engineers from Utah State, described the hybrid method using NHDPlus vector data over the United States. The details of their implementation are somewhat different from what I have done, but it is essentially the same method. Unfortunately, the link to download the code has gone dark, as is often the case with articles more than a few years old. Further, although this paper has been cited in other papers 14 times according to Google Scholar, none of these papers involve use of their waterhed delienation method.
For this project, I reviewed many articles on watershed delineation. It appears that watershed delineation continues to be a somewhat active area of research in academia. Many of the papers are aimed at developing new, improved algorithms. However, none of these recent papers mention hybrid methods. Outside of those working with NHD data in the US, researchers in the hydrologic community have been slow to adopt hybrid methods, I believe because they are largely unknown. Nevertheless, the idea is quite simple. I am convinced that there many programmers who have independently "re-invented" this method.
With the publication of my codebase and online demo, I hope to increase awareness of the hybrid method. Another goal is to share open-source code that others can adapt and improve.