Global Watersheds Web App – Help & About

This is the combined Help and About page for the Global Watersheds web app at https://mghydro.com/watersheds/. The web app can find upstream watersheds and downstream flow paths for almost any place on Earth. It uses the latest, state-of-the-art global datasets, and is faster than most other methods.

Have fun exploring! I hope the app helps you to learn about the natural world. Water connects us all, and our waterways are precious resources to be protected and conserved.

Getting Started

Just click somewhere on the map, then click the button Delineate.

The app will show you the watershed boundary as a red line, and the upstream river network as a set of blue lines. A watershed, or drainage basin, is the area upstream or upslope of a point on the earth's surface.

Precipitation that falls in a watershed will flow downhill towards the watershed's outlet. If you know the watershed for lake or a river, it can help you understand where the water in that lake or river came from. Watersheds are important in hydrology and environmental science, for studies of flooding, water pollution, aquatic habitat, hydropower, and more.

Credits

Created with the datasets MERIT-Hydro, MERIT-Basins, and HydroSHEDS.

When you choose "USGS" as the data source (for the continental United States only), the app uses data and methods by the US Geological Survey via their API, called the Hydro Network-Linked Data Index (NLDI). Curious programmers can do much more with these data using the USGS API directly or via the Python library pynhd.

Citation

If you use the app in any published work, please cite this web page. Here is a suggested citation:

Heberger, Matthew. 2022. Global Watersheds (web application). https://mghydro.com/watersheds.

Contribute

The app is free and has no advertising (and no cookies!). The web hosting costs around $25 per month.

If you found this web app fun or useful, or if it saved you time in your work, please Buy Me a Coffee. ☕ Many cups of coffee were consumed while coding the app!

Feedback

If you have feedback or suggestions, please get in touch at matt@mghydro.com. I love hearing what people are doing with the app.

Terms of Use

There is no guarantee of the correctness or suitability of these results for any purpose. The author assumes no liability for any harm or damage that results from the use of these data. You should carefully review the results and verify their accuracy.

Downloads

If you want to download geodata of your watershed or flowpath, expand the "Options," and check the box "Make downloadable." Now, any new watersheds or flowpaths that you create will be saved on the server and available for download.

Four download formats are available:

Delineate watersheds for a lake or coastline

Most watershed delineation routines find the drainage area upstream of a point. This works well for rivers and streams (most of the time). But what if you want to find the watershed of a lake or inland sea? With this feature, you can draw a polyline or polygon on the map, and find the drainage area for your shape. This feature was inspired by the Chesapeake Bay Conservation Toolbox, which unfortunately no longer seems to be functioning.

How to use this feature: First, under Options, check the box “✒️ Delineate watersheds for a lake or coastline.” A new toolbar will appear on the left side of the map. Click one of the buttons to draw a polyline or a polygon. When you are finished drawing, click the “Delineate” button on the popup window.

Sometimes the results can look a little unexpected, so experiment a bit before you give up. This feature has some limitations:

Continental scale basins

Map Layers

By default, the map shows a basemap provided by OpenStreetMap. You can change the basemap by clicking the widget at the top right. Some of the basemaps have limited geographic coverage, for example, USGS Topographic Maps or GeoPortail France.

An interesting choice is CartoDB Voyager plus its labels, as the labels will show up on top of rivers and the watershed boundary, making them easier to read.

Some of the layers are a little unreliable. But let's not complain, since these are all free!

Thematic Map Layers

Added November 2024. Three new thematic map layers -- population density, land cover, and irrigated lands -- illustrate human activities that can have big impacts on watersheds.

Population Density

Population data comes from GlobPop, by researchers in Beijing Normal University (link to 2024 journal article). This seemed to be the best among the alternatives I looked at (GPW4 , GRUMP, LandScan, and WorldPop). This was actually quite surprising, as some of these other datasets are well-known and backed by large institutions. The colors on the map correspond to population density, or people per square kilometer. The colormap is a logarithmic scale ranging from 0 population shown as dark blue, to a high population density of over 100,000 people/km².

The population is estimate here for the year 2020, the most recent year available. The GlobPop dataset includes layers for every year for 1990 to 2020. The spatial resolution of this dataset is 30 arcseconds, or 1/120°. Near the equator, the pixels are approximately 900 m across, and have an area of slightly less than 1 km². The pixels get smaller as you move north or south away from the equator. Based on the resolution, I set the basemap to display at up to Zoom Level 12.

This seemed to be one of the best datasets available, based on my exploration of the free datasets available showing global population. However, it is not perfect. If you zoom in, you will occasionally see unusual patterns, like sudden changes in population density at national or provincial borders, that are like artifacts of the data processing and not real-world patterns. If you are performing an important analysis, you should check into obtaining more detailed data from local or national governments. For example, in the United States, it is standard to use data from the Census Bureau to estimate population.

Land Cover

Land cover refers to physical objects or vegetation covering the land surface (i.e. forest, grassland, wetland, asphalt). It is commonly determined using aerial and satellite imagery.

Land cover data comes from the GLAD: Global Land Cover and Land Use Change, 2000-2020 (Popatov et al. 2022). This dataset, created by researchers at the University of Maryland, is available online here. Classification is based on satellite imagery from Landsat and machine learning tools.

I investigated a few other options for Land Use. I found NASA's MODIS land use surprisingly inaccurate. Another well-known dataset, WorldCover, is produced by the European Space Agency. There is also Dynamic World from Google and the World Resources Institute. These other datasets either did not offer snapshots of different time periods or required too much pre-processing to use.

Nonetheless, the GLAD dataset required some processing to create tiles to display on the map. I loaded all of the data into a single zarr datastore. This is a new-ish data format that is popular with the big data crowd and Python users. I pre-rendered the tiles at zoom levels 0 to 8 using QGIS and the excellent QTiles plugin. This took several hours. Tiles at higher zoom levels are created on demand using Python scripts on the server. Once we've created them once, they're saved so that the next time a user requests them, it should be lightning fast.

Irrigated Area

Data on irrigation comes from a global dataset published by an international team of researchers (Siebert et al. 2015). The dataset can be downloaded from the website mygeohub.

Interestingly, this dataset covers a much longer historical period than the others shown here, providing estimates of the area equipped for irrigation (AEI) between 1900 and 2005. The variable reported is "the area of land that is equipped with infrastructure to provide water to crops. It includes area equipped for full/partial control irrigation, equipped lowland areas, and areas equipped for spate irrigation, but it excludes rainwater harvesting" (p. 1523 in Siebert et al. 2015). The actual area irrigated in any given year may be much lower than the area equipped, because cropland is either not farmed (fallowed) or no supplemental irrigation is provided beyond rainfall.

Dams

Data on dams comes from the Global Dam Watch database, published in July 2024. For more details, see the journal article by lead researchers at McGill University (Lehner et al. 2024).

I selected the most interesting columns from the data table, and did a bit of clean up of the text, for example fixing entries in all capital letters. Then I converted the data to GeoJSON, which is displayed on the map using the fantastic Marker Cluster plugin for Leaflet maps. This site would not work without all this incredible open-source software!

MERIT Hydrography Data Layers

With these layers, you can display the source data upon which the app is based, by checking "MERIT-Basins unit catchments" and "MERIT-Basins river reaches." I think that this can help give some insight into how the app works and its limitations. Please let me know what you think about this new feature!

Source Code

I have released open-source code for doing watershed delineation on your own computer with Python. This code is based on the same hybrid method used in the web app. One advantage of the Python script is that you can run it in "batch mode" to delineate hundreds or thousands of watersheds. The Python script also has several parameters that you can change to alter its performance. By contrast, in the web app, most of these parameters are hard-coded for the sake of simplicity and speed.

The Python scripts are slower than the web app, especially for large watersheds. For most users, it will be faster to use the web app or the API (see below).

Source code: https://github.com/mheberger/delineator

Citation: Heberger, Matthew. delineator.py: fast, accurate global watershed delineation using hybrid vector- and raster-based methods. 2022. https://doi.org./10.5281/zenodo.7314287

API

I've set up an API so you can get watershed boundaries, upstream river networks, and downstream flowpaths without using the web map interface. You can use the API with any programming language to automate your workflows, for example to create hundreds (or thousands!) of watersheds. I wrote a blog post about how to do this with Python, and you can download the demo code in a Jupyter notebook. If you create code in another language, please send it to me and I'll post that too.

To use the API, you need to provide a carefully formatted URL. There are different URLs for watershed boundaries, the upstream river network, and downstream flow paths. At present, I've only set up the API to work with MERIT-Hydro data.

Base URLs:

        https://mghydro.com/app/watershed_api 
https://mghydro.com/app/upstream_rivers_api
https://mghydro.com/app/flowpath_api

Parameters

You need to append at least two parameters (lat, lng), an optional parameter for the precision, and two optional parameters that affect the appearance. Parameters are to be entered as a query string:

        lat: a number from -180 to +180.
lng: a number from -60 to +85.
precision: "low" or "high", without quotes.
Optional: if omitted, defaults to "low."
simplify: "true" or "false", without quotes.
Optional: defaults to "false."
beautify: "true" or "false", without quotes.
Optional: defaults to "false."

See the section Simplify or Beautify for a description of what the last two parameters do.

If your watershed has an area of over 50,000 km², the app will automatically revert to lower-precision mode, even if you specify precision=high.

Latitude and longitude should be in decimal degrees (e.g.: 31.416)

Example API calls:

        https://mghydro.com/app/watershed_api?lat=43.253&lng=-77.609&precision=high
        
        https://mghydro.com/app/upstream_rivers_api?lat=43.253&lng=-77.609&precision=high
    
        https://mghydro.com/app/flowpath_api?lat=-3.913&lng=29.84&precision=high
    

Status Codes

If the server successfully handles your request and creates a watershed, you will get the HTML status code 200, and the body of the response will have mimetype="application/json".

If there is a problem with one of your inputs, you will get a 400 Bad Request status code. For any other kind of error where the app cannot create a watershed (for example, your outlet point is over the ocean), you will get a 404 Not Found status code.

If you see a status code of 500, Internal Server Error, that means something is wrong, so please send me an email with as much detail as possible and I'll see if I can fix it.

Response

The response is plain text GeoJSON. Inside is a FeatureCollection. For watersheds, the FeatureCollection has just one Feature, a Polygon that represents the watershed boundary. (Sometimes the app produces multi-part polygons, but the extra parts are usually single pixels. I programmed the app to discard these, as it makes managing the data simpler.)

Example Watershed GeoJSON

{
  "type": "FeatureCollection",
  "features": [
      {
        "type": "Feature",
        "geometry": {
          "type": "Polygon",
          "coordinates": [
            [
              [80.51958, 40.11708], [-80.51375, 40.11791], ... , 
              [-80.51958, 40.11708]
            ]
          ]
        },
        "properties": {
          "area_km2": "421",
          "outlet_lat": 40.23,
          "outlet_lng": -80.61
        }
      }
  ]
}
    

The rivers GeoJSON looks similar. But here, the FeatureCollection will usually contain multiple Features, as each river reach, or segment, is a separate Feature.

Example Rivers GeoJSON

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "LineString",
        "coordinates": [
            [-77.64167, 43.12167], [-77.66083, 43.11083], 
            [-77.67417, 43.1075], [-77.67833, 43.09583]
        ]
      },
      "properties": {
        "comid": 72056019,
        "sorder": 4
      }
    },
    
    ...
    
    {
      "type": "Feature",
      "geometry": {
        "type": "LineString",
        "coordinates": [
            [-77.92583, 42.06417], [-77.93083, 42.07083], 
            [-77.93667, 42.0725], [-77.9475, 42.06917]
        ]
      },
      "properties": {
        "comid": 72058947,
        "sorder": 1
      }
    }
  ]
}
    

Usage with QGIS

With the API, you can plug the URL right into QGIS:

Screenshot of QGIS Add Layer dialog showing the option to
		add data from the Global Watersheds API

Adjust the symbology, and you can have something that looks like this:

Screenshot of QGIS showing the watersheds API in use

Origin

During my PhD research in hydrology and remote sensing, I needed accurate delineations for thousands of watersheds. I wasn't happy with any of the existing software that I tried, so I wrote my own in Python. I "invented" a hybrid method, using both vector and raster data. As a result, it is both fast and accurate. More importantly, it works on a regular laptop -- no need for a supercomputer.

I write invented in quotes because I could not find this method described in the literature. Several months later, I found a conference paper from 1999 describing the method, although the authors did not use the word "hybrid." So, let's say I "rediscovered" it. 🫠

I thought my script was useful, so I shared it on GitHub. This is great for other programmers and scientists. But what about the other 99% of people? I decided to convert the scripts into an interactive website where anyone can find a custom watershed.

As of 2022, it was the only (free) app for finding watersheds anywhere in the world. It is also much faster than any other method I've tried.

I hope you enjoy using the Global Watersheds app. I am endlessly fascinated by exploring waterways around the world. I hope the app helps you see the world in a new way!

Inspiration

Here are some sites I found during my research that are somewhat related:

Problem areas

In many areas, automated watershed delineation does not give good results. Delineation by computer is not always correct. If you are using the results for science or engineering, it is imperative that you check your results.

There are several causes for errors. Sometimes they are caused by inaccurate input data, and other times they are the result of the algorithms. Typical problem areas include:

River delta with branching and braided channels
At the outlet of the Lena River in northern Russia, the river forms a broad delta with many branched and braided channels. The app tries to pick the most likely flow pathway, but as you can see, it is only one of hundreds of possibilities.

Watershed Data Report

Example of the watershed data report, showing facts and figures about population and land cover.

New feature added Nov. 2024!

After you’ve created a new watershed, you’ll see a new blue button on the left-side menu. Click this button to create a Watershed Data Report. The report summarizes a variety of information about the watershed:

  1. Political Boundaries
  2. Population
  3. Land Cover
  4. Hydrology
  5. GRACE Total Water Storage
  6. Irrigation
  7. Dams

These data mostly have to do with human impacts on watersheds. Some of the information in the report comes from the thematic layers that you can display on the map (for example, population, irrigated area, and land cover). These data are mostly gridded, or raster datasets. In order to summarize the values, the app calculates sums or averages over the pixels that overlap the watershed. In geographic information science, or GIS, this is referred to as “zonal statistics.” For a discussion of how I do these calculations, see Section 3.4 in my PhD thesis, Calculating Basin Means of [Gridded] Earth Observation Variables. The discussion there centers around Earth Observation data of the water cycle, but applies to any kind of gridded data.

I've also included some text in each section containing a brief introduction on how human and environmental factors affect watersheds. If you have any questions about these data, or suggestions for other data layers I might show, don't hesitate to contact me. 😀

Missing Data

Unfortunately, neither MERIT-Hydro nor HydroSHEDS have data for some islands. Hawaii is missing, as are the Azores. However, there is data for the Canary Islands, Fiji, Tuvalu, the Galapagos, and many others. I recently found a dataset called HDMA that includes data for many of these smaller islands, and I may consider adding it. Or maybe small islands will be included in HydroSHEDS version 2?

Trace upstream or downstream

The app offers two main functions: "Downstream - trace flow" or "Upstream - delineate watershed."

You can use keyboard shortcuts to quickly switch back and forth:

Tracing the flow downstream gives you an (approximate) flow path to the ocean or to an inland sink -- what hydrologists call an endorheic basin. The flow path is based on following the steepest slope on a digital map of the earth's elevation. The flow path algorithm cannot deal with distributaries, such as those that occur in deltas or braided channels. So, the flowpath may not always follow the route you expect.

This app considers surface flow paths only. In other words, it does not consider water flowing underground in aquifers. It also ignores most man-made water transfers (in canals or pipelines).

Data Sources

I set up the app using data from three different sources. Here is a short description of each of the 3 options:

Lower-precision vs. higher-precision mode

Under Options, you can choose the precision of the analysis and the results. Higher-precision mode is the default, and the app will use it whenever you are zoomed in far enough (zoom level 9 or higher on the map) and have chosen MERIT-Hydro as the data source.

For watersheds with an area over 50,000 km², the app will automatically revert to lower-precision mode.

Higher-precision mode is only available with MERIT-Hydro data. I did not implement it for HydroSHEDS. While HydroSHEDS is an excellent dataset, I don't think it is accurate enough to justify doing more detailed calculations. I may change my mind when HydroSHEDS v2 is published.

There are two main differences between higher- and lower-precision mode. The first difference is in the level of detail in the output. If you zoom in on a river centerline or watershed boundary, you can see the jagged lines that come from extracting shapes from these pixels. In low-precision mode, the watershed boundary is simplified and has fewer vertices. This makes it faster to process and to display.

The second difference has to do with the precision of the watershed boundary with respect to the outlet. In low precision mode, the watershed boundary will not necessarily intersect the outlet point you requested. This is because, in low-precision mode, the app only uses vector data and methods. Your watershed will be assembled from a series of predefined "unit catchment" polygons. These have an average size of around 40 km², or 15 square miles. The boundary will always be some distance downstream of the outlet. For large watersheds, you may hardly notice this discrepancy but for small watersheds, the error can be quite obvious.

Watershed delineation at low resolution
Higher-precision watershed. Example shown for Oswaya Creek at Ceres, New York, outlet at 42.0, -78.27.
Watershed delineation at low resolution
Lower-precision watershed. The watershed outlet has been relocated to a point downstream of the requested location; as a result, this watershed is about 10% larger.

In higher-precision mode, the app will try to find a watershed boundary that intersects the outlet point you chose. This requires an extra processing step using raster data and methods. (Raster refers to the use of gridded data.) Raster methods are slower than vector methods and need more data, but they can give more accurate results.

Even in higher-precision mode, the app uses vector methods whenever it can, and only uses raster methods to "split" the most downstream unit catchment. This hybrid method (raster and vector) is faster and more accurate than using either method alone.

Simplify or Beautify

New options added in June 2023. Only available in higher-precision mode with MERIT data.

The map data can look rather jagged, a result of being derived from a gridded dataset. By default, the Simplify option is selected, and the app will simplify polylines. The resulting geodata has fewer vertices, and will download and display more quickly. However, it can still look quite jagged and unrealistic when you zoom in. If you want nice smooth lines, you can choose the Beautify option. The following example shows effect of these two different options.

Watershed at higher-resolution
Higher-precision
Simplified watershed and rivers
Simplified
Beautified watershed and rivers
Beautified

The beautify feature will make the file sizes larger. Please note that the effect is purely aesthetic! It does not make the lines any more accurate! The curves look nice, but they do not necessarily follow the true rivers or watershed boundaries. In fact, they are often a bit further from the truth.

For those interested in the techical details, the code is using the Douglas-Peucker algorithm for simplification. To smooth watershed bounadries, the app uses the Chaikin algorithm, also known as a corner-cutting algorithm. These methods were fairly easy to add because these functions are included in PostGIS. For the rivers, I'm using a Centripetal Catmull–Rom spline. This was also fairly straightforward to add -- the Wikipedia page has working Python code that I customized slightly. I found that the Chaikin method gave nice looking results for watersheds, and Catmull–Rom was better for rivers. By "nicer" I mean they look more like hand-drawn maps.

River Centerlines

By default, the app will display the centerlines of upstream rivers in addition to the watershed boundary.

For large watersheds, the app will not display all the river reaches in our database. If it did, the map would be too "busy" and hard to read. Also, showing thousands of rivers on the map could make your web browser slow to a crawl or crash. To show the right amount of detail, the app prunes the river network. The apps shows four orders of rivers, according to their Strahler number. For example, if the most downstream river reach is of order 6, we only show rivers that are order 3 and up. Small headwater streams of order 1 and 2 will not be shown.

Strahler stream order
Streams ordered by their Strahler number. Illustration from the USGS, public domain.

Nevertheless, download files (except KML) contain all of the rivers in your watershed that are available from the source dataset, including all the little headwater streams. The files have an attribute named sorder. You can use this field to filter how many rivers to display. In ArcGIS, this is a Definition Query. In QGIS, you can open Layer Properties and use the Query Builder.

You can also use the sorder attribute to set the symbology of rivers (for example, color or line width). I found that it looks nice to set the line width proportional to the square root of the stream order.

Rivers of the Mississippi watershed, unfiltered
If we try to display all the rivers in the Mississippi watershed, there is too much information, and the browser glitches. (Attempting to show 170,345 river reaches, or segments.)
Rivers of the Mississippi watershed, filtered
Pruning the river network to a suitable level of detail. Many smaller streams are not being shown (displaying 12,438 river reaches).

The river polyline features have two properties or attributes:

In addition, if you use the USGS data source, river downloads have the fields lenghtkm, and gnis_name. The latter is the name of the river or stream, derived from the US Geographic Names Information System. This is nice if you want to add labels to your maps.

What if your results look weird?

Sometimes the watersheds created by the app are weird or just look wrong. The app does not work that well for small watersheds. The source data is global, and it's not intended for detailed, local applications. If your results look odd, click a slightly different location and try again.

Also, if you're using HydroSHEDS, keep in mind that it is an older dataset, and less accurate in some places, particularly in the northern hemisphere above 60° latitude.

To get reliable results for watershed delineation, make sure you click on or near a stream. You can see the streams in the MERIT dataset by turning on the map layer "MERIT-Basins river reaches."

Why does the app show two different watershed outlets?

When you request a watershed in higher-precision mode, for the app to return meaningful results, it needs to "snap the pour point," or move the outlet to a river centerline in the source dataset. The app will try to automatically relocate the watershed outlet, or pour point, to coincide with a stream. Finding the correct river can be an art and a science. Several algorithms have been proposed for pour point snapping; the algorithm used by the app is a simple one. Getting good results often requires some trial and error. So, if your watershed is not what you expected, click somewhere else nearby and try again.

In lower-precision mode, the app will construct your watershed by merging lots of small "unit catchment" polygons. The app relocates the watershed outlet to coincide with the boundary of a unit catchment. The new outlet may be a kilometer or more from the outlet you requested. For large watersheds, this should not make a big difference. For smaller watersheds, you will usually get better results with higher-precision mode.

Why does it sometimes take a long time to delineate a watershed?

The app finds all the "unit catchment" polygons that make up the watershed and merges them together. In geographic information sciences, this is called a dissolve or a unary union. This is usually the slowest step in our calculations. For enormous watersheds (Nile, Amazon, Mississippi, Congo...), it can take a few minutes to merge thousands of little polygons into one big polygon. The good news is that the app only does this work once, and the result is saved and reused (what programmers call memoization). The next time someone requests the same watershed, or one downstream of yours, it should be very fast. So, the more you use the app to delineate watersheds around the world, the faster it will be (for everyone) in the future.

Why do the rivers look unrealistic when I zoom in?

Jagged river centerline derived from raster elevation data

The rivers can look jagged because of the conversion from a raster, or gridded, dataset to vector polylines. If you're making a map, and you're concerned about appearances, there are a few things you can try. First, if your watershed is in the continental United States, you can try selecting "USGS" as the data source. Their map data is usually quite good.

Second, you can an experiment with the "smooth" and "beautify" options added in June 2023. These won't make the river centerlines more accurate, but they may look nicer.

Finally, you could look for more detailed river data for your maps, for example clipping it to the watershed boundary. For large regions, Natural Earth may be suitable. In the United States, the National Hydrography Dataset is excellent. In Canada, there is CanVec rivers, lakes, and glaciers. You will find many others if you search for "hydrography + GIS + country name" as many countries and regions publish geodata. Otherwise, OpenStreetMap data can be quite good, if inconsistent, and is available globally. You will need to filter the data to select waterways of various types, which can be a little tricky. Luckily, the Yamazaki Lab has done this work already and shared the results: see OSM Water. This dataset is enormous (7 GB, gzipped), so you will probably need to extract a portion in order to work with it in desktop GIS software. It is also a snapshot from 2021, so more up-to-data data may be preferable.

Hybrid Method for Watershed Delineation

For those interested in technical details, here is some more information on the algorithm that the app uses to delineate watersheds. The method makes use of two distinct classes of data, vector and raster, and uses different methods with each type of data. This "hybrid" method is often faster and more accurate than using either type of data by itself.

By way of background, there are many software tools for automated (computerized) watershed delineation. Nearly all of these use raster (or gridded) terrain data called a "digital elevation model" or DEM. To get the most accurate watershed boundaries, you want to use the highest resolution data that is available. For global studies, the current state of the art is to use a DEM with 3 arcsecond resolution (about 90 meters near the equator).

Things get complicated when delineating large watersheds with high-resolution data. First, you need to load huge datasets, which may use more memory than an ordinary laptop or desktop computer. Even if you have lots of memory, it may take a long time to process the data, especially for large watersheds.

As an alternative, one can use vector data. This is usually faster but less accurate. Vector data is made up of shapes -- points, polylines, and polygons. Examples of vector hydrography datasets are the National Hydrography Dataset (US), HydroBASINS, and MERIT-Basins. In these datasets, the land surface is typically divided into thousands of polygons referred to as subwatersheds or unit catchments.

To find a watershed using vector data, you search for all the unit catchments that are upstream of a point. This can be done efficiently using a network analysis algorithm. Using raster methods, the building blocks for a large watershed are millions of small pixels. By contrast, vector methods use a few thousand polygons, each of which covers a much larger area than a pixel. As a result, processing is usually much faster. In the case of MERIT-Basins, the average size of a unit catchment is 40 km². In contrast, the pixels in MERIT-Hydro are around 0.008 km². Thus, unit catchments are about 5000 times bigger than pixels.

To find the watershed boundary, you merge the selected unit catchments into a single polygon. In geographic science, this is called "dissolve" or "unary union." This operation can usually be done on an ordinary desktop computer.

However, using vector data alone yields imperfect results. The resulting watershed is always going to be a little too big, because it includes the unit catchment containing the outlet point. For large watersheds, the error may be relatively small, and barely noticeable. But for small watersheds, the error in the watershed area may be unacceptably large. And even for the largest watersheds, when you zoom in on the map to the area near the outlet, the results just look wrong because the watershed boundary does not go through the outlet point as it should.

When I was working on this problem, my PhD advisor suggested I try a method that combines both vector data and raster data. This "hybrid" method is the best of both worlds since it combines the speed of vector-based methods with the accuracy of raster-based methods.

There are four steps in the hybrid method of watershed delineation:

  1. Identify the unit catchment in which the outlet point is located. This is the terminal or "home" unit catchment.
  2. Find the set of unit catchments that are upstream of this unit catchment.
  3. Split the home unit catchment to find the portion of it which is upstream of the outlet.
  4. Combine the split catchment from (3) with the upstream catchments from (2), and merge and dissolve. The result of this step is a single polygon whose boundary represents the watershed of the outlet point.

In step 3, we use conventional raster-based methods, which are more costly. However, we only use detailed raster methods on a relatively small area. The app opens raster datasets (flow direction, flow accumulation) for analysis in "windowed reading" mode. This means we only read into computer memory the portion of the rasters that are within the boundaries of the home unit catchment. Reading a small piece of the raster dataset uses much less memory than reading the entire file into memory.

In step 4, we may optionally remove any internal "donut holes" from the watershed.

The hybrid method is not new, but it has not been well described or documented. With the publication of my codebase and online demo, I hope to increase awareness of the hybrid method. Another goal is to share open-source code that others can adapt and improve.

Watershed Consciousness

My impression is that the app is quite popular with scientists and engineers. And I'm gratified that it has become a useful tool for professionals.

Yet, my main goal in launching the app was to increase “watershed consciousness.” Knowing where you are in a watershed helps you undersand your environment. Watersheds help us understand the movement and distribution of plants, animals, and nutrients. On a practical level, watersheds can help understand where your water supply comes from, and where pollution goes.

Watershed consciousness is an old idea rooted in 1960’s environmentalism and back-to-the-land movement. But I believe it’s still relevant. We all live in a watershed. And we are all downstream of someone else.