Lesson 3: The Data Discovery Process & Approaching a Mapping Project

3.1 Data Discovery & Mapping Projects

In this lesson, you will be introduced to the data discovery process, as well as how to use this process to approach a mapping project.

We will help you understand phases of the data discovery process and determine if a data source is appropriate for your research needs.


3.2 The Data Discovery Process

The short presentation below will introduce you to the data discovery process and how to evaluate your search results. With an emphasis on GIS data, Jaime Martindale describes how elements of a standard research process play out when trying locate spatial and numeric data sources. Don’t forget to answer the quiz questions in the video!

Pro Tip:

You can switch between, as well as move, the slides and video presentation using the video player.

This short presentation will introduce you to the data discovery process and how to evaluate your search results. With an emphasis on GIS data, I will describe how elements of a standard research process play out when trying locate spatial and numeric data sources.

Looking at the four basic elements of the research process, we can further define these steps when we are talking about GIS data.

First when formulating a research question, you begin to think about the theme of your map or what you want to analyze.

In step two, you begin searching for relevant data, using a broad or targeted approach.

Step 3 brings us to the evaluation stage. Use geospatial metadata to help you learn more about the dataset and if it is suitable.

Finally, step 4, use the data in your project. This may involve several geoprocessing tasks, analysis tools, or cartographic design techniques. Finding data is not a linear process. It requires a deliberate exploration. You might begin with a very broad approach like a Google search. With the amount of data on the web, this can be successful if you are willing to wade through many results. How you search is important. Think about the data theme, the time period, and the geographical location you need. Also keep in mind there is a lot of crowd-sourced data on the web.

As you step through your research process, you will begin see the importance of the evaluation phase. You can use existing research guides or other annotated lists or clearinghouses and websites to begin to narrow your initial search options. There is a GIS research guide available from the UW Libraries website. Use it to see a detailed list of popular sources for Wisconsin, the US, and around the world. A second option for a more targeted search approach is to think about what government agency or organization is most likely to produce the data you need. Are you looking for energy information in the US? Or climate data for the entire world? A targeted search directly within an agency website may result in what you need.

Perhaps one of the most important steps in the research process: evaluating the data you have found. You should be looking at these criteria to guide your decision making process. Who is the author? Can you determine the accuracy of the information? Can you evaluate the objectivity of the data? What is the time period of the content? Does the geographic extent meet your needs? How about the attributes? These basic evaluation criteria will force you to ask question of your data.

Keep in mind that many times the data itself presents certain limitations. The evaluation phase is where you determine which limitations you can live with, or those that may mean what you’ve found won’t work after all. Use geospatial metadata to assist in the evaluation process. Data without such documentation should be used with caution.

Metadata describes for us the most basic pieces of information about a dataset to help us determine its suitability. Each element of the metadata record can be mapped to the evaluation criteria to help guide your process. Data may not have a formal metadata record attached to it, so look for any details you can find from the website itself, or any text summaries or descriptions given by the data producer that will help you evaluate it. Metadata in any form will also be essential when you are at the point of citing your data sources.

To summarize this GIS data discovery process overview: We can apply basic research steps to geospatial data discovery. Your approach can be broad, targeted to a specific agency, or you may find that you use both methods. The data discovery process is not linear. Evaluating your search results is an important step in determining whether or not a dataset is suitable for your purposes. Using the metadata that accompanies a dataset is a valuable way to find out more about the author, the purpose of the dataset creation, time period of the content, where it was published, etc.

The data discovery process takes longer than we anticipate. Keep these basic concepts in mind as you begin your geospatial data research and it will help guide your process.


3.3 Approaching a Mapping Project

In this presentation, Jaime Martindale will guide you through the various phases of a mapping project, with an emphasis on the data discovery aspect. It follows the lecture on the data discovery process above, and highlights things to consider once you have the data you want to use. Don’t forget to answer the quiz questions in the video!

Pro Tip:

You can switch between, as well as move, the slides and video presentation using the video player.

This presentation is an attempt at guiding you through the various phases of a mapping project, with an emphasis on the data discovery aspect. It follows the lecture on the data discovery process, and highlights things to consider once you have the data you want to use.

The entire process of a mapping project as you approach it from the data discovery perspective is rarely linear. You will find yourself immersed in searching, evaluating, refining, and sometimes circling back to initial sources that you may have ruled out for one reason or another.

In a nutshell, this overview is extremely simplified. Once you collect your data, you need to focus on understanding how it’s organized and what format it’s in.

You need to consider the constraints the data may have, and of course, document your sources along the way. The very first step in a mapping project focuses on the idea.

Initially, you should be thinking about what the audience will take away from your map. Really think about what it is you want them to understand.

It may be that your idea reflects your interests or passion for a specific topic. But keep in mind that you may find yourself, at some point in your career directed to create something that is not reflective of your personal interests.

Either way, these main themes are relevant. As mentioned in the data discovery lecture, looking for data is not a simple, straightforward process. Many people find they spend far more time looking for data than designing the final project itself. The process is iterative. You may find something useful, only to discover something else minutes later that is even better. Or conversely, you may find some data that doesn’t appear to be exactly what you want, only to come back to it later on.

The back and forth nature of how the discovery process happens means documenting what you find along the way is important. Once you’ve found data you think you’d like to use, as you work through the evaluation process you may be asking yourself these questions.

You may have found the perfect dataset, but you have to think about the geographical context. Maybe there is additional information needed to understand it fully? Do you recognize the data format? Can you work with it or does it need to be converted to something else? Does the native format of the data lend itself to an easy straightforward process of mapping it or analyzing it spatially?

If the data is tabular, can you easily get it into a format that allows you to map it? If you’ve found a geospatial dataset, do you need to do any geoprocessing or re-projecting of it? Think about how much time some of this data manipulation may take, as it will add to your overall time to complete the project. Some datasets that you find will have limitations to them, or constraints that may make the information unusable for your purpose.

You have to decipher between the constraints you can deal with or work around, and those that prohibit the information from being useful. You may find the data isn’t at the resolution you were hoping for, and depending on the scale you want to map it, it may not be appropriate.

Perhaps the data was collected in geographic units that are either too detailed or too generalized for your needs. The data may have missing information in critical areas that you want to highlight. Not all data is free and open. You may have to be willing to pay for data or find an alternative that doesn’t have a cost, if available.

These final points offer a summary of things to think about as you approach a mapping project and are in the process of looking for data.

Your research question helps guide your hunt for data, but you need to think about your audience too, and why they will care about your map. Searching for data is time consuming and often takes longer than we think. Don’t forget about geographical context.

You may need to find additional datasets that help you make your point to a map reader. Datasets often come coupled with specific constraints. Be mindful of those that you can accept, and those that you absolutely cannot. Keep documentation along the way, including source information for data you find. This will be helpful when you need to cite data you decide to use. It will also be helpful if you need to circle back to any data you discovered early on.