![]() For example, you might have student data but you really want classroom data, or you might have weekly data but you want monthly data,Įtc. To load the dataset, we type: use "distance.Sometimes you have data files that need to be collapsed to be useful to you. Stata’s working directory can be changed through: cd “directory path” Syntax of geodist Make sure the working directory of your Stata file is the same as the working directory of the data file. The general syntax of the command is as follows: geodist lat1 lon1 lat2 lon2, generate(new_dist_var) The first two arguments are the latitude and longitude of the first location, followed by the latitude and longitude of the second location. The latitude must always be written before the longitude. The option of generate()is necessary, and allows us to specify the name of the variable that will be generated to store the distance between the two locations. ![]() Other optional options can also be added.īy default, this command generates distances in kilometers. Whenever working with distances, or the geodist command, it is a good idea to store values in variables with the double format rather than the float format. Calculating the Distance of New York From Other Cities The double format is more precise and accurate than float. Let’s say we wish to find the distance of New York from other cities in the US. ![]() We type the following command with the first two arguments being the latitude (40.697132) and longitude (-73.931351), respectively, of New York. The latitude and longitude of other cities are stored in ‘’latitudes’ and ‘longitudes’ respectively. The distance between each of these cities from New York will be stored in a new variable generated called ‘km’. The distances stored in the new ‘km’ variable are air distances as opposed to road distances. To compute the above distances in miles, we just add an option mile: geodist 40.697132 -73.931351 latitude longitude, gen(mile) mile This is because this command calculates the geodetic distance, which is the length of the shortest curve between two points on the earth’s spherical surface. This generates a variable ‘mile’ with the distance of each city from New York stored in miles. Haversine and Vincenty (1975) Formulae for Distance:īy default, Stata uses the Vincenty (1975) formula to calculate the distance through geodist. If we require the Haversine formula to be used, we add an option of sphere. ![]() Using geodist with Longitudes and Latitudes Stored In a Variable The Vincenty (1975) and Haversine formulas produce fairly similar measures of distance. Let’s first generate variables that hold longitude and latitude values for New York: generate baselat = 40.697132 generate baselong = -73.931351 If our longitudinal and latitudinal values were stored in a variable, and we did not want to write entire numbers in our commands manually, we simply use the respective variable names. To illustrate the lack of precision that results from the float format, we run the geodistcommand again using these newly generated variables: geodist baselat baselong latitude longitude, generate(base) This generates two variables ‘baselat’ and ‘baselong’ that are formatted as float. Note that the variable ‘base’ shows a non zero distance between New York and itself, even though a city’s distance from itself should be zero. Pairwise Combinations of Latitudes and Longitudes: It is therefore important to format distance variables as double. We now move on to computing which cities are located near (or a certain distance away) from others. For this example, we load another data set, ‘industrialcities.dta’ with observations of four industrial cities and their latitudes and longitudes stored in the respective variables. This file has longitude and latitude data on Iowa City, Boston, Houston and Chicago. We now want to know which city in our distance.dta file is within 30km of these four industrial cities. ![]()
0 Comments
Leave a Reply. |