This dataset contains the travel time (in hours) required to move between any pair of centroids of core-based statistical areas (CBSAs) in the United States for each year between 1971 and 2018, as computed in Testoni (2024). For a full description of the methodology and data sources used in the computations, please refer to the paper. More details on optimal travel routes are in Online Appendix A.
Variables:
cbsa1: The ID number of the origin CBSA
cbsa2: The ID number of the destination CBSA
year: The year the travel time is measured
cbsa1_name: The name of the origin CBSA
cbsa2_name: The name of the destination CBSA
cbsa1_cent_lon: Longitude of the origin CBSA
cbsa1_cent_lat: Latitude of the origin CBSA
cbsa2_cent_lon: Longitude of the destination CBSA
cbsa2_cent_lat: Latitude of the destination CBSA
time: Travel time (in hours)
mode: The optimal mode of transportation (car or flight)
connections: The number of connecting flights in the optimal travel route (ranging from 0 to 2). A value of 0 indicates a direct flight between the CBSAs. The maximum number of connections used to compute the optimal travel route is 2. It is missing if the optimal mode of transportation is driving (i.e., if mode = “car”).
origin_code: Origin airport code. It is missing if the optimal mode of transportation is driving (i.e., if mode = “car”).
dest_code: Destination airport code. It is missing if the optimal mode of transportation is driving (i.e., if mode = “car”).
Citation:
If you use this dataset in your research, please cite:
Testoni, M. (2024). Transportation networks and competition in the market for corporate control. Strategic Management Journal, 45(6), 1180-1208.
Files:
This dataset contains the text-based patent similarity scores between all companies (identified by PERMCOs) in the CRSP-Compustat Merged universe from 1980 to 2018, as computed in Testoni (2022). For a full description of the similarity measure and the methodology and data sources used in the computations, please refer to Online Appendices A and B of the paper.
Variables:
permco1 and permco2: CRSP company identifiers representing each pair of firms. The similarity measure is symmetric, meaning the score for company A vs. B is the same as B vs. A.
year: The year the similarity is measured. The similarity is computed considering all patents filed by the two firms in the preceding five years.
pattxpr: The uncalibrated similarity score, ranging from 0 (no overlap in keyword distributions) to 1 (identical keyword distributions across patent portfolios). The paper uses a calibrated version of the measure to have the same level of coarseness as a classification based on the three-digit USPTO technological classes. If you require the calibrated similarity measure used in the paper, apply the following transformation:
calibrated score = max{uncalibrated score – 0.0609853; 0}
See page 3 of the Online Appendix for details.
Citation:
If you use this dataset in your research, please cite:
Testoni, M. (2022). The market value spillovers of technological acquisitions: Evidence from patent‐text analysis. Strategic Management Journal, 43(5), 964-985.
Files: