Application Urls¶
Application Urls provide structure and operations on URLS where the file the URL refers to can’t, in general, simply be downloaded. For instance, you may want to refer to a CSV file inside a ZIP archive, or a worksheet in an Excel file. In conjunction with Row Generators, Application Urls are often used to refer to tabular data stored on data repositories. For instance:
- Stored on the web:
http://examples.com/file.csv
- Inside a zip file on the web:
http://example.com/archive.zip#file.csv
- A worksheet in an Excel file:
http://example.com/excel.xls#worksheet
- A worksheet in an Excel file in a ZIP Archive:
http://example.com/archive.zip#excel.xls;worksheet
- An API:
socrata+http://chhs.data.ca.gov/api/views/tthg-z4mf
The module defines an entry_point
, so other modules can extend the types of
URLs can that can produced by parse_app_url()
. For instance,
the pandas-reporter
module extends appurl
to access Census tables from Census
Reporter, using URLs such as censusreporter:B17001/140/05000US06073
Typical use – for downloading an archive and extracting a file from it – is:
from appurl import parse_app_url
from os.path import exists
url = parse_app_url("http://example.com/archive.zip#file.csv")
resource_url = url.get_resource() # Download the .zip file
target_path = resource_url.get_target() # Extract `file.csv` from the .zip
assert(exists(target_path)) # The path to file.csv