Resolving and Extending Urls¶

The primary interface is appurls.parse_url`(), which will find and construct a appurl.url.Url for a string. The function will select a Url class using two selection criteria. The first is the appurl.urls entry point. Here is the entrypoint configuration for the appurl package:

  entry_points = {
    'appurl.urls' : [ "\* = appurl.url:Url",
        #
        "http: = appurl.web.web:WebUrl",
        "https: = appurl.web.web:WebUrl",
        "s3: = appurl.web.s3:S3Url",
        "socrata+ = appurl.web.socrata:SocrataUrl",
        #
        # Archive Urls
        ".zip = appurl.archive.zip:ZipUrl",
        #
        # File Urls
        ".csv = appurl.file.csv:CsvFileUrl",
        "file: = appurl.file.file:FileUrl",
    ]
}

The key of each configuration like is a string the indicate the first round of matching, and the value is the class to use for that matcher. The match strings are:

‘proto:’ The URL protocol, which is based on either the URL scheme or scheme extension.
‘.format’ The target file format
‘schemeext+’ The URL scheme extension.
‘*’ Any URL.

Because these configurations are in the entry pointy, you can extend the AppUrls by including these entry points in Python packages.

The rowgenerators.appurls.parse_url collects all of the URL classes that pass the initial matchers and sorts them by the Url.match_priority class property. Then, it iterates through the matched classes in priority order, calling the Url.match method. The function contructs and returns the first match.

Resolving and Extending Urls¶

Row Generators

Navigation

Related Topics