Basic Usage *********** Creating an AppUrl ================== There are two ways to create an AppUrl: parse a string, or instantiate a class. If you are starting from a string, and in particular, don't know what the Url class should be, use :py:meth:`.parse_app_url`. If you do know what kind of Url you want to generate, use the :py:class:`Url` subclass directly. After creating a URL, the basic useage involve either manipulating it, or fetching it. THere are two fetching methods, :py:meth:`Url.get_resource`, to download files from the web, and :py:meth:`Url.get_target` to extract a file from an archive or other container. When the operation is unnecessary, such as getting the resource for a resource URL that has already been downloaded, the :py:meth:`Url.get_resource` returns ``self``. These AppUrls have these components in addition to standard URLS: - A scheme extension, which preceedes the scheme with a '+' - A target\_file, the first part of the URL fragment - A target\_segment, the second part of a URL fragement, delineated by a ';' The ``scheme_extension`` specifies the protocol to use with a sthadard web scheme, inspired by github URLs like ``git+http://github.com/example``. The ``target_file`` is usually the file within an archive. It is interpreted as a regular expression. The ``target_segment`` may be either a name or a number, and is usually interpreted as the name or number of a worksheet in a spreadsheet file. Combining these extensions: :: ckan+http://example.com/dataset/archive.zip#excel.xlsx;worksheet This url may indicate that to fetch a ZIP file from an CKAN server, using the CKAN protocol, extract the ``excel.xls`` file from the ZIP archive, and open the ``worksheet`` worksheet. The URLs define a few important concepts: - resource\_url: the portion of the URl that defines only the resource to be access or downloaded. In the eample above, the resource url is 'http://example.com/dataset/archive.zip' - resource\_file: The basename of the resource URL: \`archive.zip' - resource\_format: Usually, the extension of the resource\_file: 'zip' - target\_file: The name of the target\_file: 'excel.xlsx' - target\_format: The extension of the target\_file: 'xlsx' Using AppUrls ============= Typical use is: .. code-block:: python from appurl import parse_app_url url = parse_app_url("http://example.com/archive.zip#file.csv") resource_url = url.get_resource() target_path = resource_url.get_target() The call to ``url.get_resource()`` will download the resource file and store it in the cache ,returning a ``File:`` url pointing to the downloaded file. If the file is an archive, the call to ``resource.get_target()`` will extract the target file from the archive. If it is not an archive, it just returns the resource url. The final result is that ``target_path`` is a Url pointing to a file in the filesystem. Parsing Strings =============== .. autofunction:: appurl.parse_app_url The URL Base Class ================== .. autoclass:: appurl.Url :members: