Command Line
Usage: python -m gatelogue_aggregator [OPTIONS] COMMAND [ARGS]...
Options:
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
drop-sources Drop all `*Source` tables from the DB from the output of...
run actually run the aggregator
Run
Usage: python -m gatelogue_aggregator run [OPTIONS]
actually run the aggregator
Options:
--cache-dir PATH where to cache files downloaded from the
Internet (preferably a temporary directory)
[default: /tmp/gatelogue]
--cache-duration INTEGER how long to hold cached files retrieved from
URLs for [default: 3600]
--timeout INTEGER how long to wait for a network request in
seconds before aborting and failing
[default: 60]
--cooldown INTEGER how long to wait before sending new requests
to the same URL if `429 Too Many Requests`
is received [default: 15]
-o, --output PATH file to output the result to, as an SQLite
DB [default: data.db]
-r, --report / -R, --no-report print a report of all nodes after merger
[default: r]
-w, --max_workers INTEGER maximum number of concurrent workers that
download and process data [default: 8]
-ce, --cache-exclude TEXT re-retrieve data for these sources instead
of loading from cache (separate with `;`,
use `*` for all sources) [default: ""]
-i, --include TEXT sources to retrieve from (do not use with
--exclude) (separate with `;`, use `*` for
all sources) [default: ""]
-e, --exclude TEXT sources NOT to retrieve from (do not use
with --include) (separate with `;`, use `*`
for all sources) [default: ""]
-h, --help Show this message and exit.
Drop Sources
Usage: python -m gatelogue_aggregator drop-sources [OPTIONS]
Drop all `*Source` tables from the DB from the output of `run`
Options:
-i, --input PATH path of the SQLite DB [default: data.db]
-o, --output PATH path to output the sourceless DB to [default: data-
ns.db]
-h, --help Show this message and exit.