Command-Line Interface¶
The netrc file is supported by commands that interact with Elasticsearch.
sphinx¶
Prints the URL and the documents to index from the OCDS documentation as JSON.
ocdsindex sphinx DIRECTORY BASE_URL
DIRECTORY: the directory to crawl, containing language directories and HTML filesBASE_URL: the URL of the website whose files are crawled
Example:
ocdsindex sphinx path/to/standard/build/ https://standard.open-contracting.org/staging/1.1-dev/ > data.json
The output looks like:
{
"base_url": "https://standard.open-contracting.org/staging/1.1-dev/",
"created_at": 1577880000,
"documents": {
"en": [
{
"url": "https://standard.open-contracting.org/staging/1.1-dev/en/#about",
"title": "Open Contracting Data Standard: Documentation - About",
"text": "The Open Contracting Data Standard …"
}
]
}
}
with additional keys for each language and additional objects for each document.
extension-explorer¶
Prints the URL and the documents to index from the Extension Explorer as JSON.
ocdsindex extension-explorer FILE
FILE: the Extension Explorer’s extensions.json file
Example:
ocdsindex extension-explorer path/to/extension_explorer/data/extensions.json > data.json
index¶
Adds documents to Elasticsearch indices.
ocdsindex index HOST FILE
HOST: the connection URI for Elasticsearch, likehttps://user:pass@host:9200FILE: the file containing the output of thesphinxorextension-explorercommand
Example:
ocdsindex index https://user:pass@host:9200 data.json
reindex¶
Reindexes documents into a new versioned index.
For each ocdsindex_XX alias, creates a new ocdsindex_XX-NNNN index, copies all documents into it, atomically updates the alias to point to the new index, and deletes the old index.
ocdsindex reindex HOST
HOST: the connection URI for Elasticsearch, likehttps://user:pass@host:9200
Example:
ocdsindex reindex https://user:pass@host:9200
copy¶
Adds a document with a DESTINATION base URL for each document with a SOURCE base URL.
ocdsindex copy HOST SOURCE DESTINATION
HOST: the connection URI for Elasticsearch, likehttps://user:pass@host:9200SOURCE: the base URL of the documents to copyDESTINATION: the base URL of the documents to create
Example:
ocdsindex copy https://user:pass@host:9200 https://standard.open-contracting.org/staging/latest/ https://standard.open-contracting.org/latest/
expire¶
Deletes documents from Elasticsearch indices that were crawled more than 180 days ago.
ocdsindex expire HOST --exclude-file FILENAME
HOST: the connection URI for Elasticsearch, likehttps://user:pass@host:9200--exclude-file FILENAME: exclude any document whose base URL is equal to a line in this file
Example:
ocdsindex expire https://user:pass@host:9200 --exclude-file exclude.txt
Where exclude.txt contains:
https://standard.open-contracting.org/latest/
https://standard.open-contracting.org/1.1/