count-postings
Usage
Extracts posting counts from an inverted index.
Usage: ../../../build/bin/count-postings [OPTIONS]
Options:
-h,--help Print this help message and exit
-e,--encoding TEXT REQUIRED Index encoding
-i,--index TEXT REQUIRED Inverted index filename
--tokenizer TEXT:{english,whitespace} [english]
Tokenizer
-H,--html Strip HTML
-F,--token-filters TEXT:{krovetz,lowercase,porter2} ...
Token filters
--stopwords TEXT Path to file containing a list of stop words to filter out
-q,--queries TEXT Path to file with queries
--terms TEXT Term lexicon
--weighted Weights scores by query frequency
--sep TEXT Separator string
--query-id Print query ID at the beginning of each line, separated by a colon
-L,--log-level TEXT:{critical,debug,err,info,off,trace,warn} [info]
Log level
--config Configuration .ini file
--sum Sum postings accross the query terms; by default, individual list lengths will be printed, separated by the separator defined with --sep
Description
Extracts posting counts from an inverted index.
It sums up posting counts for each query term after parsing. See
parse_collection
for more details about parsing options.