count-postings

Usage

Extracts posting counts from an inverted index.
Usage: ../../../build/bin/count-postings [OPTIONS]

Options:
  -h,--help                   Print this help message and exit
  -e,--encoding TEXT REQUIRED Index encoding
  -i,--index TEXT REQUIRED    Inverted index filename
  --tokenizer TEXT:{english,whitespace} [english] 
                              Tokenizer
  -H,--html                   Strip HTML
  -F,--token-filters TEXT:{krovetz,lowercase,porter2} ...
                              Token filters
  --stopwords TEXT            Path to file containing a list of stop words to filter out
  -q,--queries TEXT           Path to file with queries
  --terms TEXT                Term lexicon
  --weighted                  Weights scores by query frequency
  --sep TEXT                  Separator string
  --query-id                  Print query ID at the beginning of each line, separated by a colon
  -L,--log-level TEXT:{critical,debug,err,info,off,trace,warn} [info] 
                              Log level
  --config                    Configuration .ini file
  --sum                       Sum postings accross the query terms; by default, individual list lengths will be printed, separated by the separator defined with --sep

Description

Extracts posting counts from an inverted index.

It sums up posting counts for each query term after parsing. See parse_collection for more details about parsing options.