create_wand_data

Usage

Creates additional data for query processing.
Usage: ../../../build/bin/create_wand_data [OPTIONS]

Options:
  -h,--help                   Print this help message and exit
  -c,--collection TEXT REQUIRED
                              Collection basename
  -o,--output TEXT REQUIRED   Output filename
  --quantize UINT             Quantizes the scores using this many bits
  --compress Needs: --quantize
                              Compress additional data
  -s,--scorer TEXT REQUIRED   Scorer function
  --bm25-k1 FLOAT Needs: --scorer
                              BM25 k1 parameter.
  --bm25-b FLOAT Needs: --scorer
                              BM25 b parameter.
  --pl2-c FLOAT Needs: --scorer
                              PL2 c parameter.
  --qld-mu FLOAT Needs: --scorer
                              QLD mu parameter.
  --range Excludes: --block-size --lambda
                              Create docid-range based data
  --terms-to-drop TEXT        A filename containing a list of term IDs that we want to drop
  -L,--log-level TEXT:{critical,debug,err,info,off,trace,warn} [info] 
                              Log level
  --config                    Configuration .ini file
[Option Group: blocks]
   
  [At least 1 of the following options are required]
  Options:
    -b,--block-size UINT Excludes: --lambda --range
                                Block size for fixed-length blocks
    -l,--lambda FLOAT Excludes: --block-size --range
                                Lambda parameter for variable blocks

Description

Creates additional data needed for certain query algorithms. See "WAND" Data for more details.

Refer to queries for details about scoring functions.

Blocks

Each posting list is divided into blocks, and each block gets a precomputed max score. These blocks can be either of equal size throughout the index, defined by --block-size, or variable based on the lambda parameter --lambda. [TODO: Explanation needed]