Log
-
Prepare for v0.4.0 release by Caio 4 years ago
-
Use tique::QueryParser 💬 by Caio 4 years ago
Now only `QueryParser` is public, not the mod.
-
Add QueryParser docs by Caio 4 years ago
-
Rename tique's `unstable` feature to `queryparser` 💬 by Caio 4 years ago
... And somehow `tique` can't be built because `cantine` wants a feature that stopped existing? That kinda diminishes the usefulness of workspaces.
-
Use multiple fields when indexing and querying 💬 by Caio 4 years ago
This patch splits the `fulltext` index field into its sources: name, ingredients and instructions, making them available for individual strict querying. Unsurprisingly, using an OR-by-default boolean-query-based with multiple fields yields awful results for simple queries. To account for that, we switch the query parser to use dismax with a 10% tiebreaking increment.
-
Move definitions around 💬 by Caio 4 years ago
No changes
-
Rust 1.42: Remove `extern crate proc_macro` by Caio 4 years ago
-
Update Cargo.lock by Caio 4 years ago
-
Add support for parsing queries using DisMax by Caio 4 years ago
-
Fix goofy pure negative query detection 💬 by Caio 4 years ago
And the weird leftofver copy-pasta in the test got removed
-
Update Cargo.lock by Caio 4 years ago
-
Merge branch 'multi_field_query_parser' by Caio 4 years ago
-
Integrate QueryParser changes into cantine 💬 by Caio 4 years ago
Not done, the default behaviour is changed, but I want to bring this branch back to master for some tests.
-
Add initial DisMaxQuery implementation 💬 by Caio 4 years ago
I was (unintentionally?) made aware that tantivy doesn't have a dismax query when @jackdoe pointed me at his cool new project. So I wrote one. Since I'm hacking on a dumb query parser that allows multiple fields and boosts, this will come in handy very soon. Ref: https://github.com/jackdoe/octopus_query/
-
Add support for changing field name 💬 by Caio 4 years ago
And rename `queryparser::interpreter` mod to `parser`
-
Decide the Occur at the raw parser level by Caio 4 years ago
-
First working field-aware QueryParser by Caio 4 years ago
-
Rename queryparser::parser to queryparser::raw by Caio 4 years ago
-
Ensure it's hard to cause an Err() with this parser by Caio 4 years ago
-
Add support for parsing +mandatory queries 💬 by Caio 4 years ago
And rename `negated` to `prohibited`
-
Add support for strict field names by Caio 4 years ago
-
Add plumbing for field:based -queries:"like these" 💬 by Caio 4 years ago
This patch makes the raw input parser identify field names in queries, but the interpreter completely ignores the information. The current thing is pretty rudimentary, so here's a brain dump of what I need to figure out when moving this forward: * Maybe `Vec<(String, Field)>` so that we don't tie to field name * Default field(s) * Per field weight * Decide how to handle unknown fields 1. Phrases are obviously wrong 2. Terms might:be:valid in some cases
-
Make code examples slightly easier to manage by Caio 4 years ago
-
check_sim: Bench both queries, add a csv header by Caio 4 years ago
-
Expose Keywords::{clone,len,is_empty}() by Caio 4 years ago
-
Update Cargo.lock by Caio 4 years ago
-
Release tique-0.3.0 by Caio 4 years ago
-
Support for conversion into weighted queries by Caio 4 years ago
-
Upgrade to tantivy 0.12 by Caio 4 years ago
-
Benchmark keyword-based similarity 💬 by Caio 4 years ago
This patch introduces the `check_sim` command, which is what I'm using to explore the indexing quality. In a gist, for every recipe in the index it: 1. Extracts the top 20 keywords 2. Tries to find 10+1 recipes similar to it 3. Measures how many of the found recipes are also in the "canonical" `Recipe.similar_recipe_ids` The command simply dumps a csv to STDOUT and I run analysis over it. This is an example of the output: > 1721097,cornbread;dressing;salsa;salad;lettuce,11,5,10,0 > 862391,salmon;fillet;fillets;baked;piece,11,1,10,0 > 1206600,mousse;peppers;partly;bruised;yogurt,11,0,10,0 > 944326,watercress;oranges;sectioned;navel;tough,11,0,10,0 > 243820,roast;freeze;barbecue;cooker;freezer,11,5,10,0 And a breakdown of the first line: > 1721097: The recipe id > cornbread;dressing;salsa;salad;lettuce: top 5 keywords > 11: Number of similar recipes we found (Step 2) > 5: 5 of the found neighbors were also in the `similar_recipe_ids` set > 10: `Recipe.similar_recipe_ids.len()` > 0: Did we manage to find $self in the 10+1 nearest neighbors? At which index?