Update running instructions by Caio 4 years ago (log)

Blob README.markdown

Showing rendered content. Download source code


A cooking recipe search JSON API with over a million recipes.


This is a cargo workspace:

Running Instructions

You can use the sample data to run a tiny version of the API:

cargo run --bin load /tmp/cantine < cantine/tests/sample_recipes.jsonlines
RUST_LOG=debug BASE_DIR=/tmp/cantine cargo run

API Tutorial

The API is publicly accessible at

You can search via POST on /search:

curl -H "Content-Type: application/json" -d'{ "fulltext": "bacon" }'

The output will contain an array under items with each item containing fields like name, crawl_url, num_ingredients, image and more.

If you want more details about a specific recipe, you can GET at /recipe/{uuid}.

There's one more useful endpoint you can GET: /info. We'll refer to it in more detail later, but it basically describes some of the features we support.

Now, to make things easier to read we'll create a simple function in bash:

export API=
function search() { curl -XPOST "$API/search" -H "Content-Type: application/json" -d"$1"; echo; }

So we can do a useful search for recipes with bacon, the phrase "deep fry" and without eggs:

search '{ "fulltext": "bacon -egg \"deep fry\"" }'


You should have noticed a next field in the output of our previous search. Should look like base64-encoded gibberish.

If you submit the same search, but with an extra after key with the value you got from next, you get (surprise!) the next results:

search '{ "fulltext": "bacon", "after": "AAAAAABAy6c0cM0Rb7VSU3OJkjB7_hHxeA" }'

Notice that the result contains a next field again? So long as a result contains a next you can keep using it as after to paginate through a result set of any size.


From the /info endpoint you can learn all the valid sort options. Currently the default is "relevance", you can sort by every feature sans diet-related ones and you can change the order to ascending.

search '{ "sort": "num_ingredients_asc" }'

Querying Features

From the /info endpoint we can also learn about the features we know about each recipe.

Here's a commented example of what you would see by looking at the output under features.num_ingredients:

  // Lowest number of ingredients (at least) one indexed recipe has
  "min": 2,
  // Ditto, but highest
  "max": 93,
  // Number of recipes in the index with the "num_ingredients" feature
  "count": 1183461,


You can query for any feature and value ranges you want. Recipes with calories within the [100,350[ range:

search '{ "fulltext": "picanha", "filter": { "calories": [100, 350] } }'


You can get a breakdown of any/every feature for arbitrary (half-open) ranges.

Maybe you'd like to see a more detailed counts of a search by total time:

search '{ "fulltext": "cheese bacon", "agg": { "total_time": [ [0, 15], [15, 60], [60, 240] ] } }'

The output will contain a new agg field, that looks something like this:

  "agg": {
    "total_time": [
        "min": 0,
        "max": 14,
        "count": 3158
        "min": 15,
        "max": 58,
        "count": 8982
        "min": 60,
        "max": 225,
        "count": 1594

Which is, in order, the breakdown of each of the ranges we requested in the search. So if we add a new filter for [15,60] to the search we should expect 8982 matching recipes:

search '{ "fulltext": "cheese bacon", "filter": { "total_time": [15, 60] } }'

Of course, you can filter and aggregate as many features/ranges as you want.

NOTE: For performance reasons, the agg field is omitted from the result if too many recipes are found (300k currently).