Missing vs. Empty Fields

Spatialized founder Jozef Sorocin
Jozef Soročin
Updated 07/13/2025

The topic of empty-ish fields comes up often because:

  • input fields weren't required and were skipped (e.g. optional input fields in signup forms)
  • because of human error (e.g. data entry "interns" forgot to fill out a form field)
  • and last but not least, because of purposefully set empty-ish values (null, [], {}, "")

In Elasticsearch, these values would indicate that a given field does exist:

  • Empty strings, such as "" or "-"
  • Arrays containing null and another value, such as [null, "foo"]
  • A custom null-value , defined in field mapping (discussed later on)

When a doc is inserted into ES, its fields are usually indexed. The indexed value of one of those fields may not exist due to a variety of reasons:

  • The field is simply not present in the source JSON
  • The field in the source JSON is nullnull or [] or [null, null, ...?] or {}
  • The field has "index" : false set in the mapping
  • The field value was malformed and the ignore_malformed parameter was defined in the index mapping

These statements might still be confusing so let's look at a concrete example.

Given the following documents that contain different data types, any of which can be emptyish or even absent (non-existent):

POST myindex/_doc/1
{
	"id": 1
                            
}

POST myindex/_doc/2
{
	"id": 2,
  "potentially_absent_field": null
}

POST myindex/_doc/3
{
	"id": 3,
  "potentially_absent_field": 123
}

I want to find documents that strictly lack the potentially_absent_field, i.e. only doc#1.

At the first glance it looks like we could reverse the exists query and thus target only those docs that do not contain the potentially_absent_field:

POST myindex/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "exists": {
            "field": "potentially_absent_field"
          }
        }
      ]
    }
  }
}

Somewhat unexpectedly, this returns both #1 and #2! The reason for that is that doc#2's potentially_absent_field cannot be indexed or searched. It is treated as though it has no values and is, for all intents and purposes, considered a non-existent field — just like the one in doc#1.

Having said that, we essentially have two ways to solve this:

We need an explicit placeholder value that we can target in the query in order to filter out the docs whose potentially_absent_field is present but null.

The mapping parameter null_value enables us to do just that. Its value can be arbitrary but it needs to be of the same data type as the field itself. Since we're dealing with an integer field, we'll pick -1 and update the mapping as follows:

PUT myindex/_mapping
{
  "properties": {
    "potentially_absent_field": {
      "type": "long",
      "null_value": -1
    }
  }
}

Keep in mind that this only influences how the field is indexed and not its original value (i.e. _source that would be retrieved in a search request).

Since this is a "static" mapping update, the already indexed documents won't be affected. So in order for this change to propagate onto the existing docs, we'll need to reindex.

The following trick forces the index to pick up on the new mapping change without any _reindex calls:

Join 200+ developers who've mastered this! Get Complete Access — €19
Already a member? Sign in here