Using static imports to create DSLs for more or less complex SOLR object hierarchies

The Poor Man’s DSL

What?

According to Wikipedia

"A domain-specific language (DSL) is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains."

In our case the DSL is used to create more or less complex object hierarchies. So the domain will be the domain of the object hierarchy and the only thing our DSL can do is create this special object hierarchy.

In this special instance we are creating Solr Responses that will be used in unit tests. We want to test parsing solr responses into our own domain classes.

As the creation of these solr specific hierachies is rather complex and uses a lot of string literals a DSL makes it a lot easier and safer to create these object hierarchies for testing. Also it is way more readable.

Compare this:

QueryResponse queryResponse = new QueryResponse();
NamedList<Object> response = new NamedList<>();

SolrDocumentList documents = new SolrDocumentList();
SolrDocument firstDocument = new SolrDocument();
firstDocument.addField("id","12345");
firstDocument.addField("title","Lord of the Rings");
firstDocument.addField("author","J.R.R Tolkien");
documents.add(firstDocument);

SolrDocument secondDocument = new SolrDocument();
...
documents.add(secondDocument);
documents.setNumFound(12);
response.add("response", documents);

NamedList<Object> facetValue = new NamedList<>();
facetValue.add("val", "High Fantasy");
facetValue.add("count", 737);

NamedList<Object> facetValue2 = new NamedList<>();
facetValue2.add("val", "Dark Fantasy");
facetValue2.add("count", 8);

NamedList<Object> buckets = new NamedList<>();
buckets.add("buckets", newArrayList(new NamedList[] { facetValue, facetValue2 }));

NamedList<Object> facetField = new NamedList<>();
facetField.add("genre", buckets);

response.add("facets", facetField);

queryResponse.setResponse(response);

to this…

QueryResponse queryResponse = queryResponse(
    documents(
        152L,
        document(
            fields(
                field("id", "12345"),
                field("title", "Lord of the Rings"),
                field("author", "J.R.R Tolkien")
            )
        ),
        document(
            fields(
                ...
            )
        )
    ),
    facets(
        facet("genre",
            buckets(
                facetValue("High Fantasy", 737)
                facetValue("Dark Fantasy", 8)
            )
        )
    )
)

Not only is the second snippet more readable than the first one also a user of that DSL does not need to know the intricate and gruesome details how a solr response is made up of nested NamedLists. WHich is a blessing when constructing more than one of these responses.

But how?

So how do we create such a DSL? Basically we use a lot of static methods and static imports and some varargs.

Our goal is a DSL that supports creating a QueryResponse with some documents including child documents and the JSON facet response used by solr. See above for an example.

Let’s start to construct this DSL top down. We start with the queryResponse method. It needs to return a QueryResponse and takes a list of solr documents and a „list“ of facet values. In Solr documents are represented in a SolrDocumentList and facets in a NamedList object.

public static QueryResponse queryResponse(SolrDocumentList documents, NamedList<Object> facets) {
    QueryResponse queryResponse = new QueryResponse();

    NamedList<Object> response = new NamedList<>();
    response.add("response", documents);

    if ( facets != null ) {
      response.add("facets", facets);
    }

    queryResponse.setResponse(response);
    return queryResponse;
  }

The next step is to create documents, set fields, add child documents and collect them into a SolrDocumentList.

Let’s start with the list of documents. We have a variable number of documents to add to a SolrDocumentList so we use varargs to express the variability. We also added the number of found documents as a parameter. The advantage here is that is a mandatory parameter and must be set. Whereas using the setter may be forgotten.

public static SolrDocumentList documents(long numFound, SolrDocument... documents) {
  SolrDocumentList solrDocumentList = new SolrDocumentList();
  solrDocumentList.addAll(Arrays.asList(documents));
  solrDocumentList.setNumFound(numFound);
  return solrDocumentList;
}

Next will be the single documents. Solr fields are pairs of field name and field value. Where the field value may be single valued or a list of values. The tuple will be modeled by Map.Entry<String, Object> and we need to allow multiple fields. We could use a vararg parmeter here but firstly a document without any field does not make sense so we force a document to have at least one field. Secondly as we have another parameter and java only allows the last parameter to be a vararg we use a list.

The child documents is also modelled as list. Using varargs here would allow us to just write

document(
    fields(...)
    document(...)
    document(...)
)

but we explicitely want to indicate child documents as noted below.

document(
    fields(...)
    children(
        document(...)
        document(...)
    )
)

That is why we use a list so we need to call another method that collects all child documents and returns a list. To allow documents without child documents we overload the document method without the second parameter and construct a Document with a empty list of child documents.

Here are these three methods that create documents and child documents:

public static SolrDocument document(List<Map.Entry<String, Object>> fields) {
  return document(fields, Collections.emptyList());
}

public static SolrDocument document(List<Map.Entry<String, Object>> fields,
                                    SolrDocument... childDocuments) {
  SolrDocument document = new SolrDocument();
  fields.forEach(e -> document.addField(e.getKey(), e.getValue()));
  document.addChildDocuments(Arrays.asList(childDocuments));
  return document;
}

public static List<SolrDocument> children(SolrDocument... documents) {
  return Arrays.asList(documents);
}

Fields are created very similar. We have a method fields that takes a variable list of Map.Entry objects. These may be created by the field methods. It is also overloaded to take different parameter types.

@SafeVarargs
public static List<Map.Entry<String, Object>> fields(Map.Entry<String, Object>... fields) {
  return Arrays.asList(fields);
}

public static Map.Entry<String, Object> field(String key, String value) {
  return new ImmutablePair<>(key, value);
}

public static Map.Entry<String, Object> field(String key, List<String> value) {
  return new ImmutablePair<>(key, value);
}
...

All these methods may be combined to create a query response like this:

document(
    fields(
        field("id", "54641874")
        field("title", "Good Omens")
        field("author", newArrayList("Neil Gaiman","Terry Pratchet")
    )
    children(
        document(
            field("id", "54641874_chapter_one")
            field("title", "In the Beginning")
        )
    )
)

Facets

Now to facets. Facets are a little bit more complex as they are made up of nested NamedList objects which are basically untyped. NamedList are a special implementation of a Map. This time we will construct the DSL bottom up because I think this will be easier. We will create the response for a terms facet query.

Facetting over a field creates buckets with values and the number of documents these values occur in. So lets start with the values. Values are a NamedList with two keys val and count. This is still easy to create.

public static NamedList<Object> facetValue(String value, Integer count) {
  NamedList<Object> facet = new NamedList<>();
  facet.add("val", value);
  facet.add("count", count);
  return facet;
}

A bucket is just a list of values represented as a NamedList with a key of buckets and the list of values, also of type NamedList.

@SafeVarargs
public static NamedList<Object> buckets(NamedList<Object>... facets) {
    NamedList<Object> namedList = new NamedList<>();
    namedList.add("buckets", newArrayList(facets));
    return namedList;
}

A bucket is nested into a NamedList with a key named after the facet field or any other string. So we have a method that takes the facet name and a bucket and creates a named list with name as key and the bucket as values.

public static NamedList<Object> facet(String name, NamedList<Object> buckets) {
  NamedList<Object> namedList = new NamedList<>();
  namedList.add(name, buckets);
  return namedList;
}

At last the facets part of the query response. This is the combination of several facets, i.e. NamedLists with the facet name as key and a bucket. These need to be combined into one NamedList. What we do here is take the list of named lists and combine them into one named list.

@SafeVarargs
public static NamedList<Object> facets(NamedList<Object>... facets) {
  return Arrays.stream(facets).reduce(new NamedList<>(), (n1, n2) -> {
    n1.addAll(n2);
    return n1;
  });
}

Now we have the possibility to create a response with documents and facets.

queryResponse(
    documents(
        152L,
        document(
            fields(
                field("id", "12345"),
                field("title", "Lord of the Rings"),
                field("author", "J.R.R Tolkien")
            )
        ),
        document(
            fields(
                ...
            )
        )
    ),
    facets(
        facet("genre",
            buckets(
                facetValue("High Fantasy", 737)
                facetValue("Dark Fantasy", 8)
            )
        )
    )
);
Conclusion

And with this we have a DSL that allows us to generate a Solr QueryResponse in a way more readable manner than using the plain solr classes. The readability is mainly achieved by the structure that closely follows the json representation of the query response and using static imports. If the methods need to be qualified with the class name as in some style guides the readabiliy will IMHO not be that good.

This DSL may be enhanced using more methods that create different parts of the query response like highlighting, query suggestions, and other parts. I created only the parts I needed for testing the functionality I needed.

One caveat of this DSL is the type safety. Due to the „untyped“ NamedLists a construct like:

facets(
    facetValue("Low Fantasy", 789),
    facetValue("Dark Fantasy", 456),
    facetValue("Cyberpunk", 123),
    facet("genre",
//      buckets(
            facetValue("High Fantasy", 123)
//      )
   )
)

would be syntactically correct but would produce a invalid query response. Using real domain objects for the different parts or a language that better supports the creation of DSLs would prohibit these potential errors. But that is why it’s called „A poor man’s DSL“. These techniques may be used to create DSLs for any object hierarchy with more or less effort. See the links for some other examples.

content by Florian Schulz

Links, References, Further Reading

Kontaktieren Sie uns

Interesse geweckt?

Dann lernen Sie unsere Lösungen kennen.

Schreiben Sie uns, worüber Sie mehr erfahren möchten.
Wir melden uns gerne bei Ihnen zurück.

E-Mail schreiben