JSON Schema to Validate Objects for Downstream consumers

JSON Schema icon

Applying JsonSchema to Json objects in JavaScript to introduce resiliency in integration points.

A recurring pattern for me, an engineer who prefers JavaScript, is ensuring my downstream consumers can deserialize JSON into a structured object. When invoking an API (REST or otherwise) the payload is validated in the request / response cycle. However when communicating through asynchronous mechanisms such as AWS Kinesis or Apache Kafka that immediate feedback is not provided and the producer of data is not aware of the consumers ability to handle that payload.

A schema is the structure of how one organizes data. In relational database systems the schema is formalized through the Data Definition Language’s (DDL) common syntax such as CREATE TABLE <blah>. Structured documents have had a number of schema types to guide content SGML leveraged the DTD (dating back to mid 1980’s). Both were supplanted by XML and XML-SCHEMA respectively (dating to mid 2000's).

While at a 100% full stack JavaScript shop I found and leveraged JOI to validate the schema of JSON objects. However JOI doesn’t work across a wide range of technologies — including Java which is essential in our enterprise ecosystem. Recently I have turned to JsonSchema to get that cross-platform validation capabilities.

JsonSchema allows us to define:

  • required and optional fields
  • nested objects
  • prebuilt validations
  • custom validations

Consider the following sample where some objects have a birthDay property, some address objects have a zip and others have an apartment properties.

Person.json

[
{
"firstName": "john",
"lastName": "smith",
"birthDay": "1980-01-01",
"address": {
"streetNumber": "10",
"streetName": "Washington",
"city": "Boston",
"state": "MA",
"zip": "02110"
}
},
{
"firstName": "john",
"lastName": "smith",
"address": {
"streetNumber": "10",
"streetName": "Washington",
"apartmentNumber": "a",
"city": "Boston",
"state": "MA",
"zip": "02110"
}
},
{
"firstName": "john",
"lastName": "smith",
"address": {
"streetNumber": "10",
"streetName": "Washington",
"apartmentNumber": "a",
"city": "Boston",
"state": "MA"
}
}
]

We can craft the following JsonSchema omitting those optional properties from the “required” array of properties. Further we can associate types (such as string) to the properties associate predefined formats for those properties (such as the birthDay property which has the type string and the format date). In the nested object address we can specify a custom pattern for the zip property which leverages a regular expression.

PersonSchema_v1.0.json

{
"definitions": {},
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://example.com/root.json",
"type": "object",
"title": "The Person Schema",
"required": [
"firstName",
"lastName",
"address"
],
"properties": {
"firstName": {
"$id": "#/properties/firstName",
"type": "string",
"title": "The firstName Schema"
},
"lastName": {
"$id": "#/properties/lastName",
"type": "string",
"title": "The lastName Schema"
},
"birthDay": {
"$id": "#/properties/birthDay",
"type": "string",
"format": "date",
"title": "The birthDay Schema"
},
"address": {
"$id": "#/properties/address",
"type": "object",
"title": "The address Schema",
"required": [
"streetNumber",
"streetName",
"city",
"state",
"zip"
],
"properties": {
"streetNumber": {
"$id": "#/properties/address/properties/streetNumber",
"type": "string",
"title": "The address/streetNumber Schema"
},
"streetName": {
"$id": "#/properties/address/properties/streetName",
"type": "string",
"title": "The address/streetName Schema"
},
"apartmentNumber": {
"$id": "#/properties/address/properties/apartmentNumber",
"type": "string",
"title": "The address/apartmentNumber Schema"
},
"city": {
"$id": "#/properties/address/properties/city",
"type": "string",
"title": "The address/city Schema"
},
"state": {
"$id": "#/properties/address/properties/state",
"type": "string",
"title": "The address/state Schema"
},
"zip": {
"$id": "#/properties/address/properties/zipCode",
"type": "string",
"title": "The address/zipCode Schema",
"pattern": "^[0-9]{5}$"
}
}
}
}
}

The syntax is a little wieldy to create from scratch by hand. However there are tools such as https://jsonschema.net/ which takes a sample object as input and generates a JsonSchema you can use as a starting point.

Validating an input document with the schema should be simple.

/*****************************************************************
* Validate some widgets and handle the results
* in this case handle success with console log
* handle failures with console.error
******************************************************************/
const personSchema = require('./PersonSchema_v1.0.json')
, persons = require('./Person.json') ;

const personsValidator = getListValidator(personSchema)

const results = personsValidator(persons)
// => [Either a' a]

const pretty = x => JSON.stringify(x, null, 4)
const handleSuccess = x => console.log("valid: ", pretty(x))
const handleFailure = x => console.error("invalid: ", pretty(x))

const handleValidation = bimap(handleFailure)(handleSuccess)
const handleValidations = map(handleValidation)

handleValidations(results)

There needs to be a method of handling a list of “n” of objects and receiving a list of “n” objects back that are either valid accroding to the schema or invalid with the object that failed validation in combination with the corresponding validation error.

We need a library to make this work:

const Ajv = require('ajv')
, { Left, Right, pipe, map, bimap } = require('sanctuary') ;

// :: (a -> boolean) -> a -> Either a' a
const validateToEither = validator => x => {
const validatorResult = validator(x)
if(validatorResult === true){
return Right(x)
} else {
return Left({"input": x, "error": validator.errors})
}
}

// :: JsonSchema -> (a -> boolean)
const getBaseValidator = schema => {
const ajv = new Ajv({}) ;
return ajv.compile(schema);
}

// :: JsonSchema -> a -> Either a' a
const getObjectValidator = pipe([
getBaseValidator,
validateToEither
])

// :: JsonSchema -> [{}] -> [Either a' a]
const getListValidator = pipe([
getObjectValidator,
map
])

We could write all of this code ourselves in the form of mapping over the initial list and building a data structure to handle the output — but the study of sophisticated type systems allows defines a category of higher types. In this space of higher types resides the Either type. We can consider the Either type as a abstract class with two concrete subclasses Left which hold an error object and a Right which holds a success object. Once we have that list of Either objects we can handle the errors with little effort by applying bimap to each one providing functions one applied to the objects of type Left and another for the objects of type Right. This alleviates our application code from having to fork to handle success and failure scenarios.

The real work is done in the function getListValidator (which you should subsume into a library). It takes a JsonSchema and returns a function which expects a list of objects that will be validated against that JsonSchema. It will output the list of Ether types indicating each objects success or failure. Hindley-Milner type signatures provides insight into how the function works: JsonSchema -> [a] -> [Either a’ a] The right of each arrow is a return type, and if it is not a the last arrow in the chain it indicates that the return type is itself a function. The variable a represent a generic type and a’ represents some derivative of that type.

getBaseValidator allows us to leverage the ajv library to perform the schema validation. Perhaps the most interesting function in the library validateToEither which takes the ajv validator and returns a function that takes an object and returns the results of that object validated (against the ajv validator) in the form of an either.

Interested in your thoughts!

UPDATE

We could rewrite these two compositions in more traditional form like this:

const getObjectValidator = jsonSchema => {
const objectValidator = getBaseValidator(jsonSchema)
const objectValidatorToEither =
validateToEither(objectValidator)

A 25 year software industry veteran with a passion for functional programming, architecture, mentoring / team development, xp/agile and doing the right thing.