File formatting
We support two main methods of integrations, API and File.
What is a file
- A file is a piece of data:
- It can contain anything; image, json, excel, csv, xml. Content can be a list (batch) or single objects.
- A batch is a file that contains a list:
- This can be both csv records, json, json-lines, xml/xml-lines and so on.
- The batch/blob/files terminology is sometimes mixed up in documentation, but they are almost synonyms anyway.
- Batch does NOT refer to a schedule (i.e. how often something runs), only to the fact that the file contains a list.
Batch-file formatting
- As mentioned, a batch can contain any list of data, but the recommended format is as follows:
- Content type: application/x-jsonlines
- Each data item is serialized to a single line of JSON JSon Streaming
- Properties use camelCasing
- Enumerations are serialized as strings
- Each serialized data item is written as a string to a stream
- String is UTF-8 encoded
- String ends in new line character
- The stream is compressed using GZip
- Example of batch (before GZip)
- In some cases we also do XML batch export, with the following format:
- Content type: application/x-xmllines
- Each data item is serialized to a single line of XML
- All insignificant whitespace is removed
- XML Prolog is not included (ie.
<?xml version="1.0" encoding="UTF-8"?>
)
- Each serialized data item is written as a string to a stream
- String is UTF-8 encoded
- String ends in new line character
- The stream is compressed using GZip
- Example of batch (before GZip)
- This example represents an ARTS XML export including the namespace for LRS extensions, but most xml elements are omitted for brevity