The Warp10OutputFormat is a standard Hadoop OutputFormat which can store Geo Time Series into a Warp 10 instance, either distributed or standalone.

As such it can be used anywhere a Hadoop OutputFormat can be used such as from within a Map Reduce, Pig or Spark job.

Using the Warp10OutputFormat

The Warp10OutputFormat expects (key,value) tuples as input. The key is discarded so you can safely pass an instance of NullWritable as the key. The value is expected to be a Geo Time Series wrapper either in raw format as produced by WRAPRAW or in text format as produced by WRAP. The raw format must be passed as a BytesWritable, the text format as Text.

The Warp10OutputFormat is flexible enough to accept mixed inputs in both raw and text formats.

The behavior of the InputFormat is governed by the following configuration keys:

warp10.gzipSet to true to compress the payload sent to Warp 10. This defaults to false
warp10.endpointWarp 10 update endpoint to use (http://host:port/api/vX/update)
warp10.tokenWrite token to use for pushing data
warp10.maxrateMaximum write rate (in datapoints per second) when talking to the update endpoint. Note that this is a maximum rate per task, if your job has multiple tasks pushing data, they will each enforce this limit. By default there is no maximum rate enforced

Note that the Warp10OutputFormat can be instantiated with a suffix. The configuration keys will be first looked up with that suffix appended. If no suffixed version is found, the unsuffixed version will be looked up and if not found the default value (when it exists) will be used.