Overview
The Warp10OutputFormat
is a standard Hadoop OutputFormat which can store Geo Time Series into a Warp 10 instance, either distributed or standalone.
As such it can be used anywhere a Hadoop OutputFormat can be used such as from within a Map Reduce, Pig or Spark job.
Using the Warp10OutputFormat
The Warp10OutputFormat
expects (key,value) tuples as input. The key is discarded so you can safely pass an instance of NullWritable as the key. The value is expected to be a Geo Time Series wrapper either in raw format as produced by WRAPRAW or in text format as produced by WRAP. The raw format must be passed as a BytesWritable, the text format as Text.
The Warp10OutputFormat
is flexible enough to accept mixed inputs in both raw and text formats.
The behavior of the InputFormat is governed by the following configuration keys:
Key | Description |
---|---|
warp10.gzip | Set to true to compress the payload sent to Warp 10. This defaults to false |
warp10.endpoint | Warp 10 update endpoint to use (http://host:port/api/vX/update ) |
warp10.token | Write token to use for pushing data |
warp10.maxrate | Maximum write rate (in datapoints per second) when talking to the update endpoint. Note that this is a maximum rate per task, if your job has multiple tasks pushing data, they will each enforce this limit. By default there is no maximum rate enforced |
Note that the Warp10OutputFormat
can be instantiated with a suffix. The configuration keys will be first looked up with that suffix appended. If no suffixed version is found, the unsuffixed version will be looked up and if not found the default value (when it exists) will be used.