Warp10OutputFormat is a standard Hadoop OutputFormat which can store Geo Time Series™ into a Warp 10™ instance, either distributed or standalone.
As such it can be used anywhere a Hadoop OutputFormat can be used such as from within a Map Reduce, Pig or Spark job.
Warp10OutputFormat expects (key,value) tuples as input. The key is discarded so you can safely pass an instance of NullWritable as the key. The value is expected to be a Geo Time Series™ wrapper either in raw format as produced by WRAPRAW or in text format as produced by WRAP. The raw format must be passed as a BytesWritable, the text format as Text.
Warp10OutputFormat is flexible enough to accept mixed inputs in both raw and text formats.
The behavior of the InputFormat is governed by the following configuration keys:
|warp10.gzip||Set to |
|warp10.endpoint||Warp 10™ update endpoint to use (|
|warp10.token||Write token to use for pushing data|
|warp10.maxrate||Maximum write rate (in datapoints per second) when talking to the update endpoint. Note that this is a maximum rate per task, if your job has multiple tasks pushing data, they will each enforce this limit. By default there is no maximum rate enforced|
Note that the
Warp10OutputFormat can be instantiated with a suffix. The configuration keys will be first looked up with that suffix appended. If no suffixed version is found, the unsuffixed version will be looked up and if not found the default value (when it exists) will be used.