Avro & Java: Record Parsing

This tutorial will guide you through how to convert json to avro and then back to json. I suggest you first read through the documentation on Avro to familiarize yourself with it. This tutorial assumes you have a maven project already setup and a resources folder.

POM:

Add Avro Dependency

 

 

 

 

Add Jackson Dependency

Avro Schema File:

Next you need to create the avro schema file in your resources folder. Name the file “schema.avsc”. The extension avsc is the Avro schema extension.

  1. {
  2. "namespace": "test.avro",
  3. "type": "record",
  4. "name": "MY_NAME",
  5. "fields": [
  6. {"name": "name_1", "type": "int"},
  7. {"name": "name_2", "type": {"type": "array", "items": "float"}},
  8. {"name": "name_3", "type": "float"}
  9. ]
  10. }

Json Record to Validate:

Next you need to create a json file that conforms to your schema you just made. Name the file “record.json” and put it in your resources folder. The contents can be whatever you want as long as it conforms to your schema above.

  1. { "name_1": 234, "name_2": [23.34,654.98], "name_3": 234.7}

It’s Avro Time:

Imports:

  1. import java.io.ByteArrayOutputStream;
  2. import java.io.DataInputStream;
  3. import java.io.File;
  4. import java.io.IOException;
  5. import java.io.InputStream;
  6.  
  7. import org.apache.avro.Schema;
  8. import org.apache.avro.generic.GenericData;
  9. import org.apache.avro.generic.GenericDatumReader;
  10. import org.apache.avro.generic.GenericDatumWriter;
  11. import org.apache.avro.io.DatumReader;
  12. import org.apache.avro.io.Decoder;
  13. import org.apache.avro.io.DecoderFactory;
  14. import org.apache.avro.io.Encoder;
  15. import org.apache.avro.io.EncoderFactory;
  16.  
  17. import com.fasterxml.jackson.databind.JsonNode;
  18. import com.fasterxml.jackson.databind.ObjectMapper;

Conversion to Avro and Back:

  1. private void run() throws IOException {
  2. //Get the schema and json record from resources
  3. final ClassLoader loader = getClass().getClassLoader();
  4. final File schemaFile = new File(loader.getResource("schema.avsc").getFile());
  5. final InputStream record = loader.getResourceAsStream("record.json");
  6. //Create avro schema
  7. final Schema schema = new Schema.Parser().parse(schemaFile);
  8.  
  9. //Encode to avro
  10. final byte[] avro = encodeToAvro(schema, record);
  11.  
  12. //Decode back to json
  13. final JsonNode node = decodeToJson(schema, avro);
  14.  
  15. System.out.println(node);
  16. System.out.println("done");
  17. }
  18.  
  19. /**
  20. * Encode json to avro
  21. *
  22. * @param schema the schema the avro pertains to
  23. * @param record the data to convert to avro
  24. * @return the avro bytes
  25. * @throws IOException if decoding fails
  26. */
  27. private byte[] encodeToAvro(Schema schema, InputStream record) throws IOException {
  28. final DatumReader<GenericData.Record> reader = new GenericDatumReader<>(schema);
  29. final DataInputStream din = new DataInputStream(record);
  30. final Decoder decoder = new DecoderFactory().jsonDecoder(schema, din);
  31. final Object datum = reader.read(null, decoder);
  32. final GenericDatumWriter<Object> writer = new GenericDatumWriter<>(schema);
  33. final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
  34. final Encoder encoder = new EncoderFactory().binaryEncoder(outputStream, null);
  35. writer.write(datum, encoder);
  36. encoder.flush();
  37.  
  38. return outputStream.toByteArray();
  39. }
  40.  
  41. /**
  42. * Decode avro back to json.
  43. *
  44. * @param schema the schema the avro pertains to
  45. * @param avro the avro bytes
  46. * @return the json
  47. * @throws IOException if jackson fails
  48. */
  49. private JsonNode decodeToJson(Schema schema, byte[] avro) throws IOException {
  50. final ObjectMapper mapper = new ObjectMapper();
  51. final DatumReader<GenericData.Record> reader = new GenericDatumReader<>(schema);
  52. final Decoder decoder = new DecoderFactory().binaryDecoder(avro, null);
  53. final JsonNode node = mapper.readTree(reader.read(null, decoder).toString());
  54.  
  55. return node;
  56. }

AWS: Java Kinesis Lambda Handler

This entry is part 2 of 5 in the series AWS & Java

If you want to write a Lambda for AWS in Java that connects to a Kinesis Stream. You need to have the handler.

Maven:

  1. <dependency>
  2. <groupId>com.amazonaws</groupId>
  3. <artifactId>aws-java-sdk</artifactId>
  4. <version>1.11.109</version>
  5. </dependency>
  6. <dependency>
  7. <groupId>com.fasterxml.jackson.core</groupId>
  8. <artifactId>jackson-databind</artifactId>
  9. <version>2.7.1</version>
  10. </dependency>

This is the method that AWS Lambda will call. It will look similar to the one below.

  1. import com.amazonaws.services.lambda.runtime.Context;
  2. import com.amazonaws.services.lambda.runtime.events.KinesisEvent;
  3. import com.amazonaws.services.lambda.runtime.events.KinesisEvent.KinesisEventRecord;
  4. import com.amazonaws.services.kinesis.model.Record;
  5. import com.fasterxml.jackson.databind.ObjectMapper;
  6. import com.fasterxml.jackson.databind.JsonNode;
  7.  
  8. public void kinesisRecordHandler(KinesisEvent kinesisEvent, Context context) {
  9. final String awsRequestId = context.getAwsRequestId();
  10. final int memoryLimitMb = context.getMemoryLimitInMB();
  11. final int remainingTimeInMillis = context.getRemainingTimeInMillis();
  12.  
  13. for (final KinesisEventRecord kinesisRec : kinesisEvent.getRecords()) {
  14. final Record record = kinesisRec.getKinesis();
  15.  
  16. //We get the kinesis data information
  17. final JsonNode recData = new ObjectMapper().readValue(record.getData().array(), JsonNode.class);
  18.  
  19. final String bucketName = recData.get("bucket").asText();
  20. final String key = recData.get("key").asText();
  21. }
  22. }

The thing to note when you setup you Lambda is how to setup the “Handler” field in the “Configuration” section on AWS. It is in the format “##PACKAGE##.##CLASS##::##METHOD##”.

Java: Jackson Json

When you want to work with JSON the package of choice is Jackson. I find it so useful and easy to use.

Maven:

  1. <dependency>
  2. <groupId>com.fasterxml.jackson.core</groupId>
  3. <artifactId>jackson-core</artifactId>
  4. <version>2.7.1</version>
  5. </dependency>
  6. <dependency>
  7. <groupId>com.fasterxml.jackson.core</groupId>
  8. <artifactId>jackson-annotations</artifactId>
  9. <version>2.7.1</version>
  10. </dependency>
  11. <dependency>
  12. <groupId>com.fasterxml.jackson.core</groupId>
  13. <artifactId>jackson-databind</artifactId>
  14. <version>2.7.1</version>
  15. </dependency>

There are so many things that we can do with Jackson. It’s actually very exciting.

Let’s start with converting a json string to ObjectNode.

  1. import com.fasterxml.jackson.databind.ObjectMapper;
  2. import com.fasterxml.jackson.databind.node.ObjectNode;
  3.  
  4. //We need an object mapper to help with the conversion
  5. final ObjectMapper mapper = new ObjectMapper();
  6. //Next we read the json string into ObjectNode
  7. final ObjectNode node = (ObjectNode)mapper.readTree(jsonString);
  8.  
  9. //You can check if the objectnode has a key by
  10. node.has("my_key")
  11.  
  12. //You can get the value by
  13. node.get("my_key")
  14.  
  15. //To get the value as a string using
  16. .asText() at the end of the get
  17.  
  18. //To get as int
  19. .asInt()
  20.  
  21. //To get as boolean
  22. .asBoolean()
  23.  
  24. //To get as double
  25. .asDouble()
  26.  
  27. //To get as Long
  28. .asLong()

Next let’s convert a json string to JsonNode

  1. import com.fasterxml.jackson.databind.JsonNode;
  2. import com.fasterxml.jackson.databind.ObjectMapper;
  3.  
  4. //You need an object mapper to help with the conversion
  5. final ObjectMapper mapper = new ObjectMapper();
  6. //Next we read the json string into a jsonnode
  7. final JsonNode node = mapper.readTree(jsonString);
  8. //We can if you want to switch it to an object node fairly easily.
  9. final ObjectNode objNode = (ObjectNode)node;
  10.  
  11. //Put value in the object node
  12. objNode.put("my_key", 1);
  13.  
  14. //You can also set json node
  15. objNode.set("my_key", myJsonNode);

You can also create a new object node by using JsonNodeFactory. See below.

  1. import com.fasterxml.jackson.databind.node.JsonNodeFactory;
  2. import com.fasterxml.jackson.databind.node.ObjectNode;
  3.  
  4. //If you want to create an instance of ObjectNode
  5. final ObjectNode objNode = JsonNodeFactory.instance.objectNode();

If you want to work with json array’s it’s also pretty straight forward. Again we work with JsonNodeFactory to declare it out.

  1. import com.fasterxml.jackson.databind.node.ArrayNode;
  2. import com.fasterxml.jackson.databind.node.JsonNodeFactory;
  3.  
  4. private ArrayNode arrNode = JsonNodeFactory.instance.arrayNode();
  5.  
  6. //We can then add to it
  7. arrNode.add();

Jackson also has generators which allow us to create json on the fly.

  1. import com.fasterxml.jackson.core.JsonFactory;
  2. import com.fasterxml.jackson.core.JsonGenerator;
  3. import com.fasterxml.jackson.core.JsonProcessingException;
  4.  
  5. //It throws a "JsonProcessingException'
  6.  
  7. final JsonFactory jfactory = new JsonFactory();
  8. ByteArrayOutputStream out=new ByteArrayOutputStream();
  9.  
  10. try (final JsonGenerator jg = jfactory.createGenerator(out, JsonEncoding.UTF8)) {
  11. jg.setCodec(objectMapper);
  12.  
  13. jg.writeStartObject();
  14.  
  15. jg.writeFieldName("my_key");
  16. jg.writeString("hi");
  17.  
  18. jg.writeFieldName("next_key");
  19. jg.writeObject(my_obj);
  20.  
  21. jg.writeFieldName("third_key");
  22. jg.writeTree(my_obj_node);
  23.  
  24. jg.writeEndObject();
  25. }

Jackson JsonViews are really cool in my opinion. We can have an object that has multiple properties and we can set a view to each one of these properties. Which means when we serialize the object we will only get those fields that are associated to that view and vise versa on the deserialize.

  1. import com.fasterxml.jackson.annotation.JsonCreator;
  2. import com.fasterxml.jackson.annotation.JsonProperty;
  3. import com.fasterxml.jackson.annotation.JsonView;
  4.  
  5. //You can however many views we want to work with in our objects. They must all be classes.
  6. //If you extend a class then it will contain all properties of that view and the view it extends.
  7. public class MyClassViews {
  8. public static class ViewA {}
  9. public static class ViewB extends ViewA {}
  10. }
  11.  
  12. //Let's create an object for our serialization and deserialization
  13. public class MyClass {
  14. private final int id;
  15. private final String value;
  16.  
  17. //This constructor is what the deserializer would use.
  18. @JsonCreator
  19. public MyClass(@JsonProperty("id") int id, @JsonProperty("value") String value) {
  20. }
  21.  
  22. @JsonView(MyClassViews.ViewA.class)
  23. @JsonProperty("id") //You don't have to put the name you could leave it as @JsonProperty
  24. public int getId() {
  25. return id;
  26. }
  27.  
  28. @JsonView(MyClassViews.ViewB.class)
  29. @JsonProperty("value")
  30. public String getValue() {
  31. return value;
  32. }
  33. }

I didn’t come up with this next part but it’s so awesome that I wanted to include it. Let’s say you had an object that was populated not using any views but the object does contain views that will eventually pass up to a UI but you only want to pass using a specific view. Then you can do the following.

  1. import javax.ws.rs.core.StreamingOutput;
  2. import com.fasterxml.jackson.databind.ObjectMapper;
  3.  
  4. final StreamingOutput jsonStream = new StreamingOutput() {
  5. @Override
  6. public void write(OutputStream out) throws IOException {
  7. final ObjectMapper mapper = new ObjectMapper();
  8. mapper.writerWithView(MyClassViews.ViewB.class).writeValue(out, MyClass);
  9. }
  10. };

If you want to write the object to string using a view. All you need to do is

  1. final String jsonString = new ObjectMapper().writerWithView(ViewB).writeValueAsString(MyClass);

If you then want to convert our new jsonString to the object using the view do the following:

  1. new ObjectMapper().readerWithView(ViewB.class).forType(MyClass.class).readValue(jsonString);

If you want to convert a json string to object.

  1. new ObjectMapper().readValue(myJsonString, MyClass.class)