Hive & Java: Connect to Remote Kerberos Hive using KeyTab

In this tutorial I will show you how to connect to remote Kerberos Hive cluster using Java. If you haven’t install Hive yet follow the tutorial.

Import SSL Cert to Java:

Follow this tutorial to “Installing unlimited strength encryption Java libraries

If on Windows do the following

  1. #Import it
  2. "C:\Program Files\Java\jdk1.8.0_171\bin\keytool" -import -file hadoop.csr -keystore "C:\Program Files\Java\jdk1.8.0_171\jre\lib\security\cacerts" -alias "hadoop"
  3.  
  4. #Check it
  5. "C:\Program Files\Java\jdk1.8.0_171\bin\keytool" -list -v -keystore "C:\Program Files\Java\jdk1.8.0_171\jre\lib\security\cacerts"
  6.  
  7. #If you want to delete it
  8. "C:\Program Files\Java\jdk1.8.0_171\bin\keytool" -delete -alias hadoop -keystore "C:\Program Files\Java\jdk1.8.0_171\jre\lib\security\cacerts"

POM.xml:

  1. <dependency>
  2. <groupId>org.apache.hive</groupId>
  3. <artifactId>hive-jdbc</artifactId>
  4. <version>2.3.3</version>
  5. <exclusions>
  6. <exclusion>
  7. <groupId>jdk.tools</groupId>
  8. <artifactId>jdk.tools</artifactId>
  9. </exclusion>
  10. </exclusions>
  11. </dependency>

Imports:

  1. import org.apache.hadoop.conf.Configuration;
  2. import org.apache.hadoop.security.UserGroupInformation;
  3. import java.sql.SQLException;
  4. import java.sql.Connection;
  5. import java.sql.ResultSet;
  6. import java.sql.Statement;
  7. import java.sql.DriverManager;

Connect:

  1. // Setup the configuration object.
  2. final Configuration config = new Configuration();
  3.  
  4. config.set("fs.defaultFS", "swebhdfs://hadoop:50470");
  5. config.set("hadoop.security.authentication", "kerberos");
  6. config.set("hadoop.rpc.protection", "integrity");
  7.  
  8. System.setProperty("https.protocols", "TLSv1,TLSv1.1,TLSv1.2");
  9. System.setProperty("java.security.krb5.conf", "C:\\Program Files\\Java\\jdk1.8.0_171\\jre\\lib\\security\\krb5.conf");
  10. System.setProperty("java.security.krb5.realm", "REALM.CA");
  11. System.setProperty("java.security.krb5.kdc", "REALM.CA");
  12. System.setProperty("sun.security.krb5.debug", "true");
  13. System.setProperty("javax.net.debug", "all");
  14. System.setProperty("javax.net.ssl.keyStorePassword","changeit");
  15. System.setProperty("javax.net.ssl.keyStore","C:\\Program Files\\Java\\jdk1.8.0_171\\jre\\lib\\security\\cacerts");
  16. System.setProperty("javax.net.ssl.trustStore", "C:\\Program Files\\Java\\jdk1.8.0_171\\jre\\lib\\security\\cacerts");
  17. System.setProperty("javax.net.ssl.trustStorePassword","changeit");
  18. System.setProperty("javax.security.auth.useSubjectCredsOnly", "false");
  19.  
  20. UserGroupInformation.setConfiguration(config);
  21. UserGroupInformation.setLoginUser(UserGroupInformation.loginUserFromKeytabAndReturnUGI("hive/hadoop@REALM.CA", "c:\\data\\hive.service.keytab"));
  22.  
  23. System.out.println(UserGroupInformation.getLoginUser());
  24. System.out.println(UserGroupInformation.getCurrentUser());
  25.  
  26. //Add the hive driver
  27. Class.forName("org.apache.hive.jdbc.HiveDriver");
  28.  
  29. //Connect to hive jdbc
  30. Connection connection = DriverManager.getConnection("jdbc:hive2://hadoop:10000/default;principal=hive/hadoop@REALM.CA");
  31. Statement statement = connection.createStatement();
  32.  
  33. //Create a table
  34. String createTableSql = "CREATE TABLE IF NOT EXISTS "
  35. +" employee ( eid int, name String, "
  36. +" salary String, designation String)"
  37. +" COMMENT 'Employee details'"
  38. +" ROW FORMAT DELIMITED"
  39. +" FIELDS TERMINATED BY '\t'"
  40. +" LINES TERMINATED BY '\n'"
  41. +" STORED AS TEXTFILE";
  42.  
  43. System.out.println("Creating Table: " + createTableSql);
  44. statement.executeUpdate(createTableSql);
  45.  
  46. //Show all the tables to ensure we successfully added the table
  47. String showTablesSql = "show tables";
  48. System.out.println("Show All Tables: " + showTablesSql);
  49. ResultSet res = statement.executeQuery(showTablesSql);
  50.  
  51. while (res.next()) {
  52. System.out.println(res.getString(1));
  53. }
  54.  
  55. //Drop the table
  56. String dropTablesSql = "DROP TABLE IF EXISTS employee";
  57.  
  58. System.out.println("Dropping Table: " + dropTablesSql);
  59. statement.executeUpdate(dropTablesSql);
  60.  
  61. System.out.println("Finish!");

Eclipse/Maven: Jacoco Integration

This tutorial will guide you through configuring Jacoco in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “EclEmma”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you can edit “Code Coverage” from “Window/Preferences”.

You can now run “Code Coverage” through Eclipse by right clicking your project. As you can see below I have not written any unit tests yet :(.

 

Pom.xml

Build

  1. <build>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.jacoco</groupId>
  5. <artifactId>jacoco-maven-plugin</artifactId>
  6. <version>0.8.1</version>
  7. <configuration>
  8. <!-- Path to the output file for execution data. (Used in initialize
  9. phase) -->
  10. <destFile>${project.build.directory}/target/coverage-reports/jacoco-unit.exec</destFile>
  11. <!-- File with execution data. (Used in package phase) -->
  12. <dataFile>${project.build.directory}/target/coverage-reports/jacoco-unit.exec</dataFile>
  13. <excludes>
  14. </excludes>
  15. </configuration>
  16. <executions>
  17. <execution>
  18. <id>jacoco-initialization</id>
  19. <phase>initialize</phase>
  20. <goals>
  21. <!-- https://www.eclemma.org/jacoco/trunk/doc/prepare-agent-mojo.html -->
  22. <goal>prepare-agent</goal>
  23. </goals>
  24. </execution>
  25. <execution>
  26. <id>jacoco-site</id>
  27. <phase>package</phase>
  28. <goals>
  29. <!-- https://www.eclemma.org/jacoco/trunk/doc/report-mojo.html -->
  30. <goal>report</goal>
  31. </goals>
  32. </execution>
  33. </executions>
  34. </plugin>
  35. </plugins>
  36. </build>

 

 

 

Eclipse/Maven: FindBugs/SpotBugs Integration

This tutorial will guide you through configuration FindBugs/SpotBugs in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “SpotBugs”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate FindBugs/SpotBugs for and click “Properties”. Click “SpotBugs” and then make the following changes.

Now you can run SpotBugs by right clicking your project and selecting SpotBugs then “Find Bugs”.

Pom.xml

Reporting

  1. <reporting>
  2. <plugins>
  3. <plugin>
  4. <groupId>com.github.spotbugs</groupId>
  5. <artifactId>spotbugs-maven-plugin</artifactId>
  6. <version>3.1.3</version>
  7. </plugin>
  8. </plugins>
  9. </reporting>

Build

  1. <build>
  2. <plugins>
  3. <plugin>
  4. <groupId>com.github.spotbugs</groupId>
  5. <artifactId>spotbugs-maven-plugin</artifactId>
  6. <version>3.1.3</version>
  7. <dependencies>
  8. <dependency>
  9. <groupId>com.github.spotbugs</groupId>
  10. <artifactId>spotbugs</artifactId>
  11. <version>3.1.3</version>
  12. </dependency>
  13. </dependencies>
  14. <configuration>
  15. <effort>Max</effort>
  16. <threshold>Low</threshold>
  17. <failOnError>true</failOnError>
  18. <plugins>
  19. <plugin>
  20. <groupId>com.h3xstream.findsecbugs</groupId>
  21. <artifactId>findsecbugs-plugin</artifactId>
  22. <version>LATEST</version>
  23. </plugin>
  24. </plugins>
  25. </configuration>
  26. </plugin>
  27. </plugins>
  28. </build>

Maven Commands

  1. mvn spotbugs:spotbugs
  2.  
  3. #Generates the report site
  4. mvn site

Eclipse/Maven: PMD Integration

This tutorial will guide you through configuring PMD in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “PMD”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate PMD for and click “Properties”. Click “PMD” and then click “Enable PMD for this project”. You will need to create a rule set. To do that go here.

Pom.xml

Reporting

You will need both reporting plugins in your project. “maven-jxr-plugin” fixes an issue with not finding the xRef.

  1. <reporting>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.apache.maven.plugins</groupId>
  5. <artifactId>maven-pmd-plugin</artifactId>
  6. <version>3.9.0</version>
  7. </plugin>
  8. <plugin>
  9. <groupId>org.apache.maven.plugins</groupId>
  10. <artifactId>maven-jxr-plugin</artifactId>
  11. <version>2.5</version>
  12. </plugin>
  13. </plugins>
  14. </reporting>

Build

You will need to configure the following to use with “mvn pmd:???” commands.

  1. <build>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.apache.maven.plugins</groupId>
  5. <artifactId>maven-pmd-plugin</artifactId>
  6. <version>3.9.0</version>
  7. <configuration>
  8. <failOnViolation>true</failOnViolation>
  9. <verbose>true</verbose>
  10. <targetJdk>1.8</targetJdk>
  11. <includeTests>false</includeTests>
  12. <excludes>
  13. </excludes>
  14. <excludeRoots>
  15. <excludeRoot>target/generated-sources/stubs</excludeRoot>
  16. </excludeRoots>
  17. </configuration>
  18. <executions>
  19. <execution>
  20. <phase>test</phase>
  21. <goals>
  22. <goal>pmd</goal>
  23. <goal>cpd</goal>
  24. <goal>cpd-check</goal>
  25. <goal>check</goal>
  26. </goals>
  27. </execution>
  28. </executions>
  29. </plugin>
  30. </plugins>
  31. </build>

Maven Commands

  1. mvn pmd:check
  2. mvn pmd:pmd
  3.  
  4. #cdp checks for copy paste issues
  5.  
  6. mvn pmd:cdp-check
  7. mvn pmd:cdp
  8.  
  9. #Generates the report site
  10. mvn site

Eclipse/Maven: CheckStyle Integration

This tutorial will guide you through configuration CheckStyle in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “Checkstyle”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate CheckStyle for and activate it. There are also properties you can configure through Eclipse’s preferences. I suggest you go there and configure it. You can also customize your checkstyle or make your own. Up to you.

Pom.xml

Build

When you run “mvn checkstyle:check” if will then run and will fail the build if you have any issues.

  1. <build>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.apache.maven.plugins</groupId>
  5. <artifactId>maven-checkstyle-plugin</artifactId>
  6. <version>3.0.0</version>
  7. <executions>
  8. <execution>
  9. <id>validate</id>
  10. <phase>validate</phase>
  11. <configuration>
  12. <encoding>UTF-8</encoding>
  13. <consoleOutput>true</consoleOutput>
  14. <failsOnError>true</failsOnError>
  15. <linkXRef>false</linkXRef>
  16. </configuration>
  17. <goals>
  18. <goal>check</goal>
  19. </goals>
  20. </execution>
  21. </executions>
  22. </plugin>
  23. </plugins>
  24. </build>

Reporting

You can generate a HTML report with the following by running “mvn checkstyle:checkstyle”.

  1. <reporting>
  2. <plugins>
  3. <plugin>
  4. <groupId>org.apache.maven.plugins</groupId>
  5. <artifactId>maven-checkstyle-plugin</artifactId>
  6. <version>3.0.0</version>
  7. <reportSets>
  8. <reportSet>
  9. <reports>
  10. <report>checkstyle</report>
  11. </reports>
  12. </reportSet>
  13. </reportSets>
  14. </plugin>
  15. </plugins>
  16. </reporting>

Java: Command Line Arguments Parsing

This tutorial will guide you through how to do command line argument parsing easily. For this example we will use Commons-Cli package.

pom.xml

  1. <properties>
  2. <commonscli.version>1.4</commonscli.version>
  3. </properties>
  4.  
  5. <dependencies>
  6. <!-- https://mvnrepository.com/artifact/commons-cli/commons-cli -->
  7. <dependency>
  8. <groupId>commons-cli</groupId>
  9. <artifactId>commons-cli</artifactId>
  10. <version>${commonscli.version}</version>
  11. </dependency>
  12. </dependencies>

Imports

  1. import org.apache.commons.cli.CommandLine;
  2. import org.apache.commons.cli.CommandLineParser;
  3. import org.apache.commons.cli.DefaultParser;
  4. import org.apache.commons.cli.HelpFormatter;
  5. import org.apache.commons.cli.Option;
  6. import org.apache.commons.cli.Options;
  7. import org.apache.commons.cli.ParseException;

Main

  1. public static void main(String[] args) {
  2. final Options options = new Options();
  3. Option startOption = new Option("s", "start", true, "Start the process.");
  4. startOption.setRequired(true);
  5. options.addOption(startOption);
  6.  
  7. final HelpFormatter help = new HelpFormatter();
  8. final CommandLineParser parser = new DefaultParser();
  9. CommandLine cmd = null;
  10.  
  11. try {
  12. cmd = parser.parse(options, args);
  13. } catch (final ParseException e) {
  14. help.printHelp("java -jar myApp.jar", "My Header", options, "-s must be specified");
  15. return;
  16. }
  17.  
  18. final boolean doStart = Boolean.valueOf(cmd.getOptionValue("s"));
  19.  
  20. if (doStart) {
  21. //Do my work here
  22. }
  23. }

Avro & Java: Record Parsing

This tutorial will guide you through how to convert json to avro and then back to json. I suggest you first read through the documentation on Avro to familiarize yourself with it. This tutorial assumes you have a maven project already setup and a resources folder.

POM:

Add Avro Dependency

 

 

 

 

Add Jackson Dependency

Avro Schema File:

Next you need to create the avro schema file in your resources folder. Name the file “schema.avsc”. The extension avsc is the Avro schema extension.

  1. {
  2. "namespace": "test.avro",
  3. "type": "record",
  4. "name": "MY_NAME",
  5. "fields": [
  6. {"name": "name_1", "type": "int"},
  7. {"name": "name_2", "type": {"type": "array", "items": "float"}},
  8. {"name": "name_3", "type": "float"}
  9. ]
  10. }

Json Record to Validate:

Next you need to create a json file that conforms to your schema you just made. Name the file “record.json” and put it in your resources folder. The contents can be whatever you want as long as it conforms to your schema above.

  1. { "name_1": 234, "name_2": [23.34,654.98], "name_3": 234.7}

It’s Avro Time:

Imports:

  1. import java.io.ByteArrayOutputStream;
  2. import java.io.DataInputStream;
  3. import java.io.File;
  4. import java.io.IOException;
  5. import java.io.InputStream;
  6.  
  7. import org.apache.avro.Schema;
  8. import org.apache.avro.generic.GenericData;
  9. import org.apache.avro.generic.GenericDatumReader;
  10. import org.apache.avro.generic.GenericDatumWriter;
  11. import org.apache.avro.io.DatumReader;
  12. import org.apache.avro.io.Decoder;
  13. import org.apache.avro.io.DecoderFactory;
  14. import org.apache.avro.io.Encoder;
  15. import org.apache.avro.io.EncoderFactory;
  16.  
  17. import com.fasterxml.jackson.databind.JsonNode;
  18. import com.fasterxml.jackson.databind.ObjectMapper;

Conversion to Avro and Back:

  1. private void run() throws IOException {
  2. //Get the schema and json record from resources
  3. final ClassLoader loader = getClass().getClassLoader();
  4. final File schemaFile = new File(loader.getResource("schema.avsc").getFile());
  5. final InputStream record = loader.getResourceAsStream("record.json");
  6. //Create avro schema
  7. final Schema schema = new Schema.Parser().parse(schemaFile);
  8.  
  9. //Encode to avro
  10. final byte[] avro = encodeToAvro(schema, record);
  11.  
  12. //Decode back to json
  13. final JsonNode node = decodeToJson(schema, avro);
  14.  
  15. System.out.println(node);
  16. System.out.println("done");
  17. }
  18.  
  19. /**
  20. * Encode json to avro
  21. *
  22. * @param schema the schema the avro pertains to
  23. * @param record the data to convert to avro
  24. * @return the avro bytes
  25. * @throws IOException if decoding fails
  26. */
  27. private byte[] encodeToAvro(Schema schema, InputStream record) throws IOException {
  28. final DatumReader<GenericData.Record> reader = new GenericDatumReader<>(schema);
  29. final DataInputStream din = new DataInputStream(record);
  30. final Decoder decoder = new DecoderFactory().jsonDecoder(schema, din);
  31. final Object datum = reader.read(null, decoder);
  32. final GenericDatumWriter<Object> writer = new GenericDatumWriter<>(schema);
  33. final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
  34. final Encoder encoder = new EncoderFactory().binaryEncoder(outputStream, null);
  35. writer.write(datum, encoder);
  36. encoder.flush();
  37.  
  38. return outputStream.toByteArray();
  39. }
  40.  
  41. /**
  42. * Decode avro back to json.
  43. *
  44. * @param schema the schema the avro pertains to
  45. * @param avro the avro bytes
  46. * @return the json
  47. * @throws IOException if jackson fails
  48. */
  49. private JsonNode decodeToJson(Schema schema, byte[] avro) throws IOException {
  50. final ObjectMapper mapper = new ObjectMapper();
  51. final DatumReader<GenericData.Record> reader = new GenericDatumReader<>(schema);
  52. final Decoder decoder = new DecoderFactory().binaryDecoder(avro, null);
  53. final JsonNode node = mapper.readTree(reader.read(null, decoder).toString());
  54.  
  55. return node;
  56. }

Java: Basic Dropwizard Project

This entry is part 1 of 5 in the series Dropwizard

This tutorial will guide you through how to create a bare bones Dropwizard app. Ensure you have Eclipse installed or whatever IDE you are deciding to use. You can use their documentation as a guide. This is the first tutorial in the dropwizard series.

Setup Eclipse Archetype:

Select new Maven Project then in the select Archetype if dropwizard isn’t there you can add it by using the below settings.

Filter for Dropwizard Archetype:

Set Project Settings:

POM:

Ensure the pom has the correct “dropwizard.version” and that the dependency “dropwizard-core” is there.

Our Configuration:

Now that the project is created let’s have a look at what our configuration looks like. Pretty basic but that’s all we need right now.

  1. package ca.gaudreault.mydropwizardapp;
  2.  
  3. import io.dropwizard.Configuration;
  4. import com.fasterxml.jackson.annotation.JsonProperty;
  5. import org.hibernate.validator.constraints.*;
  6. import javax.validation.constraints.*;
  7.  
  8. public class MyDropwizardAppConfiguration extends Configuration {
  9. }

Our Application:

This is what our application class looks like. It’s empty nothing yet.

  1. package ca.gaudreault.mydropwizardapp;
  2.  
  3. import io.dropwizard.Application;
  4. import io.dropwizard.setup.Bootstrap;
  5. import io.dropwizard.setup.Environment;
  6.  
  7. public class MyDropwizardAppApplication extends Application {
  8.  
  9. public static void main(final String[] args) throws Exception {
  10. new MyDropwizardAppApplication().run(args);
  11. }
  12.  
  13. @Override
  14. public String getName() {
  15. return "MyDropwizardApp";
  16. }
  17.  
  18. @Override
  19. public void initialize(final Bootstrap bootstrap) {
  20.  
  21. }
  22.  
  23. @Override
  24. public void run(final MyDropwizardAppConfiguration configuration,
  25. final Environment environment) {
  26.  
  27. }
  28. }

Config.yml:

This is what our config.yml file looks like at the start.

  1. logging:
  2. level: INFO
  3. loggers:
  4. ca.gaudreault: DEBUG

Setup Debug Configuration:

Setup your debug configuration like the below setup.

Running:

Once you run it you will be running two sites.

  1. http://localhost:8080
    1. Your main site
  2. http://localhost:8081
    1. Your operational site (health, etc)

Java: String.format

To format a string with parameters use String.format. There are different ways to use it depending on boolean, number, etc.

Format with Boolean

  1. String.format("Boolean: %b", true)

Format with Number

  1. String.format("Number: %n", 32)

Format with String

  1. String.format("String: %s", "Information")

AWS: Java Post to Kinesis Queue

This entry is part 4 of 5 in the series AWS & Java

Posting to an AWS Kinesis Queue is rather simple and straight forward. As always you should refer to AWS Documentation.

Put Multiple Records On Queue

Import the following

  1. import com.amazonaws.auth.AWSStaticCredentialsProvider;
  2. import com.amazonaws.auth.BasicAWSCredentials;
  3. import com.amazonaws.services.kinesis.AmazonKinesis;
  4. import com.amazonaws.services.kinesis.AmazonKinesisClientBuilder;
  5. import com.amazonaws.services.kinesis.model.PutRecordsRequest;
  6. import com.amazonaws.services.kinesis.model.PutRecordsRequestEntry;
  7. import com.amazonaws.services.kinesis.model.Record;

Put Records

  1. AmazonKinesisClientBuilder clientBuilder = AmazonKinesisClientBuilder.standard().withRegion(myRegion).withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(myAccessKeyId, mySecretKey)));
  2. AmazonKinesis kinesisClient = clientBuilder.build();
  3. PutRecordsRequest putRecordsRequest = new PutRecordsRequest();
  4. putRecordsRequest.setStreamName(myQueue);
  5. List putRecordsRequestEntryList = new ArrayList<>();
  6.  
  7.  
  8. //You can put multiple entries at once if you wanted to
  9. PutRecordsRequestEntry putRecordsRequestEntry = new PutRecordsRequestEntry();
  10. putRecordsRequestEntry.setData(ByteBuffer.wrap(myData));
  11. putRecordsRequestEntry.setPartitionKey(myKey);
  12. putRecordsRequestEntryList.add(putRecordsRequestEntry);
  13.  
  14.  
  15. putRecordsRequest.setRecords(putRecordsRequestEntryList);
  16. PutRecordsResult putResult = kinesisClient.putRecords(putRecordsRequest);

Put Single Record On Queue

Import the following

  1. import com.amazonaws.auth.AWSStaticCredentialsProvider;
  2. import com.amazonaws.auth.BasicAWSCredentials;
  3. import com.amazonaws.services.kinesis.AmazonKinesis;
  4. import com.amazonaws.services.kinesis.AmazonKinesisClientBuilder;
  5. import com.amazonaws.services.kinesis.model.PutRecordRequest;
  6. import com.amazonaws.services.kinesis.model.Record;

Put Record

  1. AmazonKinesisClientBuilder clientBuilder = AmazonKinesisClientBuilder.standard().withRegion(myRegion).withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(myAccessKeyId, mySecretKey)));
  2. AmazonKinesis kinesisClient = clientBuilder.build();
  3. PutRecordRequest putRecordRequest = new PutRecordRequest();
  4. putRecordRequest.setStreamName(myQueue);
  5.  
  6. putRecordRequest.setData(ByteBuffer.wrap(data.getBytes("UTF-8")));
  7. putRecordRequest.setPartitionKey(myKey);
  8.  
  9. PutRecordResult putResult = kinesisClient.putRecord(putRecordRequest);

You now have put a record(s) onto the queue congratulations!

AWS: Java S3 Upload

This entry is part 3 of 5 in the series AWS & Java

If you want to push data to AWS S3 there are a few different ways of doing this. I will show you two ways I have used.

Option 1: putObject

  1. import com.amazonaws.AmazonClientException;
  2. import com.amazonaws.services.s3.model.ObjectMetadata;
  3. import com.amazonaws.ClientConfiguration;
  4. import com.amazonaws.auth.AWSCredentialsProvider;
  5. import com.amazonaws.regions.Regions;
  6. import com.amazonaws.services.s3.AmazonS3;
  7. import com.amazonaws.services.s3.AmazonS3ClientBuilder;
  8.  
  9. ClientConfiguration config = new ClientConfiguration();
  10. config.setSocketTimeout(SOCKET_TIMEOUT);
  11. config.setMaxErrorRetry(RETRY_COUNT);
  12. config.setClientExecutionTimeout(CLIENT_EXECUTION_TIMEOUT);
  13. config.setRequestTimeout(REQUEST_TIMEOUT);
  14. config.setConnectionTimeout(CONNECTION_TIMEOUT);
  15.  
  16. AWSCredentialsProvider credProvider = ...;
  17. String region = ...;
  18.  
  19. AmazonS3 s3Client = AmazonS3ClientBuilder.standard().withCredentials(credProvider).withRegion(region).withClientConfiguration(config).build();
  20.  
  21. InputStream stream = ...;
  22. String bucketName = .....;
  23. String keyName = ...;
  24. String mimeType = ...;
  25.  
  26. //You use metadata to describe the data.
  27. final ObjectMetadata metaData = new ObjectMetadata();
  28. metaData.setContentType(mimeType);
  29.  
  30. //There are overrides available. Find the one that suites what you need.
  31. try {
  32. s3Client.putObject(bucketName, keyName, stream, metaData);
  33. } catch (final AmazonClientException ex) {
  34. //Log the exception
  35. }

Option 2: MultiPart Upload

  1. import com.amazonaws.AmazonClientException;
  2. import com.amazonaws.event.ProgressEvent;
  3. import com.amazonaws.regions.Regions;
  4. import com.amazonaws.services.s3.AmazonS3;
  5. import com.amazonaws.services.s3.AmazonS3ClientBuilder;
  6. import com.amazonaws.ClientConfiguration;
  7. import com.amazonaws.auth.AWSCredentialsProvider;
  8. import com.amazonaws.event.ProgressEventType;
  9. import com.amazonaws.event.ProgressListener;
  10. import com.amazonaws.services.s3.AmazonS3;
  11. import com.amazonaws.services.s3.model.ObjectMetadata;
  12. import com.amazonaws.services.s3.transfer.TransferManager;
  13. import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
  14. import com.amazonaws.services.s3.transfer.Upload;
  15.  
  16. ClientConfiguration config = new ClientConfiguration();
  17. config.setSocketTimeout(SOCKET_TIMEOUT);
  18. config.setMaxErrorRetry(RETRY_COUNT);
  19. config.setClientExecutionTimeout(CLIENT_EXECUTION_TIMEOUT);
  20. config.setRequestTimeout(REQUEST_TIMEOUT);
  21. config.setConnectionTimeout(CONNECTION_TIMEOUT);
  22.  
  23. AWSCredentialsProvider credProvider = ...;
  24. String region = ...;
  25.  
  26. AmazonS3 s3Client = AmazonS3ClientBuilder.standard().withCredentials(credProvider).withRegion(region).withClientConfiguration(config).build();
  27.  
  28. InputStream stream = ...;
  29. String bucketName = .....;
  30. String keyName = ...;
  31. long contentLength = ...;
  32. String mimeType = ...;
  33.  
  34. //You use metadata to describe the data. You need the content length so the multi part upload knows how big it is
  35. final ObjectMetadata metaData = new ObjectMetadata();
  36. metaData.setContentLength(contentLength);
  37. metaData.setContentType(mimeType);
  38.  
  39. TransferManager tf = TransferManagerBuilder.standard().withS3Client(s3Client).build();
  40. tf.getConfiguration().setMinimumUploadPartSize(UPLOAD_PART_SIZE);
  41. tf.getConfiguration().setMultipartUploadThreshold(UPLOAD_THRESHOLD);
  42. Upload xfer = tf.upload(bucketName, keyName, stream, metaData);
  43.  
  44. ProgressListener progressListener = new ProgressListener() {
  45. public void progressChanged(ProgressEvent progressEvent) {
  46. if (xfer == null)
  47. return;
  48. if (progressEvent.getEventType() == ProgressEventType.TRANSFER_FAILED_EVENT || progressEvent.getEventType() == ProgressEventType.TRANSFER_PART_FAILED_EVENT) {
  49. //Log the message
  50. }
  51. }
  52. };
  53.  
  54. xfer.addProgressListener(progressListener);
  55. xfer.waitForCompletion();

Java: Enums

Below is an example of how to create a enum in Java.

  1. public enum MyEnum {
  2. VAL(0), NEXTVAL(1);
  3.  
  4. private int value;
  5.  
  6. private MyEnum(int value) {
  7. setValue(value);
  8. }
  9.  
  10. public int getValue() {
  11. return value;
  12. }
  13.  
  14. public void setValue(int value) {
  15. this.value = value;
  16. }
  17. }

Java: Input/Output Streams

This page is to give some basic examples of how to convert OutputStreams to InputStreams and vise versa.

Maven:

  1. <dependency>
  2. <groupId>org.logback-extensions</groupId>
  3. <artifactId>logback-ext-loggly</artifactId>
  4. <version>0.1.4</version>
  5. </dependency>

To convert an InputStream to OutputStream we can do it using IoUtils.copy as demonstrated below.

  1. import ch.qos.logback.ext.loggly.io.IoUtils;
  2. import java.io.InputStream;
  3.  
  4. InputStream input = ##INPUTSTREAM##;
  5.  
  6. //Convert InputStream to OutputStream
  7.  
  8. try (FileOutputStream out = new FileOutputStream(file)) {
  9. IoUtils.copy(input, out);
  10. } catch (final IOException e) {
  11. }

To convert a ByteArrayOutputStream to a ByteArrayInputStream we can do it as demonstrated below.

  1. final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
  2. //Convert to ByteArrayInputStream
  3. final ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());

AWS: Java Kinesis Lambda Handler

This entry is part 2 of 5 in the series AWS & Java

If you want to write a Lambda for AWS in Java that connects to a Kinesis Stream. You need to have the handler.

Maven:

  1. <dependency>
  2. <groupId>com.amazonaws</groupId>
  3. <artifactId>aws-java-sdk</artifactId>
  4. <version>1.11.109</version>
  5. </dependency>
  6. <dependency>
  7. <groupId>com.fasterxml.jackson.core</groupId>
  8. <artifactId>jackson-databind</artifactId>
  9. <version>2.7.1</version>
  10. </dependency>

This is the method that AWS Lambda will call. It will look similar to the one below.

  1. import com.amazonaws.services.lambda.runtime.Context;
  2. import com.amazonaws.services.lambda.runtime.events.KinesisEvent;
  3. import com.amazonaws.services.lambda.runtime.events.KinesisEvent.KinesisEventRecord;
  4. import com.amazonaws.services.kinesis.model.Record;
  5. import com.fasterxml.jackson.databind.ObjectMapper;
  6. import com.fasterxml.jackson.databind.JsonNode;
  7.  
  8. public void kinesisRecordHandler(KinesisEvent kinesisEvent, Context context) {
  9. final String awsRequestId = context.getAwsRequestId();
  10. final int memoryLimitMb = context.getMemoryLimitInMB();
  11. final int remainingTimeInMillis = context.getRemainingTimeInMillis();
  12.  
  13. for (final KinesisEventRecord kinesisRec : kinesisEvent.getRecords()) {
  14. final Record record = kinesisRec.getKinesis();
  15.  
  16. //We get the kinesis data information
  17. final JsonNode recData = new ObjectMapper().readValue(record.getData().array(), JsonNode.class);
  18.  
  19. final String bucketName = recData.get("bucket").asText();
  20. final String key = recData.get("key").asText();
  21. }
  22. }

The thing to note when you setup you Lambda is how to setup the “Handler” field in the “Configuration” section on AWS. It is in the format “##PACKAGE##.##CLASS##::##METHOD##”.

AWS: Java S3 Lambda Handler

This entry is part 1 of 5 in the series AWS & Java

If you want to write a Lambda for AWS in Java that connects to S3. You need to have the handler.

Maven:

  1. <dependency>
  2. <groupId>com.amazonaws</groupId>
  3. <artifactId>aws-java-sdk</artifactId>
  4. <version>1.11.109</version>
  5. </dependency>

This is the method that AWS Lambda will call. It will look similar to the one below.

  1. import com.amazonaws.services.lambda.runtime.Context;
  2. import com.amazonaws.services.lambda.runtime.events.S3Event;
  3. import com.amazonaws.services.s3.event.S3EventNotification.S3Entity;
  4. import com.amazonaws.services.s3.event.S3EventNotification.S3EventNotificationRecord;
  5.  
  6. public void S3Handler(S3Event s3e, Context context) {
  7. final String awsRequestId = context.getAwsRequestId();
  8. final int memoryLimitMb = context.getMemoryLimitInMB();
  9. final int remainingTimeInMillis = context.getRemainingTimeInMillis();
  10.  
  11. for (final S3EventNotificationRecord s3Rec : s3e.getRecords()) {
  12. final S3Entity record = s3Rec.getS3();
  13. final String bucketName = record.getBucket().getName()
  14. final String key = record.getObject().getKey();
  15. }
  16. }

The thing to note when you setup you Lambda is how to setup the “Handler” field in the “Configuration” section on AWS. It is in the format “##PACKAGE##.##CLASS##::##METHOD##”.

Java: Jackson Json

When you want to work with JSON the package of choice is Jackson. I find it so useful and easy to use.

Maven:

  1. <dependency>
  2. <groupId>com.fasterxml.jackson.core</groupId>
  3. <artifactId>jackson-core</artifactId>
  4. <version>2.7.1</version>
  5. </dependency>
  6. <dependency>
  7. <groupId>com.fasterxml.jackson.core</groupId>
  8. <artifactId>jackson-annotations</artifactId>
  9. <version>2.7.1</version>
  10. </dependency>
  11. <dependency>
  12. <groupId>com.fasterxml.jackson.core</groupId>
  13. <artifactId>jackson-databind</artifactId>
  14. <version>2.7.1</version>
  15. </dependency>

There are so many things that we can do with Jackson. It’s actually very exciting.

Let’s start with converting a json string to ObjectNode.

  1. import com.fasterxml.jackson.databind.ObjectMapper;
  2. import com.fasterxml.jackson.databind.node.ObjectNode;
  3.  
  4. //We need an object mapper to help with the conversion
  5. final ObjectMapper mapper = new ObjectMapper();
  6. //Next we read the json string into ObjectNode
  7. final ObjectNode node = (ObjectNode)mapper.readTree(jsonString);
  8.  
  9. //You can check if the objectnode has a key by
  10. node.has("my_key")
  11.  
  12. //You can get the value by
  13. node.get("my_key")
  14.  
  15. //To get the value as a string using
  16. .asText() at the end of the get
  17.  
  18. //To get as int
  19. .asInt()
  20.  
  21. //To get as boolean
  22. .asBoolean()
  23.  
  24. //To get as double
  25. .asDouble()
  26.  
  27. //To get as Long
  28. .asLong()

Next let’s convert a json string to JsonNode

  1. import com.fasterxml.jackson.databind.JsonNode;
  2. import com.fasterxml.jackson.databind.ObjectMapper;
  3.  
  4. //You need an object mapper to help with the conversion
  5. final ObjectMapper mapper = new ObjectMapper();
  6. //Next we read the json string into a jsonnode
  7. final JsonNode node = mapper.readTree(jsonString);
  8. //We can if you want to switch it to an object node fairly easily.
  9. final ObjectNode objNode = (ObjectNode)node;
  10.  
  11. //Put value in the object node
  12. objNode.put("my_key", 1);
  13.  
  14. //You can also set json node
  15. objNode.set("my_key", myJsonNode);

You can also create a new object node by using JsonNodeFactory. See below.

  1. import com.fasterxml.jackson.databind.node.JsonNodeFactory;
  2. import com.fasterxml.jackson.databind.node.ObjectNode;
  3.  
  4. //If you want to create an instance of ObjectNode
  5. final ObjectNode objNode = JsonNodeFactory.instance.objectNode();

If you want to work with json array’s it’s also pretty straight forward. Again we work with JsonNodeFactory to declare it out.

  1. import com.fasterxml.jackson.databind.node.ArrayNode;
  2. import com.fasterxml.jackson.databind.node.JsonNodeFactory;
  3.  
  4. private ArrayNode arrNode = JsonNodeFactory.instance.arrayNode();
  5.  
  6. //We can then add to it
  7. arrNode.add();

Jackson also has generators which allow us to create json on the fly.

  1. import com.fasterxml.jackson.core.JsonFactory;
  2. import com.fasterxml.jackson.core.JsonGenerator;
  3. import com.fasterxml.jackson.core.JsonProcessingException;
  4.  
  5. //It throws a "JsonProcessingException'
  6.  
  7. final JsonFactory jfactory = new JsonFactory();
  8. ByteArrayOutputStream out=new ByteArrayOutputStream();
  9.  
  10. try (final JsonGenerator jg = jfactory.createGenerator(out, JsonEncoding.UTF8)) {
  11. jg.setCodec(objectMapper);
  12.  
  13. jg.writeStartObject();
  14.  
  15. jg.writeFieldName("my_key");
  16. jg.writeString("hi");
  17.  
  18. jg.writeFieldName("next_key");
  19. jg.writeObject(my_obj);
  20.  
  21. jg.writeFieldName("third_key");
  22. jg.writeTree(my_obj_node);
  23.  
  24. jg.writeEndObject();
  25. }

Jackson JsonViews are really cool in my opinion. We can have an object that has multiple properties and we can set a view to each one of these properties. Which means when we serialize the object we will only get those fields that are associated to that view and vise versa on the deserialize.

  1. import com.fasterxml.jackson.annotation.JsonCreator;
  2. import com.fasterxml.jackson.annotation.JsonProperty;
  3. import com.fasterxml.jackson.annotation.JsonView;
  4.  
  5. //You can however many views we want to work with in our objects. They must all be classes.
  6. //If you extend a class then it will contain all properties of that view and the view it extends.
  7. public class MyClassViews {
  8. public static class ViewA {}
  9. public static class ViewB extends ViewA {}
  10. }
  11.  
  12. //Let's create an object for our serialization and deserialization
  13. public class MyClass {
  14. private final int id;
  15. private final String value;
  16.  
  17. //This constructor is what the deserializer would use.
  18. @JsonCreator
  19. public MyClass(@JsonProperty("id") int id, @JsonProperty("value") String value) {
  20. }
  21.  
  22. @JsonView(MyClassViews.ViewA.class)
  23. @JsonProperty("id") //You don't have to put the name you could leave it as @JsonProperty
  24. public int getId() {
  25. return id;
  26. }
  27.  
  28. @JsonView(MyClassViews.ViewB.class)
  29. @JsonProperty("value")
  30. public String getValue() {
  31. return value;
  32. }
  33. }

I didn’t come up with this next part but it’s so awesome that I wanted to include it. Let’s say you had an object that was populated not using any views but the object does contain views that will eventually pass up to a UI but you only want to pass using a specific view. Then you can do the following.

  1. import javax.ws.rs.core.StreamingOutput;
  2. import com.fasterxml.jackson.databind.ObjectMapper;
  3.  
  4. final StreamingOutput jsonStream = new StreamingOutput() {
  5. @Override
  6. public void write(OutputStream out) throws IOException {
  7. final ObjectMapper mapper = new ObjectMapper();
  8. mapper.writerWithView(MyClassViews.ViewB.class).writeValue(out, MyClass);
  9. }
  10. };

If you want to write the object to string using a view. All you need to do is

  1. final String jsonString = new ObjectMapper().writerWithView(ViewB).writeValueAsString(MyClass);

If you then want to convert our new jsonString to the object using the view do the following:

  1. new ObjectMapper().readerWithView(ViewB.class).forType(MyClass.class).readValue(jsonString);

If you want to convert a json string to object.

  1. new ObjectMapper().readValue(myJsonString, MyClass.class)

Java: Connect to Postgres

Below are the steps to setup a connection to a Postgres DB and some other options that you can use.

Pom.xml:

  1. <dependency>
  2. <groupId>org.postgresql</groupId>
  3. <artifactId>postgresql</artifactId>
  4. <version>9.4.1208.jre7</version>
  5. </dependency>

Import:

  1. import java.sql.Connection;
  2. import java.sql.DriverManager;
  3. import java.sql.PreparedStatement;
  4. import java.sql.ResultSet;
  5. import java.sql.SQLException;

Build The Connection:

  1. Class.forName("org.postgresql.Driver");
  2. Connection connection = DriverManager.getConnection(##URL##, ##USER##, ##PASS##);

Preparing the Query:
We utilise “PreparedStatement” to setup the connection. Which will allow us to use parameters in the query.

  1. PreparedStatement ps = connection.prepareStatement("select id from table where column=?")

If your query had parameters (IE: ?) then you will need to pass in the value for each parameter. If you notice there is ##POSITION## and ##VALUE##. Position is the location of where the parameter appears in the query. You can set various types of data for example integer, json, etc.

  1. ps.setString(##POSITION##, ##VALUE##);

After we perform a query we need to retrieve the data. We use “ResultSet” for that. Depending on how we return the data depends on how we get the data after the query succeeds. Below are examples of one line returned vs multiple rows returned.

  1. ResultSet rs = ps.executeQuery();
  2.  
  3. if (rs.next()) {
  4. int variable = rs.getInt("id");
  5. }
  6.  
  7. while (rs.next()) {
  8. //Do Something
  9. }

Insert:
If you want to perform an insert you don’t need to do “executeQuery” you can call just “execute”.

  1. ps = connection.prepareStatement("insert into mytable (column) values (?)");
  2. ps.setInt(##POSITION##, ##VALUE##);
  3. ps.execute();

Batching:
Sometimes we have a lot of updates or inserts to perform. We should batch that.

  1. //You setup the preparedStatement and then add to the batch.
  2. ps.addBatch();
  3.  
  4. //You can set a max batch size to send the batch once it hits that amount
  5. if (++count % batchSize == 0) {
  6. ps.executeBatch();
  7. }

Passing Json as a Parameter:
We create the postgres object. Set the type to json and the value as a string in json format. Next we set the preparedStatement parameter as the postgres object that we just created.

  1. final PGobject jsonObject = new PGobject();
  2. jsonObject.setType("json");
  3. jsonObject.setValue(value.toString());
  4.  
  5. //Now set the json object into the preparedStatement
  6. ps.setObject(##POSITION##, jsonObject);

Java: ExecutorService / Future

If you want to spin off a bunch of threads and manage them and their responses effectively you can do it this way.

  1. final ExecutorService executor = Executors.newFixedThreadPool(numThreads);
  2. final Collection<Future<JsonNode>> futures = new LinkedList<Future<JsonNode>>();
  3.  
  4. //Write a Loop if you want
  5. final Callable<TYPE> callable = new MyClass();
  6. futures.add(executor.submit(callable));
  7. executor.shutdown();
  8.  
  9. // We need to monitor the queries for their data being returned.
  10. for (final Future<?> future : futures) {
  11. try {
  12. final TYPE val = (TYPE) future.get();
  13. } catch (final InterruptedException e) {
  14. } catch (final ExecutionException e) {
  15. }
  16. }

 

The callable works with a class so you will need that.

  1. package mypackage;
  2.  
  3. import java.util.concurrent.Callable;
  4. import org.apache.log4j.Logger;
  5.  
  6. public class MyClass implements Callable<JsonNode> {
  7.  
  8. static final Logger logger = Logger.getLogger(MyClass.class);
  9.  
  10. MyClass() {
  11. }
  12.  
  13. /**
  14. * This is how each caller queries for the data. It can be called many times and runs on threads from the calling class
  15. * So data is returned as it gets it.
  16. */
  17. @Override
  18. public TYPE call() throws Exception {
  19. try {
  20.                   return null;
  21. } catch (final Exception e) {
  22. return null;
  23. }
  24. }
  25. }