Getting Started with Jakarta EE 9: Jakarta Persistence API (JPA)

Originally published on 22 Sep 2021
Last updated on 19 Dec 2023

With the Jakarta Persistence API, the system can perform the serialization of Java Objects into the Database or read data into objects. You can use Jakarta JPA to read and write Java instances easily from and to the database.

With the help of annotations on Java classes and instance variables, the mapping is defined between the Java world and the database world.

In this blog, we cover some of the basic aspects of the JPA specification and how you can use it. The specification is rather large so make sure you also consult some documentation and other resources to find out all the capabilities of the specification.

Configuration

You can follow along with the configuration and reading of the data in this video:

As always, the configuration for using this feature is limited but since we are now connecting to an external system we'll need to define a few more aspects.

To connect to a database, we need to indicate where the database is located and what parameters are required for the connection. At the lowest level, this is done using the DataSource, Connection, Statement, and ResultSet which are part of the JDBC specification. JPA uses a higher abstraction level which means you don't need to interact on this detailed level with the database connection.

The runtime still needs a DataSource so that the JPA implementation can do its work. We provide this DataSource through a JNDI entry that is created. You can learn more on how the configuration of a DataSource and JNDI can be performed within the Payara Server in the Data Source with JPA video.

Within our project, we need to define a persistence.xml file to indicate the JNDI data source we will use.

<?xml version="1.0" encoding="UTF-8"?>
<persistence xmlns="https://jakarta.ee/xml/ns/persistence"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="https://jakarta.ee/xml/ns/persistence https://jakarta.ee/xml/ns/persistence/persistence_3_0.xsd"
   version="3.0">
   <persistence-unit name="TestUnit">
      <jta-data-source>jdbc/local-mysql</jta-data-source>
      <properties>
         <property name="jakarta.persistence.schema-generation.database.action" value="drop-and-create"/>
         <property name="jakarta.persistence.sql-load-script-source" value="META-INF/defaultdata.sql"/>
         <property name="eclipselink.logging.level" value="FINEST"/>
      </properties>
   </persistence-unit>
</persistence>

Jakarta EE 9 uses version 3 of the JPA specification and that is the version of the XML namespace we define at the root level of the XML file.

In this file, we define the Persistence unit which is a connection to a database. You can have several units within the same application if you want to access multiple databases.

The JNDI name for the data source is specified with the <jta-data-source> element. The JTA part, JTA stands for Jakarta Transaction API, defines that the transaction management, like commits and rollbacks, are handled by the runtime, Payara Server in our example. More on that later in this article how this works.

You can also define several configuration options for the unit with the properties element. In the example, we have defined the two properties from the specification itself that instructs the runtime to create the required tables based on the requirements of the application and load some test data. This is nice for a demo like this but for production scenarios, you need to maintain the database outside of the application of course.

The third option is an EclipseLink-specific one, the JPA implementation within Payara to set the logging level to FINEST.

There are many more options from Jakarta or the JPA implementation that can be found within the documentation.

Define Mapping

As mentioned, JPA works based on the mapping between the Java instance variables and database tables and fields. With a few annotations, we can define how a Java class can be turned into an entity Object.

@Entity
@Table(name = "Company")
public class Company implements Serializable {

   @Id
   @Column(name = "id")
   private Long id;

   @Column(name = "name")
   private String name;

   // Setters and getters
}

The @Entity indicates to the runtime this class is used by JPA and the @Table indicates the table name we use to store Java instances. Each Entity requires an Id field that corresponds with the Primary Key of the database table. The @Column annotation is optional and not required in this case as the variable name is the same as the database field name.

The @Column annotation has properties for size and requiredness but they are used when a database schema is created by the JPA implementation (as we do in this demo). I didn't specify them as in real-world examples, the database must be maintained outside the application.

Reading from Database

Now that we have defined how the runtime needs to connect to a database, the JNDI data source name, how the mapping is between the Database and Java Instances, we can read a record from the database. Since this requires some transactional processing, actions on the database are performed in a transactional way, we can use EJB beans which are transactional by default, or define that a CDI bean receives the transactional capabilities. This is a CDI bean that reads a record from the Company Table we have defined.

@ApplicationScoped
@Transactional
public class CompanyService {

   @PersistenceContext
   private EntityManager em;

   public Company findCompany(Long id) {
      return em.find(Company.class, id);
   }
}

The @Transactional annotation defines the transactional capabilities. When the method starts, it begins a transaction to the database which is committed when the method ends without an Exception. If there is an exception, a Rollback is performed.

The EntityManager implements the high-level concept of accessing the database. The @PersistenceContext defines which Persistence unit we want to use. Since we only have defined one within the persistence.xml, we don't need to specify the name. Otherwise, we had to define the name within the @PersistenceContext annotation.

One of the EntityManager methods is the find() one that returns the record from the database that has the provided value.

Instead of just reading a single record, you can also read all records from the table by creating a 'Query'. The syntax is similar to SQL but has a few differences and is called JPQL (JPA Query Language). The main difference is that you refer to the Java names and not the database names as the idea of JPA is that you work with Java objects and the system looks them up for you in the database based on the mapping you have defined.

public List<Company> allCompanies() {
   return em.createQuery("SELECT c FROM Company c", Company.class).getResultList();
}

The above code selects all records from the table that hold the Company Java objects and determines the table name based on the @Entity value. Later on, we will see some more examples of how you can limit the returned records by using some query restrictions.

Converters

Not all data can easily be serialised to and from the database. Strings and numbers are handled by default, but other or more complex constructs need a converter that performs the conversion.

One of those Java types that need a converter is the Enum. You have 2 options to store an enum value within the database, as a number representing the enums ordinal and as text representing the enums name. The number seems appealing but is rather dangerous as reordering the enum values in code leads that the values in the database are interpreted differently. The following definition is the best solution for an enum.

@Column(name = "GENDER")
@Enumerated(EnumType.STRING)
private Gender gender;

Also, a date needs some additional information when needs to be stored. A Java Date instance contains day and hour/minute information. But maybe you are only interested in the date and not in the hours. With an annotation, you can define which portions are considered by the runtime.

@Column(name = "HIRE_DATE")
@Temporal(TemporalType.DATE)
private Date hireDate;

In the case of the hire date, only the day information, and not the hour information, is important and so we specify the TemporalType.DATE value.

And as in so many cases within the Jakarta EE specifications, you have an API so you can define your custom converters. For example, suppose we want to store the favorite color of the employee, this is a possible solution.

@Converter
public class ColorConverter implements AttributeConverter<Color, String> {

   @Override
   public String convertToDatabaseColumn(Color attribute) {
      StringBuilder result = new StringBuilder();
      result.append(attribute.getRed())
          .append(",")
          .append(attribute.getGreen())
          .append(",")
          .append(attribute.getBlue());
      return result.toString();
   }

   @Override
   public Color convertToEntityAttribute(String dbData) {
      if (dbData == null || dbData.isBlank()) {
        return null;
      }
      String[] parts = dbData.split(",");

      return new Color(Integer.parseInt(parts[0])
         , Integer.parseInt(parts[1])
         , Integer.parseInt(parts[2]));
   }
}

It stores the Color value as a String value in the format r,g,b. The 2 methods of the AttributeConverter interface perform the conversion in both ways. The above code needs of course a bit more safety checks but gives you an idea of how the create the converter. It can be used by either setting the autoApply property of the annotation @Converter or explicitly indicate it on an entity field.

@Column(name = "FAVORITE_COLOR")
@Convert(converter = ColorConverter.class)
private Color favoriteColor;

Writing to Database

You can follow along with this section in the second part of the video:

Writing to the database is also rather easy to achieve. The main concern we have is that the ID field, the primary of a database table, should be autogenerated as the best practice is that it should not have any business value. So we cannot expect that the value is filled in by client code that uses the Persistence Context and we also must guarantee that the value is unique.

All databases have several strategies to assist you with this problem. You have the auto increment fields of MySQL or the Sequences within the Oracle database. JPA can rely on database capabilities for the primary key field. You just need to indicate that the field is not required to have a value when you persist it with JPA. This is needed as a basic check of JPA is that all Id fields are required when we save values to the database.

@Id
@GeneratedValue
@Column(name = "id")
private Long id;

In our example in the video, the table is autogenerated, so does not use the auto-increment fields and since we have already some records the insert will fail since JPA uses a simple increment starting from 1 as a backup.

In our case, we can indicate that JPA needs to maintain a table with the latest value for the primary and we can also define an initial one so that we don't have a conflict with already existing records.

@Id
@TableGenerator(name = "companyGen", table = "ID_GEN", pkColumnName = "GEN_NAME", valueColumnName = "GEN_VAL", pkColumnValue = "CompanyGen", initialValue = 10, allocationSize = 1)
@GeneratedValue(generator = "companyGen")
@Column(name = "id")
private Long id;

Parent Child Relations

In almost all cases, there exist some parent-child relations between tables. Employees in our case, belong to a certain Company. Within the database, this is handled by defining a Foreign Key from the child to parent, from Employee to the Company. Within JPA, we can express this relation with the @ManyToOne annotation.

You can remember the current annotation name by using the trick that there are Many employees for One company. Since there exists also the @OneToMany annotation which can be used in the Company class to get all employees of a company, it is good to have an easy to remember trick to pick the correct one. But also keep in mind that @OneToMany is the inverse of what the database has and requires more queries and therefor is not recommended to use. Later in this section, I'll go a bit deeper on another performance issue you can have.

@ManyToOne()
@JoinColumn(name = "COMPANY_ID")
private Company company;

Besides the @ManyToOne, we also specify the column name in the database, but this time with a specific one and not the regular @Column.

Reading all the employees can be done just as we have seen before, using a query using the JPQL language.

public List<Employee> getAllEmployees() {
return em.createQuery("SELECT e FROM Employee e", Employee.class).getResultList();
}

When you look at the server log, you can see that the JPA implementation performs 2 queries. One to retrieve the contents of the Employee table and one for the company table. It will launch an additional query for each parent that is found in the result (unless already in the JPA cache but that is an aspect of JPA we can't cover in this introduction article). So if you are reading a child table that has 100 different parents (like companies), it executes 100 additional queries. This is known as the N+1 Select problem. It is therefore recommended to specify that option eclipselink.logging.level in development so that you can see what JPA performs and how the transactions are handled.

In our case, we can solve the N+1 Select problem in two ways. When we know that the Company information is not needed by the method that called the read of the employee table, we can indicate that the parent should be loaded Lazy (only when we access the parent within the same session/transaction, the query is performed)

@ManyToOne(fetch = FetchType.LAZY)

Or we can indicate that the query should read the records from the Employee table must also fetch the fields of the Company table by performing a join.

    public List<Employee> getAllEmployees() {
        return em.createQuery("SELECT e FROM Employee e JOIN FETCH e.company", Employee.class).getResultList();
    }

This join fetch will result then in a select to the database similar to this one:

SELECT t1.EMPLOYEE_ID, t1.FAVORITE_COLOR, t1.FIRST_NAME, t1.GENDER, t1.HIRE_DATE, t1.LAST_NAME, t1.COMPANY_ID, t0.id, t0.name 
FROM Company t0, Employee t1 
WHERE ((t0.id = t1.COMPANY_ID))

This example indicates that you must always carefully inspect what queries the JPA implementation performs to the database; You can define very complex mapping with many relations between many tables. But that might result in too many queries and a performance impact on the application. In that case, it will also read too much data that is not needed. When keeping an eye on what the system executes, you can make sure there are no performance issues related to JPA in production. It allows you to tune the mapping and the JPQL queries. And if needed, you can fall back to executing native database queries and semi-automatic mapping to Data Transfer Objects.

Query Restrictions

The last piece of functionality we discuss in this introduction blog to JPA is the ability to restrict the returned records by defining some restrictions. Restrictions can be that we only retrieve employees that are working for the company for more than three years, or in our simple example, we use here, to return the employees of only one company.

public List<Employee> getEmployeesOfCompany(Long companyId) {
TypedQuery<Employee> query = em.createQuery("SELECT e FROM Employee e WHERE e.company.id = :companyId", Employee.class);
query.setParameter("companyId", companyId);
return query.getResultList();
}

JPA uses named placeholders for parameter values, like :companyid. This makes it easier to identify those parameters if you have several of them when you compare when you just use numbered ? placeholders we used when using plain JDBC.

Use Jakarta JPA to Avoid Writing Boilerplate Code for Conversion and Transaction Management

With Jakarta JPA, you can read and write Java instances easily from and to the database. By defining a mapping with annotations, the EntityManager can use the database as storage but the developer keeps on working with Java objects. He doesn't need to care about the database but as we saw, you must be aware that the JPA system doesn't generate the most optimal queries unless you tweak the code. It helps you with creating more readable code, and developers don't need to write boilerplate code for conversion and transaction management.

Unlocking the Speed: Performance Tuning for Jakarta EE Applications With JCache

Download Now