JDBC Performance Tips




Performance tips for the IBM Developer Kit for Java JDBC driver

Why Optimize?

• On average, a web request performs 4 database queries.
• Experience has shown that database calls are typical performance bottleneck.
• Bad JDBC can overwhelm the database.


Avoid SELECT * SQL queries
SELECT * FROM... is a common way to state a query in SQL. Often, however, you do not need to query all the fields. For each column that is to be returned, the JDBC driver must do the additional work of binding and returning the row. Even if your application never uses a particular column, the JDBC driver has to be made aware of it and has to reserve space for its use. If your tables have few columns that are not used, this is not significant overhead. For a large number of unused columns, however, the overhead can be significant. A better solution is to list the columns that your application is interested in individually, like this:
     SELECT COL1, COL2, COL3 FROM...

Use getXXX(int) instead of getXXX(String)
Use the ResultSet getXXX methods that take numeric values instead of the versions that take column names. While the freedom to use your column names instead of numeric constants seems like an advantage, the database itself only knows how to deal with column indexes. Therefore, each getXXX method with a column name you call must be resolved by the JDBC driver before it can be passed to the database. Because getXXX methods are typically called inside loops that could be run millions of times, this little bit of overhead can rapidly accumulate.
Avoid getObject calls for Java primitive types
When getting values from the database of primitive types (ints, longs, floats, and so on), it is faster to use the get method specific to the primitive type (getInt, getLong, getFloat) than to use getObject. The getObject call does the work of the get for the primitive type, and then creates an object to return to you. This is typically done in loops, potentially creating millions of objects with short lifespans. Using getObject for primitive commands has the added drawback of frequently activating the garbage collector, further degrading performance.
Use PreparedStatement over Statement
If you are writing an SQL statement that is used more than once, it performs better as a PreparedStatement than as a Statement object. Every time you run a statement, you go through a two step process: the statement is prepared, and then the statement is processed. When you use a prepared statement, the statement is prepared only at the time that it is constructed, not each time it is run. Though it is recognized that a PreparedStatement performs faster than a Statement, this advantage is often neglected by programmers. Due to the performance boost that PreparedStatements provide, it is wise to use them in the design of your applications wherever possible.
Avoid DatabaseMetaData calls
Be aware that some of the DatabaseMetaData calls can be expensive. In particular, the getBestRowIdentifier, getCrossReference, getExportedKeys, and getImportedKeys methods can be costly. Some DataBaseMetaData calls involve complex join conditions over system-level tables. Use them only if you need their information, not just for convenience.
Use the correct commit level for your application
JDBC provides several commit levels which determine how multiple transactions against the system affect each other. The default is to use the lowest commit level. This means that transactions can see some of each other's work through commit boundaries. This introduces the possibility of certain database anomalies. Some programmers increase the commit level so that they do not have to worry about these anomalies occurring. Be aware that higher commit levels involve the database hanging onto more course-grained locks. This limits the amount of concurrency that the system can have, severely slowing the performance of some applications. Often, the anomaly conditions cannot occur because of the design of the application in the first place. Take time to understand what you are trying to accomplish and limit your transaction isolation level to the lowest level you can safely use.
Consider storing data in Unicode
Java requires all character data that it works with (Strings) to be in Unicode. Therefore, any table that does not have Unicode data requires the JDBC driver to translate the data back and forth as it is put into the database and retrieved out of the database. If the table is already in Unicode, the JDBC driver does not need to translate the data and can therefore place the data from the database faster. Take care to understand that data in Unicode may not work with non-Java applications, which do not know how to deal with Unicode. Also keep in mind that non-character data does not perform any faster, as there is never a translation of this data. Another consideration is that data stored in Unicode takes up twice as much space as single byte data does. If you have many character columns that are read many times, however, the performance gained by storing your data in Unicode can be significant.
Use stored procedures
The use of stored procedures is supported in Java. Stored procedures can perform faster by allowing the JDBC driver to run static SQL instead of dynamic SQL. Do not create stored procedures for each individual SQL statement you run in your program. Where possible, however, create a stored procedure that runs a group of SQL statements.
Use BigInt instead of Numeric or Decimal
Instead of using Numeric or Decimal fields that have a scale of 0, use the BigInt data type. BigInt translates directly into the Java primitive type Long whereas Numeric or Decimal data types translate into String or BigDecimal objects. As noted in Avoid getObject calls for Java primitive types, using primitive data types is preferable to using types that require object creation.
Explicitly close your JDBC resources when done with them
ResultSets, Statements, and Connections should be explicitly closed by the application when they are no longer needed. This allows the resources to be cleaned up in the most efficient way possible and can increase performance. Further, database resources that are not explicitly closed can cause resource leaks and database locks to be held longer than necessary. This can lead to application failures or reduced concurrency in applications.
Use connection pooling
Connection pooling is a strategy by which JDBC Connection objects get reused for multiple users instead of each user request creating its own Connection object. Connection objects are expensive to create. Instead of having each user create a new one, a pool of them should be shared in applications where performance is critical. Many products (such as WebSphere) provide Connection pooling support that can be used with little additional effort on the user's part. If you do not want to use a product with connection pooling support, or prefer to build your own for better control over how the pool works and performs, it is reasonably easy to do so.
Consider using PreparedStatement pooling
Statement pooling works similarly to Connection pooling. Instead of just putting Connections into a pool, put an object that contains the Connection and the PreparedStatements a pool. Then, retrieve that object and access the specific statement you want to use. This can dramatically increase performance.
Use efficient SQL
Because JDBC is built on top of SQL, just about anything that makes for efficient SQL also makes for efficient JDBC. Hence, JDBC benefits from optimized queries, wisely chosen indices, and other aspects of good SQL design.

Comments