Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 605 Vote(s) - 3.43 Average
  • 1
  • 2
  • 3
  • 4
  • 5
What is the "N+1 selects problem" in ORM (Object-Relational Mapping)?

#11
N+1 select issue is a pain, and it makes sense to detect such cases in unit tests.
I have developed a small library for verifying the number of queries executed by a given test method or just an arbitrary block of code - [JDBC Sniffer][1]

Just add a special JUnit rule to your test class and place annotation with expected number of queries on your test methods:

@Rule
public final QueryCounter queryCounter = new QueryCounter();

@Expectation(atMost = 3)
@Test
public void testInvokingDatabase() {
// your JDBC or JPA code
}

[1]:

[To see links please register here]

Reply

#12
[Here's a good description of the problem][1]

Now that you understand the problem it can typically be avoided by doing a join fetch in your query. This basically forces the fetch of the lazy loaded object so the data is retrieved in one query instead of n+1 queries. Hope this helps.


[1]:

[To see links please register here]

Reply

#13
Check Ayende post on the topic: [Combating the Select N + 1 Problem In NHibernate][1].

Basically, when using an ORM like NHibernate or EntityFramework, if you have a one-to-many (master-detail) relationship, and want to list all the details per each master record, you have to make N + 1 query calls to the database, "N" being the number of master records: 1 query to get all the master records, and N queries, one per master record, to get all the details per master record.

More database query calls → more latency time → decreased application/database performance.

However, ORMs have options to avoid this problem, mainly using JOINs.


[1]:

[To see links please register here]

Reply

#14
N+1 SELECT problem is really hard to spot, especially in projects with large domain, to the moment when it starts degrading the performance. Even if the problem is fixed i.e. by adding eager loading, a further development may break the solution and/or introduce N+1 SELECT problem again in other places.

I've created open source library [jplusone](

[To see links please register here]

) to address those problems in JPA based Spring Boot Java applications. The library provides two major features:
1. Generates reports correlating SQL statements with executions of JPA operations which triggered them and places in source code of your application which were involved in it
<pre>
2020-10-22 18:41:43.236 DEBUG 14913 --- [ main] c.a.j.core.report.ReportGenerator :
ROOT
com.adgadev.jplusone.test.domain.bookshop.BookshopControllerTest.shouldGetBookDetailsLazily(BookshopControllerTest.java:65)
com.adgadev.jplusone.test.domain.bookshop.BookshopController.getSampleBookUsingLazyLoading(BookshopController.java:31)
com.adgadev.jplusone.test.domain.bookshop.BookshopService.getSampleBookDetailsUsingLazyLoading [PROXY]
SESSION BOUNDARY
OPERATION [IMPLICIT]
com.adgadev.jplusone.test.domain.bookshop.BookshopService.getSampleBookDetailsUsingLazyLoading(BookshopService.java:35)
com.adgadev.jplusone.test.domain.bookshop.Author.getName [PROXY]
com.adgadev.jplusone.test.domain.bookshop.Author [FETCHING ENTITY]
STATEMENT [READ]
select [...] from
author author0_
left outer join genre genre1_ on author0_.genre_id=genre1_.id
where
author0_.id=1
OPERATION [IMPLICIT]
com.adgadev.jplusone.test.domain.bookshop.BookshopService.getSampleBookDetailsUsingLazyLoading(BookshopService.java:36)
com.adgadev.jplusone.test.domain.bookshop.Author.countWrittenBooks(Author.java:53)
com.adgadev.jplusone.test.domain.bookshop.Author.books [FETCHING COLLECTION]
STATEMENT [READ]
select [...] from
book books0_
where
books0_.author_id=1
</pre>

2. Provides API which allows to write tests checking how effectively your application is using JPA (i.e. assert amount of lazy loading operations )

``` lang-java
@SpringBootTest
class LazyLoadingTest {

@Autowired
private JPlusOneAssertionContext assertionContext;

@Autowired
private SampleService sampleService;

@Test
public void shouldBusinessCheckOperationAgainstJPlusOneAssertionRule() {
JPlusOneAssertionRule rule = JPlusOneAssertionRule
.within().lastSession()
.shouldBe().noImplicitOperations().exceptAnyOf(exclusions -> exclusions
.loadingEntity(Author.class).times(atMost(2))
.loadingCollection(Author.class, "books")
);

// trigger business operation which you wish to be asserted against the rule,
// i.e. calling a service or sending request to your API controller
sampleService.executeBusinessOperation();

rule.check(assertionContext);
}
}
```


Reply

#15
Without going into tech stack implementation details, architecturally speaking there are at least two solutions to N + 1 Problem:

- Have Only 1 - big query - with Joins. This makes a lot of information be transported from the database to the application layer, especially if there are multiple child records. The typical result of a database is a set of rows, not graph of objects (there are solutions to that with different DB systems)
- Have Two(or more for more children needed to be joined) Queries - 1 for the parent and after you have them - query by IDs the children and map them. This will minimize data transfer between the DB and APP layers.

Reply

#16
**N+1 problem in Hibernate & Spring Data JPA**

N+1 problem is a performance issue in Object Relational Mapping that fires multiple select queries (N+1 to be exact, where N = number of records in table) in database for a single select query at application layer. Hibernate & Spring Data JPA provides multiple ways to catch and address this performance problem.

**What is N+1 Problem?**

To understand N+1 problem, lets consider with a scenario. Let’s say we have a collection of *User* objects mapped to *DB_USER* table in database, and each user has collection or *Role* mapped to *DB_ROLE* table using a joining table *DB_USER_ROLE*. At the ORM level a **User** has **many to many** relationship with **Role**.

Entity Model
@Entity
@Table(name = "DB_USER")
public class User {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String name;

@ManyToMany(fetch = FetchType.LAZY)
private Set<Role> roles;
//Getter and Setters
}

@Entity
@Table(name = "DB_ROLE")
public class Role {

@Id
@GeneratedValue(strategy= GenerationType.AUTO)
private Long id;

private String name;
//Getter and Setters
}


**A user can have many roles. Roles are loaded Lazily.** Now lets say we want to *fetch all users from this table and print roles for each one*. Very naive Object Relational implementation could be -
**UserRepository** with **findAllBy** method

public interface UserRepository extends CrudRepository<User, Long> {

List<User> findAllBy();
}

**The equivalent SQL queries executed by ORM will be:**

First Get **All User** (1)

Select * from DB_USER;

Then get **roles for each user** executed N times (where N is number of users)

Select * from DB_USER_ROLE where userid = <userid>;

So we need **one select for User** and **N additional selects for fetching roles for each user**, where *N is total number of users*. ***This is a classic N+1 problem in ORM***.

**How to identify it?**

Hibernate provide tracing option that enables SQL logging in the console/logs. ***using logs you can easily see if hibernate is issuing N+1 queries for a given call***.

If you see multiple entries for SQL for a given select query, then there are high chances that its due to N+1 problem.

**N+1 Resolution**

**At SQL level**, what ORM needs to achieve to avoid N+1 is to *fire a query that joins the two tables and get the combined results in single query*.

***Fetch Join SQL that retrieves everything (user and roles) in Single Query***

**OR Plain SQL**

select user0_.id, role2_.id, user0_.name, role2_.name, roles1_.user_id, roles1_.roles_id from db_user user0_ left outer join db_user_roles roles1_ on user0_.id=roles1_.user_id left outer join db_role role2_ on roles1_.roles_id=role2_.id

**Hibernate & Spring Data JPA provide mechanism to solve the N+1 ORM issue.**

**1. Spring Data JPA Approach:**

If we are using Spring Data JPA, then we have two options to achieve this - using ***EntityGraph*** or using ***select query with fetch join.***

public interface UserRepository extends CrudRepository<User, Long> {

List<User> findAllBy();

@Query("SELECT p FROM User p LEFT JOIN FETCH p.roles")
List<User> findWithoutNPlusOne();

@EntityGraph(attributePaths = {"roles"})
List<User> findAll();
}

N+1 queries are issued at database level using left join fetch, we resolve the N+1 problem using attributePaths, Spring Data JPA avoids N+1 problem

**2. Hibernate Approach:**

If its pure Hibernate, then the following solutions will work.

Using **HQL** :

from User u *join fetch* u.roles roles roles

Using **Criteria** API:

Criteria criteria = session.createCriteria(User.class);
criteria.setFetchMode("roles", FetchMode.EAGER);

All these approaches work similar and they issue a similar database query with left join fetch





Reply

#17
### A generalisation of N+1

The N+1 problem is an ORM specific name of a problem where you move loops that could be reasonably executed on a server to the client. The generic problem isn't specific to ORMs, you can have it with any remote API. [In this article, I've shown how JDBC roundtrips are very costly][1], if you're calling an API N times instead of only 1 time. The difference in the example is whether you're calling the Oracle PL/SQL procedure:

- `dbms_output.get_lines` (call it once, receive N items)
- `dbms_output.get_line` (call it N times, receive 1 item each time)

They're logically equivalent, but due to the latency between server and client, you're adding N latency waits to your loop, instead of waiting only once.

### The ORM case

In fact, the ORM-y N+1 problem isn't even ORM specific either, you can achieve it by running your own queries manually as well, e.g. when you do something like this in PL/SQL:

```sql
-- This loop is executed once
for parent in (select * from parent) loop

-- This loop is executed N times
for child in (select * from child where parent_id = parent.id) loop
...
end loop;
end loop;
```

It would be much better to implement this using a join (in this case):

```sql
for rec in (
select *
from parent p
join child c on c.parent_id = p.id
)
loop
...
end loop;
```

Now, the loop is executed only once, and the logic of the loop has been moved from the client (PL/SQL) to the server (SQL), which can even optimise it differently, e.g. by running a hash join (`O(N)`) rather than a nested loop join (`O(N log N)` with index)

### Auto-detecting N+1 problems

If you're using JDBC, [you could use jOOQ as a JDBC proxy behind the scenes to auto-detect your N+1 problems][2]. jOOQ's parser normalises your SQL queries and caches data about consecutive executions of parent and child queries. This even works if your queries aren't exactly the same, but semantically equivalent.


[1]:

[To see links please register here]

[2]:

[To see links please register here]

Reply

#18
What is the N+1 query problem
-----------------------------

The N+1 query problem happens when the data access framework executed N additional SQL statements to fetch the same data that could have been retrieved when executing the primary SQL query.

The larger the value of N, the more queries will be executed, the larger the performance impact. And, unlike the slow query log that can help you find slow running queries, the N+1 issue won’t be spot because each individual additional query runs sufficiently fast to not trigger the slow query log.

The problem is executing a large number of additional queries that, overall, take sufficient time to slow down response time.

Let’s consider we have the following post and post_comments database tables which form a one-to-many table relationship:

[![The `post` and `post_comments` tables][4]][4]

We are going to create the following 4 `post` rows:

INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 1', 1)

INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 2', 2)

INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 3', 3)

INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 4', 4)

And, we will also create 4 `post_comment` child records:

INSERT INTO post_comment (post_id, review, id)
VALUES (1, 'Excellent book to understand Java Persistence', 1)

INSERT INTO post_comment (post_id, review, id)
VALUES (2, 'Must-read for Java developers', 2)

INSERT INTO post_comment (post_id, review, id)
VALUES (3, 'Five Stars', 3)

INSERT INTO post_comment (post_id, review, id)
VALUES (4, 'A great reference book', 4)

N+1 query problem with plain SQL
--------------------------------

If you select the `post_comments` using this SQL query:

List<Tuple> comments = entityManager.createNativeQuery("""
SELECT
pc.id AS id,
pc.review AS review,
pc.post_id AS postId
FROM post_comment pc
""", Tuple.class)
.getResultList();

And, later, you decide to fetch the associated `post` `title` for each `post_comment`:

for (Tuple comment : comments) {
String review = (String) comment.get("review");
Long postId = ((Number) comment.get("postId")).longValue();

String postTitle = (String) entityManager.createNativeQuery("""
SELECT
p.title
FROM post p
WHERE p.id = :postId
""")
.setParameter("postId", postId)
.getSingleResult();

LOGGER.info(
"The Post '{}' got this review '{}'",
postTitle,
review
);
}

You are going to trigger the N+1 query issue because, instead of one SQL query, you executed 5 (1 + 4):

SELECT
pc.id AS id,
pc.review AS review,
pc.post_id AS postId
FROM post_comment pc

SELECT p.title FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'

SELECT p.title FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'

SELECT p.title FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'

SELECT p.title FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

Fixing the N+1 query issue is very easy. All you need to do is extract all the data you need in the original SQL query, like this:

List<Tuple> comments = entityManager.createNativeQuery("""
SELECT
pc.id AS id,
pc.review AS review,
p.title AS postTitle
FROM post_comment pc
JOIN post p ON pc.post_id = p.id
""", Tuple.class)
.getResultList();

for (Tuple comment : comments) {
String review = (String) comment.get("review");
String postTitle = (String) comment.get("postTitle");

LOGGER.info(
"The Post '{}' got this review '{}'",
postTitle,
review
);
}

This time, only one SQL query is executed to fetch all the data we are further interested in using.

N+1 query problem with JPA and Hibernate
----------------------------------------

When using JPA and Hibernate, there are several ways you can trigger the N+1 query issue, so it’s very important to know how you can avoid these situations.

For the next examples, consider we are mapping the `post` and `post_comments` tables to the following entities:

[![`Post` and `PostComment` entities][5]][5]

The JPA mappings look like this:

@Entity(name = "Post")
@Table(name = "post")
public class Post {

@Id
private Long id;

private String title;

//Getters and setters omitted for brevity
}

@Entity(name = "PostComment")
@Table(name = "post_comment")
public class PostComment {

@Id
private Long id;

@ManyToOne
private Post post;

private String review;

//Getters and setters omitted for brevity
}

## `FetchType.EAGER`

Using `FetchType.EAGER` either implicitly or explicitly for your JPA associations is a bad idea because you are going to fetch way more data that you need. More, the `FetchType.EAGER` strategy is also prone to N+1 query issues.

Unfortunately, the `@ManyToOne` and `@OneToOne` associations use `FetchType.EAGER` by default, so if your mappings look like this:

@ManyToOne
private Post post;

You are using the `FetchType.EAGER` strategy, and, every time you forget to use `JOIN FETCH` when loading some `PostComment` entities with a JPQL or Criteria API query:

List<PostComment> comments = entityManager
.createQuery("""
select pc
from PostComment pc
""", PostComment.class)
.getResultList();

You are going to trigger the N+1 query issue:

SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4

Notice the additional SELECT statements that are executed because the `post` association has to be fetched prior to returning the `List` of `PostComment` entities.

Unlike the default fetch plan, which you are using when calling the `find` method of the `EntityManager`, a JPQL or Criteria API query defines an explicit plan that Hibernate cannot change by injecting a JOIN FETCH automatically. So, you need to do it manually.

If you didn't need the `post` association at all, you are out of luck when using `FetchType.EAGER` because there is no way to avoid fetching it. That's why it's better to use `FetchType.LAZY` by default.

But, if you wanted to use `post` association, then you can use `JOIN FETCH` to avoid the N+1 query problem:

List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
join fetch pc.post p
""", PostComment.class)
.getResultList();

for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}

This time, Hibernate will execute a single SQL statement:

SELECT
pc.id as id1_1_0_,
pc.post_id as post_id3_1_0_,
pc.review as review2_1_0_,
p.id as id1_0_1_,
p.title as title2_0_1_
FROM
post_comment pc
INNER JOIN
post p ON pc.post_id = p.id

-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'

-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'

-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'

-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

## `FetchType.LAZY`

Even if you switch to using `FetchType.LAZY` explicitly for all associations, you can still bump into the N+1 issue.

This time, the `post` association is mapped like this:

@ManyToOne(fetch = FetchType.LAZY)
private Post post;

Now, when you fetch the `PostComment` entities:

List<PostComment> comments = entityManager
.createQuery("""
select pc
from PostComment pc
""", PostComment.class)
.getResultList();

Hibernate will execute a single SQL statement:

SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc

But, if afterward, you are going to reference the lazy-loaded `post` association:

for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}

You will get the N+1 query issue:

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

Because the `post` association is fetched lazily, a secondary SQL statement will be executed when accessing the lazy association in order to build the log message.

Again, the fix consists in adding a `JOIN FETCH` clause to the JPQL query:

List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
join fetch pc.post p
""", PostComment.class)
.getResultList();

for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}

And, just like in the `FetchType.EAGER` example, this JPQL query will generate a single SQL statement.

> Even if you are using `FetchType.LAZY` and don't reference the child association of a bidirectional `@OneToOne` JPA relationship, you can still trigger the N+1 query issue.

## How to automatically detect the N+1 query issue

If you want to automatically detect N+1 query issue in your data access layer, you can use the [`db-util`][7] open-source project.

First, you need to add the following Maven dependency:

<dependency>
<groupId>com.vladmihalcea</groupId>
<artifactId>db-util</artifactId>
<version>${db-util.version}</version>
</dependency>

Afterward, you just have to use `SQLStatementCountValidator` utility to assert the underlying SQL statements that get generated:

SQLStatementCountValidator.reset();

List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
""", PostComment.class)
.getResultList();

SQLStatementCountValidator.assertSelectCount(1);

In case you are using `FetchType.EAGER` and run the above test case, you will get the following test case failure:

SELECT
pc.id as id1_1_,
pc.post_id as post_id3_1_,
pc.review as review2_1_
FROM
post_comment pc

SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = 1

SELECT p.id as id1_0_0_, p.title as title2_0_0_ FROM post p WHERE p.id = 2


-- SQLStatementCountMismatchException: Expected 1 statement(s) but recorded 3 instead!

[4]:

[5]:

[7]:

[To see links please register here]

Reply

#19
Let's say you have a collection of `Car` objects (database rows), and each `Car` has a collection of `Wheel` objects (also rows). In other words, `Car` → `Wheel` is a 1-to-many relationship.

Now, let's say you need to iterate through all the cars, and for each one, print out a list of the wheels. The naive O/R implementation would do the following:

SELECT * FROM Cars;

And then **for each `Car`:**

SELECT * FROM Wheel WHERE CarId = ?

In other words, you have one select for the Cars, and then N additional selects, where N is the total number of cars.

Alternatively, one could get all wheels and perform the lookups in memory:

SELECT * FROM Wheel;

This reduces the number of round-trips to the database from N+1 to 2.
Most ORM tools give you several ways to prevent N+1 selects.

Reference: *[Java Persistence with Hibernate][1]*, chapter 13.

[1]:

[To see links please register here]

Reply

#20
A good explanation of the problem can be found in the [Phabricator documentation][1]

**TL;DR**

> It is much faster to issue 1 query which returns 100 results than to
> issue 100 queries which each return 1 result.
>
> Load all your data before iterating through it.


**More in-detail**

> The N+1 query problem is a common performance antipattern. It looks like this:
>
> $cats = load_cats();
> foreach ($cats as $cat) {
> $cats_hats => load_hats_for_cat($cat);
> // ...
> }
>
> Assuming `load_cats()` has an implementation that boils down to:
>
> SELECT * FROM cat WHERE ...
>
> ..and `load_hats_for_cat($cat)` has an implementation something like this:
>
> SELECT * FROM hat WHERE catID = ...
>
> ..you will issue "N+1" queries when the code executes, where N is the number of cats:
>
> SELECT * FROM cat WHERE ...
> SELECT * FROM hat WHERE catID = 1
> SELECT * FROM hat WHERE catID = 2
> SELECT * FROM hat WHERE catID = 3
> SELECT * FROM hat WHERE catID = 4
> ...


[1]:

[To see links please register here]

Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through