Discussion:
[PATCH] nilfs2: fix issue with potential infinite loop in nilfs_mdt_get_block()
Vyacheslav Dubeyko
2014-06-29 13:36:50 UTC
Permalink
From: Vyacheslav Dubeyko <Vyacheslav.Dubeyko-***@public.gmane.org>
Subject: [PATCH] nilfs2: fix issue with potential infinite loop in nilfs_mdt_get_block()

Potentially, there are situations when nilfs_mdt_read_block() can
return -ENOENT and nilfs_mdt_create_block() can return -EEXIST.
Such situation results in infinite loop inside nilfs_mdt_get_block()
method. This patch fixes issue with potential infinite loop in
nilfs_mdt_get_block() by means of uncomment limitation of read-create
loop retries.

Signed-off-by: Vyacheslav Dubeyko <Vyacheslav.Dubeyko-***@public.gmane.org>
CC: Vyacheslav Dubeyko <slava-***@public.gmane.org>
CC: Ryusuke Konishi <konishi.ryusuke-***@public.gmane.org>
---
fs/nilfs2/mdt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nilfs2/mdt.c b/fs/nilfs2/mdt.c
index c4dcd1d..d1e121f 100644
--- a/fs/nilfs2/mdt.c
+++ b/fs/nilfs2/mdt.c
@@ -254,7 +254,7 @@ int nilfs_mdt_get_block(struct inode *inode, unsigned long blkoff, int create,

ret = nilfs_mdt_create_block(inode, blkoff, out_bh, init_block);
if (unlikely(ret == -EEXIST)) {
- /* create = 0; */ /* limit read-create loop retries */
+ create = 0; /* limit read-create loop retries */
goto retry;
}
return ret;
--
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ryusuke Konishi
2014-06-29 18:33:45 UTC
Permalink
Post by Vyacheslav Dubeyko
Subject: [PATCH] nilfs2: fix issue with potential infinite loop in nilfs_mdt_get_block()
Potentially, there are situations when nilfs_mdt_read_block() can
return -ENOENT and nilfs_mdt_create_block() can return -EEXIST.
Did you actually see this situation happened ?

nilfs_mdt_read_block() returns -ENOENT only if nilfs_grab_buffer()
didn't find a buffer head on page cache and nilfs_bmap_lookup() hit
a hole block.

nilfs_mdt_create_block() returns -EEXIST only if nilfs_grab_buffer()
got a buffer head in uptodate state or nilfs_bmap_insert() failed due
to an existing block.

So, these usually don't happen at the same time. What situation
do you suppose ?

This function is so primitive to easily change even though the retry
loop looks an interim implementation of an early date. I would
understand what is going on and whole impact of the change.
Post by Vyacheslav Dubeyko
Such situation results in infinite loop inside nilfs_mdt_get_block()
method. This patch fixes issue with potential infinite loop in
nilfs_mdt_get_block() by means of uncomment limitation of read-create
loop retries.
If this patch is applied, nilfs_mdt_get_block() can return -ENOENT
even for the case of create == true. This is unexpected for caller
functions. I think we should return proper error code which fits to
the exceptional situation that causes an infinite repetition (if it
actually happens).

Regards,
Ryusuke Konishi
Post by Vyacheslav Dubeyko
---
fs/nilfs2/mdt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/mdt.c b/fs/nilfs2/mdt.c
index c4dcd1d..d1e121f 100644
--- a/fs/nilfs2/mdt.c
+++ b/fs/nilfs2/mdt.c
@@ -254,7 +254,7 @@ int nilfs_mdt_get_block(struct inode *inode, unsigned long blkoff, int create,
ret = nilfs_mdt_create_block(inode, blkoff, out_bh, init_block);
if (unlikely(ret == -EEXIST)) {
- /* create = 0; */ /* limit read-create loop retries */
+ create = 0; /* limit read-create loop retries */
goto retry;
}
return ret;
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Vyacheslav Dubeyko
2014-06-30 07:14:20 UTC
Permalink
Post by Ryusuke Konishi
Post by Vyacheslav Dubeyko
Subject: [PATCH] nilfs2: fix issue with potential infinite loop in nilfs_mdt_get_block()
Potentially, there are situations when nilfs_mdt_read_block() can
return -ENOENT and nilfs_mdt_create_block() can return -EEXIST.
Did you actually see this situation happened ?
Yes, I've caught this issue during testing extended attributes support.
I think that initial reason of the issue is bug in my code. But, anyway,
it is possible to encounter such issue in the future because of some
modification of the code.
Post by Ryusuke Konishi
nilfs_mdt_read_block() returns -ENOENT only if nilfs_grab_buffer()
didn't find a buffer head on page cache and nilfs_bmap_lookup() hit
a hole block.
nilfs_mdt_create_block() returns -EEXIST only if nilfs_grab_buffer()
got a buffer head in uptodate state or nilfs_bmap_insert() failed due
to an existing block.
So, these usually don't happen at the same time. What situation
do you suppose ?
In brief, I have such situation:

First step - trying to create node.

/* .... */

err = nilfs_palloc_prepare_alloc_entry(xafile, req);

/* .... */

err = nilfs_palloc_get_entry_block(xafile, req->pr_entry_nr, 1,
&req->pr_entry_bh);

/* failure in the code */

nilfs_palloc_abort_alloc_entry(xafile, req);

Second step - trying to create node again.

/* .... */

err = nilfs_palloc_prepare_alloc_entry(xafile, req);

/* .... */

err = nilfs_palloc_get_entry_block(xafile, req->pr_entry_nr, 1,
&req->pr_entry_bh);

I have infinite loop here, in nilfs_palloc_get_entry_block(). Every next
trying to create node results in infinite loop even if I restart system.
So, I suppose that, maybe, there is some issue in
nilfs_palloc_abort_alloc_entry() on the first step. Or, maybe, I am
using nilfs_palloc_prepare_alloc_entry() and
nilfs_palloc_get_entry_block() by means of improper way.
Post by Ryusuke Konishi
This function is so primitive to easily change even though the retry
loop looks an interim implementation of an early date. I would
understand what is going on and whole impact of the change.
Post by Vyacheslav Dubeyko
Such situation results in infinite loop inside nilfs_mdt_get_block()
method. This patch fixes issue with potential infinite loop in
nilfs_mdt_get_block() by means of uncomment limitation of read-create
loop retries.
If this patch is applied, nilfs_mdt_get_block() can return -ENOENT
even for the case of create == true. This is unexpected for caller
functions. I think we should return proper error code which fits to
the exceptional situation that causes an infinite repetition (if it
actually happens).
Yes, maybe, this patch is improper. But, anyway, we need to have some
limitation for repetition. And, of course, it needs to remove this
commented line. So, what your final vision of proper fix for the issue?

Thanks,
Vyacheslav Dubeyko.


--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ryusuke Konishi
2014-07-01 18:07:44 UTC
Permalink
Post by Vyacheslav Dubeyko
Post by Ryusuke Konishi
nilfs_mdt_read_block() returns -ENOENT only if nilfs_grab_buffer()
didn't find a buffer head on page cache and nilfs_bmap_lookup() hit
a hole block.
nilfs_mdt_create_block() returns -EEXIST only if nilfs_grab_buffer()
got a buffer head in uptodate state or nilfs_bmap_insert() failed due
to an existing block.
So, these usually don't happen at the same time. What situation
do you suppose ?
First step - trying to create node.
/* .... */
err = nilfs_palloc_prepare_alloc_entry(xafile, req);
/* .... */
err = nilfs_palloc_get_entry_block(xafile, req->pr_entry_nr, 1,
&req->pr_entry_bh);
/* failure in the code */
nilfs_palloc_abort_alloc_entry(xafile, req);
Second step - trying to create node again.
/* .... */
err = nilfs_palloc_prepare_alloc_entry(xafile, req);
/* .... */
err = nilfs_palloc_get_entry_block(xafile, req->pr_entry_nr, 1,
&req->pr_entry_bh);
I have infinite loop here, in nilfs_palloc_get_entry_block(). Every next
trying to create node results in infinite loop even if I restart system.
So, I suppose that, maybe, there is some issue in
nilfs_palloc_abort_alloc_entry() on the first step. Or, maybe, I am
using nilfs_palloc_prepare_alloc_entry() and
nilfs_palloc_get_entry_block() by means of improper way.
Thanks for explaning the situation. At first glance, your steps
themselves look ok to me.

Let's narrow down the issue first.

The infinite loop is caused by the combination of the following two
behaviors:

1) nilfs_mdt_read_block() returned -ENOENT that nilfs_bmap_lookup()
returned.

2) nilfs_mdt_create_block() returned -EEXIST.
This function can return -EEXIST in the following two cases.

2-1) nilfs_bmap_insert() returned -EEXIST.

2-2) The buffer head gained through nilfs_grab_buffer() was
up-to-date.

Could you confirm which is happening between 2-1 and 2-2 by inserting
debug code in nilfs_mdt_create_block() ?
Post by Vyacheslav Dubeyko
Yes, maybe, this patch is improper. But, anyway, we need to have some
limitation for repetition. And, of course, it needs to remove this
commented line. So, what your final vision of proper fix for the issue?
We need to find the root cause of the issue to know proper fix.

Thanks,
Ryusuke Konishi
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...