Discussion:
hg status slow
Eric Bloodworth
2005-09-17 03:59:19 UTC
Permalink
I have a tree where there is a directory which is not checked into
mercurial. This directory has tons of files, and causes hg status to
take a long time to return. Adding this directory to .hgignore has no
effect on the timing, but neither does using -X. I think I understand
why .hgignore has no effect on file traversal and probably can't be
easily made to, but I don't understand why -X doesn't cause traversal of
the given directory to be avoided. Are there any plans to change the
current behavior?

-- Eric
Bryan O'Sullivan
2005-09-17 05:05:36 UTC
Permalink
Post by Eric Bloodworth
Adding this directory to .hgignore has no
effect on the timing, but neither does using -X.
I'm not surprised by the former, but the latter does surprise me.

Have you used strace or something similar to see what files are actually
getting stat called on them?

In my tests, the .hgignore file has no effect on the number of stat
calls, while -X nukes all the ones I'd expect.

<b
Bryan O'Sullivan
2005-09-17 07:41:23 UTC
Permalink
Post by Bryan O'Sullivan
I'm not surprised by the former, but the latter does surprise me.
Problem fixed; see my other mail.

<b
Eric Bloodworth
2005-09-17 17:55:23 UTC
Permalink
Post by Bryan O'Sullivan
Post by Bryan O'Sullivan
I'm not surprised by the former, but the latter does surprise me.
Problem fixed; see my other mail.
<b
Well, it is *much* faster. However, it changes the behavior in a bad
way, IMO. I have some directories that are matched by .hgignore, but I
also have some files in those directories which are tracked by
mercurial. To be more concrete, I'm tracking my home directory in
Mercurial: I ignore all files starting with ".", but I also track a few
files in .vim, and some other files starting with ".". This way status
doesn't spew out a load of ignored configuration files. Here
is .hgignore:

^\.
^(dev|tmp|apps|data|extern)
~$


Using your patched hg (actually the tip of your repo), directories which
match in .hgignore cause files that I tracked in those directories to be
considered removed:

R .vim/_vimrc_common
R .vim/plugin/bufexplorer.vim

NOTE: In 0.6c, to add files that are matched by .hgignore was impossible
without editing it, adding the file, and then changing it back. In 0.7,
this improved, so you can explicitly add files that are otherwise
ignored via .hgignore. I think this is a much less confusing, and an
improvement.

Now, if I hg add a new file in .vim, each such added file is considered
removed.

$ touch .vim/test_file
$ hg add .vim/test_file
$ hg status
R .vim/_vimrc_common
R .vim/plugin/bufexplorer.vim
R .vim/test_file



-- Eric
Bryan O'Sullivan
2005-09-18 05:56:22 UTC
Permalink
Post by Eric Bloodworth
Well, it is *much* faster. However, it changes the behavior in a bad
way, IMO. I have some directories that are matched by .hgignore, but I
also have some files in those directories which are tracked by
mercurial.
Hmm. What happens if you don't use this patched version, but use -X to
exclude that directory?

I don't think this will be difficult to fix. Based on your description,
I think what's happening is that the dirstate walking code is filtering
the list of real directories it is walking, but not the manifest it is
comparing against. The result is that a file shows up in the manifest,
but not in the (filtered) real filesystem, so it looks like it's been
deleted.

I'll look at it tomorrow. Thanks for trying it out.

<b
Eric Bloodworth
2005-09-18 18:40:46 UTC
Permalink
Post by Bryan O'Sullivan
Post by Eric Bloodworth
Well, it is *much* faster. However, it changes the behavior in a bad
way, IMO. I have some directories that are matched by .hgignore, but I
also have some files in those directories which are tracked by
mercurial.
Hmm. What happens if you don't use this patched version, but use -X to
exclude that directory?
In both versions, the tracked files in the .hgignore'd directories are
not listed.

FWIW, in either version, if I add .vim/test_file, and exclude .vim, the
added file doesn't get listed.

-- Eric
Bryan O'Sullivan
2005-09-18 21:00:24 UTC
Permalink
Post by Bryan O'Sullivan
I don't think this will be difficult to fix.
It won't be as trivial as I thought, either.

The walk code copies the manifest (files that Mercurial already knows it
manages), then goes through the filesystem, deleting each file it finds
that's in the manifest. What's left in the manifest after this is files
that are known to Mercurial, but don't exist in the filesystem.

Now that we don't walk directories that show up in the .hgignore file,
the logic needs to become a little more complicated. Files that are in
the manifest but matched by .hgignore still need to be walked, unless an
earlier attempt to walk them failed or they've been eliminated from
consideration by -I or -X or such.

Subtle stuff.

<b
Matt Mackall
2005-09-18 22:03:56 UTC
Permalink
Post by Bryan O'Sullivan
Post by Bryan O'Sullivan
I don't think this will be difficult to fix.
It won't be as trivial as I thought, either.
I don't like the idea of walking ignored subdirectories at all. There
are really important performance reasons not to do it that can't be
worked around. Consider the case where the number of ignored files is
_much_ larger than the number of checked in files. Simply running find
| wc in my home directory takes minutes.
--
Mathematics is the supreme nostalgia of our time.
Bryan O'Sullivan
2005-09-18 22:06:37 UTC
Permalink
Post by Matt Mackall
I don't like the idea of walking ignored subdirectories at all.
Nor do I. But I found a nice tidy fix that only walks the files that
are in the dirstate map in these directories, so the world is a happier
place once more. The only code that needed changing was
localrepository.changes.

http://hg.serpentine.com/mercurial/bos

<b
--
Bryan O'Sullivan <***@serpentine.com>
Loading...