www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2010/06/01/17:42:18

X-Recipient: archive-cygwin AT delorie DOT com
X-SWARE-Spam-Status: No, hits=-0.3 required=5.0 tests=AWL,BAYES_20
X-Spam-Check-By: sourceware.org
Message-ID: <80373222dd5d43b134a5ede7036e7674.squirrel@www.webmail.wingert.org>
In-Reply-To: <4C03D6C5.4050004@x-ray.at>
References: <efe8a37b2e4466daa7b6eb1aa610c3d7 DOT squirrel AT www DOT webmail DOT wingert DOT org> <20100530170747 DOT GA8605 AT ednor DOT casa DOT cgf DOT cx> <f460895a8fc53da26cb91259a4005da2 DOT squirrel AT www DOT webmail DOT wingert DOT org> <4C03D6C5 DOT 4050004 AT x-ray DOT at>
Date: Tue, 1 Jun 2010 14:42:39 -0700
Subject: Re: Cygwin Performance and stat()
From: "Christopher Wingert" <mailbox AT wingert DOT org>
To: cygwin AT cygwin DOT com
User-Agent: SquirrelMail/1.4.20
MIME-Version: 1.0
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Unsubscribe: <mailto:cygwin-unsubscribe-archive-cygwin=delorie DOT com AT cygwin DOT com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com

I think there are a lot of use cases where the extra information (ACL
information *I assume* is the majority of the problem) is unnecessary. 
For most of the applications filename, size, and the three dates are all
that is necessary.  So cygwin stat is overkill.  So if I can tell the
emulation layer (via an environment flag) or the actually utility
(bash/ls/make/find/du) via a command line switch, I think I can save a lot
of time waiting.

Just to highlight how bad this problem is.  I have a network drive with
681 sub directories and approximately 90k files.  A time comparison for
getting directory information as follows:

*DOS "dir /s" takes 17 seconds.
*Cygwin "ls -lR" takes 5950 seconds (that's almost two hours).
*msls -lR takes 55 seconds.
*myls (see code below) takes 7 seconds.

Each test was done twice and after a reboot to make sure there was no
caching involved.

To be clear, Cygwin ls is 850X slower.

msls can be retrieved here http://utools.com/msls.htm

myls is as follows:

int main( int arc, char *argv[] )
{
   dodir( "p:\\dl" );
}

int dodir( char *dir )
{
   WIN32_FIND_DATA findData;
   HANDLE f;
   char spec[ 1024 ];
   char fname[ 1024 ];

   printf( "DIR %s\n", dir );
   sprintf( spec, "%s\\*.*", dir );

   f = FindFirstFile( spec, &findData );
   do
   {
      sprintf( fname, "%s\\%s", dir, findData.cFileName );
      if ( findData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY )
      {
         if ( ( strcmp( findData.cFileName, "." ) != 0 ) && ( strcmp(
findData.cFileName, ".." ) != 0 ) )
         {
            dodir( fname );
         }
      }
      else
      {
         printf( "%s %d\n", fname, findData.nFileSizeLow );
      }
   }
   while( FindNextFile( f, &findData ) );
   FindClose( f );
}




> Christopher Wingert schrieb:
>> I assume POSIX compatibility.  However, I bet there are cases where one
>> can sacrifice compatibility for performance (configurable with an
>> environment flag of course).
>>
>> See
>> http://marc.info/?l=git&m=122278284210941
>> for an example.
>
> This git do_stat is for only meant for a 50% implementation of relative
> paths known before, and therefore onyl useful to certain apps, but it
> can never be useful for the cygwin1.dll layer, because cygwin has to
> provide the POSIX compat. layer, and not 50% cut-throughs for apps which
> don't need the other 50%. ACL, mounts, symlinks, inode.
>
> A better chaching stat or an cygwin extension for relative deeper only
> would be possible, but a better caching stat would need more memory and
> sacrifice speed for the first stat.
> A fast relative stat is very unlikely to be #IFDEF's in some apps just
> for us. So it's more likely that those apps which might need it, come up
> with their own 50% less, but 50% faster bits, as git did.
>
>>> On Sun, May 30, 2010 at 08:54:10AM -0700, Christopher Wingert wrote:
>>>> I was looking into speeding up stat() performance.  More specifically
>>>> bash, ls, test, stat performance.  I've seen the subject come up
>>>> before.
>>>> Git recently implemented a native Win32 work around.  Are there any
>>>> cygwin
>>>> patches around?
>>>
>>> If there was a way to make stat() faster why wouldn't it be in the
>>> source
>>> code already?
> --
> Reini Urban
> http://phpwiki.org/  http://murbreak.at/
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>
>



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019