X-Recipient: archive-cygwin AT delorie DOT com X-SWARE-Spam-Status: No, hits=0.3 required=5.0 tests=AWL,BAYES_00,HELO_LH_LD,J_CHICKENPOX_102,J_CHICKENPOX_34,SPF_SOFTFAIL X-Spam-Check-By: sourceware.org Message-ID: <4A8585BD.7040009@intello.com> Date: Fri, 14 Aug 2009 11:41:49 -0400 From: Mike Schmidt User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: "Pierre A. Humblet" CC: cygwin AT cygwin DOT com Subject: Re: cron cannot change user References: <4A841EBE DOT 3080203 AT intello DOT com> <034901ca1c24$72b36400$570410ac AT wirelessworld DOT airvananet DOT com> <4A8456E2 DOT 1070602 AT intello DOT com> <03d401ca1c5e$575df780$570410ac AT wirelessworld DOT airvananet DOT com> <4A85019C DOT 4040802 AT intello DOT com> <043701ca1cf0$c428fdf0$570410ac AT wirelessworld DOT airvananet DOT com> In-Reply-To: <043701ca1cf0$c428fdf0$570410ac@wirelessworld.airvananet.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Pierre A. Humblet wrote: > ----- Original Message ----- > From: "Mike Schmidt" > To: "Pierre A. Humblet" > Cc: > Sent: Friday, August 14, 2009 2:18 AM > | > | Pierre A. Humblet wrote: > | > ----- Original Message ----- > | > From: "Mike Schmidt" > | > To: cygwin > | > Sent: Thursday, August 13, 2009 2:09 PM > | > > | > > | > | On all the systems where cron works I did NOT run cron-config. I used > | > | the following line to install cron: > | > | cygrunsrv --install cron --path=/usr/sbin/cron --desc='Cygwin cron > | > | service' --type=auto --neverexits -a '-n' > | > | > | > | It has always worked for me. > | > | > | > | Nothing cron-related in /tmp > | > | > | > | When it stopped working on 1 system, I ran cron-config on that system. > | > | Here is the result: > | > | > | > | Running cron_diagnose ... > | > | It appears that you do not have an entry for: > | > | NT AUTHORITY\SYSTEM > | > | in /etc/passwd. > | > > | > OK, that happens because your environment has > | > USERDOMAIN=NT AUTHORITY > | > USERNAME=SYSTEM > | > and that confuses cron-config (it's trying to check you have a passwd entry) > | > I suspect that you are logged in under ssh or some such. > | > It's a Cygwin issue with a long history, but it's not something to worry about for cron. > | > I will fix that test. > | > > | > > | True enough. I am running remotely under ssh at this point. My company > | has a number of digital signage systems in the field running windows ( > | Usually it's under linux, but sometimes we have no choice) and I need to > | install cygwin along with ssh and cron snd other stuff to fit my support > | needs. So I try to do this invisibly, since otherwise I am operating in > | full view of whatever audience there may be. > | > | ================================================================ > | > | attached is the cronbug.txt > | > | > | > | Note that if I run cron-config on any of the systems that work perfectly > | > | fine, I get the same warnings about the account and the environment, but > | > | when I restore the service with the cygrunserv command above, it works > | > | fine. Only this 1 system refuses. They use the same userids and > | > | configuration as far as I can tell. > | > > | > > | > What do you mean by "refuses"? > | > > | > > | I mean that all the cron jobs on the system that refuses are not > | executed (or their results are not kept) > | > > | > As far as I can see from the log, cron does change to the "impact" user and the command > runs. > | > What led you to the "Subject:" of your e-mail? > | > What's happening to all the echo `date` > /tmp/date you have been running recently ? > | > It's kind of weird they don't run every minute. > | > > | > > | The commands don't run as far as I can tell. All of these commands log > | into logfiles in /home/impact, and this works fine if I run cron > | manually (not as a service) or if I run the commands manually under the > | impact account. /tmp/date is properly when cron is not a service. It was > | a debug tool I put in there just to make sure that I had something that > | didn't depend on the account at all. As for the subject of my email, I > | suppose that is more a guess than anything else. > | > I also see some MAIL (mailed 56 bytes of output but got status 0x0001) > | > That would happen if /tmp is not writable. Any reason that would be the case ? > | > Just to be sure, please edit /bin/cronlog and change all "exit 1" to "exit 123". > | > > | > > | .ls -l /: > | drwxrwx---+ 2 SYSTEM root 0 Aug 8 03:59 bin/ > | dr-xr-xr-x 1 0 root 0 Dec 31 1969 cygdrive/ > | drwxrwx---+ 2 SYSTEM root 0 Jun 16 20:20 dev/ > | drwxr-xr-x+ 10 SYSTEM root 0 Aug 2 22:03 etc/ > | drwxrwxrwx+ 5 impact None 0 Aug 8 01:20 home/ > | drwxrwx---+ 11 SYSTEM root 0 Aug 8 01:10 lib/ > | dr-xr-xr-x 1 impact None 0 Nov 30 2006 proc/ > | drwxrwx---+ 3 SYSTEM root 0 Jun 16 20:20 srv/ > | drwxrwxrwt+ 2 SYSTEM root 0 Aug 13 21:19 tmp/ > | drwxrwx---+ 12 SYSTEM root 0 Jun 16 20:19 usr/ > | drwxr-xr-x+ 9 SYSTEM root 0 Jun 16 20:31 var/ > | > | ls -l /tmp > | total 1.0K > | -rw-r--r-- 1 impact None 29 Aug 13 04:00 date > | > | The date file come from the last time I ran cron manually. When the > | system rebooted at 4am (daily reboot) cron came back as a service, and > | /tmp/date stopped. > | None of the log files written by the other commands are modified, and > | one of the commands (./agent.exe register) does a heartbeat via a wget > | to another system, which is definitely not happening. > | > | After changing the exit 1 to exit 123 in /bin/cronlog, I see no changes. > | What is that supposed to do? I also then commented out the rm commands > | that delete the temp files. Even after the cronevents showed mail > | messages there are none of the expected files in /tmp. > | > | I noticed on the neighbor system (that works perfectly) that even there > | the events listed in cronevents do not quite correspond to what cron > | actually does. For example, the 'check_jezam' script actually runs > | every minute, based on its own log. But cronevents only records every > | second or third call. > | > | Back to the broken system: here is the log after I changed cronlog with > | exit 123 and commented out the rm commands: > | > > | 2009/08/13 21:55:02 [SYSTEM] /usr/sbin/cron: PID 2068: (impact) MAIL > | (mailed 56 bytes of output but got status 0x0001 > | ) > | > | Note that according to the event log, neither of the scripts that run > | every 5 or 6 minutes are run. The check_jezam script which is supposed > | to run every minute runs less often than the date echo script, even > | though both are sceduled the same. > | > | In any case, /tmp/date is not written, nor is any of the tmp mail files. > | Nothing is written into any of the logs for agent.exe or check_jezam. > | And no heartbeat is signalled. I'm convinced these jobs are not run, > | even though the cron event log seems to say they are. > | > | So what can I do to debug this problem? This is driving me nuts! > | > | Thanks a lot for looking at this with me. > | > > Mike, > > I am completely baffled as well. There are a number of issues. > 1) Not all runs are listed by cronevents. > No idea if cron doesn't run (why) or if it's a logging failure. > cronevents is just a utility that dumps selected Windows event log entries. > Could be a bug in cronevents (I doubt it). > Can you access the windows log by using the Windows event viewer (system32/eventvwr.exe) ? > 2) Cron starts the command but nothing happens. > No idea > 3) Some MAIL output fails. Changing the exit status in cronlog was to ascertain if it was due > to an access problem. Now we know it isn't. > > What do you mean by "run cron manually". Do you just type /usr/sbin/cron ? As what user ? > Apparently cron used to work. Can you think of what has changed when it stopped working? > > Here are two things you could try > 1) Run the cron daemon as user "impact" (use cron-config, some log and pid files owned > by SYSTEM have to be deleted) > 2) Run the daemon in debug mode > cygrunsrv -I cron -p /usr/sbin/cron -a "-x sch,proc,pars,load,misc" > A lot of debug output will be written to /var/log/cron.log , so stop cron after a few > minutes > and send me the file. This will show us if cron works OK every minute, but won't help > determine why > commands do nothing. > Unfortunately cron doesn't log exec failures, it just mails them (which requires another > exec) :( > > Pierre > I will run the debug tonight, unfortunately I have to leave fro the rest of the day. However, here are some pieces of information: 1) I started a manual run as user impact: just /usr/sbin/cron. I deleted the 0-size /var/log/cron belonging to system , but the pid file was written without needing to delete it. 2) I checked the event log with sysinternals psloglist, it show the same as cronevents. No surprise there. 3) I changed the crontab that writes the date to accumulate instead of overwriting, so we can check how it runs. 4) both the date and the 'check_jezam' script log properly every minute without fail. (despite what the eventlog says). I will mail these logs + the eventlog with the debug output tonight. In the meantime, gotta go. Thanks very very much for your help! Mike -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple