Tuesday, May 28, 2013

Count the number of lines added to a log file in a time interval

I needed to estimate how many lines are added to a log file in 10 minutes. THe idea was to start from a command like tail -f file | wc -l, but that never ends, because wc never sees an end of file and gets blocked when reading on the read size of the pipe if the pipe is empty.

So what I need to do is just to kill tail after a certain time (e.g. 10 minutes). Now, if I run tail -f file | wc -l in background by adding a & at the end, that becomes one job for the shell and I can only use kill %1 to kill both the tail and the wc process. But that way I will not get any output from wc.

The solution was to save the pid of the tail process in a file e.g. tail.pid and then kill tail only by its ID. To save the ID we just output variable $! to descriptor number 3. $! expands to the process ID of the most recently executed background (asynchronous) command, if tail succeeds, it will be its PID. Before tail & echo is executed we make user descriptor number 3 is opened and redirected to a file (this is what 3>tail.pid does):

$ ( tail -f trace.130524.txt & echo $! >&3 ) 3>tail.pid | wc -l &
$ sleep 600 && kill $(


The command substitution $(< file) is just a faster replacement for $(cat file).

No comments: