John Reese wrote:
> A question for the community-
> 
> Is there a way to monitor disk usage in real time *without* a disk read?
> 
> Here's the situation: Company managers have dumped large files onto the
> production server in the course of a single business day. The server has
> no capacity problem, but the backup system chokes and we lose our
> backup. The managers have no idea of the size file they are putting on
> the server, and they tend to do so in batches, which means that there is
> no 'incremental' buildup to watch over the course of a couple of days.
> It happens all at once, probably within the space of a few minutes.
> 
> I can catch the problem with a simple 'df -h' toward the end of the day,
> but if I see there's a problem for the backup there still is no way to
> find the problem directory unless I do a disk read with one of the many
> commands available. I can reduce the scope of the search, but the
> narrowest scope would still leave me with scores of gigabytes and
> hundreds of thousands of files. This would coincide with peak server
> activity, and even if I re-nice the disk read, I would still slow down
> the company for up to an hour at a critical time of day.
> 
> What I need is some sort of disk usage accounting that 1) does not rely
> on a real-time disk read; 2) can locate data growth by file and
> directory.  Is there anything out there that fills the bill?
> 
> John Reese
> 

I don't know of anything that will do exactly what you want.  Here are a
couple of things that might give you part of what you are looking for.

1. iostat will give you info about data written per partition.  The info
for this is stored in /proc & /sys so you should have to do any
additional disk I/O to get the info.  You can get info like amount of
data writtend to a partition in the last 6 hours.

2. Process accounting can also give you I/O, but it is setup to give you
things per user.  So this would tell you that user bob had 200 GB of I/O
in the last 6 hours.  However process accounting requires that you log
all process info to a file and then process that file.  This can lead to
a small performance hit on the machine, also processing the accounting
files isn't trivial but you could do that on another box.

--
Lee