View previous topic :: View next topic |
Author |
Message |
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Mon Jul 29, 2024 12:15 pm Post subject: Using basic shell tools to get mount (fs) stats |
|
|
I thought I could programmatically just parse output of df -PTk, but it seems that the output isn't meant for parsing. Just have a mount point path with a white space in it, and the parsing fails.
Do anyone here know if there are any ways to do this with basic shell tools? I have tried to use busybox only.
If I loosen then requirement a bit I could have a very small C-program to get all the same info as df, but separated with tabs or (even better) NUL. Another way would be to use python for that, but then it would require more stuff since python is interpreted language.
I found it strange that there is no sysfs entries which provide this information. Yes, we have /proc/mounts, but that's mount stats, not filesystem stats. No free space numbers there. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3421
|
Posted: Mon Jul 29, 2024 1:08 pm Post subject: |
|
|
Well, you do know the number of columns, there are no empty columns, and I think the only column that MAY contain spaces is the last one, so... Is this good enough?
Code: | df -PTk | while read A B C D E F G; do echo $A $B $C $D $E $F $G; done |
A space in the device name will mess things up, but who does that anyway? _________________ Make Computing Fun Again |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10655 Location: Somewhere over Atlanta, Georgia
|
Posted: Mon Jul 29, 2024 1:15 pm Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
Zucca wrote: | ... Do anyone here know if there are any ways to do this with basic shell tools? I have tried to use busybox only. | Wouldn't the busybox built-in AWK count as part of "basic shell tools"? There's a full set of string manipulation functions there.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters.
Last edited by John R. Graham on Mon Jul 29, 2024 1:20 pm; edited 1 time in total |
|
Back to top |
|
|
lars_the_bear Guru
Joined: 05 Jun 2024 Posts: 517
|
Posted: Mon Jul 29, 2024 1:18 pm Post subject: |
|
|
If the mount point can contain only one space, you can just collapse the spaces and then parse the result with `cut`:
Code: | df -PTk | tr -s ' ' | cut -d ' ' -f 7-
|
That avoids the bash loop, but with that limitation. Parsing with awk is also possible, but it's surprisingly nasty to get 'rest of line' as a variable in awk, without picking up unwanted delimiters.
If you're even considering using C, why not just use statfs() to get the filesystem metrics, rather than transforming the output of df?
BR, Lars. |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1238 Location: Richmond Hill, Canada
|
Posted: Mon Jul 29, 2024 1:52 pm Post subject: |
|
|
I am not sure busybox have this command or not, but if you use Code: | findmnt -Dl | # whatever parser you like
| may be easier. You can add -b option if you want the output in bytes. |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Mon Jul 29, 2024 2:00 pm Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
John R. Graham wrote: | Wouldn't the busybox built-in AWK count as part of "basic shell tools"? Full set of string manipulation functions there.
- John | Hm. Yes. Assuming device node paths don't contain whitespace (which is very reasonable), I could reference each column in awk (or in shell really) only with $1, but shifting after each column value is stored, then finally get the mount point path (which may contain whitespace) from $0 (or with $* in shell).
That should do it.
Thanks.
It's not completely fool proof, as someone may name their logical volumes with white space (although I would assume udev would convert whitespace to "_" or "-"). Very much a corner case.
I'll work something out. This would be easy with sh, but I want to juggle with awk. :D _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Mon Jul 29, 2024 2:19 pm Post subject: |
|
|
lars_the_bear wrote: | If you're even considering using C, why not just use statfs() to get the filesystem metrics, rather than transforming the output of df?
BR, Lars. |
- My C skills aren't that good.
- I'd prefer to use only busybox.
Finally I came up with this: awk code: | BEGIN {
while (("df -PTk 2> /dev/null" | getline) > 0) {
if (g != 1) {
g = 1
continue
}
devpath=$1
type=$2
size=$3
used=$4
free=$5
perc=$6
for (f=1; f<7; f++) $f=""
sub(/^[[:space:]]+/,"")
mount=$0
# Do something with all this.
}
} | Not very elegant, as awk doesn't really have shift. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10655 Location: Somewhere over Atlanta, Georgia
|
Posted: Mon Jul 29, 2024 6:51 pm Post subject: |
|
|
Zucca wrote: | ...awk doesn't really have shift. | No need for shift...or zeroing out all of those intervening fields. Just do Code: | mount = substr($0, 83) |
Note that substring operations can be done in bash as well (see "Substring Expansion" in the man page), but I like bash + AWK way better than bash alone.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Mon Jul 29, 2024 8:24 pm Post subject: |
|
|
John R. Graham wrote: | No need for shift...or zeroing out all of those intervening fields. Just do Code: | mount = substr($0, 83) |
| I found out that the zeroing was the easier way, since you'd need to add up all length()'s and the spacing in between the fields to get the offset for substr(). Or is there simpler way?
I thought of using split(), to create an array from $0... That would be another way too. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3421
|
Posted: Mon Jul 29, 2024 8:34 pm Post subject: |
|
|
I think you just overcomplicated with awk the same thing I did with while read... And in the end you'll still have to actually access the data in the output _somehow_, instead of just referring one of the variables bound by read.
BTW, how do you even run this awk script? I can't make it produce any results _________________ Make Computing Fun Again |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Mon Jul 29, 2024 9:11 pm Post subject: |
|
|
szatox wrote: | BTW, how do you even run this awk script? I can't make it produce any results | Replace Code: | # Do something with all this. | with... for example. Then run it with... or just add shebang: and make it executable.
szatox wrote: | I think you just overcomplicated with awk the same thing I did with while read... And in the end you'll still have to actually access the data in the output _somehow_, instead of just referring one of the variables bound by read. | I sure did overcomplicate it. Zucca wrote: | ...but I want to juggle with awk. :D |
The point of all this is to write fast (lightweight) scripts that I use with my status bar (yambar).
But really I should just write this in C as a yambar module. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10655 Location: Somewhere over Atlanta, Georgia
|
Posted: Mon Jul 29, 2024 10:55 pm Post subject: |
|
|
Zucca wrote: | I found out that the zeroing was the easier way, since you'd need to add up all length()'s and the spacing in between the fields to get the offset for substr(). Or is there simpler way? | I was assuming that the output was fixed width; admittedly, that might be an unwarranted assumption, so: Code: | {
while (...) {
if (NR == 1) {
mount_offset = index($0, "Mounted")
continue
}
...
mount = substr($0, mount_offset)
# Do something with all this.
}
} | - John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2177
|
Posted: Tue Jul 30, 2024 7:33 am Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
Zucca wrote: | I thought I could programmatically just parse output of df -PTk, but it seems that the output isn't meant for parsing. Just have a mount point path with a white space in it, and the parsing fails.
... |
Read up on awk FIELDWIDTHS. It's designed for exactly this purpose. _________________ Greybeard |
|
Back to top |
|
|
lars_the_bear Guru
Joined: 05 Jun 2024 Posts: 517
|
Posted: Tue Jul 30, 2024 8:06 am Post subject: |
|
|
Zucca wrote: | My C skills aren't that good.[*]I'd prefer to use only busybox. |
Shame, because it's so much easier (and less error-prone) to get this kind of information in C, than by hacking around with awk, etc. I concede that I might be the only person on Earth who thinks this
BR, Lars.
Code: |
#include <sys/statvfs.h>
#include <stdio.h>
#include <mntent.h>
int main () {
FILE *f = setmntent ("/proc/mounts", "r");
if (f) {
struct mntent *mnt;
while ((mnt = getmntent (f))) { // Get next mount
struct statvfs result;
if (statvfs (mnt->mnt_dir, &result) == 0) {
if (result.f_blocks) { // Only show mounts that have a size
double free_percent = 100.0 * result.f_bfree / result.f_blocks;
// Show mount point and free space
printf ("mount='%s' free=%.1f%%\n", mnt->mnt_dir, free_percent);
}
}
}
fclose (f);
}
}
|
|
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Tue Jul 30, 2024 8:33 am Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
Goverp wrote: | Read up on awk FIELDWIDTHS. It's designed for exactly this purpose. | For that to work I'd need to force gnu coreutils. Looks like FIELDWIDTHS is gawk only, and for some reason busybox df fixed width columns break when one of the columns is too wide.
Anyways I got my awk code to cope with busybox too.
lars_the_bear wrote: | Shame, because it's so much easier (and less error-prone) to get this kind of information in C, than by hacking around with awk, etc. I concede that I might be the only person on Earth who thinks this ;)
BR, Lars. | No. You're absolutely right. That's much more simpler than all this other "hackery".
My quest here was really not to parse output of df, since it isn't really meant to, but to find another solution from basic shell tools. But There isn't one, I assume. Thus the C route is the simplest. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
bstaletic Guru
Joined: 05 Apr 2014 Posts: 363
|
Posted: Tue Jul 30, 2024 6:30 pm Post subject: |
|
|
@lars_the_bear That fclose(f) should have been endmntent(f).
man setmntent wrote: |
The setmntent() function opens the filesystem description file filename and returns a file pointer which can be used by getmntent(). The argument type is the type of access required
and can take the same values as the mode argument of fopen(3). The returned stream should be closed using endmntent() rather than fclose(3).
|
|
|
Back to top |
|
|
RumpletonBongworth n00b
Joined: 17 Jun 2024 Posts: 74
|
Posted: Wed Jul 31, 2024 1:34 am Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
Zucca wrote: | I'll work something out. This would be easy with sh, but I want to juggle with awk. |
However easy you may think it is, it usually isn't. Below is a script that should work with any conforming implementation of sh(1) and df(1), while also depending on proc(5).
Code: |
#!/bin/sh
# Slow. Portable. Don't use it for status bar duty. This is really to prove the
# point that correctness and speed sometimes cannot be attained at the same time
# if relying solely on sh and the standard utilities.
# Don't perform pathname expansion.
set -f
while read -r _ mountpoint _; do
# Decode octal escape sequences.
mountpoint=$(printf %b. "$mountpoint")
mountpoint=${mountpoint%.}
# Collect the stats. Doing this per-mountpoint is expensive but safe.
# POSIX does not guarantee any particular format for <file system root>.
set -- $(df -kP -- "$mountpoint" | tail -n +2)
fstype=$1 blocks=$2 used=$3 free=$4 capacity=$5
# Do as you wish with the above-assigned variables ...
done < /proc/self/mounts
|
It could made less expensive by having findmnt(1) from util-linux be a requirement. Of course, you would forgo strict XCU compatibility by doing that. As such, you'll have to decide what your priorities are. The suggestion to write it in C may be the most appropriate one for you. |
|
Back to top |
|
|
lars_the_bear Guru
Joined: 05 Jun 2024 Posts: 517
|
Posted: Wed Jul 31, 2024 7:11 am Post subject: |
|
|
bstaletic wrote: | @lars_the_bear That fclose(f) should have been endmntent(f).
|
Yeah, probably. The glibc implementation of endmntent() is just call to fclose(), but I suppose that can't be guaranteed.
A bigger problem with my code is that it has no error checking. I don't think I claimed it was production-quality code
BR, Lars. |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Wed Jul 31, 2024 10:17 am Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
RumpletonBongworth wrote: | Zucca wrote: | I'll work something out. This would be easy with sh, but I want to juggle with awk. :D |
However easy you may think it is, it usually isn't. Below is a script that should work with any conforming implementation of sh(1) and df(1), while also depending on proc(5). | Well... I have history of torturing myself trying to create shell scripts that work on almost every shell environment. I have that "ps clone" updated version somewhere on my other box. It fares quite good against procps and busybox versions in regards of speed. I was pretty surprised it turned out to be that fast. It was a fun exercise. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
RumpletonBongworth n00b
Joined: 17 Jun 2024 Posts: 74
|
Posted: Wed Jul 31, 2024 4:13 pm Post subject: Re: Using basic shell tools to get mount (fs) stats |
|
|
Yes, it can be fun to write programs within such constraints. Regarding your other post, there are a few potential pitfalls: it uses find -regex (not portable); it uses stat (not a standard utility); it uses realpath (only recently standardised by Issue 8). The find dependency is a tractable one, at least.
Code: |
for d in /proc/*/; do
d=${d%/}
case ${d##*/} in
*[!0-9]*) continue
esac
printf '%s\n' "$d"
done
|
|
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3687 Location: Rasi, Finland
|
Posted: Wed Jul 31, 2024 4:27 pm Post subject: |
|
|
Yeah. I think I loosened the restrictions and made sure it runs on bash+coreutils or with busybox internals. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2177
|
Posted: Wed Jul 31, 2024 7:35 pm Post subject: |
|
|
Here follows some pure shell script (tested with dash) to parse columnar stuff like the "df" output in question.
You can "trivially" extend it to handle more columns (probably build a delimited series of masks for each column to iterate over. Also, beware of right-justified columns and headings...)
I assume the column widths are set according to the df output - if they're fixed, you can just generate the masks or even type them in.
Code: | #!/bin/dash
### Split off first column of something with a header line
# Generate a shell pattern matching the specified column width.
# This can be optimized using powers of two, but for now:
mask() {
count="$1"
m=""
while [ "$count" -gt 0 ]
do
m="?$m"
count=$(( count - 1 ))
done
printf '%s' "$m"
}
read -r hdr
# Calculate the width of the first column, using the header line
not1="${hdr#* [! ]}" # hdr less everything to 1st letter of second word
col1="${hdr%$not1}" # everything to 1st letter of second word
col1="${col1%?}" # this is the heading for column 1, including padding
width1="${#col1}" # So we have width of column 1.
m1=$(mask "$width1")
while read -r line
do
rest="${line#$m1}"
word1="${line%$rest}"
printf '%s.\n' "$word1"
done
|
_________________ Greybeard |
|
Back to top |
|
|
RumpletonBongworth n00b
Joined: 17 Jun 2024 Posts: 74
|
Posted: Wed Jul 31, 2024 8:31 pm Post subject: |
|
|
Goverp wrote: | Here follows some pure shell script (tested with dash) to parse columnar stuff like the "df" output in question. |
It cannot be presumed that the newline character indicates the end of a record printed by df(1). Taking the busybox implementation as an example, the <file system root> field is emitted as raw bytes. In the event that a mount point contains a newline character - however unlikely that may be in practice - its output will no longer be columnar in nature. GNU coreutils, on the other hand, produces corrupted output by converting such characters to "?". As such, its output remains columnar but the bytes that form the mount point may be incorrect. The proc(5) interface does not suffer from this problem since it presents 'special' characters as octal escape sequences that can be trivially decoded with printf %b. It is a pity that df(1) is not required to act so sensibly.
EDIT: As an aside, be careful where writing parameter expansions of the form ${word1#$word2}. The value of word2 will be treated as a pattern in turn, which matters if globbing metacharacters are present. Wherever the intent is to have said value be treated as a literal string, the expansion may be double-quoted e.g. ${word1#"$word2"}. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2177
|
Posted: Thu Aug 01, 2024 8:44 am Post subject: |
|
|
RumpletonBongworth wrote: | ...
EDIT: As an aside, be careful where writing parameter expansions of the form ${word1#$word2}. The value of word2 will be treated as a pattern in turn, which matters if globbing metacharacters are present. Wherever the intent is to have said value be treated as a literal string, the expansion may be double-quoted e.g. ${word1#"$word2"}. |
Ah, but if you read the code carefully, you'll notice that the expansion is exactly what I want. _________________ Greybeard |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2177
|
Posted: Thu Aug 01, 2024 8:54 am Post subject: |
|
|
RumpletonBongworth wrote: | Goverp wrote: | Here follows some pure shell script (tested with dash) to parse columnar stuff like the "df" output in question. |
It cannot be presumed that the newline character indicates the end of a record printed by df(1).
... |
Probably so (though anyone who includes a newline character in a mount point deserves all the problems they get). In a system with such a weird mountpoint, I expect parsing "df" output for statistics will be the least of their problems! Whatever, I was exhibiting some pure shell code to handle parsing text in columns; if the input is not columnar due to embedded newlines in the data, I can therefore claim it's a user error.
If the last column is right-justified, (so all lines are of fixed length), something based on my code could be used to (1) determine the column widths from the header line, and (2) split off columns one-by-one from each line, then eat the newline, and so on - though I guess it would need
to read the entire input in one pass. Alternatively, if the input from a normal "read line" is too short, add a newline and read append the next line. _________________ Greybeard |
|
Back to top |
|
|
|