View previous topic :: View next topic |
Author |
Message |
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 1:57 pm Post subject: |
|
|
VinzC wrote: | I'm confident that you probably have a strong technical ground but do you have data to backup your claims? Do/did you see an impact on performance with the current state of caching algorithms you're pointing at? | I don't have the background otherwise I wouldn't ask here. Since I think ZFS does fit my needs best I tried to understand the way caching is done there. IMHO - it's more advanced than the page cache. There are a lot of other nice features. Of course there are some drawbacks, but by far not big enough to use EXT or XFS in general.
In this matter it would be better not to use ZFS for that single drive - if the data get's lost it is no big deal (RAID part of ZFS not needed), data won't change (no snapshots needed)… I do not want to have two zpools … just like you woudln't want two same-type FS to work separately (no RAM/code sharing/…) next to each other on the same box.
EDIT: It's not exactly like that (when having two pools) but it works more or less that way.
Last edited by as.gentoo on Fri Aug 24, 2018 3:15 pm; edited 6 times in total |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Fri Aug 24, 2018 2:05 pm Post subject: Re: How do I inhibit that a large file spoils the page cache |
|
|
as.gentoo wrote: | I have a partition containing videos and music. I won't watch the same movie(s) every day! However, watching a movie will (in case the page cache is fully used) kick older data from the page cache, most probably data that is really suited to be kept in RAM.
I know that there is a tool that can force data to be kept in the page cache ( dev-util/vmtouch ), I need the opposite.
..
Can / how do I tell the kernel to omit putting files / directory contents into the page cache? | I am not sure you can, nor that you need to.
As you said the real issue is "to prevent a huge file drive away all the cache-worthy files from the cache" and that does not really happen: that's the point of the kernel handling it.
I think the confusion is that the bcache ("buffer cache") is always going to contain the file data when it is read, because that is how files are read. The bcache is the original underlying memory implementation since UNIX was first developed. cf: Lions and Bach.
There is no way to avoid that, in general: the file must be read into memory.
An application such as a video player, would be well-advised to use posix_madvise and POSIX_MADV_SEQUENTIAL in general, and madvise specifically on Linux, since POSIX_MADV_DONTNEED is ignored (see the latter link.)
Other than that, the kernel will evict the pages if they're not used; quite often they are.
For instance, you say a tarball will not be read soon after creation, but I quite often do a quick untar and a cmp run to verify the backup.
In general, it does no harm for the bcache to stay current; it actually does a lot of good, which is the point of it.
Typically it only keeps the more recently-read chunks; you're seeing more, because you have so much RAM. |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 2:21 pm Post subject: |
|
|
Are you sure, here's what the kernel Doc says (from /usr/src/linux/Documentation/bcache.txt
Quote: | Say you've got a big slow raid 6, and an X-25E or three. Wouldn't it be
nice if you could use them as cache... Hence bcache.
[…]
It's designed to avoid random writes at all costs; it fills up an erase block
sequentially, then issues a discard before reusing it.
[…]
Bcache detects sequential IO and skips it; |
Quote: | As you said the real issue is "to prevent a huge file drive away all the cache-worthy files from the cache" and that does not really happen: that's the point of the kernel handling it. | Well, the output of vmtouch shows that the whole page cache holds solely the data of the big file. |
|
Back to top |
|
|
1clue Advocate
Joined: 05 Feb 2006 Posts: 2569
|
Posted: Fri Aug 24, 2018 2:46 pm Post subject: |
|
|
Have you considered copying the programs you want to cache into a RAM disk and running them from there? If you do that, then those files will always be in there unless you run out of RAM, and the cached data won't affect your system.
It's a little awkward, (I'd say sorry for the pun but I'm not really very sorry) but using ramdisks is mature and well documented. It may be a workaround for what you're experiencing. |
|
Back to top |
|
|
tholin Apprentice
Joined: 04 Oct 2008 Posts: 207
|
Posted: Fri Aug 24, 2018 2:52 pm Post subject: |
|
|
as.gentoo wrote: | Are you sure, here's what the kernel Doc says (from /usr/src/linux/Documentation/bcache.txt) |
Bcache (drivers/md/bcache) and buffer cache are completely different things. Bcache is used for having an ssd cache data for a slower spinning disk. That's unrelated to the page cache.
The buffer cache was removed long ago but for legacy reason the kernel still report that some data is used by the buffer cache. The page cache is what steveL is really talking about.
as.gentoo wrote: | Well, the output of vmtouch shows that the whole page cache holds solely the data of the big file. |
Only because you read it twice in a short period of time. If you had read it only once it wouldn't have replaced the older data. |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 2:55 pm Post subject: |
|
|
1clue wrote: | Have you considered copying the programs you want to cache into a RAM disk and running them from there? If you do that, then those files will always be in there unless you run out of RAM, and the cached data won't affect your system. It may be a workaround for what you're experiencing. |
Yes, it may be a way but there will be tons of libraries… maybe /lib* could be synced to ramdisk…
Does anybody have experience with that?
OTOH this is a workstation / home box I use a lot of programs and some of the data should be available (if only extended) over a long period.
I could create a new rootFS on tmpfs - maybe using LXC. Sounds very circumstancial to me.
Last edited by as.gentoo on Fri Aug 24, 2018 3:04 pm; edited 1 time in total |
|
Back to top |
|
|
1clue Advocate
Joined: 05 Feb 2006 Posts: 2569
|
Posted: Fri Aug 24, 2018 2:59 pm Post subject: |
|
|
as.gentoo wrote: | 1clue wrote: | Have you considered copying the programs you want to cache into a RAM disk and running them from there? If you do that, then those files will always be in there unless you run out of RAM, and the cached data won't affect your system. It may be a workaround for what you're experiencing. |
Yes, it may be a way but there will be tons of libraries… maybe /lib* could be synced to ramdisk…
Does anybody have experience with that? |
There's a wiki on how to boot from a ramdisk. I don't suggest that, although it may be the easiest way for you and also may be feasible if the boot disk is small. How big is your non-data install?
Edit: This may be helpful: https://www.linuxquestions.org/questions/linuxquestions-org-member-success-stories-23/how-to-boot-os-into-ram-for-speed-and-silence-662116/ |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 3:10 pm Post subject: |
|
|
I have a zpool with a RAID-10 (4x 932 GB) plus two (ZFS) level 2 cache SSDs (when the RAM cache (L1 cache) is full data is moved to the dedicated fast RAID-1 SSDs (L2 cache)).
The remaining SSD holds 230 GB. To be used for S2D and a readonly-data partition.
I plan booting from a stick. AFAIK I need a ramdisk for loading the zfs module anyways. I already created one or two working initrds - although not with zfs but mdadm….
I think what was ment above is a RAMdrive which you create like Code: | mount -t tmpfs none /mnt/ramdisk -o 'size=5G' |
EDIT
1clue wrote: | There's a wiki on how to boot from a ramdisk. I don't suggest that, although it may be the easiest way for you and also may be feasible if the boot disk is small. How big is your non-data install? | Ah, you (1clue) did write about RAMDISK. Where's the connection to using an init-ramdisk?
None-data-install? You mean /var and /usr … roundabout 128GB but I'm trying a lot of new things so it might be somewhat more.
Last edited by as.gentoo on Fri Aug 24, 2018 5:10 pm; edited 7 times in total |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 3:45 pm Post subject: |
|
|
steveL wrote: | There is no way to avoid that, in general: the file must be read into memory. | I thought - maybe you could assign an amount of RAM space for a file (vmtouch can do that) and if the file is bigger then the data that was read first is overwritten with the part of the file that is needed in that moment.
When RAM is full the OOM killer usually terminates a process, so the kernel does know what data belongs together (of course it does). It's not just a bunch of sectors/pages in nowhere. It could be implemented to have a caching-per-file-quota. I'm not asking for that, just if it is available or not.
I'm slowly getting the impression that the page cache is robust and fits most peoples needs but it's (probably) not configurable. At least not to check sizes and prevent caching automatically - there is nothing like /sys/…/do_not_cache_bigger_files_than. |
|
Back to top |
|
|
1clue Advocate
Joined: 05 Feb 2006 Posts: 2569
|
Posted: Fri Aug 24, 2018 5:54 pm Post subject: |
|
|
as.gentoo wrote: | I have a zpool with a RAID-10 (4x 932 GB) plus two (ZFS) level 2 cache SSDs (when the RAM cache (L1 cache) is full data is moved to the dedicated fast RAID-1 SSDs (L2 cache)).
The remaining SSD holds 230 GB. To be used for S2D and a readonly-data partition.
I plan booting from a stick. AFAIK I need a ramdisk for loading the zfs module anyways. I already created one or two working initrds - although not with zfs but mdadm….
I think what was ment above is a RAMdrive which you create like Code: | mount -t tmpfs none /mnt/ramdisk -o 'size=5G' |
EDIT
1clue wrote: | There's a wiki on how to boot from a ramdisk. I don't suggest that, although it may be the easiest way for you and also may be feasible if the boot disk is small. How big is your non-data install? | Ah, you (1clue) did write about RAMDISK. Where's the connection to using an init-ramdisk?
None-data-install? You mean /var and /usr … roundabout 128GB but I'm trying a lot of new things so it might be somewhat more. |
- You might have noticed that I suggested loading only those binaries you want cached in RAM on the ramdisk.
- I just looked on the "heaviest" gui system I have, and my binaries take around 11 GB. Not sure exactly what new things you're loading, but seeing as I don't use most of what's on my disk I have to wonder what sort of software you're using.
- At any rate, the code you want cached in RAM fits in RAM because you're caching it. I suggested a boot-from-RAM type install because it was possible it would work for you. Evidently not.
- Not sure if you intended sarcasm but I certainly read that from your post. Please keep in mind that a bunch of people who don't have your problem are trying to help you solve your problem. Much of this is brainstorming.
|
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 6:33 pm Post subject: |
|
|
tholin wrote: | as.gentoo wrote: | Are you sure, here's what the kernel Doc says (from /usr/src/linux/Documentation/bcache.txt) |
Bcache (drivers/md/bcache) and buffer cache are completely different things. Bcache is used for having an ssd cache data for a slower spinning disk. That's unrelated to the page cache.
The buffer cache was removed long ago but for legacy reason the kernel still report that some data is used by the buffer cache. The page cache is what steveL is really talking about.
as.gentoo wrote: | Well, the output of vmtouch shows that the whole page cache holds solely the data of the big file. |
Only because you read it twice in a short period of time. If you had read it only once it wouldn't have replaced the older data. |
Code: | $> free -h
total used free shared buff/cache available
Mem: 62G 3.2G 4.5G 1.8G 55G 57G
Swap: 0B
#> dd if=/tmp/file1 bs=1G count=65 of=/dev/zero
65+0 records in
65+0 records out
69793218560 bytes (70 GB, 65 GiB) copied, 633.417 s, 110 MB/s
#> vmtouch -v * | wc -l
vmtouch: WARNING: not following symbolic link dev-zero
21397
#> vmtouch -v /tmp/file1
/tmp/file1
[ oOOOOOOOOOOOOOOOOOOOOOOOo ] 13234439/33554432
Files: 1
Directories: 0
Resident Pages: 13234439/33554432 50G/128G 39.4%
Elapsed: 1.3741 seconds
#> vmtouch -v /tmp/file2
/tmp/file2
[o ] 1/39842298
Files: 1
Directories: 0
Resident Pages: 1/39842298 4K/151G 2.51e-06%
Elapsed: 1.0271 seconds
#> dd if=/tmp/file2 bs=1G count=65 of=/dev/zero
65+0 records in
65+0 records out
69793218560 bytes (70 GB, 65 GiB) copied, 958.117 s, 72.8 MB/s
#> sudo vmtouch -v /tmp/file1
/tmp/file1
[ ] 0/33554432
Files: 1
Directories: 0
Resident Pages: 0/33554432 0/128G 0%
Elapsed: 0.69987 seconds
#> vmtouch -v /tmp/file2
/tmp/file2
[o oOOOOOOOOOOOOOOOOOOOo ] 13332832/39842298
Files: 1
Directories: 0
Resident Pages: 13332832/39842298 50G/151G 33.5%
Elapsed: 1.4155 seconds |
Good, so only ~3X% of the big files are cached.
And the cached data of big-file-1 in RAM is completely replaced by big-file-2.
Nice! It's not plain RLU.
So as far as I understand adjusting the page cache is possible but not on a limit-for-any-file base, but it's probably ok the way it works.
Thanks!
Last edited by as.gentoo on Fri Aug 24, 2018 7:26 pm; edited 4 times in total |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 6:52 pm Post subject: |
|
|
1clue wrote: | - You might have noticed that I suggested loading only those binaries you want cached in RAM on the ramdisk.
- I just looked on the "heaviest" gui system I have, and my binaries take around 11 GB. Not sure exactly what new things you're loading, but seeing as I don't use most of what's on my disk I have to wonder what sort of software you're using.
- At any rate, the code you want cached in RAM fits in RAM because you're caching it. I suggested a boot-from-RAM type install because it was possible it would work for you. Evidently not.
- Not sure if you intended sarcasm but I certainly read that from your post. Please keep in mind that a bunch of people who don't have your problem are trying to help you solve your problem. Much of this is brainstorming.
|
- Actually I did not. I'm sorry for that. # no sarcasm!
- I do not want to explain this part. I mentioned some use cases. Please take it as a static thing.
- … I didn't make the connection. My fault - see #1.
- When I look at what I wrote I can't find sarcasm. And it's not my intention to be sarcastic. I appreciate everybodys help.
However, I got a bit tired that I wanted an answer for a question and the focus was completely about what I could/should do differently. I'm no rookie. Again, it's fine to tell me and the forum members how I could do it differently. You told me that you don't know the answer to my question - again, that's okay.
Until the posting of tholin & steveL I was waiting for an answer to my question. *shrug* |
|
Back to top |
|
|
1clue Advocate
Joined: 05 Feb 2006 Posts: 2569
|
Posted: Fri Aug 24, 2018 7:36 pm Post subject: |
|
|
as.gentoo wrote: | 1clue wrote: | - You might have noticed that I suggested loading only those binaries you want cached in RAM on the ramdisk.
- I just looked on the "heaviest" gui system I have, and my binaries take around 11 GB. Not sure exactly what new things you're loading, but seeing as I don't use most of what's on my disk I have to wonder what sort of software you're using.
- At any rate, the code you want cached in RAM fits in RAM because you're caching it. I suggested a boot-from-RAM type install because it was possible it would work for you. Evidently not.
- Not sure if you intended sarcasm but I certainly read that from your post. Please keep in mind that a bunch of people who don't have your problem are trying to help you solve your problem. Much of this is brainstorming.
|
- Actually I did not. I'm sorry for that. # no sarcasm!
- I do not want to explain this part. I mentioned some use cases. Please take it as a static thing.
- … I didn't make the connection. My fault - see #1.
- When I look at what I wrote I can't find sarcasm. And it's not my intention to be sarcastic. I appreciate everybodys help.
However, I got a bit tired that I wanted an answer for a question and the focus was completely about what I could/should do differently. I'm no rookie. Again, it's fine to tell me and the forum members how I could do it differently. You told me that you don't know the answer to my question - again, that's okay.
Until the posting of tholin & steveL I was waiting for an answer to my question. *shrug* |
Peace. I misunderstood something you said. It happens often on forums.
WRT answering your question, I can't answer it so I'm trying to come up with ideas that may help you work around it. If I'm being irritating about it, I'll stop. |
|
Back to top |
|
|
as.gentoo Guru
Joined: 07 Aug 2004 Posts: 321
|
Posted: Fri Aug 24, 2018 7:51 pm Post subject: |
|
|
1clue wrote: | as.gentoo wrote: | However, I got a bit tired that I wanted an answer for a question and the focus was completely about what I could/should do differently. I'm no rookie. Again, it's fine to tell me and the forum members how I could do it differently. You told me that you don't know the answer to my question - again, that's okay.
Until the posting of tholin & steveL I was waiting for an answer to my question. *shrug* |
Peace. I misunderstood something you said. It happens often on forums.
WRT answering your question, I can't answer it so I'm trying to come up with ideas that may help you work around it. If I'm being irritating about it, I'll stop. | Now that was saracsm, at least I hope so!
Nah, it's fine now that I know what I wanted to know in the first place. I think I was just impatient. |
|
Back to top |
|
|
VinzC Watchman
Joined: 17 Apr 2004 Posts: 5098 Location: Dark side of the mood
|
Posted: Sun Aug 26, 2018 6:49 pm Post subject: |
|
|
I learnt something new today! _________________ Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739! |
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Fri Aug 31, 2018 2:39 pm Post subject: |
|
|
tholin wrote: | The buffer cache was removed long ago but for legacy reason the kernel still report that some data is used by the buffer cache. The page cache is what steveL is really talking about. | Indeed (in terms of usage); another term for the original UNIX bcache is "block cache".
It is in fact a general power-of-2 block-size allocator, from which the kernel malloc ("map alloc") function draws blocks of either 64 or 512 bytes (in the configuration given.)
I mention the detail solely to appeal to coders ;) who really do need to read Lions and Bach.
Lions has all the code; Bach adds diagrams and a more comprehensive overview, to Lions' wonderful commentary.
Wonderfully concise code, and very efficient. |
|
Back to top |
|
|
|