Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
2022: Your (home) Gentoo server: how much data do you hoard?
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  

How much data do you hoard at home?
< 250GB. I only have my phone.
5%
 5%  [ 2 ]
250GB ≤ x < 500GB I have a few computers
13%
 13%  [ 5 ]
500GB ≤ x < 1TB I store a few movies
10%
 10%  [ 4 ]
1TB ≤ x < 5TB I store a lot of stuff...
34%
 34%  [ 13 ]
5TB ≤ x < 10TB I store one heck of a lot of stuff...
10%
 10%  [ 4 ]
10TB ≤ x < 50TB Movies are all on disk
21%
 21%  [ 8 ]
50TB ≤ x <200TB OMG
5%
 5%  [ 2 ]
≥ 200TB I'm waiting for Exabyte drives, and not silly tape.
0%
 0%  [ 0 ]
Total Votes : 38

Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Aug 29, 2022 3:23 am    Post subject: 2022: Your (home) Gentoo server: how much data do you hoard? Reply with quote

Curious of home users of Gentoo servers, or if you don't have a "server" ... a workstation or laptop is fine. All your data.
What is your *working set* of data hoarded?

Don't count:
- Empty space. So if you're using 4KB on a 1ZB disk, you vote 4KB.
- Redundant data, if the data you have is a copy of another disk you own, don't count - unless that other media is slow or unwieldy like tape, blueray, or remote...

I'm curious: with the pervasiveness of streaming media, it almost seems people don't bother with running media servers anymore, they just run off the network. Is this the case nowadays? Note: I do not stream media, I don't have any streaming accounts: no NetF***, no Disease Plus, no Aneurysm Prime, etc., etc... I do PVR OTA, and that's consuming a significant amount of working set data, but I suspect my working set is still fairly low... or is it?

I also wonder if some of these categories are ridiculous, or if there aren't enough categories...

I count about 3TB of random crap on my machines, a lot of that is my PVR box recordings, and things like git histories aren't tiny...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
steve_v
Guru
Guru


Joined: 20 Jun 2004
Posts: 409
Location: New Zealand

PostPosted: Mon Aug 29, 2022 4:13 am    Post subject: Reply with quote

Not sure if it exactly counts since my fileserver runs Devuan (for the low maintenance unattended-upgrades). All the machines it serves are Gentoo boxes though, so:

Somewhere around ~18TB used of ~40TB currently. ZFS space accounting is a little weird and I'm excluding backups, snapshots, extra copies etc.
I do take out the trash from time to time, so it's been floating around 50% for a while now. Frankly a fair bit of what's on there is only so I can seed it, essentially as a middle-finger to our self-styled "big tech" overlords.


Screw streaming services, and screw SaaS/MaaS and all the other "cloud" bollocks. I started out ripping all my (legit) CDs, DVDs and LPs and recording TV (circa 1999), but since free-to-air is all but gone here and you can't easily get physical disks (or own anything you "buy" in general) any more... It would seem bittorrent is the future.

I've been saying it for decades, if you want people to buy media rather than torrenting it, make it easier than torrenting it...
Yet here we are with a bunch of competing services, platform exclusives, region-locking, onerous browser requirements and broken DRM. Basically, same shit different decade... With the added stab in the eye that even when you do pay for content, it's no longer yours to keep and play at your leisure.

I also self-host my mail, "cloud" file storage and link sharing, phone backup and sync (nextcloud), music streaming (supysonic) and a bunch of other stuff on that box. I don't really have any "devices" I'd care to watch video on that aren't PCs though, so that's just served over NFS locally.
_________________
Once is happenstance. Twice is coincidence. Three times is enemy action. Four times is Official GNOME Policy.
Back to top
View user's profile Send private message
Spanik
Veteran
Veteran


Joined: 12 Dec 2003
Posts: 1003
Location: Belgium

PostPosted: Mon Aug 29, 2022 8:32 am    Post subject: Reply with quote

About 2.4TB. Most of it are (scans of) photos, recordings op LP's and concerts, bit of video, scanned documents and magazines... The most important bits are sitting on a raid5 in the pc (and backuped at least at two different location).

Like Steve I want to rip my cd collection and scan my whole collection of negatives and slides. But at the same time I really like the immediate-ness of taking a cd, plopping it in the player and have music. Instead of booting a player, searching the stuff,... Just not having to use a screen/keyboard/mouse/whatever is so great.
_________________
Expert in non-working solutions
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54578
Location: 56N 3W

PostPosted: Mon Aug 29, 2022 8:47 am    Post subject: Reply with quote

Well, I have a CD/DVD/BD collection on 4x4TB HDD in raid 5. Its about 90 % full.

I still have all my distfiles from mid 2006, which comes in very handy. They are also online.
Lots of random Gentoo ISOs and stage3 files for several architectures. Some are also online.
Some very early Beta Gentoo Live DVDs, left over from testing for Likewhoa.

All my downloaded junk since around 2003, when I started with Gentoo.

All the leftovers from putting together Historical Gentoo becase I needed to download and unpick lots of source RPMs to find bits.

Not counted, about 12TB of unused space for future expansion:)

It boils down to not having any pressure on storage these days and being lazy.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2179

PostPosted: Mon Aug 29, 2022 9:00 am    Post subject: Reply with quote

A little over 250 GB, one machine, most is photographs, very few videos (I've a box of camcorder tapes that some day I'll transfer, which will probably double the amount of space needed). The rest is just OS, documents.

What's possibly interesting is a comparison; I'm by nature tidy - I delete old stuff and dislike duplicates. I inherited a project from a friend; he's the opposite - dupes all over the place. The project (~150 GB) is one book, about 1,500 pages of text, which of course should be way smaller, but not the way he worked. Everything was screen grabs from PDF files (originals still there), turned into jpegs, imported into Microsoft Publisher (this guy only uses Windows), and thence to a new PDF. All intermediate items saved Also, he dumped everything into a humongous Publisher file as a sort of backup, at the end of each day. And kept a copy of this file (about 1.5 GB) perhaps weekly. Plus random intermediate versions. I wrote a program to count dupes; the record was 26 bitwise-identical copies of one file, and that's not even counting the scans and Publisher files containing embedded copies. So much data, so little information!

Compulsory xkcd cartoon.
_________________
Greybeard
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20485

PostPosted: Mon Aug 29, 2022 2:41 pm    Post subject: Reply with quote

eccerr0r wrote:
What is your *working set* of data hoarded?

Don't count:
- Empty space. So if you're using 4KB on a 1ZB disk, you vote 4KB.
- Redundant data, if the data you have is a copy of another disk you own, don't count - unless that other media is slow or unwieldy like tape, blueray, or remote...
750G+, but I personally only consider the disorganized duplicated data to be "hoarded".

For example, I have home directory backups that go back at least to 2009. That consumes 2.9G. A similar scripts directory at 925M. Then there's some work related stuff around 7.5G. I have binpkgs duplicated in a chroot and in a directory where it is made available via webhost. But that keeps me from having to deal with NFS. I know there are OS backups that duplicate data, but I don't know how much is duplicated (mostly anything outside of /etc could be deleted).

I only have 83G of movies, but I only started that collection in the last few years or so. I've never set up a media server because it seemed like more work than I cared about. I have a text file as the "guide." I copy something to my laptop, watch it, then delete it from the laptop. I use VLC, but it is awful (their fdroid app is so much worse). I don't think I've watched anything that way for over a year.

I had Prime for the shipping and used their Video streaming briefly. The selection wasn't great, and other than Star Trek, it didn't have much I wanted to watch (I did watch a few movies because they were "free"). I didn't renew Prime in 2018/19 because the cost kept going up, and the value down. Plus I wasn't interested in dealing with counterfeit items, so I also stopped buying from them completelly. Otherwise, I've not used any other services (never used Netflix, even for their DVD service).

I have almost 15G of audio that isn't music (included in the 750G), and 45G of music (not in the 750G). Plus some music that I still need to convert from more recently acquired CDs.


As an aside, I don't think there are any good hoarded data management solutions. fdupes or similar doesn't produce usable output for my use case. I've started something in python. It grabs file metadata and puts it into an sqlite database. I then create an "duplicate candidate" index based on the same file size. I need to add checksums for those duplicates. I haven't yet found a way to throttle disk reads to avoid causing system IO issues. The big problem is how to use the results to eliminate the chaff.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
sdauth
l33t
l33t


Joined: 19 Sep 2018
Posts: 651
Location: Ásgarðr

PostPosted: Mon Aug 29, 2022 3:17 pm    Post subject: Reply with quote

Desktop : 60tb (52tb used currently)
Filled with Linux iso of course :wink:
All mirrored using rsync to a small custom nas (built with old parts) and 6x12tb drives.
I do not use ZFS (hardware is too old, no ECC, some disks have different RPM as well) neither mdadm as I don't need the performance bump.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Aug 29, 2022 3:44 pm    Post subject: Reply with quote

I suppose my intent for the duplicates clause is if you spread data across multiple drives, don't count *known* backups (like if you use your server to hold backups of another machine). Also I sort of wanted to make clear that you don't count the parity blocks in RAID systems, so if I filled up my three 2TB disk RAID5, I'd only have 4TB of data - not 6TB. If you just so happen to have two copies of the same movie and forgot about it, or if you have two copies of a movie at different bitrates, then don't worry about the duplicate rule.

I'm not counting the bunch of DVDs I burned of old distfiles. I really should stop using recordable DVDs and copy them back to a hard disk as I think the data longevity on hard drives tend to be a bit longer than recordable DVD. However this is only a fraction of a TB (it takes two fullsized cakeboxes to get 1TB!) so it's really a drop in the bucket :D

Also I wasn't sure about how to account for filesystems that do versioning (are old versions backups?), copy on write (if you have two copies and one was changed by one byte, does it count?), and deduplication. Just make up a number! Incidentally, please discuss if you use versioning/CoW/deduplication! I've been thinking about setting it up so I can squeeze more out of my meager 4TB further.

But... the current results... 8O ... 52TB... at least nobody has 200TB yet at home...

So far I'm shocked the distribution is fairly even... despite having an exponential bracket system. Oh and btw, no problems, I just wanted to count Gentoo users - so that's fine if you run Debian or even Windows as your hoard machine(s). Just wanted to exclude people who run solely Windows or MacOS or whatnot.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54578
Location: 56N 3W

PostPosted: Mon Aug 29, 2022 4:01 pm    Post subject: Reply with quote

eccerr0r,

My April 2003 original install liveCD still works, so optical media can't be all bad.

I've moved up ta dual layer BD (50GB) for backups. DVDs are now far too small.
My BD drive will burn 100GB media too but the price will need ta drop a long way before I try one of those.

Quote:
.. at least nobody has 200TB yet at home...

... that they admit to :)
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Aug 29, 2022 4:15 pm    Post subject: Reply with quote

I've already had some optical media fail on me, so I do have some distrust.
The worst so far is CD-RW, then DVDRW. I've had several CD-Rs fail and not sure about DVDR longevity after the first burn verification cycle, but suspecting it's no different than CD-R.

Yes I'm lumping + and - media together.

I don't have a bluray recorder yet, and not sure of the value proposition yet. Media costs are annoying since they're one time use. Optical rewrite technology just didn't seem very good at all...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20485

PostPosted: Mon Aug 29, 2022 4:53 pm    Post subject: Reply with quote

Notice that Sony bought the rights to the post-DVD market and prices remain ridiculous. I've never owned a BR drive and I can't imagine needing one in the future. I'm holding out for the holographic 3D cubes that were capable of half-terabit per square inch in 2005 :)
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Aug 29, 2022 7:35 pm    Post subject: Reply with quote

How close are we to being able to reasonable fit 1PB at home? I suppose it's reasonable already, we just need 50 20TB HDDs...

Well, maybe or maybe not... Next year Seagate supposedly will have 30TB HDDs, so it will drop down to 33 drives.

I still don't know what one would do with 1PB of storage...then once we hit 1PB then those tape drives become relevant and the brand becomes a true capacity...

What would one store with 1PB. Will 3d cameras or "surround vision" cinematography become the normal, and we get so immersed that we miss details behind us (and making it difficult to hide stunt wires?)

I just wonder there will be a point piracy will become prohibitively hard that only old stuff can be stored?

but I already own two Exabyte tape drives!
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20485

PostPosted: Mon Aug 29, 2022 10:05 pm    Post subject: Reply with quote

I was going to say 34 drives, but 33 works. My first thought was that they were SMR drives. I presume not having heard disaster stories about HAMR is a Good Thing.

As drives kept growing into the gigabyte range, I was sure that home use tape drives would make a resurgence :( I think the last ones I recall were in the 40MB range

For the cost of tape hardware and tapes, it is a hard sell vs more hard drives. Unfortunately no decent mechanism has become "common' to make easier the use of HDs as a tape replacement. I'm aware of the "enthusiast" brands that offer what seems like their own proprietary format. I'm not really interested in a "service" that can arbitrarily and without warning lock me out of my own data.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Aug 29, 2022 10:19 pm    Post subject: Reply with quote

As long as people feel vendor lock-in is acceptable we'll continue to have proprietary devices...

Then there's another issue: how long does it take to duplicate (or resilver) a 30TB hard disk at usual hard disk speeds (read and write, especially if the target disk is HAMR or SMR!) I was just playing with a 36GB SCA disk that could resilver in about a half hour. A 30TB seems to be more than a day, and getting dangerously close to 2...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
djdunn
l33t
l33t


Joined: 26 Dec 2004
Posts: 812

PostPosted: Tue Aug 30, 2022 12:42 am    Post subject: Reply with quote

im less that 5TB i guess i need to catch up...
_________________
“Music is a moral law. It gives a soul to the Universe, wings to the mind, flight to the imagination, a charm to sadness, gaiety and life to everything. It is the essence of order, and leads to all that is good and just and beautiful.”

― Plato
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Tue Aug 30, 2022 12:49 am    Post subject: Reply with quote

lol don't leave me (and a lot of others of us here) behind!
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Leonardo.b
Guru
Guru


Joined: 10 Oct 2020
Posts: 308

PostPosted: Tue Aug 30, 2022 1:33 pm    Post subject: Reply with quote

eccerr0r wrote:

I still don't know what one would do with 1PB of storage...

Recordings from surveliance cameras in high definition.

Offline maps, if you want them with pictures.

Full images from TAC exams (or similar). Not limited to some specific parts, I mean a complete set of pictures of everything it was recorded.
As definition improves, an hospital would need a lot of disk space to do that, for every patient.
Back to top
View user's profile Send private message
Perfect Gentleman
Veteran
Veteran


Joined: 18 May 2014
Posts: 1255

PostPosted: Tue Aug 30, 2022 1:49 pm    Post subject: Reply with quote

14TB of music.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Tue Aug 30, 2022 1:58 pm    Post subject: Reply with quote

Leonardo.b wrote:
eccerr0r wrote:

I still don't know what one would do with 1PB of storage...

Recordings from surveliance cameras in high definition.

Offline maps, if you want them with pictures.

Full images from TAC exams (or similar). Not limited to some specific parts, I mean a complete set of pictures of everything it was recorded.
As definition improves, an hospital would need a lot of disk space to do that, for every patient.


Again this query is for home use?

I keep a copy of the whole OpenStreetMap USA map (without history/metadata) - it's about 9GB for the PBF. This is not very much for even 1TB disks... Granted if we all kept tile servers, then fine yes we'd need more space, but even then - only one copy is really needed and still probably will fit in 1TB for the world map. (I keep the OSM USA PBF for converting to Garmin format; and the Garmin format is probably close to 1:1 for its "tile" representation which is quite a bit more useful as it's routeable unlike png tiles.)

Businesses, hospitals, governments, etc. sure, but I don't see 1PB yet for home use. Most people only store photos, perhaps some videos, but most of the home use as far as I know are copies of commercially produced movies and music, perhaps a few games, etc...and as far as I can tell, this will continue up until they stop making it accessible...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3708
Location: Rasi, Finland

PostPosted: Tue Aug 30, 2022 3:35 pm    Post subject: Reply with quote

Perfect Gentleman wrote:
14TB of music.
Holy Jeebus! Are those flac? CD quality or over?

My photos and videos take a little under 1TB, and I thought it was a lot for a non professional photo snapper.
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Tue Aug 30, 2022 5:35 pm    Post subject: Reply with quote

Yeah I do suspect music affectionandos will give flak if it's not FLAC.

Speaking of massive amounts of data, are people already > 1Gb/sec Ethernet, how are you transferring files between machines?

Hmm...was messing with a bit more of encrypted root disks and seems my Athlon II X2 255 uses a lot of cpu cycles to maintain disk speeds, which is expected as it does not have AES instructions. But the GbE is fairly well utilized, and is fully utilized if I didn't encrypt.

Alas faster Ethernet is probably bottlenecked by the PCI/PCIe-x1 on the Athlon too, though that Celeron 1200 which bottlenecks a standard PCI Gbit Ethernet is even worse.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Spanik
Veteran
Veteran


Joined: 12 Dec 2003
Posts: 1003
Location: Belgium

PostPosted: Tue Aug 30, 2022 7:44 pm    Post subject: Reply with quote

eccerr0r wrote:
Yeah I do suspect music affectionandos will give flak if it's not FLAC.

Speaking of massive amounts of data, are people already > 1Gb/sec Ethernet, how are you transferring files between machines?

My music is as .wav on the disk. I don't see sense in compressing (even if it is flac). But then I do not have 14TB of it.

I started bonding my ethernet between the desktop, switch and nas. All 3 support that and I have now set it at LACP. Just two Gb interfaces bonded. I don't think I will notice much of it but it was a nice experience setting it all up.
_________________
Expert in non-working solutions
Back to top
View user's profile Send private message
Leonardo.b
Guru
Guru


Joined: 10 Oct 2020
Posts: 308

PostPosted: Tue Aug 30, 2022 8:39 pm    Post subject: Reply with quote

eccer0r, yeah, nowadays I can't see an use case for such an enormous disk space.
The future is unkown, but I would not be surprised.

I read Google Maps holds more than 20Pb of datas. Including the 3D walk-in pictures.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54578
Location: 56N 3W

PostPosted: Tue Aug 30, 2022 8:56 pm    Post subject: Reply with quote

eccerr0r,

If its not analogue all the way, its not music.
Phillips made a pigs ear of CDs when they designed the system. It should have been log weighted but log amplifiers were difficult to make at the time.

Let me just finish by asserting, with no evidence whatsoever, that MP3 is what's left when you throw the music away. :)

At my age, it doesn't matter much any more :(
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Tue Aug 30, 2022 11:58 pm    Post subject: Reply with quote

Next, will movies be saved on storage frame by frame in raw uncompressed format, leading to the likewise bloat of disk space requirements like mp3 is with FLAC? (I suspect those of us who have/use prosumer or professional still cameras will be shooting pictures to raw format, and those also get big...)

I suppose with video, people can keep on increasing sample rate(resolution) and I don't think we've hit the point where increasing has diminishing returns like with audio... but still, this seems like a travesty of bits used...

TBH ever since I first learned of MP3 compression I've had a hard time discerning the difference. At this point it's a bit easier to tell, but the question is "do I care" after all these years of not really being able to tell. JPG however was easier to tell from originals...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum