Kaydol

Flood göndermek, insanların floodlarını okumak ve diğer insanlarla bağlantı kurmak için sosyal Floodlar ve Flood Yanıtları Motorumuza kaydolun.

Oturum aç

Flood göndermek, insanların floodlarını okumak ve diğer insanlarla bağlantı kurmak için sosyal Floodlar ve Flood Yanıtları Motorumuza giriş yapın.

Şifremi hatırlamıyorum

Şifreni mi unuttun? Lütfen e-mail adresinizi giriniz. Bir bağlantı alacaksınız ve e-posta yoluyla yeni bir şifre oluşturacaksınız.

3 ve kadim dostu 1 olan sj'yi rakamla giriniz. ( 31 )

Üzgünüz, Flood yazma yetkiniz yok, Flood girmek için giriş yapmalısınız.

Lütfen bu Floodun neden bildirilmesi gerektiğini düşündüğünüzü kısaca açıklayın.

Lütfen bu cevabın neden bildirilmesi gerektiğini kısaca açıklayın.

Please briefly explain why you feel this user should be reported.

DIY Nas with many TB but low hardware requirements?

I would like to make my own NAS.

And basically I would like a lot of TB – for me that’s probably somewhere between 50 and 100 TB for now. And Perhaps between 100 and 200 TB within 5 years.

I would very much like, but I might not actually *need* some kind of deduplication, – to save space.

But I want the hardware to be cheap and not to use too much power.

I am thinking to build my own server, and install the needed software myself. I.e. I don’t think I want FreeNAS or another turnkey solution.

​

I am thinking a normal PC. I have an old on with a i3 processor. It only have 3 SATA, but I am thinking to install a 4- or 8-port SATA adapter. It just have 12 GB of memory.

Right now I am thinking in the direction of FreeBSD on ZFS. Probably 6×18 TB hard drives in RaidZ1 or Raidz2. That would give me 108 TB RAW, but somewhere around 70-90 TB depending if I choose Raidz1 og Raidz2.

For backup I am actually thinking a Raspberry Pi in a different location. And I am thinking to let it be turned off for 14 days. And then every 14 days it will get some incremental backup files. I am thinking to make one every day, perhaps even more often than that and then store it on a 3rd computer, or perhaps in the cloud, until day 14. The reason for this is a) To make it use just around zero power, it’s not going to be in my place, but also b) to make it a little bit safer. If I get hacked in a bad way, chance is that I’ll find out and stop the backup process and then have a good copy and some incremental files that I will be very careful about installing or perhaps discard.

But I am very much in doubt about the Raspberry setup. I don’t want to make a Raidz or striping or anything like that over several USB discs on the Raspberry. It sounds a bit like a disaster waiting to happend! But can I backup a 50 TB dataset in a good way to several smaller filesystems/datasets?

​

For now I think this is what I can afford to do.

First priority for improvements would be backing up locally, and hence get to 3 copies.

Second priority would be to get a better fileserver. Hopefully I would be more informed and know better what I want. But for now, I am thinking a faster and especially to get lots of ECC memory and two mirrored NVM SSDs. I’m thinking 128 GB memory, 2×2 TB Nvm, I guess at least 2,5 Gbps NIC and at that time probably 8, 10 or 12 drives of 20 TB each in raidz2 or perhaps even raidz3. Yes, I know, 20 TB isn’t out yet. But I’m not ready to buy yet either. I guess 20 TB would be the sweet spot in 1-2 years which is likely to be the time.

​

I don’t know if it makes to much difference. But my hoarding is party videoes. They do not change. They do not require much speed. They don’t compress. But also a lot of HTML files, some databases that I populate myself. I am probably not neurotypical and I like to collect data. And it takes up a little more space than most people imagine 🙂 RSS files and some screen scraping that I process and put into databases. And then of course personal files, python-, java-, php- etc source. As well as the boring standard stuff, some personal images, some mails, some letters, excel spreadsheets etc. etc.

I’m not really sure how and exactly what I want. But I suppose that I’ll learn that it would be better to have two or more different filesystems with different rules. Snapshots of my videoes are actually not important at all, I suppose. And there’s no reason to compress those files. My source-files is the exact opposite. I would like compression, I would like snapshots. And I suppose speed is of some importance, especially to be able to open/read hundreds of small files fairly fast.

Is FreeBSD, ZFS, Samba/NFS the right solution? Or is there something better?And do you have some ideas about how to back it up. I would prefer the Raspberry with external USB-drives. But how to make that work is still unclear to me.

Benzer Yazılar

Yorum eklemek için giriş yapmalısınız.

8 Yorumları

  1. Lots of storage, cheap hardware, and not use a lot of power. That is going to be hard to find, if not the direction you want to go. But I had the same requirement, so this is what I did.

    My primary server is an older SuperMicro X10 server. CPU is a low power Xeon E-1285L v4 and it’s maxed out with memory, NVME on the PCI bus, and 4 3.5 10TB HDDs. All, in all, it runs about 60 watts and runs 24/7. This is my main server, runs all my VMs (Proxmox), dockers, and everything else. My day to day downloads happen on this server.

    Then I have cold storage. A Ceph cluster on 2 old Dell R710 servers with an assortment of 8TB to 16TB HDD in them. It’s well over 100TB now and growing. Ceph lets me add or subtract disk or storage nodes anytime. I got tired of playing the NAS upgrade shuffle and this solved that.

    Every 2 weeks or so, I startup the Ceph cluster, and move that weeks collection from my primary server to the Ceph cluster, then shut it back down.

    The only down side to this, is my entire collection is not online 24/7, but I don’t need it to be. The active stuff stays on the primary server, and when I am done with it, on Ceph is goes and I tend to never touch it again.

  2. I don’t use dedupe. And have been getting away with running a 80tb nas for 2 years now off a old motherboard and ram from my girlfriends old build from 2013 , its a i5 3470 with only 8gb ram. I stuffed a lsi card in it to get 8 more ports and its running 11 8tb drives in zfs raid2 right now. Its been rock solid, is not the fastest but can maintain several 4k streams concurrently as well as audio streams. Decoding is all done client end this is a simple file server. Works for my needs. Next one will have more ram though lol.

  3. > I am thinking a normal PC. I have an old on with a i3 processor. It only have 3 SATA, but I am thinking to install a 4- or 8-port SATA adapter. It just have 12 GB of memory.

    > Right now I am thinking in the direction of FreeBSD on ZFS.

    it will do. rn im running FreeBSD/ZFS on a Core2Duo with 4GB of RAM and it’s enough. granted, i don’t have so many TBs, but it does the job.

    > I would very much like, but I might not actually need some kind of deduplication, – to save space.

    dedup is kinda a HolyGrail^^(tm) that (almost) nobody needs. it’s an enormous memory hog, and very few datasets (very specific ones) will benefit from it. i’d say just forget it.

    > I’m not really sure how and exactly what I want. …snip… especially to be able to open/read hundreds of small files fairly fast.

    as the other poster said – ZFS offers enough granularity to accomodate you.

    > Is FreeBSD, ZFS, Samba/NFS the right solution? Or is there something better?

    yes, and IMO no. my take on [ZFS](https://www.reddit.com/r/DataHoarder/comments/nbfx0e/advice_on_first_home_nas/gyhlhmu/).

  4. We are using Gigabyte Brixes – used ones are around the Raspi price, but They have 4 real USB30 ports. It’s important, because this allows you to connect 4 drives directly or 4*8 drives with powered USB hubs. The external drives are usually cheaper than internal ones for marketing reasons.

    Multiple machines setup increases you availability and allows for unlimited scaling. I keep ext4 on my drives – my brixes usually have 8GB RAM per 20TB of space, but if you’re not going to torture them with VMs or containers, when you’re fine for as many TB as your drives allows.

    We pool our drives with mergerfs and rclone union. Again, in allows for unlimited grow with common hardware.

    If we’re speaking about arrays above 8 drives, you can saturate your 1Gbps network without slow downs. You can play with load on your UPSes too!

    Go vertical, because there is no limit. Horizontal way is for those hella rich people.

    The next step is git annex and waking the machines using WakeUp over Ethernet. It allows you to disable machines and drives which you do not need at the moment, and turn on the ones which you need when you need them!

  5. You’ve got an inverse relationship between your needs here. I understand keeping your costs low, but the use case you are describing is working directly against that, especially if you are looking at Truenas/ZFS.

    The rule of thumb I’ve seen for ZFS is 1 GB ram per TB of storage. This can vary depending on applications and features (such as deduplication), but can kind of chart your path forward.

    If I were you, start building with what you have, then procure the parts you need as you get closer to outgrowing them. For example, take the I3/12GB system you have and get a 10/12/14TB Nas going( based on whatever drive density is the best value when you buy). All you need to get is drives.

    When you outgrow it, one of the great features of Truenas Core is that you can physically disconnect your boot and storage pools, move them to a new machine, and boot right up. Or you can upgrade your existing system, it’s all up to you and what you are willing to spend money on. This gives you time to get going with what you have and not mired in design issues.

    You may find that you need a lot less horsepower than even the Truenas Core specs recommend. I’m running mine on Ivy Bridge gear with about 10 drives in it, and it’s very economical on power. Best of luck.

  6. I don’t use dedup, but I hear you will need lots of RAM, assuming you mean block level dedup. Assume at bare minimum 2-3GB for every 1TB of data, maybe as much as 5GB. So investigate thoroughly.

    I also think if you’re hoarding videos your dedup ratio might not be so great because of the random nature of the format and they are already greatly compressed, there’s likely not going to be an extensive amount of common blocks that can be deduped. At best you’ll get *maybe* 10-15% savings, probably less. Not worth all the added RAM and extra overhead. All your other stuff will likely have a high dedup ratio, but do they really take up that much space to begin with?

    Avoid the SSD cache in my opinion. It won’t benefit you much. Even with 2.5G a multi disk RAID Z/Z2 will saturate that no problem. Heck, I run a measly six disk RAID 6 over 10G network and it nearly saturates that line. SSD will help if you transfer a massive amount of small files regularly, but overall it’s just another bump in the pipeline to root cause any issues. But if you do run SSD cache, be sure to run it in a mirror config.

    For additional SATA ports be sure to pick up an LSI SAS card (flashed to IT mode), dual port (supporting up to 4 SATA per port) cards are found all over eBay for under $50. Your best option rather than those cheap “SATA controller” cards. The SAS card is more robust overall.

    The Raspberry Pi is a fun project board, but not sure it’s up to the task for the volume of data you’re considering. The USB ports are all tied together to a single PCIe lane, and your throughput will be dismal across multiple disks. Especially if you plan to run regular surface scans and validate data integrity, as well as occasional restore tests.

  7. I have a RPi4 with a 4 bay USB 3.0 enclosure, running OMV NAS software. Works fine. It also runs Emby media server. The drives are pooled using mergerfs.

    I also have an old SFF i5 PC with a large media storage, running Ubuntu MATE and also Emby. The SFF PC is connected to a 5 bay USB 3.1 gen2 enclosure. The drives are pooled using mergerfs.

    In addition I use a 4TB SATA SSD, internally in the SFF PC, as a simple form of cache for new files. This means that as new files are added to the media pool they are automatically stored on the SATA SSD. As more new files are added to the SSD, eventually the oldest files on the SSD are migrated from the SSD to the HDDs in the slower USB enclosure. This means that new files can, for a while, be accessed locally at internal SATA SSD speeds. Great for checking/re-encoding/renaming and so on. And still the files are automatically migrated to the slower USB mergerfs pool, after a while. Without the path changing.

    This tiered storage setup is described on the mergerfs github page. I just automated some parts.

    Also in addition, I have an extra external USB drive that I use for SNAPRAID redundancy of the mergerfs pool on the SFF PC. This drive can be turned off except when I scan (update parity) or scrub (check files against parity).

    Performance of a mergerfs pool over a single USB cable is similar to what you can expect from a single external USB drive. This means that performance is fine for few (say 1-2) simultaneous users, but not for many simultaneous users.

    Drives used are 16TB EXOS X16 and 4TB Crucial MX500.

    As for your question of how to backup the RPi4: Either get two RPi4 and sync between them. Or connect more drives and have more than one mergerfs pool (there are 10-drive USB enclosures) and sync between the pools. And perhaps also to another NAS.

    You can run whole file deduplication software on a RPi4 without any problems. Either to silently hardlink duplicate files or to report them for action.

  8. ZFS can do compression per dataset. So you can have folder with ZSDT and high compression, some folder with for example images with very fast lz4 compression and other folder where just zeros are compressed. Completely turning off compression is a very niche use case, I doubt you want that.

    I have compression on even on video datasets. It gives up if it can’t compress, but still compresses all the metadata.

    ZFS makes it easy to push datasets to another server.

    There are tools to do automated snapshots and you just define yourself how often what dataset gets snapshotted and at what point older snapshots get deleted.

    /r/zfs

    Edit: Don’t use de-dup on ZFS! From what I have been told it has a very narrow use-case and most people should not use it.