Copying files from one drive to another and `diff` says a lot of them differ

I have two, 6TB drives (~2.3TB used) that I use for storing photos, videos, and music in my desktop. I bought them back when I used Windows, but now after using Linux for about 2 years I figured it would be time to get away from the `NTFS` partitions and manually “mirroring” the drives by copying any file twice when adding anything to one of them.

1. I made sure that everything I usually access on my drives was still good.
2. I reformatted one as `ext4` and then copied what was on the `NTFS` drive over to the newly formatted `ext4` one (planning to format the other to `ext4` as well and setup an rsync cron job to keep them both identical).
3. Then I started looking a bit more seriously into digital storage techniques and saw things like zfs had the whole bit rot mitigation thing with a mirror VDEV and `zpool scrub`; Soo…
4. I wiped the `NTFS` drive after I confirmed that the data on the `ext4` drive was still accessible, and setup a pool with `sudo zpool create vault sdb`.
5. Then I copied from the `ext4` over to the newly created zpool using `cp -rv /media/username/ext4-drive/ /vault/`.
6. I wanted to make sure the transfer went all right, because I caught some weird errors at the beginning of the `cp` and it looked like some permission stuff was off and I thought that some of the files must have just not transferred for whatever reason. So I ran `diff -rq /media/username/ext4-drive/ /vault/` and got returned that a whole slew of files differed. Most of them `.ORF` and `.JPG` (from my camera) and some `.flac` and `.mp3` from my music collection, a few `.MOV` from my camera as well. Anywho, I tried opening of the `.JPG` files that differed and opening both drive’s versions went fine and there seemed to be nothing wrong with the photo when visually looking at it in am image viewer.

Really I am wondering what I should do now, because I want to wipe the `ext4` drive and add it to my VDEV so that I have zfs handle the mirroring for me, but I do not want to do so until I know that the version of my files in `/vault` is ok. What should I do to confirm that files are alright? Open all of the ones that differed? Does that even matter, could it just be metadata that differs? Should I re-copy over the files that differed?

Any help is greatly appreciated. Thanks.

2 Yorumları

  1. It’s most likely meta differences.

    Also, use rsync instead of cp

  2. Is your ZFS using compression? That would explain difference in hashes..