Kaydol

Flood göndermek, insanların floodlarını okumak ve diğer insanlarla bağlantı kurmak için sosyal Floodlar ve Flood Yanıtları Motorumuza kaydolun.

Oturum aç

Flood göndermek, insanların floodlarını okumak ve diğer insanlarla bağlantı kurmak için sosyal Floodlar ve Flood Yanıtları Motorumuza giriş yapın.

Şifremi hatırlamıyorum

Şifreni mi unuttun? Lütfen e-mail adresinizi giriniz. Bir bağlantı alacaksınız ve e-posta yoluyla yeni bir şifre oluşturacaksınız.

3 ve kadim dostu 1 olan sj'yi rakamla giriniz. ( 31 )

Üzgünüz, Flood yazma yetkiniz yok, Flood girmek için giriş yapmalısınız.

Lütfen bu Floodun neden bildirilmesi gerektiğini düşündüğünüzü kısaca açıklayın.

Lütfen bu cevabın neden bildirilmesi gerektiğini kısaca açıklayın.

Please briefly explain why you feel this user should be reported.

Moment of silence for >1 petabyte data loss..

My work just reported a massive data loss exceeding 1 PB on our supercomputer.
It represents years of cluster time running engineering simulations which is apparently mostly unrecoverable and had not been backed up due to the large capacity.
The issue stemmed from a software bug where file path inputs were not sanitized. Although I would ultimately place the blame on the server admin’s permission policy and lack of safeguards for allowing this to happen.
At least we won’t need to buy hard drives for a while 😛

Benzer Yazılar

Yorum eklemek için giriş yapmalısınız.

36 Yorumları

  1. 1+ Petabyte (less than 2 I guess) is not that much data and hasn’t been for the last 5 years. I really don’t understand this kind of thinking.

    *How is it possible to have the money for a supercomputer, petabytes of storage but not backups? If your data is not worth backing up, is what you do really worth anything?*

    Maybe it was a calculated risk, but at that scale, dataloss is just one human mistake away.

    I think many people don’t intrinsically realise the true cost of reliable data storage. People don’t want to accept that to safely store x amount of data, you need somewhere between x * 2 or x * 3+ (backup history) of raw storage capacity in additional drives or even tape.

  2. You blame the sysadmin for code he didn’t write that killed a server that management didn’t want to pay for backups for?

    Shit happens, it’s always stupid shit and people should’ve known better but given enough time it will always happen. This is a failure of your organization and its culture not a single person.

  3. > file path inputs were not sanitized. Although I would ultimately place the blame on the server admin’s permission policy and lack of safeguards for allowing this to happen.

    Spotted the SWE… or, everyone blames the SysAdmin for whatever happens. Restrict stuff so I can’t happen? Obstructing work. Don’t and something bad happens? It’s your fault.

    (work has EBs upon EBs. you can back up anything if you care enough…)

  4. Who the hell has the money to build a supercomputer setup with an array >1PB but somehow doesn’t have the money or general investment into backing up the data that requires a supercomputer to generate in the first place? Shit, you guys should double down and get a series of teamed 10Gbit lines and use them to write test data to /dev/null all day.

  5. There is no such thing as “too much data to back up”, only “too much data to lose”.

  6. *sniff sniff*… do you smell something burning? Oh right someone’s getting fired today

  7. Sysadmin 101: If its not backed up, it doesn’t exist.
    Sysadmin 201: If you haven’t restored it this week, it’s not backed up.

  8. your company leaders are insane for not having a backup of years of work.

  9. If you can afford 1 PB of storage, you can afford a second one.

    Otherwise the data isn’t important.

  10. As someone who actually builds petabyte scale storage and backup solutions, this is truly a WTF.

    1. If you wanted to go with spinning rust, you’re talking 88, 16tb drives (raid60) It’s going to occupy a whopping 10U.

    2. If you wanted to dead store the data.. imaging zero compression, you can get 12tb tapes… you’d need ~15 of them… 10 for data plus some spares.

  11. On the bright side, now you won’t be receiving the relentless “please free up x% storage on the /data partition or scheduling will be paused on the machine” emails for a few weeks.

    :gives ample side-eye to the ass with 340TB of critical experimental results on the scratch partition for the last 8 months:

  12. Yeah, now is when your company begins to see how cheap was that $150k local backup plan

  13. Woah holy shit OP I hope you’re doing o-

    >My work just reported a massive data loss exceeding 1 PB on our supercomputer.

    Oh, technically not your data or fault. Phew. But still, a big F for that petabyte.

  14. Well from observation, science big data collection and movie production don’t like spending more than minimum and often have to redo work.

    Look at Pixar and Toy Story 2

    https://thenextweb.com/media/2012/05/21/how-pixars-toy-story-2-was-deleted-twice-once-by-technology-and-again-for-its-own-good/

  15. >not been backed up due to the large capacity

    AWS Glacier Deep Archive would have cost $1k/month. That seems like a pretty good price to not have this happen, if you ask me.

  16. I assume they already looked into filesystem recovery? Seems odd that a path error would make everything completely unrecoverable unless it was present for a long time.

  17. This is a common problem with data lakes too. If it’s computed data, it’s generally not worth the cost of backup. The amount of data can also be an issue, as can restoration time. However, archiving this data is often worthwhile but few in IT separate backup from archiving.

  18. an engineering company that couldn’t evaluate the cost VS risk of failing to provision a backup.

    this is my surprised face.

    you don’t happen to work for Boeing do you ? was the super computer the one that coincidentally did the simulations on the Max ?

  19. “had not been backed up due to the large capacity.” so perhaps they just needed 112 LTO 8 Type M 9TB tapes @ $70? each = $7840 plus the cost of a large tape library which if people think that is expensive then it must be peanuts compared to the cost of 1 PB of primary storage. Probably also not running Lustre/ZFS either and using snapshots to protect the data against accidental deletion.

  20. How much is this gonna cost you? More than $500k? Cuz that’s what you could get a PB for.

  21. 1PB with no backups … who was the sys admin, Linus Techtips ?

  22. >had not been backed up due to *[not being important]*

    FTFY. If it’s not backed up, it’s not important.

  23. Well I hope they learn their lesson and get a proper backup system now.

    There are companies and organizations that create and backup >1PB

    It’s not like it is impossible.

  24. Casual passerby but still.

    # ⠀⠀⠀⢀⡤⢶⣶⣶⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⢀⣠⣤⣤⣤⣿⣧⣀⣀⣀⣀⣀⣀⣀⣀⣤⡄⠀ ⢠⣾⡟⠋⠁⠀⠀⣸⠇⠈⣿⣿⡟⠉⠉⠉⠙⠻⣿⡀ ⢺⣿⡀⠀⠀⢀⡴⠋⠀⠀⣿⣿⡇⠀⠀⠀⠀⠀⠙⠇ ⠈⠛⠿⠶⠚⠋⣀⣤⣤⣤⣿⣿⣇⣀⣀⣴⡆⠀⠀⠀ ⠀⠀⠀⠀⠠⡞⠋⠀⠀⠀⣿⣿⡏⠉⠛⠻⣿⡀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣿⣿⡇⠀⠀⠀⠈⠁⠀⠀ ⠀⠀⣠⣶⣶⣶⣶⡄⠀⠀⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⢰⣿⠟⠉⠙⢿⡟⠀⠀⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⢸⡟⠀⠀⠀⠘⠀⠀⠀⣿⣿⠃⠀⠀⠀⠀⠀⠀⠀ ⠀⠈⢿⡄⠀⠀⠀⠀⠀⣼⣿⠏⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠙⠷⠶⠶⠶⠿⠟⠉⠀⠀⠀⠀⠀⠀⠀⠀⠀

  25. Time to divide it in half then and build up a second storage on another machine to use to make backups.