[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ZFS performance (was: Re: deduplicating file systems: VDO withDebian?)



On Fri, 2022-11-11 at 14:48 -0500, Michael Stone wrote:
> On Fri, Nov 11, 2022 at 07:15:07AM +0100, hw wrote:
> > There was no misdiagnosis.  Have you ever had a failed SSD?  They usually
> > just
> > disappear.
> 
> Actually, they don't; that's a somewhat unusual failure mode.

What else happens?  All the ones I have seen failing had disappeared.

> [...]
> I've had way more dead hard drives, which is typical.

Because there were more hard drives than SSDs?

> > There was no "not normal" territory, either, unless maybe you consider ZFS
> > cache
> > as "not normal".  In that case, I would argue that SSDs are well suited for
> > such
> > applications because they allow for lots of IOOPs and high data transfer
> > rates,
> > and a hard disk probably wouldn't have failed in place of the SSD because
> > they
> > don't wear out so quickly.  Since SSDs are so well suited for such purposes,
> > that can't be "not normal" territory for them.  Perhaps they just need to be
> > more resilient than they are.
> 
> You probably bought the wrong SSD.

Not really, it was just an SSD.  Two of them were used as cache and they failed
was not surprising.  It's really unfortunate that SSDs fail particulary fast
when used for purposes they can be particularly useful for.

> SSDs write in erase-block units, 
> which are on the order of 1-4MB. If you're writing many many small 
> blocks (as you would with a ZFS ZIL cache) there's significant write 
> amplification. For that application you really need a fairly expensive 
> write-optimized SSD, not a commodity (read-optimized) SSD.

If you can get one you can use one.  The question is if it's worthwhile to spend
the extra money for special SSDs which aren't readily available or if it's
better to just replace common ones which are readily available from time to
time.

>  (And in fact, 
> SSD is *not* ideal for this because the data is written sequentially and 
> basically never read so low seek times aren't much benefit; NVRAM is 
> better suited.)

if you can get that

> If you were using it for L2ARC cache then mostly that 
> makes no sense for a backup server. Without more details it's really 
> hard to say any more. Honestly, even with the known issues of using 
> commidity SSD for SLOG I find it really hard to believe that your 
> backups were doing enough async transactions for that to matter--far 
> more likely is still that you simply got a bad copy, just like you can 
> get a bad hd. Sometimes you get a bad part, that's life. Certainly not 
> something to base a religion on.
> 

That can happen one way or another.  The last SSD that failed was used as a
system disk in a Linux server in a btrfs mirror.  Nothing much was written to
it.

> > Considering that, SSDs generally must be of really bad quality for that to
> > happen, don't you think?
> 
> No, I think you're making unsubstantiated statements, and I'm mostly 
> trying to get better information on the record for others who might be 
> reading.

I didn't keep detailed records and don't remember all the details, so the better
information you're looking for is not available.  I can say that SSDs failed
about the same as HDDs because that is my experience, and that's enough for me.

You need to understand what experience is and what it means.


Reply to: