[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Advice on hardware server to use for small a dedicated data center




I'm very happy to have asked this question here, my brain is enjoying. Thank you again for all, I'll definitely be stick on Debian.

> Something seems a little flakey about that architecture.  Rsyncng postgress data?  There are lots of better ways to merge data into a database.  Particularly in these days of cheap, continuous, broadband connectivity.  And if you can't get business broadband in all your locations, cellular modems are dirt cheap.  (I speak as someone who's designed more than a few mobile data collection systems - everything from tactical military systems to transit buses)

Exactly, what a bad idea. Here I should be talking about sync not rsync ( autocorrect killed me). I was thinking about WAL archiving for a point in time recovery. 

> Rsync is just wrong for that kind of application.  What are you syncing, anyway.  2500 postgress instances, some raw data files for input to a single instance of postgress, something else?  And if the postgress instance is feeding live data to apps, you really need to focus first on your high-availability strategy - single points of failure will kill you.

Not for database but for raw files like images, video and audio, I planned using rsync ( I've not yet tested if git could do the trick here but I thought about it)

> As a startup, you've got way too many other things to worry about than home brewing an IT environment - focus on your core product/service, whatever that is

Exactly, but when I've seen that the product we are going to build will cover all those aspects a.k.a building this means our product is ready and we gained more experience on the subject.

 > I added indexes for the relevant fields, and the query took a few minutes. Powerful hardware is nice, but using it efficiently is even better

Infinitely right, I don't repeating this to my team.

ZFS requires a different way of thinking, 
> ...
 So, you can replicate over a network in real-time and you can send a replication
stream to a file and receive it later.

At what level do we have to change our mind? For example will this impact the way to think SELinux (normally no) ? 

> you have good backup and restore processes for your current systems?
What are your plans for the new systems?
> ...
> Do you desire 24x7 operations, live maintenance, automatic fail-over,
high-availability, or similar?  If so, how?
> ...
> What is your budget and schedule?  Manpower?

About the budget an manpower, we are creating a financial plan in order to present it to our investors, so nothing is well decided for the moment, that is why all your answers and suggestions are welcome and very useful. 
What is ideal for 24/7 operations,high-availability and auto fail-over ? We're 8 in the team for the moment, 2 web and mobile developer and only 2 peoples know things about system and network (admin,security,..) but we plan to get more peoples.
So please, suggest, in the worst case we will have learned more things.

+2 for docker and kubernetes 


> We also had some HPE Proliant DL380, that's pure crap (a lot a disk died
in the same month, bios update are non free, watermarked hard disks).
For now, they are all out of order, and it's far too expensive to repair
them.

Thank very much for this warning. Something more to remove from the list of things to evaluate. And somewhere in my mind I was suspecting things like this :) .


Le dim. 28 juin 2020 à 20:51, Miles Fidelman <mfidelman@meetinghouse.net> a écrit :

On 6/28/20 6:37 AM, echo test wrote:


Hello,

Thank you for all your answers and sorry to be late for answering.
 
> I prefer ZFS but I find that lots of corps prefer mdadm. I really think that's simply > because ZFS came from Sun and they lack Solaris backgrounds. Now, in a low-> RAM environment with simpler disc needs, I would probably go with mdadm. 
> Anything else I would choose ZFS. It's ability to take care of itself is surprisingly > strong. Less work for me after the set up and installation.

ZFS beeing a filesystem and mdadm an utility software, I think I'll go for mdadm. I didn't know that Debian was supporting ZFS I always used Ext4.

It's a stack.  You build up from disk, to block-level raid, to volume manager, to file system, to access protocol.

ZFS includes multiple levels of the stack.  And yes there are ZFS implementations for Debian, along with a dozen or more other file systems.


Sorry if I'm misunderstanding, are you saying that Debian  cannot scale in a bigger enterprise ?
Can you tell me what happened with hardware RAID solutions?

small" could be anything from 10 to 1000 users. Mentioning some numbers
> could get you more useful recommendations.
> In any case, some interesting hardware not mentioned so far (don't
> forget about the power consumption).

Small here is for me about 2000 users all are restaurants that save their selling history locally on their own server then 2 or 3 times in the morning they will rsync their postgres data on my data center.
About the power consumption, any advice about some low power hardware are also welcome.

Something seems a little flakey about that architecture.  Rsyncng postgress data?  There are lots of better ways to merge data into a database.  Particularly in these days of cheap, continuous, broadband connectivity.  And if you can't get business broadband in all your locations, cellular modems are dirt cheap.  (I speak as someone who's designed more than a few mobile data collection systems - everything from tactical military systems to transit buses).




Supermicro 1U servers - run two or more of them
> and it's easy to turn them into a high-available cluster
>  ...
Note:  I'm seriously considering migrating from Debian for our
> next refresh - I really don't like systemd - might go all the way to BSD
>or an OpenSolaris distro.

Supermicro seems definitely to propose some great stuff I will take them in account. Why do you dislike systemd ? I heard many people saying the same thing and I don't really understand what are their motivation except initd is less invasive.

It's a spaghetti coded package of crap, that takes over your system and does things its way.  I prefer modularity, and control over my systems.


I don't really know how to answer to your question but let's try.  We are a startup and for the moment we have a production and a development, in fact the production is just like a test environment because we do continuous delivery, we push everyday in order to know more quickly  when something has been broken and our semi-automated tests didn't detect it. Personally, I'm a self learner, and probably many guys of my team are too. So some advices here are also welcome.
We want to be able to handle 2500+ rsync in the morning (probably distributing them in time in order to avoid a single big load acting as a ddos) and for each client of my clients (restaurants) a get and put profile request. 
Note: client's profile are shared across restaurants and clients can find/filter restaurants on the website which is not yet built but we are working on it.

Rsync is just wrong for that kind of application.  What are you syncing, anyway.  2500 postgress instances, some raw data files for input to a single instance of postgress, something else?  And if the postgress instance is feeding live data to apps, you really need to focus first on your high-availability strategy - single points of failure will kill you.

Come to think of it, you're a poster child for doing everything in the cloud.  As a startup, you've got way too many other things to worry about than home brewing an IT environment - focus on your core product/service, whatever that is.  (Now, if you're setting up a service bureau, that's another story - in which case, hire some folks who actually know how to do this stuff.  Here, I'm speaking as someone who HAS homebrewed a small service bureau, with serious experience in computing & IT - back before any of this stuff was available off the shelf.  It's a royal PITA.  These days, I'm far more likely to set up a new domain, or app, on a hosting service, than on our cluster - unless & until I know that it needs to be around for a while.  Life's too short.)

Miles Fidelman


-- 
In theory, there is no difference between theory and practice.
In practice, there is.  .... Yogi Berra

Theory is when you know everything but nothing works. 
Practice is when everything works but no one knows why. 
In our lab, theory and practice are combined: 
nothing works and no one knows why.  ... unknown

Reply to: