We want to do everything for you to trust DotNest for running your valuable websites: this is why we very much care about minimizing outages. DotNest opened about a month ago: in this month the uptime ratio of DotNest tenants was more than 99,99%! (As measured by Uptime Robot.)
Actually it was a bit even more than 99,99%: we had only one outage on the 27th of March that lasted less than 5 minutes! To be honest there was another issue that affected all the tenants but technically doesn't count as downtime: on the third day after launch we had some configuration issues that caused CSS not to be loaded for tenants for about 20 minutes. We learned from this and applied fail-safes so it never happens again.
How do we manage to get 99,99% uptime?
- We don't operate with planned downtimes. When we built up our deployment story we started with the ideal case of not having any downtime when updating DotNest or doing some maintenance work. This doesn't mean that something can't go very wrong to cause some outage but if that happens that is an exception: we plan for no-downtime maintenance.
- We have a roll-back strategy in case something unexpected happens: if that incriminated thing hits the fan we can still get away with a very short outage by hitting the "roll everything back" button.
- Only errors in Azure can cause downtime: at the moment if we have some downtime it is caused by Azure failing. As you may know, DotNest runs on Microsoft's excellent cloud platform, Azure. Azure is very reliable, but still, there can be slight downtimes. When this happens what we can do is limited, since our core infrastructure is impacted. Actually that one outage was also caused by an Azure service failing in the background. While we can do nothing about Azure service outages we actively work on improving DotNest to be more tolerant to failures in its server backend.
Is 99,99%+ good enough for you? Then jump on the DotNest train! If not, then be assured that we'll get better :-).