There is a saying that goes "Price, Quality, Speed of delivery: Pick any two." that sums up most of our purchases. You can get something cheap and quick like a McDonald's burger, or have a nice restaurant meal but you'll wait longer and pay more. Makes sense doesn't it?
This does not happen in the software world, and believe it or not, the developers are not the problem - we *want* to make software that is bullet proof, but the cost is too high and the customers won't pay 5x the estimate to have that level of quality, nor will the managers, who are pushing to "release next week!"
That may sound harsh, but it is business - people need things fast, and they can't wait a year to get a perfectly polished product, so they will get something that does the job with a few bugs and workarounds so they can get on with their business.
But what if you want software to last a long time - I mean a *really* long time, so that it is still running when you are in a retirement home - what sort of things do you need to think about in the design phase of this?
In this article I'll talk about things to avoid that makes software brittle, and how to go about having a system that runs for as long as your hardware lives.
We want to build a piece of software that will run without falling over. For the purpose of this blog, we will focus on software developed for PC's / servers that will be deployed to a website (everyone calls it the cloud, but it really is just someone else's computer)
One thing to keep in mind, is that unless you are writing code for a long range space probe, then no one is going to pay you to do this so you will be doing this after hours in your own time.
Think about the software as a whole - what does it do, who uses it, how often, what MUST it do, and what are the nice features, that don't really matter if they fail [separate the critical functionality from the fluff]
Separating the MUST from the NICE determines how you'll handle the errors for these things - anything nice that fails, can gracefully return and let the user know 'sorry, that isn't going to happen right now', but the MUST functionality needs to have a plan to get around this.
Look at the software documentation (and if you don't have software documentation, that is the first job!) and identify every component and think about what happen if that fails.
This is probably the most obvious initial question, but it is not the right one to ask - hardware changes quite quickly, and it is more important to work out what operating system or software platform you should consider as these are updated to run on new hardware as it comes out. If you are hosting this on your own local system consider something simple like the Arduino or Raspberry PI - if the software is intensive then go for multiple rack mounted computers to avoid a single point of failure. Battery backup is a must in either case.
The only real choice is an open source such as Linux or BSD - the large companies like Microsoft and Apple, which although they do support older versions of their OS for a limited time, have always stopped support for older versions. Once this happens, you will likely need to upgrade the software - often this is simple, but other times it is not.
Either way, for long term software, don't lock yourself into a large vendor whose primary source of income is to sell you a new version every 3-4 years - it is in their best interests to make sure that old software doesn't work after a while.
Anything that has a public specification and in good public use - that means you can't go wrong with the C language - runs on just about any hardware and has been used for years. Though you need to watch for pointer issues and security flaws (more on this below)
This goes against the current trend where people deploy quickly and use any and all libraries to reduce the amount of code they need to develop. Long term use, is different - you need to very carefully consider any addition or dependency to your software, as it adds another way for it to break when a library gets unsupported, changes, or has a serious security flaw.
Ideally, you would not use anything that you couldn't write yourself - if you can find a pure code solution to do a task or pick a library, you would normally pick the library (your boss and customers don't want you charging your time to re-invent the wheel)
So you've picked your hardware and language, so let's get to work - what's next?
1. Catch all the errors
You should never see a server 500 error on the web, or a core dump in a C program - every non trivial function (and maybe even the trivial ones) needs to look after itself. Treat every input as potentially hostile, assume everything is going to fail and manage what happens when it does - it must have a gentle exit from failures and not just crash.
2. Don't trust anything
- all inputs have to be very carefully checked.
- don't trust a network call - they absolutely fail. Treat a read/write to a network as a 'slow, try , try again' rather than a file.save()
3. Don't use external libraries unless you absolutely must
If you are not in a rush, write the libraries yourself rather than have a massive stack that becomes brittle
4. Don't use *any* API's or online services
They will deprecate, change, or stop, or start charging $5,000 a month if they get too successful
Even with paid services, you have no guarantees that the service you use will not change.
If you really have to use them, then have the information cached and all calls wrapped so that when they do change, you will have the last recent data from them which you can use in your software, along with the date that it failed.
My Weather App
Todays Temp = 25 [last updated 12/12/2004]
The best way to get software to fail is to have issues with either Paths, Permissions or Parenthesis.
Even though I am pulling this number out of my hat, I'd estimate that at least 80% of all software bugs are caused by one of these.
Your software tried to open a file that doesn't exist, network not working, or file was moved.
Syntax style errors, some are obvious fixes, mostly caught by a compiler or linter but many cause changes to logic which causes errors.
C - you left out the semicolon
Python - your indent's are off which changed the logic
Biggest issue is the software can't read something as it doesn't have access - for example a file on a server, local folder has permissions restricted, or selecting from a database with incorrect grants / logon access
How does long term maintenance work in this situation?
You need to pay for services in advance, including domain name renewal, hosting and your own internet access so you can get to it to fix it
This is the weak point of the exercise, you have to pick a hosting provider that is not going to go out of business, or change policies, get hacked, etc.
Best to pick a big one, pay in advance (don't just setup recurring payments - your credit card expiry date is a nice trap for peoples domains not being renewed)
What bugs? We just spent MONTHS of our own time after hours making this software that is supposed to last for decades - where did the bugs come from?
Answer = there are always bugs. But at least these will be bugs under your own control, which can be solved.
So that's it - a brief look at the main things that will cause your software to fail. Hope you got some tips that might help make the world of software a bit nicer.