Debugger is not a naughty word (by )

Conclusion

The biggest cause of software bugs is, at heart, the fact that a software company is usually under pressure to produce a product before its competitors, even if it's buggy. And so the managers want to stick with tried-and-tested programming languages where a pool of cheap programmers is easily found, and the programmers don't get much time to write unit tests or use formal methods. This creates a culture in which it is considered normal to write code with bugs in and find them later, so training initiatives don't tend to put too much emphasis on the rigorous application of practices that reduce the incidence of bugs at the cost of slowing down initial development.

Personally, I like to write unit tests, and would have done for the application that exhibited the reminder bug if there had been time; but I was already behind due to my house flooding and desperate to catch up, there had been no explicit request for them from the client, there was a deadline, and I was being paid by the day - so I skipped them. Whether this was the right or wrong thing to do at the time under short-term pressures, I don't know; but in the long term, I'm now paying the cost.

Apart from in the safety-critical world, this will probably take a long time to change, as the pace of development of the software industry doesn't look set to slow down...

Pages: 1 2 3 4 5

7 Comments

  • By @ndy Macolleague, Tue 15th Jan 2008 @ 2:02 pm

    Hi,

    Here is a quick "highest number we've seen all numbers less than or equal" algorithm that I just cooked up whilst reading this. It's pretty much at the pseudo code level and I've not even tried to compile it but I think it illustrates a point.

    I've don't store the numbers that come in: I just store a set of flags so that we know if we have seen a given number or not.

    int main (void) { int i; /* Most recent value read / int blk, idx; / The block that i lives in and the index into block / int mem[1024]; / Some space. Highest value is 1024 * sizeof(int) * 8. / int h = 0; / The block that we are up to / int val; / The answer so far */

        while (h < 1024) {
                i = read();
                blk = i / (sizeof(i) * 8);
                idx = i % (sizeof(i) * 8);
    
                mem[blk] |= (1 << idx);
    
                for ( ; h<1024, mem[h] != 0xFFFF; h++);
    
                val = ((h-1) * (sizeof(int) * 8)) + pop_count(mem[h]);
    
                /* pop_count is number of bits that are 1 */
    
                printf("Highest with all less than or equal: %d.\n", val);
        }
    

    }

    Seeing as we are tracking h we could probably throw away / reuse the early parts of mem as they fill up. That would mean that mem would only have to be big enough to store the range of values that might be un decided at any given time. To do that would make the illustration above less clear so I didn't bother as it'd be mostly memory management rather than part of the core algorithm.

  • By @ndy Macolleague, Tue 15th Jan 2008 @ 2:04 pm

    Hmph! The formatting of the variable declarations has been messed up. Here it is again:

    int main (void) {

        int i; /* Most recent value read */
    
        int blk, idx; /* The block that i lives in and the index into block */
    
        int mem[1024]; /* Some space. Highest value is 1024 * sizeof(int). */
    
        int h = 0; /* The block that we are up to */
    
        int val; /* The answer so far */
    
    
        while (h < 1024) {
                i = read();
                blk = i / (sizeof(i) * 8);
                idx = i % (sizeof(i) * 8);
    
                mem[blk] |= (1 << idx);
    
                for ( ; h<1024, mem[h] != 0xFFFF; h++);
    
                val = ((h-1) * (sizeof(i) * 8)) + pop_count(mem[h]);
    
                /* pop_count is number of bits that are 1 */
    
                printf("Highest with all less than or equal: %d.\n", val);
        }
    

    }

  • By Gavan Fantom, Tue 15th Jan 2008 @ 2:12 pm

    I would have expected that for nearly-sequential input, storing ranges (ie extents) would have been a reasonable compression. The answer is then the size (well, strictly speaking, the higher end) of the first extent.

    The size of the extent map would then be a reasonable metric for the extent of disorder in the input.

    Depending on the quality of the input, an extent map of all numbers not seen so far may also be a suitable choice.

  • By @ndy Macolleague, Tue 15th Jan 2008 @ 2:19 pm

    Hi,

    This thing probably has loads of bugs... The for loop will prevent the while loop terminating. So change it to "while (val < (1024 * sizeof(int) * 8))" or just change the memory management model and make it "while (1)".

  • By @ndy Macolleague, Tue 15th Jan 2008 @ 2:27 pm

    Well... the for loop should have prevented the while loop from terminating as it should be h<1023 not h<1024 otherwise we end up with a bounds overflow on the next line. Perhaps I should have compiled this... Maybe I still should...

    Oh dear.

  • By Gavan Fantom, Tue 15th Jan 2008 @ 2:36 pm

    In my experience, the biggest factor in improving quality in software is visibility. You touched on a few of the "rules" which have been proposed over time, but the overriding rule as far as I'm concerned is "Write clear and readable code". If you can read it and understand it, you have more chance of spotting when something is not right. This usually extends to having a clear write-up or diagram describing the design clearly so that someone trying to understand the code can understand the design at the same time.

    But that's not the be all and end all of visibility. Unit tests can help here, but only if you actually run them.

    Amazingly, large codebases with multiple developers often suffer from the problem of the head of the repository not even compiling. When everybody is updating all the time it gets noticed and fixed regularly, but when you have people doing the bulk of their development on a snapshot or a branch and only updating to the head or the trunk occasionally, this can become a really big problem. Again the way to solve this is by greater visibility - in this case autobuilding.

    Another frequent failure is to spot a bug (or a potential bug), not fix it straight away, and then forget about it. Two years later it rears its ugly head, and by that time you've completely forgotten about it and have to debug from scratch. It is only after spending weeks debugging that you realise that you've already seen this, and then you wish you'd fixed it at the time, or at least written a TODO item, or tracked it in a bug database.

    And if it's in a bug database, it's visible. It's measurable. You can generate a report of all known bugs.

    So once you have readable code, documented design, (some) unit tests which you run daily right after your nightly builds, and a bug database full of all the bugs which you spotted and didn't have time to fix, you still have bugs that you don't know about. What about them?

    Well, software is not fault-tolerant except in extremely specific cases. So if a fault does occur, make sure your code spots it early and fails noisily. This will increase visibility of the problem. It will also prevent knock-on failures, as the code will simply stop executing once bad data has been detected. And also, hopefully, it will make it easier to isolate where the problem occurred. This is all extremely good news when it comes to debugging. A failed assertion is much more useful than a Segmentation violation.

    The more you can see, and the clearer it is, the higher the quality of the resulting software, given a competent and conscientious programmer.

  • By alaric, Thu 17th Jan 2008 @ 10:29 am

    @ndy: That's similar to the principle I used, I guess. Since the algorithm runs continuously with an ever-increasing (and, eventually, wrapping back to 0) sequence of numbers (yes, before people ask, this is for packet sequence numbering in a network client!), I indeed have a buffer, but set up so that it slides along the space of sequence numbers, so it only stores the "frothy" region; the "solid" region where we have seen all the sequence numbers gets pushed off of the bottom of the buffer ASAP to free up space for froth.

    Gavan: Yep, good points, thanks

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales