Friday, September 23, 2016

C++ Optimization Bibliography

I wrote a wonderful book on performance tuning in C++. Plug, plug, plug. Buy it now!

The web is full of free optimization advice. The best of these references support their findings with timing benchmarks. Intermediate in quality are references that speak to the behavior of specific processors, but don't contain timing data to allow the reader to reason about the size of the performance speedup. Least interesting are simple lists of recommendations presented without evidence. These are a few frequently-cited references I found useful.

Most people think of code tweaks when they write about optimization. The performance improvement from statement-level optimization is unlikely to make much difference unless it is magnified by being frequently executed, in a loop or frequently called function, so these tweaks are of uncertain value.

Free Resources on the Web 

Here is a free taste of the optimization drug, to whet your appetite and make you want to read my book, or to get you over those two days of the shakes while Amazon delivers your paper copy.

Short Notes

Tips for Optimizing C/C++ Code (PDF)
This short note contains two dozen short thought-mantras about optimizing code. Explanations are necessarily brief. It can serve as an, "I didn't know that!" kind of checklist.

Optimizing C and C++ Code (HTML)
Two dozen short thought-mantras on code optimization from game developers.

Optimization in C and C++: good practices, pitfalls, by S├ębastien Binet (PDF)
Optimization advice on PowerPoint slides. Covers a lot of ground, focuses on linked structures, great graphics.

Programming Optimization (HTML), by Paul Hsieh
An interesting mix of general advice, plus brief specific low-level advice for optimizing statements and expressions.

Performance Optimization (HTML)
This is chapter 25 of the book Object Oriented System Design. It is notable for providing language-independent descriptions of design-time optimization techniques. It uses a dense and somewhat obsolete jargon defined elsewhere in the book, that can be a little hard to read. 

Three Optimization Tips for C++, Andrei Alexandrescu (HTML)
Anything Alexandrescu says is worth reading. This is more of a case study than a general advice piece, and contains many more than three tips. It features reduction in strength, reduction in array writing, and memoization.

This link has commentary and more measurements on the counting digits example Alexandrescu uses in his article.

Efficient C/C++ Coding Techniques, by Darren G. Moss (PDF)
This is a performance analysis of how several compilers available in 2001 generate code for common flow-of-control constructs, loops that index tables, and methods of structure composition. The findings are unsurprising; that switch is often better than if-elseif-else, that pointers are efficient in loops, and that a class with virtual multiple inheritance is expensive (if that wasn't obvious).

Writing Efficient C and C Code Optimization, Koushik Ghosh (HTML)
A grab-bag of statement- and expression-level optimization techniques, provided without supporting evidence for its effectiveness. Compilers may do some of these optimizations.

Techniques for Scientific C++, by Todd Veldhuizen (PDF)
A somewhat older, but interesting article with a grab-bag of optimization advice. Covers optimizing compile time, eliminating virtual functions, how aliasing interferes with optimization, expression templates,

C++ Optimization Strategies and Techniques (HTML) by Pete Isensee
Isensee really knows his stuff. This web page represents the state of the art in optimization circa 1999 (as he notes himself). It is slightly dated now, but still very relevant. The advice is well supported with timing measurements and examples.

Faster C++ (HTML)
Another grab-bag of statement-level optimizations, plus discussion of profiling, compiler flags, and cache issues. Good bibliography at the end.

FastC++: Coding Cpp Efficiently blog (HTML)
I haven't explored this blog extensively, but he seems to know a lot about getting the C++ compiler to use SSE extensions.

The Top 10 Myths of Video Game Optimization, Eric Preisz (HTML)
Interesting viewpoint on several well-loved optimizations. Of particular interest to me, that LUT-based algorithms may be a lose if they have reduced cache performance, since a cache miss is slower than the slowest instruction. The moral would be that small LUTs are good, but big ones can be problematical.

Mentor low Latency C or C++ code (HTML)
This is an article on verifying that programs meet their latency goals, not on optimizing C++ per se, But it has a bunch of good rules about memory allocation, and a couple of other optimization topics, covered without supporting evidence.

Longer Works

Technical Report on C++ Performance ISO/IEC TR 18015:2004 (PDF)
This is a definitive 202 page report on performance-related issues in C++03 by the ISO C++ Committee, fully supported with extensive timing evidence. At the time this TR was written, C++ was under an unjustified attack for being inefficient compared to C. TR 18015 is amazing, offering a mature look at C++ which includes much of the optimization advice in Optimized C++, albeit in a compressed form. I was aware of TR 18015 before I began writing my book,  then forgot about it until the book was in production.It's a gem.

TR 18015 covers all of the following:
  • Types of programs (embedded programs on small devices, servers running flat out) with performance issues
  • Costs of various C++ features: namespaces, type conversion and casting, classes and member function calls, exception handling, and templates, all supported with run time measurements.
  • Two methods of exception handling (code, table). This discussion also contains a list of alternative error handling mechanisms.
  • Ways to make the standard I/O functions faster by fiddling with locales.
  • A hardware interface using atomic and volatile. 

TR 18015 spawned some articles in the embedded systems press. Some of these articles offer a much less mature “C++ is less efficient than C” viewpoint.

Efficient C/C++ Coding Techniques, Moss, 2000
"Explains" that C++ is less efficient than C if you use virtual multiple inheritance when simple containment will do. Some comparisons of control flow choices also motivated by poor understanding about what’s good about C++. Great diagrams on virtual functions and virtual inheritance, though.
Using C++ Efficiently In Embedded Applications, Quiroz, 1998
"Explains" that C++ exception handling is expensive because you don’t really need to destroy objects as you exit each stack frame, which is fine if resource leaks are OK with you. Paper talks about some C++ features and categorizes them into cheap, maybe-cheap, and expensive categories.
Reducing Run-Time Overhead in C++ Programs, Saks, 2002
A paper on optimizing C++. It has great 90/10 law quotes, and covers implicit copying and custom new and delete. It cites Bulka and Mayhew in its bibliography.
Representing and Manipulating Hardware in Standard C and C++, Saks
A paper on how hardware registers may be represented in C++.

Optimizing Software in C++: Agner Fog (PDF)
This is a long (150 page) treatment of statement and expression-level optimization in C++. The material is informed by deep knowledge of the x86 architecture, and makes reference to architectural details. It is only partially translatable to other platforms. Agner Fog actively maintains this document, so its content changes from time to time. If you find it useful, it may be worth caching a copy for your own use, lest the material change out from under you.

Optimizing C++: A book about improving C++ performance (Wikipedia | PDF)
There is some useful information in this small collection of Wikipedia entries, though not enough to call it a book. Good things about this wikibook include:
  • It evolved substantially during the time I was writing Optimized C++, growing to cover more material better.
  • It has a bibliography, including free on-line references, and some books that were new to me.
There are several problems with it, though:
  • No topic is covered exhaustively. There are, for instance, a few algorithmic techniques, but not a complete or even extensive catalog.
  • Most optimization advice is supported only by hand-waving arguments about how a compiler might behave.

Books

Efficient C++, Dov Bulka and David Mayhew (Amazon link) 1999, Addison Wesley, ISBN: 0-201-37950-3
Meticulously researched, but now slightly dated. I have tremendous respect for this work.

The Software Optimization Cookbook: (PDF table of contents)
I have not read this book, but it appears to be a useful source of information on processor-dependent optimization for the IA-32 (Intel 32-bit) architecture.

Optimizing C++, Steve Heller (HTML), 1999, Prentice-Hall, ISBN: 0-13-977430-0
Warning: not about performance tuning in C++ at all, but rather about optimizing an ISAM database. You can view it for free on the web.

Optimized C++, Kurt Guntheroth (Amazon link), 2016, O'Reilly Media, ISBN: 978-1491922064
My book, the best of everything, of course.

Other Notes

Microsoft Visual C++ Tips and Tricks has all sorts of debugger tricks. Not about optimization, but interesting.

Friday, December 4, 2015

Mastery Takes Time

I have talked about achieving mastery previously. Here's another, perhaps better researched article on how long it takes to achieve mastery of a discipline. For those too indolent to click a link, what it says is that it takes about 10,000 hours, or about five full-time years, to obtain a reasonable degree of mastery, and it doesn't matter what it is; playing the violin, swinging a baseball bat, or coding in C++. A study of violinists indicated that "natural" talent was not a factor. The people who were the best were the ones who practiced the most and the longest.

This is something recent college grads all seem not to know about the careers on which they are embarking. They think (like I once thought) that their natural intelligence and their good education was what made the difference. As an Old Hand, I can tell you that practice has a lot to do with it, and good as you are, you will get a lot better when you've been doing the job for 20 years.

Don't Teach Yourself to Code by Writing a Library

It is a terrible idea to teach yourself how to use some unfamiliar feature of C++ by designing a library that heavily uses this feature.

I cannot think of an easily explained example, because libraries that come into existence in this way are very rarely easily explained. Even experienced C++ folks look at the thing and say, "Huh...WHAT?! Can you DO THAT?" Readers with more than a couple years of development experience are grimacing right now, because chances are good that the code they are paid to work in every day relies on a library that some genius thought would be cooler if it all used some tricks they'd been reading about.

It may be cool to the original author, but to anyone who hasn't read the same blogs as the author, it's just mysterious. Especially mysterious when the author has trouble doing what they originally imagined was possible, and ends up covering this lack of experience with a hack, that then lingers long after the author stops working on the library.

Writing libraries is a great time to slow down; to be conservative; to solicit the widest possible review; to ask your peers, "Is this good style?"

Thursday, June 25, 2015

The Mysterious Hidden Languages of C++

C++ is old enough to have teenaged children (like Java, conceived when C++ was itself a wild teenager). Some lucky developers have spent virtually their whole career writing C++. But while C++ grew and solidified, the world of programming languages bubbled and foamed around it. Languages and language ideas came and went. Lexical styles changed and changed again. In these modern days of Hascell, python and Javascript, it can be hard to remember that the folks who originally designed C and C++ learned to program in BCPL.
 

The Secret Expression Language Inside C++


An expression language is a language where every executable statement produces a value. The vestigal expression language embedded in C++ includes individual expressions, sequences, blocks, if/else, and function call, but not iteration, alas.

x = 1, y = 2, z = x + y is a sequence. It sets the values of variables x, y, and z, and produces the value 3.

(rand(), 0) is a block. It calls the function rand(), and produces the value 0.

x == 1 ? y : (z, 0) is an if/else statement. It evaluates the boolean condition x == 1. If x and y have the values set in the previous expression, it produces the value 2. If x is not equal to 1, it produces zero.
 

C++ Has a Shiny New Functional Language Too


Here's the old syntax for a function.

int foo(int x)
{
    return x*x-1;
}
Here's the same function as a lambda. Return value in a different place. Syntax for associating a name with the function different. New rules to remember about the difference between lambdas and function pointers.

auto foo = [](int x)
{
    return x*x-1;
};
 
Lambdas have a whole new scheme (ouch, pun) of type deduction rules. Someday they are going to be special. But today they are mostly syntactic sugar, for people who grew up with Javascript or some exotic functional language so they can pretend that C++ is trendy too.

Monday, March 30, 2015

Aphorisms for Work

Here are some aphorisms that I think upon, usually during meetings with grumpy managers.

It takes the same time to do a project right as to do it wrong

You just spend the tine in different places. You can rush through coding without any planning, but you will spend a lot of time in front of the debugger, and maybe have to throw your code away and start over when you understand the problem better. Or you can do thoughtful design, code at a deliberate pace, run your unit tests, and be done. It comes down to a lifestyle choice.

The most dangerous thing in the world is a half-understood Harvard Business Review article 

There's an HBR article that says, essentially, "All other things being equal, getting software done sooner results in greater revenue." This one paper is the cause of so much misery for development teams. Managers love to use this article as an excuse to pry the software out of development and jam it into production before it's finished. They never read the "all other things being equal" part of the article. 

There is an HBR article that says, "Pay is not what motivates employees the most." There are managers who read this article and think they can overwork and underpay their employees. They don't read the part about what does motivate employees, like empowerment, recognition, and work/life balance.

The most pain is inflicted when the manager hears a summary second-hand, and doesn't even know he has only half the advice. He thinks, "There's an HRB article that says you should ship the project as early as possible", or "HBR says employees don't care that much about pay."

When everything is important, nothing is

Ever have your manager tell you on Monday to work on this, because it is the most important thing, and then tell you on Tuesday to work on that, because it is the most important thing. And then you ask him to choose which is most important, and he says they both are? That is what a weak manager does when he doesn't want to make a decision. There must be a most important thing, the thing to finish first. 

If everything is important, if everything is an emergency, then nothing is really important, because your manager will blame you for what you didn't choose, no matter what you did choose. If your manager behaves in this way, it's a good time to get your resume updated and start looking for work.

The 90/90 Rule: The first 90% of development takes 90% of the time, and the last 10% takes the other 90% of the time

Oh, how many times has a developer said he was 90% done in one month, only to spend a second month on that last pesky ten percent? You're 90 percent done with the part of the task you know about. Generally that's about half the task. It's a very mature organization indeed that can correctly estimate completion percentages. Your best chance is to estimate things using a consistent process, and then consistently record how your estimate matched reality. Then you can de-rate future estimates until your estimation process matures.

Wednesday, January 28, 2015

Software Quality: A Race to the Bottom?

Another Old Hand, whose opinions are based on very relevant experience, says that the software industry is trapped in a race to the bottom in software quality.

The trap is that investors' needs are divorced from software companies' mechanisms for producing value. Investors want the stock price or corporate valuation to improve in every reporting period; every year, every quarter, every month, every day. Investors are entirely uninterested in what a company did last month. They invest today to reap profit in the future. Companies, on the other hand, can only demonstrate their ability to produce value by completing projects; shipping disks, posting content, publishing interfaces. But it is hard to encode much value into a software product in a day or a month.

To impress investors, a company must ship, soon. Since investors aren't interested in what the software does, there is a strong incentive to ship early, but less incentive to ship complete or useful code. For a startup which has never shipped, it's all about who has the value story with the most zeros on the end of it and the shortest time to delivery.

To impress partners and customers, a company must ship. Customers are largely uninterested in what a company plans to do. They want products now. If a potential customer wants functionality, and there are competitors, the one who ships first wins the customer. Having won, it's a relatively smaller problem to keep the customer. Customers are used to defective software.

Thus, according to my friend, what wins is crazy, caffeine-fueled speed. Quality doesn't matter. By the time the customer finds out the product is imperfect, the sale is made, the money paid.

This reasoning depresses me. I hate what it means for my profession. Yet I can find no flaws in the argument, only small exceptions; established quality brands; software products (databases, kernels) that can never work in the marketplace if they aren't reliable; special customers who know what they need and write quality into their purchase agrements.

Monday, January 5, 2015

Reading Your Own Press Releases: A Management Antipattern

To customers, a new company is an unknown. A new company has no reputation. A new company must focus on providing innovative or cost-effective solutions to customer problems, because it's the only story they have to tell. Nothing wrong with that.

Well-established companies have more to say. In addition to presenting their latest product's features, they can tell customers about their previous products, their many years in business, their reputation for quality. Nothing wrong with that either. It's easy for a customer to choose an established company versus an upstart competitor, even without doing a bunch of homework, because the established company is a known quantity.

But sometimes, an established company loses its way. They begin to tell themselves about their reputation for quality, their years in business, and their leadership position. Their executives tell their product teams that they can charge a premium price because they are the "Cadillac brand", regardless of the strength of competitive products. If they say this often enough, it becomes received wisdom within the company. I call this behavior reading your own press releases.

A company that is reading its own press releases is already past its prime. When a company starts reading its own press releases, it stops worrying about solving customer problems, or being better, faster, and stronger than the competition. They may even get away with this for a few years if they really were market leaders. But eventually, they find themselves sidelined by their hungrier competitors, and their position of advantage evaporates.

I've worked for two companies who had this problem. One is now a division of a conglomerate, who fixed the problem by replacing complacent managers. The other annoyed its partners so much with slow product delivery that the partners made the company irrelevant. Last I checked, this second company was still in business, but they shed three quarters of their staff in a paroxysm of downsizing, and never recovered their former greatness.

I bet you can see some obvious examples in the news of large technology companies who have been reading their own press releases. Where are their new products? Missing in action or underwhelming critics and customers. When your division manager starts talking this way, tell them to wake up and smell the innovation.