General optimization rules

Question

General optimization rules

This is a dangerous question, so let me try to formulate it correctly. Premature optimization is the root of all evil, but if you know you need it, there is a basic set of rules to consider. This set is what interests me.

For example, imagine you have a list of several thousand items. How do you view an item with a specific unique ID? Of course, you are just using a dictionary to map the ID to the element.

And if you know that there is always a parameter in the database that is required all the time, you are just caching it, rather than sending a database query a hundred times per second.

Or even something as simple as using release instead of building debug in prod.

I think there are a few more basic ideas.

I'm not specifically looking for "don't do this, for experts: don't do this" or "use a profiler", but for really simple general advice. If you think this is a moot point, you probably misunderstood my intention.

Nor am I looking for specific advice in any of my projects, nor in complex tricky tricks. Think of it as how to avoid the most important performance mistakes you made as a beginner.

Edit: This might be a good description of what I'm looking for: create a presentation (not a practical example) of general optimization rules for people who have a basic technical understanding (say they got a CS degree) but for some reason never wrote a single line code. Indicate the most important aspects. The pseudocode is fine. Don't assume specific languages or even architectures.

+2

optimization

mafu May 19 '10 at 10:26

a source to share

5 answers

anon · Answer 1 · 2010-05-19T10:28:30+0000

Two rules:

Use correct data structures.
Use the correct algorithms.

I think it covers.

Michael Borgwardt · Answer 2 · 2010-05-19T10:34:37+0000

Minimize the number of hits on the network
Minimize the number of hard disk accesses

These are orders of magnitude slower than anything your program can do, so avoiding them can be very important. Typical methods for this are:

Caching
Increased granularity of network access and HD

For example, B-Trees are absolutely ubiquitous in database systems because they reduce the HD access granularity for indexing on-disk indexes.

Arno · Answer 3 · 2010-05-19T10:47:55+0000

I think something extremely important is to be very careful in all the code that is often executed. This is usually code in critical inner loops.

Rule 1: Know This Code

For this code, avoid all overhead. Small differences in runtime can have a significant impact on overall performance. For instance. if you implement an image filter, a difference of 0.001ms per pixel will have a 1s difference in the filter execution time on a 1000x1000 image (which is not large).

Things to avoid / do in inner loops:

do not go through interfaces (e.g. DB queries, RPC calls, etc.)
don't jump into RAM, try to access it linearly
if you need to read from disk then read big chunks outside of inner loop (swap)
avoid virtual function calls
avoid function calls / use inline functions
use float instead of double if possible
avoid numerical tricks if possible
use ++ instead of a ++ in C ++
iterate over pointers if possible

Second general advice: each level / interface is worth it, try to avoid big stacks of different technologies, the system will spend more time converting the data, and then doing the actual work, keep things simple.

And as others have said, use the correct algorithm, first try to optimize the complexity of the algorithm before optimizing the implementation of the algorithm.

Mike dunlavey · Answer 4 · 2010-05-19T18:05:59+0000

I know you are looking for specific coding hints, but they are easy to find: caching, loop unwinding, code tucking, data and code localization, blah, blah ...

The biggest hint at all is not to use them.

Would it help to do this if I said, "It's a secret that the almighty powers that don't want you to know!"? Choose your credentials: Microsoft, Google, Sun, etc. Etc.

Don't use them

Until you know with dead certainty what the problems are, and then the coding clues are obvious.

Here's an example that used a lot of coding tricks, but the heart and soul of the exercise is not coding techniques, but diagnostic techniques .

graham.reeds · Answer 5 · 2010-05-19T10:27:47+0000

Are your algorithms correct for the situation or are there better ones available?

0

graham.reeds May 19 '10 at 10:27

a source to share

General optimization rules

More articles: