General optimization rules
This is a dangerous question, so let me try to formulate it correctly. Premature optimization is the root of all evil, but if you know you need it, there is a basic set of rules to consider. This set is what interests me.
For example, imagine you have a list of several thousand items. How do you view an item with a specific unique ID? Of course, you are just using a dictionary to map the ID to the element.
And if you know that there is always a parameter in the database that is required all the time, you are just caching it, rather than sending a database query a hundred times per second.
Or even something as simple as using release instead of building debug in prod.
I think there are a few more basic ideas.
I'm not specifically looking for "don't do this, for experts: don't do this" or "use a profiler", but for really simple general advice. If you think this is a moot point, you probably misunderstood my intention.
Nor am I looking for specific advice in any of my projects, nor in complex tricky tricks. Think of it as how to avoid the most important performance mistakes you made as a beginner.
Edit: This might be a good description of what I'm looking for: create a presentation (not a practical example) of general optimization rules for people who have a basic technical understanding (say they got a CS degree) but for some reason never wrote a single line code. Indicate the most important aspects. The pseudocode is fine. Don't assume specific languages ββor even architectures.
a source to share
- Minimize the number of hits on the network
- Minimize the number of hard disk accesses
These are orders of magnitude slower than anything your program can do, so avoiding them can be very important. Typical methods for this are:
- Caching
- Increased granularity of network access and HD
For example, B-Trees are absolutely ubiquitous in database systems because they reduce the HD access granularity for indexing on-disk indexes.
a source to share
I think something extremely important is to be very careful in all the code that is often executed. This is usually code in critical inner loops.
Rule 1: Know This Code
For this code, avoid all overhead. Small differences in runtime can have a significant impact on overall performance. For instance. if you implement an image filter, a difference of 0.001ms per pixel will have a 1s difference in the filter execution time on a 1000x1000 image (which is not large).
Things to avoid / do in inner loops:
- do not go through interfaces (e.g. DB queries, RPC calls, etc.)
- don't jump into RAM, try to access it linearly
- if you need to read from disk then read big chunks outside of inner loop (swap)
- avoid virtual function calls
- avoid function calls / use inline functions
- use float instead of double if possible
- avoid numerical tricks if possible
- use ++ instead of a ++ in C ++
- iterate over pointers if possible
Second general advice: each level / interface is worth it, try to avoid big stacks of different technologies, the system will spend more time converting the data, and then doing the actual work, keep things simple.
And as others have said, use the correct algorithm, first try to optimize the complexity of the algorithm before optimizing the implementation of the algorithm.
a source to share
I know you are looking for specific coding hints, but they are easy to find: caching, loop unwinding, code tucking, data and code localization, blah, blah ...
The biggest hint at all is not to use them.
Would it help to do this if I said, "It's a secret that the almighty powers that don't want you to know!"? Choose your credentials: Microsoft, Google, Sun, etc. Etc.
Don't use them
Until you know with dead certainty what the problems are, and then the coding clues are obvious.
a source to share