Coding is an art! Anyone can draw, but only those who draw with the elegant have their paintings valued. Similarly any programmer can code, but a Good Program is an artistic job. I see lot of people do code, but only few actually write Good code. This means they have understood the elegance of the language and knows how to use it.
Big Code – Good Code
Big Code, a Good Code? Sounds awkward? A Good code I mean here is a code which runs quicker than other solutions for the same problem.
if T(Sol1) < T(Sol2) Then, Solution 1 is a Good code., Where T is Time function , measures Time Taken
( though there are various other constraints involved, for time being let not discuss them )
Hardcore programmers are always tries to write short programs. They couple few lines of code into one and show that they have done a great coding. But, things are not always the way it looks. Short is sweet, but not always !
Look into this code, 3 solutions for one simple problem.
Problem: Find the greatest of two numbers a, b
Million Dollar Question
Which code runs faster? If you have chosen the A’s code as faster. You have correctly made a wrong decision.
The fastest program is the B’s program. So, 3 lines of code, still fastest code? Ok.
In the A’s code, consider if a < b, two operations should be performed.
- Compare ‘a’ , ‘b’
- Branch to Else if ‘a’ < ‘b’
- Return the expression ‘b’
But in B’s solution, we have already assigned a value to ‘x’. Only job is to compare and Assign. So the second operation is saved.
In such a small code, you won’t find a big difference. But think of a macro system with MLOC ( Million Lines of Code ).
I will explain you by solving a bit complex problem, come on join me.
Problem: Given number ‘N’ , split into ‘k’ integers, such that N = k1 + k2 + k3 +…kn and k1 * k2 * k3 is maximum.
You can try various solutions. One simple solution is
![]()
Great code, you would be proud to write one line solution of such great problem. But wait, is your code is faster to run for N > 10^10, K > 10^5?
Consider the optimize solution
More optimised
The last code works much faster than any other. Try looping it for million times with various values of ‘N & k ‘. and measure the time it takes to complete the million time execution.
How to write a speedy code?
So, this is yet another million dollar question. Take a look at the above three solutions. You will find a resemblance.
Solution – 1 is a generalized form of other two. Solution – 3 is more detailed form of Solution – 1.
So, what is been detailed in the Solution 3. If you look at more closely, you will find that Solution-3 has checks for ‘Boundary Conditions’.
Boundary Conditions are very important scenarios in any Logical System. You prove that system works perfect for boundary conditions, you prove the system to work for any values between it. Similar to Mathematical Induction. So, this applies for programming too.
If you are able to identify the boundary conditions, and make the appropriate formula for those conditions, then you will skip unnecessary computations.
We will look what have save the computational cycles.
Lets assume, the Input are N=10, K = 2
Using Solution-1:
residue = 0, mid = 5
maxProduct = Math.pow(5+1,0) * Math.pow( 5 ,(2-0) ) = 25
Operations performed:
Note: costs are just for calculation, not exact one
- Math.pow
- Addition – (5+1)
- Math.pow
- Subtraction – ( 2-0 )
- Multiplication
lets assume the cost for each operation is
Using Solution-3:
The code will branch out to the 1st if condition ( residue == 0 )
maxProduct = Math.pow( 5, 2) = 25
Operations performed
So, you have saved more than 55% of the computational cycles by covering up the boundary conditions.
So, Why you need a Speedy code?
Everyone in life loves speed. Speed bike, Speed cars ( that’s why you would love F1 ), A speedy 2 GB – Core 2 Duo Powered Machine.
Because Speed makes you feel Good !! So does a Speedy Code will be a Good code.
But, to achieve something you have to compromise with other factors, like I mentioned earlier.
If you need a 2 GB – Core 2 Duo Powered Machine, you should not consider the Money factor.
Similarly while you code, there are lot of other factors like
- Memory Management
- Security
- Transaction Isolations etc.,
This is just a introduction, will start covering up other topics soon.
14 Comments
Would be very helpful for those who start coding and intermediator’s .Things that they need to have always in mind when they are coding.Nice Tips!!!
Hi,
Checking the Boundary conditions is good when since it will reduce the process time if the input lies in boundary.but the short form of coding is appreciated to reduce the detailed logic,thats why mathematical formulas are developed.we can write unique formula for each input but the mathematical simplicity is to handle all the inputs.We can divide a formula into any numbers with boundary conditions hence to reduce the time take.So programming with handling boundary conditions is Good Programming but we can critisize Short programming.
Sorry that is we cant critisize Short programming…
Hi Senthil, You are right in a sense.
Short Programming is good but mathematical formulas are genralization of concepts.
But when it comes to programming, you have to split the problems into sub problems ( Divide & Conquer ), identify the boundries, so that you can solve it quick. That’s why we have the concepts of Dynamic Programming ( Divide & Con ).
Math can’t be used as it is in Programming.
Even I love short programming. Coz, short is sweet, but not always
Nice article and some nice point here and there.
But you have at least one bug:
if ( residue == 0 )
…
else if ( k == 1 && residue == 0 )
…
In you “else if” statement you check to see i residue is 0 again… but you checked that case in the first line. See?
A stupid bug in an article that goes into the details of statement, where every check and branch count. Here you can take out the entire else-if statement.
I can’t agree with your assessments. I spent 3 years doing assembly level optimization in the video game industry and working with output from Microsoft Visual C++ and GNU gcc, and I find your arguments specious.
Using C# (VS 2008 Express), I made a quick benchmark for your first example:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace TestShortCode
{
class Program
{
delegate int TestMethod(int a, int b);
static int Method1(int a, int b) {
int x = a < b ? a : b;
return x;
}
static int Method2(int a, int b)
{
int x = b;
if (a < b)
x = a;
return x;
}
static void Profile(TestMethod code)
{
Random r = new Random();
code(0, 0);
DateTime startTime = DateTime.Now;
for (int i = 0; i < 100000000; i++)
{
int a = r.Next(100);
int b = r.Next(100);
code(a, b);
}
DateTime endTime = DateTime.Now;
TimeSpan delta = endTime – startTime;
Console.WriteLine(delta);
}
static void Main(string[] args)
{
Profile(Method1);
Profile(Method2);
}
}
}
Running it a half-dozen times, I got consistently the same results:
00:00:09.9218750
00:00:09.9375000
I notice people tend to ASSUME they know what their compilers are doing, but modern compilers do some crazy-ass stuff even with optimization turned off.
Your first example is not a good one because it ignores two things: 1) you assume the compiler somehow does a “return” when in fact even in debug output it would generate case B, and 2) the branch is the single most expensive operation (writes go straight to the cache and on most CPUs have 0-cost writes, at least for single store instructions).
In VM contexts, the issue gets even more fuddled where the ad-hoc compiler can use profiling of your code to figure out which branch is more likely and arrange for the most common case to be the fall-through.
On to your major point: that is true that writing “bigger” code can lead to time savings–the most famous example is loop unwinding. But as Donald Knuth liked to say “Premature optimization is the root of all evil”. One should profile your code and _only_ optimize the hot-spots. My personal experience has given me the following strategy:
For slow code try these steps (in order):
1. Change the algorithm. In 3-D world rendering (like for Halo), back-face culling is a HELL of a lot faster than a faster polygon renderer.
2. I/O bound is I/O bound. If your CPU monitor is less than half-full but your HD light is solid, you’re definitely I/O bound. Reduce I/O access by using caching or memory mapping (<– most often overlooked speed-up for file-based processing). If you are connecting to a database, figure out where in the database pipeline is the slowdown. Most of the time, it’s the query. Be generous with indexing on SQL databases.
3. Watch your memory access. This is very CPU/System dependent, but generally CPU time is cheaper than memory accessing (at least for C/C++/Assembly programming). It can _sometimes_ be faster to calculate a value than to use a look-up table. There are exceptions, but one should look at the actual assembly generated before deciding this because compilers can play mind-games on you. In interpreted languages (like Perl & Python), look-up tables are faster than calculations.
@Bjarne, that was a silly mistake. I’ve corrected it. Thanks for pointing it. I’ve moved the “else if” as “if”.
@Travers
I’m glad that you tried it. I’m trying to explain how a Big code can’t be a Bad Code ( as normally people think ).
Here are my results when I tried while I was writing this post.
For IF Condition test
1 – Time: 37.5
2 – Time: 29.7
3 – Time: 31.2
For the Problem test
1 – Time: 42.2
2 – Time: 34.4
3 – Time: 32.8
Please find the code for above two tests at http://drop.io/mymindleaks. You will find two java files.
I strongly belive, Manual optimization is more effective than Compiler Optimization.
@Travers,
Also your points on improving the slow code is great!
The bit about (a > b ? c : d) makes no sense. Why does it check ab? All it should expand to is:
if (a > b)
//do code section ‘c’
else
//do code section ‘d’
I’m not saying you’re wrong, but a decent compiler should optimize that to exactly what you program, not insert a bunch of code that you have no use for.
Maybe Java and possibly even .NET has this issue (it may even be in the specs for the language(s) to do extra boundary checking because of the nature of the language(s)), but I haven’t found any issue with this in C++. I look at the Assembly code output by the compiler, and things are the same. I’m glad I don’t suffer from this issue in C++. It does exactly what I write rather than what it thinks I write. ^_^
Eh you write “But wait, is your code is faster to run for N > 1^10, K > 1^5″. In my book 1^10 = 1 and 1^5 is 1. I think you meant to write:
N > 10^10, K > 10^5
@rpgfan, a ‘CMP’ statement would be created in the assembly. Which makes the comparision. if the 1st condition fails, a BRANCH assembly statement would also be executed, which is another costlier statement .
The else-if test for ‘residue == k’ in your “more optimised” solution is unnecessary, as it will never be true.
In the same vein, the first test for ‘k == 1 && residue == 0′ is slightly redundant. If k == 1, then residue == 0 is always true.
I also feel that I should mention something else. The “optimalness” of your boundary-checking depends on the values of n and k that you test with. If your conditions of k == 1 and residue == 0 are frequently true, then you’ll probably see a good speedup. However, if both of those conditions are frequently false (allow n and k to vary over all positive integers, and see how often they’re true), then your solution has become sub-optimal, as you’re now checking for conditions that are rare. The bottom line is that checking for boundary cases only saves time if they occur frequently, and if the cost of checking for them is less than the cost of simply using the formula.
Some even use good grammar. There’s probably some relationship there, but I won’t pursue it.