[Freeipa-devel] Suggested additions to codding standard page on freeIPA wiki.

Sat Nov 1 01:55:56 UTC 2008

Some notes from me, sorry for the excessive length.

Dmitri Pal wrote:
> Simo Sorce wrote:
> > On Fri, 2008-10-31 at 15:56 -0400, Dmitri Pal wrote:
> >   
> >
> >> *Length of line: *
> >>
> >> Not more than 100-110 characters. Just keep it reasonable. Longer
> >> lines are harder to deal with due to different sized of monitors
> >> and different editors used.
> >>     
> >
> > I am for strict adherence to not more than 80 chars per line.
> > While I myself used to use up to 120 chars I found out that 80
> > lines is more usable in many situations, including debugging python
> > code/C code on a console, letting people use their tools of choice,
> > and yes, fitting 3 terminals in a row :-)
> >
> >   
> I think 80 is to short with the modern monitors and screens.
> We should not limit us but try no to be overwhelmingly long

I agree with Simo on this. Besides the reasons he stated, also think
about your eyes. Take your favorite book and count the number of
letters per line. It will be cca 66. That's because if it was more,
your eyes would be tired or hurt after long hours of reading (more
eyeball movement). Also, this convention is held by almost all
projects I can think of and also our preferred style for python code.
I don't have problems fitting my code into 80 characters even with 8
spaces of indentation, with 4 it will be very easy.

> >> *Module structure: *
> >>
> >> The module should be structured in the following order:
> >> * Copyright boilerplate
> >> * Includes

Let's also say that includes should be grouped properly. Standard
headers and local headers should definitely be separated by a blank
line. Other logical grouping should be reasonably done if needed. Files
inside the groups should be sorted alphabetically, unless a specific
order is required - this however is very rare, and must not happen with
our own headers.

Also, one shouldn't depend on the fact that one header file includes
other one, unless it is really obvious and/or desirable, like in cases
when one header file practically "enhances" the other one, for example
with more error codes, etc.

> >> * Local defines – used only inside module. Global/shared defines
> >> should be in the header
> >> * Local module typedefs - Global/shared typedefs should be in the
> >> header 
> >
> > I would like to discourage use of typdefs unless really necessary,
> > like when a very basic type is defined. I find the code is much more
> > comprehensible if a struct is immediately recognizable as such as
> > well as pointers.
> >
> >   
> I think this is should be formulated as preference with suggestion
> not to use typedefs.

I like using typedefs if the convention is to always end them with _t.
However, this way you sometime need both struct and typedef (self
reference, as in case of linear lists for example). I also like structs
alone and this approach is much more clearer, because there's no need
to mix typedefs and structs. And it's always apparent at first glance
that we're dealing with a struct as Simo noted, so I'm vouching for
structs.

Also, let's not forget that hiding a pointer behind a typedef is a
mortal sin.

> >> * Module variable declarations – variables that are global for the
> >> module. 
> >
> > Global variables should be discouraged as well, they make for very
> > poor code. Should be used only if no other way can be found. They
> > tend to be not thread/async safe
> >
> >   
> Agree. But if there are some they should be covered.

Yup, global variables should be used only if necessary and when used in
a threaded code should be lockable. But let's not be religiously
against their use as many are in case of goto.. Anyway, they should be
mentioned and I think that everyone knows he should use them only when
really needed.

> >> * Declarations of external functions
> >>     
> >
> > Should not be needed, usually headers should contain foreign
> > functions definitions.
> >
> >   
> Agree. The word usually is crucial here. Sometimes you have
> collisions on the headers in this case you need to include things
> explicitly. But this is more an indication of the bad header design.

I agree with Dmitri, sometimes it just happens.

I wanted to tease Simo by mentioning ipa_pwd_extop.c, line 125, but
then I realized I was the one who added the declaration :-) Oh well,
but still better than to use the function without the declaration at
all..

> >> * Declarations of static functions
> >>     
> >
> > I find it much easier to declare functions only if they need to be
> > referenced before the body of the function is defined. Otherwise
> > just use the ordering in the file to avoid declarations.
> >
> >   
> I agree but this should be a preference not a rule.
> 
> >> * Implementation of the exposed functions from the module
> >> * Implementation of the static functions
> >>     
> >
> > Given the above comment, static functions usually come first.
> >
> >   
> I kind of disagree. This should be a preference.
> 
> 
> >> It is not required but recommended to group and position functions
> >> in the module so that it is easier to understand.
> >>     
> >
> > This is very controversial, for me easier to understand is when the
> > innermost function is first in the file and the outermost is last,
> > but for some that seem an unnatural bottom to top ordering.
> >
> >   
> No it is not. I think this allows you to follow your natural flaw of 
> logic and me mine.
> I prefer implementing the functions that are main to the module first 
> and stub out the functions that I need.
> So naturally I read the code the same way. I do not care about the 
> details and low level functions that act as helpers.
> I want to see and concentrate on the core functionality and not dig
> it into the module to find it scattered somewhere in the middle or in
> the bottom.
> 
> I guess this is a personal preference so I would suggest leaving it
> as it was spelled.

On this subject, I strongly agree with Dmitri. I always found it
unnatural that main function is on the bottom, it just doesn't make
sense. I suspect people do this only because they are lazy to use
forward declarations. This is not a good way. Simo please reconsider,
I'm sure you will agree with me that code is much more read than
written. By this logic, I think that if we can get a more easily
readable code by making just a little more effort, it is worth it. And
I definitely think this is more readable as it follows logic. You
basically only need to read the main function to get the idea what the
program does. You can then just go the way down as you'd read a book to
get into more details of the implementation.

Maybe you don't like the prototypes being on the beginning of the file?
You can quickly skip them and read what's after them. Additionally,
they serve as a nice reference. I actually always used to write the
prototypes on the top. Also, if by mistake your declaration is
different than the definition, the compiler will tell you.

> >> *Comments: *
> >>
> >> Sections (groups of functions) can be separated with a line of
> >> comments like this:
> >>
> >> /************ Main Interface Functions ****************/
> >>
> >> Or this
> >>
> >> /********* Static Internal Functions **********/
> >>     
> >
> > Is it really necessary to state this? Static functions are usually
> > pretty obviously static ... Also I usually prefer to keep static
> > functions related to a specific public function close to said
> > function so that there is no need to jump around the file when
> > checking what a related function does.
> >
> >   
> Yes. I think it will. It will help you with your approach to
> processing code to read the code I write and vice versa.
> These things are very helpful when you face the code that is
> organized in the way it is not intuitive for you to digest.

I agree with Dmitri, it's easier to find stuff in a file that has
sections.

Also (sorry about the wrapping made by my email client):

/*
 * VERY important single-line comments look like this.
 */

/* Most single-line comments look like this. */

/*
 * Multi-line comments look like this. Make them real
sentences. Fill
 * them so they look like real paragraphs.
 */

So ideally, no
/* Multiline comments
   that look like this */

I always disliked any ascii art in my files, I always found them
distracting. Couldn't we just separate the sections with the "VERY
important" single-line comments? If one wants to really distinguish the
comment from the others, he might add some empty lines around.

> >> Each function must have a comment describing what this function
> >> does and returns.
> >> There is no specific format for this but it is a good practice to 
> >> describe what parameters mean and what return codes function
> >> returns under what conditions.
> >>
> >> Inside the function comments should help reader to digest the
> >> logic of the function. They should answer the main question “why
> >> is this line of code here and what it tries to accomplish?”. Not
> >> everything should be commented but practice shows that comments
> >> giving the hint about what the developer what trying to accomplish
> >> with this or that construct usually turn out to be pretty helpful.
> >>     
> >
> > This is fine as long as we don't have things like:
> >
> > int i = x +1; /* add 1 to x and assign it to i */
> >
> > Let's avoid useless comments :-)
> >
> >   
> 
> Agree.

+1, the general logic should be:
Comment on *what* your code does, not *how* it does it. If you really
must write how it works, you should probably rewrite it.

> >> Do not use Hungarian notation (prefixing variable with the type).
> >> The only exception to the rule is pointers. The following example 
> >> illustrates the case:
> >>
> >> void function_that_does_something(data_set inventory) {
> >> int number_items =0;
> >>
> >> err=get_number_of_items_in_set(inventory, number_items);
> >> ...
> >> }
> >>
> >> int get_number_of_items_in_set(data_set inventory, int*
> >> p_number_items) { ...
> >> }
> >>     
> >
> > This definition is redundant, int * already defines the variable as
> > pointer. Please let's avoid the insane Hungarian Notation
> > completely.
> >
> >   
> >> In this case the variable in the second function is related to the 
> >> variable in the calling function but it is a pointer. This is
> >> useful since naming variables same will cause confusion with type
> >> (especially if there will be cutting and pasting of code which I
> >> use to do a lot), and naming them differently will break the
> >> relation. Like having number_items in one function and item_count
> >> in the other and one is actually a pointer to another is hard to
> >> work with. Prefixing with "p_" solves the issue nicely.
> >>     
> >
> > No, the compiler solves any issue nicely, it will tell you if you
> > are using the wrong type.
> >
> >   
> 
> There are cases when these things caused a lot of trouble. Not 
> explicitly indicating pointers where variables are both pointers and 
> just variables IMO makes the code much harder to read and maintain.
> I just makes things harder for no value but if others insist I would
> not resist. Any other opinions?

I agree there shouldn't be any Hungarian notation (and considering that
my name is "Nagy", that's saying a lot!). Only exceptions should be
cases similar to these:

size_t strlen(const char *str);
void free(void *ptr);

Also, no one can complain about:
size_t size;

Which reminds me of another thing: If appropriate, always use the
const modifier for pointers passed to the function. This makes the
intentions of the function more clearer, plus allows the compiler to
catch more bugs and make some optimizations.

> >> *Names of the functions: *
> >>
> >> Looking at the code (server.c for example) I have seen all sorts
> >> of names for functions used in one module:
> >>
> >> setup_signals(void)
> >> BlockSignals(true, SIGPIPE);
> >> poptGetNextOpt(pc)
> >>
> >> We should try to stick to one style which will be low case, multi
> >> word, _ separated like:
> >> monitor_task_init(struct task_server *task);
> >>     
> >
> > The later is the preferred form, poptGetNextOpt and BlockSignals
> > come from external libraries, so we have no control over them.
> >
> >   
> Yes.

+1, consistency in general should always be preferred, unless the
breaking of it can actually make the code more cleaner and easier to
read.

> >> * Declarations *
> >> Major declarations such as typedef's, stuct tags, etc. deserve to
> >> be identifiable on their own in some fashion.
> >> Often this was done with an initial cap, but since we're using 
> >> lower_case_with_underscores then I think the convention of
> >> appending "_t" is a good way to go and is consistent with the
> >> Posix and kernel conventions.
> >>     
> >
> > Only if we are defining a new type, something very rare. Again I am
> > for avoiding typedefs altogether except for rare cases.
> >
> > If I need a structure to hold something a bout a socket this is how
> > it should be defined:
> >
> > struct socket_stuff {
> >     int foo;
> >     long bar;
> >     struct baz;
> > };
> >
> > no typdefs.
> >
> >   
> Well... I kind of disagree but I can live with that.

Again, preferably structs as noted above.

> >> Other name formats should not be used unless it is an external
> >> function we do not have control over.
> >>
> >> The wiki page will be updated based on this discussion.
> >>     
> >
> > Thanks.
> >
> >
> > I'd like to add some more style related conventions I tend to use.
> >
> > Function definitions:
> >
> > Always define a function on one line with the opening parenthesis
> > on the next line.
> > Example:
> > struct foobar *foo_bar_fn(int foo, int bar)
> > {

I disagree with the position of the type. There are a lot of types with
variable length and with both the type and function name on one line,
it's hard to quickly find the name of the function. In contrast, it is
much easier to find it when it's on the beginning of a line and the
type and name are easily distinguishable. IMO it also makes it easier
to spot the header of a function. One disadvantage I can think of is
the case when you need to know the return type in the debugger. But
this IMO doesn't happen often and so I think it is not that important.

Are there any arguments against?

> > Do not put the "{" on the same line or the return type on a previous
> > line like.
> >
> >   
> 
> +1
+1

> > Example of NOT ok declaration:
> > struct foobar *
> > foo_bar_fn(int foo, int bar) {

As I mentioned above..

> > if the function declaration does not fit 80 chars, put arguments
> > after the first on following lines and indent them to start after
> > the "("
> >
> > Example:
> > int very_long_function_name_that_makes_arguments_not_fit_80(int foo,
> >                                                             int bar)
> > {
> +1

+1

> > Do not put spaces before or after parenthesis:
> >
> > OK:  int foo(int bar, int baz);
> > NOT OK: bad ( arg1 , arg2 );
> >
> >
> >   
> +1

+1, but we should use space after standard C keywords, so:
OK:
if (expr)
while (expr)

NOT OK:
if(expr)
while(expr)

etc.

> > Avoid complex variable initializations (like calling functions) when
> > declaring variables like:
> >
> > 	int foobar = get_foobar(baz);
> >
> > but split it in:
> > 	int foobar;
> >
> > 	foobar = get_foobar(baz);
> >
> >   
> +1

+1, and don't initialize static or global variables to 0 or NULL.

> > This makes it simpler to follow code in a debugger, and also avoid
> > forgetting to test the return value of the function for errors.
> >
> >
> > Always declare all variables at the top of the function, normally
> > try to avoid declaring local variables in internal loops.
> >
> >   
> +1

Bjarne Stroustrup would concur. It's often much better to have local
variables then to have 10 of them at the top of the function, regular
human can't keep the track of them. Then again, it might be the case
when breaking the function into smaller pieces is desirable.

I would say the truth is somewhere in the middle. I wouldn't define
this as a strict rule, only recommendation. The decision should be made
in the interest of clarity and readability, and may wary from case to
case.

> > Use only C style comments /* blah */ and not C++ style comment: //
> > blah
> >
> >
> >   
> +1

+1

> > Add to these the coding style predicaments already available on
> > freeipa.org :-)
> >
> >
> >   
> I am planning to add this back to freeIPA.org
> 
> Any other opinions and suggestions?

Please let's wait some time so that other people get the chance to
state their opinions.

Some additional rules I think would be useful:

Macros that are unsafe should be in upper-case. This also applies to
macros that span multiple lines:

#define MY_MACRO(a, b) do {                       \
             foo((a) + (b));                      \
             bar(a);                              \
} while (0)

Notice that arguments should be in parentheses if there's a risk, but
in case of bar(a) this isn't the case (caution however won't hurt).
Also notice that a is referenced two times, and hence the macro is
dangerous. Wrapping the body in do { } while (0) makes it safe to use
it like this:

if (expr)
    MY_MACRO(x, y);

Notice the semicolon is used after the invocation, not in the macro
definition. Also notice the position of \'s.

Otherwise, if a macro is safe (for example a simple wrapping function),
then the case can be lower-case. Reasoning is that too much upper-case
looks ugly.

When using #ifdefs, it's nice to add comments in the pairing #endif:

#ifndef _HEADER_H_
#define _HEADER_H_

/* something here */

#endif /* !_HEADER_H_ */

or:

#ifdef HAVE_PTHREADS

/* some code here */

#else /* !HAVE_PTHREADS */

/* some other code here */

#endif /* HAVE_PTHREADS */

Try to align members of union and struct definitions, but not at all
costs:

struct some_data {
    int                size;      /* Allocated bytes. */
    char               *buffer;   /* Buffer containing the names. */
    struct some_data   *next;     /* Next member. */
    struct some_name_that_is_very_long  var;  /* Won't fit. */
};

Creating lists and queues was already done a lot of times. When
possible, use some common functions for manipulating these to avoid
mistakes.

Don't use binary not (!) in tests, unless you test for a boolean:
good: if (*str = '\0')
bad:  if (!*str)

Switches:

switch (var) {
case 0:
    break;
case 1:
    printf("meh.\n");
    /* FALLTHROUGH */
case 2:
    printf("2\n");
    break;
default:
    /* Always have default */
    break;
}

This is all from me right now.

Martin