Red Hat Speaks on gcc 2.96-RH

Red Hat made a decision to use a branch of the public GNU compiler collection (gcc) when it launched the Red Hat 7 series of Linux operating systems. This choice, although controversial, was done for very specific reasons. Here is an overview of the key issues and decisions impacting Red Hat's engineering organization in the decision to use gcc 2.96-rh.

  • gcc 2.96 is more standards compliant than any other version of gcc released at the time Red Hat made this decision.

    It may not be "standards compliant" as in "what most others are shipping", but 2.96 is almost fully ISO C99 and ISO C++ 98 compliant, unlike any previous version of gcc.

  • gcc 2.96 has more complete support for C++. Older versions of gcc could handle only a very limited subset of C++.

    Earlier versions of g++ often had problems with templates and other valid C++ constructs.

  • Most of gcc 2.96's perceived "bugs" are actually broken code that older gccs accepted because they were not standards compliant - or, using an alternative term to express the same thing, buggy.

    A C or C++ compiler that doesn't speak the standardized C language is a bug, not a feature.

    In the initial version of gcc 2.96, there were a couple of other bugs. All known ones have been fixed in the version from updates. The bugs in the initial version don't make the whole compiler broken, though. There has never been a 100% bug free compiler or any other 100% bug free non-trivial program.

    The current version can be taken from Red Hat Linux 7.3. It will work without changes on prior 7.x releases of Red Hat Linux.

  • gcc 2.96 generates better, more optimized code.
  • gcc 2.96 supports all architectures Red Hat is currently supporting, including ia64. No other compiler can do this. Having to maintain different compilers for every different architecture is a development (find a bug, then fix it 4 times), QA and support nightmare.
  • The binary incompatibility issues are not as bad as is sometimes rumored.

    First of all, they affect dynamically linked C++ code only. If you don't use C++, you aren't affected. If you use C++ and link statically, you aren't affected.

    If you don't mind depending on a current glibc, you might also want to link statically to c++ libraries while linking dynamically to glibc and other C libraries you're using: g++ -o test test.cc -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic (Thanks to Pavel Roskin for pointing this out)

    Second, the same issues appear with every major release of gcc so far. gcc 2.7.x C++ is not binary compatible with gcc 2.8.x. gcc 2.8.x C++ is not binary compatible with egcs 1.0.x. egcs 1.0.x C++ is not binary compatible with egcs 1.1.x. egcs 1.1.x C++ is not binary compatible with gcc 2.95. gcc 2.95 C++ is not binary compatible with gcc 3.0.

    Besides, it can easily be circumvented. Either link statically, or simply distribute libstdc++ with your program and install it if necessary. Since it has a different soname, it can coexist with other libstdc++ versions without causing any problems.

    Red Hat Linux 7 also happens to be the first Linux distributions using the current version of glibc, 2.2.x. This update is not binary compatible with older distributions either (unless you update glibc - there's nothing that prevents you from updating libstdc++ at the same time). If you want to distribute something binary-only, link it statically and it will run everywhere.

Here is a list of commonly found constructs that gcc 2.96 doesn't compile (some of these code snippets are actually taken from "bug" reports), or compiles but generates unexpected results, along with why it's broken code and not a compiler problem, and how to fix it:

Language: C

Broken because...

Parameters passed to "..." type varargs are automatically promoted to int according to ISO C99.

Broken Code


num = va_arg(args, short);
	

Correct Code


num = (short)va_arg(args, int);
or plain
num = va_arg(args, int);
	

Broken because...

This is not supposed to work - older gcc versions could deal with it because they were not optimizing well.

Broken Code


int i;
*(float *)&i = 2.0;
return i;
	

Correct Code

Don't do this.
Quick fix: compile with -fno-strict-aliasing

Broken because...

The compiler is free to add arbitrary padding in structs. There is no guarantee whatsoever that s.b will be stored at &(s.a)+sizeof(int).

Broken Code


struct a {
	int a;
	int b;
};
[...]
struct a *s=(struct a *)malloc(sizeof(struct a));
memset(s + sizeof(int), 0, sizeof(int));
assert(s.b==0);
	

Correct Code

fix the code - how to do this should be fairly obvious.

Back to top


Language: C++

Broken because...

Ignoring const qualifiers is valid (but not nice) C, but it's not valid C++.

Broken Code


main() {
	void *a;
	const void *b;
	a=b;
}
	

Correct Code


main() {
	void *a;
	const void *b;
	a=(void*)b;
}
	

Broken because...

ISO C++ 98 defines or as ||, therefore compliant compilers parse the code as int ||=1;.

Broken Code


int or=1;
	

Correct Code


int o_r=1;
	

Broken because...

Function throws different exceptions than declared (declarations are commonly in header files, so this is usually not as easy to spot as it is here). This is as wrong as int some_func(int i);
int some_func(char *i) { };
.

Broken Code


int some_func() throw(someException);
int some_func()
{
}
	

Correct Code


int some_func() throw(someException);
int some_func() throw(someException)
{
}
	

Broken because...

According to the C++ standard: A template parameter shall not be redeclared within its scope (including nested scopes).

Broken Code


template <class TYPE>
class a
{
	template<class TYPE>
	friend ostream &operator <<(ostream &os, a<TYPE> &b);
};
	

Correct Code


template <class TYPE>
class a
{
	template<class T>
	friend ostream &operator <<(ostream &os, a<T> &b);
};
	

Broken because...

C++ forbids using functions that don't have a prototype.

Broken Code


main()
{
	exit(0);
}
	

Correct Code


#include <stdlib.h>
main()
{
	exit(0);
}
	

Broken because...

ISO C++ 98 defines xor as ^^, therefore compliant compilers parse the code as #if !defined(^^) [...].

Broken Code


#if !defined(xor)
#define xor ^^
#endif
	

Correct Code

Don't use xor. ^^ is supported by both older and newer compilers, and checking for a compliant compiler with defined(xor) may work for some compilers, but is definitely not required by the standard.

Broken because...

The definition of enum a is private to class a, so class b can't use it.

Broken Code


class a {
private:
	enum aaa { TEST, TEST1, TEST2 };
	void xyz(aaa x);
};
class b {
	void xyz(a::aaa x);
};
	

Correct Code


class a {
public:
	enum aaa { TEST, TEST1, TEST2 };
private:
	void xyz(aaa x);
};
class b {
	void xyz(a::aaa x);
};
	

Broken because...

Not valid ISO C++

Broken Code


template <class X> void *A<class X>::operator new(int a) { }
	

Correct Code


template <class X> void *A<X>::operator new(int a) { }
	

Broken because...

A template and its instantation are not the same class according to ISO C++ 98.

Broken Code


class A {
public:
	typedef int a;
};
class B;
template <class A>
class C {
	typedef A b;
public:
	void c() { C<B>::b::a i; };
};
	

Correct Code


class A {
public:
	typedef int a;
};
class B;
template <class A>
class C {
public:
	typedef A b;
	void c() { C<B>::b::a i; };
};
	

Back to top


Language: C/Assembler

Broken because...

While this is not exactly commonly found code, it's one of the perceived gcc 2.96 "bugs" most talked about.

This code used to be in MPlayer, causing it to miscompile, and its maintainers to add loud and blatantly wrong statements to their documentation.

The reason this code miscompiles (only the packuswb %%mm0, %mm0 instruction is executed) is that recent versions of gcc, starting with 2.96, support both the Intel and AT&T variants of x86 assembly.

The pipe character is an actual symbol in the Intel variant, therefore its use in asm() constructs (even comments in asm constructs) is illegal.

This has since been fixed in the MPlayer code - unfortunately its maintainers didn't remove their unjustified comments about gcc 2.96 at the same time.

Broken Code


asm( [...]
"packuswb %%mm0, %%mm0 # B6 B4 B2 B0 | B6 B4 B2 B0\n\t"
"packuswb %%mm1, %%mm1 # R6 R4 R2 R0 | R6 R4 R2 R0\n\t"
"packuswb %%mm2, %%mm2 # G6 G4 G2 G0 | G6 G4 G2 G0\n\t"
[...]
)
	

Correct Code


asm( [...]
"packuswb %%mm0, %%mm0 # B6 B4 B2 B0  B6 B4 B2 B0\n\t"
"packuswb %%mm1, %%mm1 # R6 R4 R2 R0  R6 R4 R2 R0\n\t"
"packuswb %%mm2, %%mm2 # G6 G4 G2 G0  G6 G4 G2 G0\n\t"
[...]
)

Back to top