Thomas Touhey
:
Why the Unix return code is an int
On Mastodon, Koz Ross (quitter.se, his instance, disappeared since this post was published) asked:
UNIX historians, why does main return a *signed* int? As far as I’m aware, exit statuses can only be positive or 0.
I first made a quick answer, but then I decided to check out for some sources, and it eventually got too big to fit into a Mastodon post, so here’s the reason of this article.
One of the reasons is that the unsigned
keyword did not exist in C until the
mid 1970s, as hardware did not support unsigned operations until then,
and Unix had been up for several years already. But the “when should
I use an unsigned type” question is still asked frequently today.
First of all, you shouldn’t use unsigned
for everything that can’t be
under zero. Here’s what Peter Van Der Linden says in
Expert C Programming:
Avoid unnecessary complexity by minimizing your use of unsigned types. Specifically, don’t use an unsigned type to represent a quantity just because it will never be negative (e.g. “age” or “national_debt”).
Use a signed type like
int
and you won’t have to worry about boundary cases in the detailed rules for promoting mixed types.Only use unsigned types for bitfields or binary masks. Use casts in expressions, to make all the operands signed or unsigned, so the compiler does not have to choose the result type.
Mixed types promotion is when you have to adapt two integers of different
types to apply an operation on them, such as when you are multiplying
signed int
and unsigned int
. Also, keep in mind that the unsigned
version of a type only allows you to use one more bit: for example, as
int
is at least 16-bits wide, you can store up to 32767
into it (use long
/unsigned long
if you want to store up to 32-bits in
a portable fashion).
This is used in ctype.h
as well, as transferring an int
to a
function is usually quicker than transferring an unsigned char
(usually
because your function has to transmit it as an int
, then the other function
has to apply the & 255
mask on it).
Now, you may ask: yeah, but what if I want to return 32768 as a return code?
Well you can’t. Here’s the information about it that you can find in
the Single Unix Specification (basically POSIX),
2.13. Status Information
section:
Status information is data associated with a process detailing a change in the state of the process. It shall consist of: […]
If the new state is terminated:
- The low-order 8 bits of the status argument that the process passed to
_Exit()
,_exit()
orexit()
, or the low-order 8 bits of the value the process returned from main().
So you are only allowed to return up to 255 as the exit code, which the int
can contain in a portable fashion. And because of the definition of the
exit code, you can just take return_code & 255
.