重温K&R(2)

daybreakcx posted @ 2010年2月08日 02:08 in 学习笔记 , 3208 阅读

《The C Programming Language》第二章是讲类型，操作和表达式的，题目没有第一章的多，但是都很不错。

Exercise 2-1. Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
解：这个题目是要你用一个程序来算char，short，int和long的范围，你可以用标准头文件里头的宏，也可以自己算，更进一步，你可以算一下浮点类型的最大值。首先是头文件里头定义的常量，我们可以从”limites.h“里头获取我们所要的信息，这都很容易计算，当然要注意的一点就是如果无符号数则要用“%u”等输出。

然后是另一条路线：计算最大值。对于像char那类的数值，我们可以用循环，每次+1直到+1之后数值比本身小，那么这次就是最大值了，但是这个在位数比较多的时候就行不通了，比如32位的情况下，等待时间非常的长。

于是我们需要换一个思路，能否直接构造出最大值。我们知道数值是以补码的形式存储于计算机中的，最高位为符号位，那么对于有符号数，我们只要将最高位设0,其余置1即我们最大值，最高位置1,其余设0就是最小值。对于无符号数，最小值就是0,最大值则是全1。

接着我们需要面对的问题就是如何去构造这个01序列，我们很自然的想到了-1,-1的表示为各个位上均为1（补码），那么有符号最大值就是-1逻辑右移一位，最小值就是最大值+1。对于无符号数，表示最大值就是-1本身，这样我们就得到我们要的结果了。

接下来剩下的问题就是浮点数的表示了，对于浮点数遵循IEEE 754,比如float我的机子上是32位数,0位为符号位，1~8位为指数位，9~31位为数值位，我们可以直接构造出最大值，而最大值和最小值是相反数，我们只要算其中一个就行了（变0位可以得到对方）。那么这个数值到底该是多少呢？由于数值位是原码表示的，我们全设1即可，剩下就是让指数尽量大了，指数对于浮点数来说要进一步计算，具体参照IEEE的文件，这里要说的一点是能否为指数取全1,答案是否定的，因为它是预留的。浮点数预留了如inf还有NaN（Not a Number）等数值，而指数全1就是NaN。对于double，是64位的浮点数，其中符号1位，指数15位，其余的是数值。至于long double类型，由于gcc为了4字节对其将其设置为了96位，而一般long double为80位的，我的程序里头并没有对于long double进行处理，有兴趣的情自行尝试。有了以上知识构造就很容易了，但是问题还没有结束。

是否我们可以很顺利地构造出那个数值呢？你如果按照自己所想去直接编码某些时候会出问题，得不到想要的数值，比如我原先就那么干了，得到的数很离谱，仔细分析下，将结果16进制打印出来，问题很容易得到解决。由于我的机子是小端存储的，所以我置位的顺序是反的比如我构造float的最大值，我用0x7fffffff，但是实际上如果我按照数组顺序开始输入，得到的值将是0xffffff7f，这个根据自己的机子情况调整一下即可。

#include <stdio.h>
#include <string.h>
#include <limits.h>
union Float {
	float fa;
	unsigned char f[4];
}f;
union Double {
	double da;
	unsigned char d[8];
}d;
int main ()
{
	printf("The size of \"char\" is %d, its range is [%d, %d].\n", sizeof(char), CHAR_MIN, CHAR_MAX);
	printf("Compute Range: [%d, %d].\n", (char)((((unsigned char)(-1)) >> 1) + 1), ((unsigned char)(-1)) >> 1);
	printf("The size of \"short\" is %d, its range is [%d, %d].\n", sizeof(short), SHRT_MIN, SHRT_MAX);
	printf("Compute Range: [%d, %d].\n", (short)((((unsigned short)(-1)) >> 1) + 1), ((unsigned short)(-1)) >> 1);
	printf("The size of \"int\" is %d, its range is [%d, %d].\n", sizeof(int), INT_MIN, INT_MAX);
	printf("Compute Range: [%d, %d].\n", (int)((((unsigned int)(-1)) >> 1) + 1), ((unsigned int)(-1)) >> 1);
	printf("The size of \"long\" is %d, its range is [%d, %d].\n", sizeof(long), LONG_MIN, LONG_MAX);
	printf("Compute Range: [%d, %d].\n", (long)((((unsigned long)(-1)) >> 1) + 1), ((unsigned long)(-1)) >> 1);
	printf("-----------------------------------------------------\n");
	printf("The size of \"unsigned char\" is %d, its range is [0, %d].\n", sizeof(unsigned char), UCHAR_MAX);
	printf("Compute Range: [0, %d].\n", (unsigned char)(-1));
	printf("The size of \"unsigned short\" is %d, its range is [0, %d].\n", sizeof(unsigned short), USHRT_MAX);
	printf("Compute Range: [0, %d].\n", (unsigned short)(-1));
	printf("The size of \"unsigned int\" is %d, its range is [0, %u].\n", sizeof(unsigned int), UINT_MAX);
	printf("Compute Range: [0, %u].\n", (unsigned int)(-1));
	printf("The size of \"unsigned long\" is %d, its range is [0, %u].\n", sizeof(unsigned long), ULONG_MAX);
	printf("Compute Range: [0, %u].\n", (unsigned long)(-1));
	printf("-----------------------------------------------------\n");
	memset(f.f, 0xff, sizeof(f.f));
	f.f[3] = f.f[2] = 0x7f;
	printf("The maximum of \"float\" is : %e\n", f.fa);
	memset(d.d, 0xff, sizeof(d.d));
	d.d[6] = 0xef;
	d.d[7] = 0x7f;
	printf("The maximum of \"double\" is : %e\n", d.da);
	return 0;
}

Exercise 2-2. Write a loop equivalent to the for loop above without using && or ||.

解：这里说不用与以及或操作写一个等价的循环形式，原来的形式是：

for (i = 0; i < lim - 1 && (c = getchar()) != '\n' && c != EOF; i++)
	s[i] = c;

我们可以用if-else语句来进行改写：

for (i = 0; i < lim - 1; i++)
	if ((c = getchar()) != '\n')
		if (c != EOF)
			s[i] = c;

Exercise 2-3. Write a function htoi(s), which converts a string of hexadecimal digits (including an optional 0x or 0X) into its equivalent integer value. The allowable digits are 0 through 9, a through f, and A through F.
解：这个题目要求我们写一个将十六进制字符串转为十进制数值的函数，当然需要同时能处理带或者不带0x与0X开头的情况。我把字符串的检测给省去了，具体目的为了说明执行过程：

int htoi(char s[])
{
	int ret = 0, i = 0;
	if (s[1] == 'x' || s[1] == 'X')
		i = 2;
	for (; s[i]; i++)
		if (s[i] >= 'a')
			ret = ret * 16 + s[i] - 'a' + 10;
		else if (s[i] >= 'A')
			ret = ret * 16 + s[i] - 'A' + 10;
		else
			ret = ret * 16 + s[i] - '0';
	return ret;
}

Exercise 2-4. Write an alternative version of squeeze(s1,s2) that deletes each character in s1 that matches any character in the string s2.
解：这里需要我们写一个squeeze函数，目的是要在s1中删除在s2中出现的字符，我考虑用一个数组标记某个字符是否出现，然后将文中原先的squeeze进行拓展：

void squeeze(char s1[], char s2[])
{
	int ins2[128], i, j;
	for (i = 0; i < 128; i++)
		ins2[i] = 0;
	for (i = 0; s2[i]; i++)
		ins2[s2[i]] = 1;
	for (i = j = 0; s1[i]; i++)
		if (!ins2[s1[i]])
			s1[j++] = s1[i];
	s1[j] = 0;
}

Exercise 2-5. Write the function any(s1,s2), which returns the first location in a string s1 where any character from the string s2 occurs, or -1 if s1 contains no characters from s2. (The standard library function strpbrk does the same job but returns a pointer to the location.)
解：这个题目和上题方法一样，我们如法炮制：

int any(char s1[], char s2[])
{
	int ins2[128], i, j;
	for (i = 0; i < 128; i++)
		ins2[i] = 0;
	for (i = 0; s2[i]; i++)
		ins2[s2[i]] = 1;
	for (i = j = 0; s1[i]; i++)
		if (ins2[s1[i]])
			return i;
	return -1;
}

Exercise 2-6. Write a function setbits(x,p,n,y) that returns x with the n bits that begin at position p set to the rightmost n bits of y, leaving the other bits unchanged.
解：这里是在原先的getbits的基础上多了一个设位的过程，本身并不难，改改就可以了。

unsigned setbits(unsigned x, int p, int n, unsigned y)
{
    return y | ((x >> (p + 1 - n)) & ~(~0 << n));
}

Exercise 2-7. Write a function invert(x,p,n) that returns x with the n bits that begin at position p inverted (i.e., 1 changed into 0 and vice versa), leaving the others unchanged.
解：这题的原理也是一样的，改改就能用：

unsigned setbits(unsigned x, int p, int n)
{
    return ~(~0 << n) ^ ((x >> (p + 1 - n)) & ~(~0 << n));
}

Exercise 2-8. Write a function rightrot(x,n) that returns the value of the integer x rotated to the right by n positions.
解：这题要我们将x右移n位，你可以一位一位的移动，当然也可以一次搞定，我选择后者：

#include <limits.h>
unsigned rightrot(unsigned x, int n)
{
    int l = sizeof(x) * CHAR_BIT;
	if (n >= l)
		n %= l;
	return (x >> n) | ((x & ~(~0 << n)) << (l - n));
}

Exercise 2-9. In a two's complement number system, x &= (x-1) deletes the rightmost 1-bit in x. Explain why. Use this observation to write a faster version of bitcount.
解：这个知识我很早就知道，x &= (x - 1)的操作本质上就是式子x-1将x最右边的1变0,然后其右的0变1,与原来的x相与之后就去掉了最右边的1,代码很好写：

int bitcount(unsigned x)
{
    int b = 0;
	while (x) {
		b++;
		x &= (x - 1);
	}
	return b;
}

Exercise 2-10. Rewrite the function lower, which converts upper case letters to lower case, with a conditional expression instead of if-else.
解：这里要我们用”?:“表达式来重写lower这个函数，知道了原理其实特别的简单：

char lower(char c)
{
    return c >= 'A' && c <= 'Z' ? c - 'A' + 'a' : c;
}

知识补缺：

1.原先生成最低n位的1我用的是(1<<n) - 1，原文中提供了另一种方法~(~0<<n)

2.printf输出字符的时候，\xhh可以用hh十六进制表示一个字符，也可以用\ooo以八进制表示一个字符

3.C中的字符串常量是可拼接的，比如"hello, " "world"就等价于"hello, world"

4.枚举类型因为不常用有点遗忘，其本质上看成是宏的集合，其数值的规律是后者是前者+1,而不管本身后方的值多少，默认从0开始，是一种多对一的映射

5.普通整数常量常常看成是int，如果想要限定的话，无符号尾部加上u或者U，长整数尾部加上l或者L，无符号长整数加上ul或者UL，64位整数加LL

6.浮点数处理要注意，如inf和nan等值，寄存器处理浮点数的时候采用的是拓展精度的浮点数，如果你用的是双精度或者单精度就有可能在从寄存器拷贝到内存的时候产生精度丢失，同时可能比如两次同样操作的结果其一已经放入内存，而另一个还在寄存器中，比较的时候导致结果不一致。gcc中带-O2选项编译的时候可以带-ffloat-store选项强制存入内存，但是也不是完全的安全，最安全的是全用long double，可是会导致一定程度性能的损失，但一般情况下不会有这么严格的精度问题。

7.除了C/C++以外，很少语言支持无符号整数，C中无符号整数提供了方便，同时也带来了引入bug的可能，比如-1<0U的结果是false，原因就在于当有符号和无符号整数操作的时候，有符号整数隐式类型转换为无符号整数，这个问题常常发生，需要引起注意，平时可以尽量避免使用无符号整数，否则很有可能因为疏忽而导致错误，下面函数就是一个很好的例子：

float sum_elements(float a[], unsigned length)
{
	int i;
	float result = 0;
	for (i = 0; i <= length-1; i++)
		result += a[i];
	return result;
}

乍看之下很正常，但是仔细分析就会发现i<=length-1一句，由于length是无符号的，所以其比较就会将i隐式转换为无符号整数，这个不是最可怕的，最可怕的是当length传入0的时候第一次length-1由于是无符号减法会变成无符号的正最大而执行，此时就会出现内存错误，需要谨慎。