Integer in Computer from first principle

224 阅读3分钟

In computer science, it uses binary positional numbering system to represent numbers. For instance, 123 in binary is 01111011.  Each 0 or 1 is a bit. A sequence of 8 bits is called a byte, which can represent 256 numbers. In previous post, Number is about counting, we see  that why we can use a sequence of bits to represent numbers. And based on that idea, we can move on to show different concept of number in binary positinal number system in computer.

We are familiar with decimal positional number system, i.e. 0, 1, 2, 3... . Based on decimals, we have natural number which start from 0, and next natural number is the sum of previous natural number and one, i.e. 0, 1, 2, 3, 4 and so forth.

Or another definition is Natural Number is Zero or Successor of A Natural number.

In computer, it represents natural number as unsigned integer. But in computer, the space is limited, not like the space in our daily use math is unlimited. Therefore, for 8 bits unsigned integer, we can only represent natural number from 0 to 255. For 16 bits, it is from 0 to 65535, and the like. Easy and one to one mapping to our natural number concept with limited size.

After Natural Number, we need negative number to complement Natural Number to form Integer. So where is negative number from. Well, we like symmetry, so let us define positive number first. In natural number, positive means we have, it exists, therefore without zero, the rest of natural number forms positive number. And for the complement, the negative number is opposite of positive number, that is when a positive number and its corresponding negative number are added together, the result should be zero. They are melted together and become nothing. And For representation of this concept, we add a minus sign to positive number to mean negative number. It is called sign-magnitude negative number representation.

However, when it is used in computer, it is not quite fit as natural number. That is if we use the leftmost bit of binary to show if it is positive or negative, then when we calculate (+7)  + (- 7) in binary form, which is 0111 + 1111 = 0110, that is not the expected 0000. In our daily math, we calculate (+7) + (-7) by subtraction, that is 7 - 7, which is zero for sure. Moreover, for 7 - 13, which equals - 6, is calculated following subtract 7 from 13 then add the sign of large number to the result. Actually, subtraction is quite more complicated than addition. As the proceduce shows. One more problem about sign-magnitude representation is that +0 and -0 means 0, but have different forms. It is not big deal for our daily math, one because we are custom to it, another because we just treat them the same straigtforwardly. But in computer, it wastes a state by using two distinguishing state to represent the same thing. i.e. 0000 and 1000. 

It is not good. So let us get started from the begining, negative number is for complement is positive number to get zero. And in computer for natural number, the zero is 0000, so for a given positive number, e.g. 0111, how to get an other number to make their sum as 0000. Let us first try on flipping the positive number, e.g. 1000. But 0111 + 1000 = 1111, it is not 0000. To make it 0000, we need to add 1 to 1111. So how about the negative is the sum of flipping positive number and one. Let us see, 1000 + 0001 = 1001, 0111 + 1001 = 0000. That is one we want. So we define the negative number as the complement of positive number added one. Actually the process of adding one after flipping is the negate process. That is, based one the concept of the sum of positive number and its negative number is zero, a negative number is the number of negating a positive number, reversely, a positive number is the number of negating a negative number. We can try to negate 1001, then we get 0111 back.

That is the two's complement representation of negative number in computer.

One more property is that, naturally the leftmost bit is the sign bit, and 0000 is 0, and 1000 is -8 not 0 any more. So for 4 bit signed integer, it represents number form -8 to 7. And based that representation of signed integer in computer, the arithmetics can be derived naturally.

One more key thing for that two's complement representation works in computer, is that computer has limited resource, it can not represent infinite numbers, so its representation of integer is ranged, that is the implicit modulo operation on arithmetics of number in computer, e.g. for 4 bits unsigned interger, add 0001 to 1111, we get 0000. In our daily math, it should be 15 + 1 = 16, not 0, but (15 + 1) % 16 is 0. The % 16 is implicit. That is the operation of modulo the result on 2 to the number of bits is implicit.

That is it, have fun.