One possible basis for building up data representation is to insist that all of the wires will either have a standard amount of electricity flowing through them, or none at all. Two possibilities: On or off.
This is a very small amount of information. It is called a "bit". It's the theoretically smallest amount of information you can have.
We can store more information by using more bits. For example, if we use two bits, we can store four different possibilities:
In general, n bits have 2n possible configurations. This is assuming we know which order they're in; in other words, possibilities 2 and 3 above have to count as different configurations.
With current computers, you frequently find bits organized into groups of eight, which are somewhat humorously called "bytes". This means there are 28 possible configurations of the bits in a byte. 28 is 256.
We want a representation of numbers with these bits, and we want a representation which is convenient for arithmetic. We want to be able to build arithmetic circuitry, and software, which doesn't have huge addition tables built in. We want the numbers to be in order, in some sense.
In fact there's a fairly straightforward way to represent numbers with bits. Each bit can be on or off; we can call this "0" or "1".
Then if we put a bunch of bits together, like this: 10011
we can treat that as a number in base 2 (binary).
You know that "926" in "base ten" means 9*102 + 2*101 + 6*100
so 10011 in base two is 1*24 + 0*23 + 0*22 + 1*21 + 1*20
Actually we can omit the zeroes, and the multiplication by one...
binary is very easy.
So we get
24 + 21 + 20
= 16 + 2 + 1
= 19
A byte is generally eight bits. What's the largest number we could represent in that byte?
111111112
= 1*27 + 1*26 + 1*25 + 1*24 +
1*23 + 1*22 + 1*21 + 1*20
= 255
All bits zero will be zero, of course.
So an eight-bit number can be anywhere from zero to 255, inclusive. If we want larger numbers, we have to use more bits, i.e. more than one byte.
In general, we can use n bits to represent an integer from 0 to 2n-1 inclusive, e.g. 8 bits gives us 0 to 255.
How do we get negative numbers? Two ways:
We won't discuss the two's-complement representation further in this course. It's discussed in CSC 258, at least if you take it from me.
In either case, you can see that we have 2n possibilities for the numbers,
and we can choose all non-negative,
which gives us 0 to 2n-1; or we can choose about half
each, positive and negative.
A straightforward representation for non-integer numbers
involves adding something like decimal points.
But "decimal" means base 10,
so we tend to call it a "binary point".
There's also a generic term "radix point",
which applies equally well to base 2 or to base 10 or to any base.
I claim that we can represent two and a half by 10.1 in binary.
That is, 2.510 = 10.12.
10.12
This is called "fixed point".
We can use the same basic format as we did for integers, so long as we agree on
how many places there are after the radix point.
But computer floating-point numbers are usually based on base two, of course.
To represent this in a computer, in (say) 32 bits, we can use something like
the
IEEE floating-point standard (IEEE 754):
If the exponent is all ones (looks like 255), the number is a special value:
If exponent is all zeroes, the meaning of the number is
+/- 0.{mantissa} x 2-126
For exponents other than all ones (255) and all zeroes (0), the meaning of the
number is:
Why subtract 127? 8 bits represents a number from 0 to 255 -- subtract 127 to
give us the ability to express exponents from -127 to 128.
Note that we save a bit by not representing the '1.'. This means that all
floating-point numbers must be "normalized" -- they begin with "1", DOT.
This restriction resembles that in scientific notation in base ten,
except that in base ten there
are nine possible digits for that single digit to the left of the decimal
point, e.g. '6' in Avogadro's number, whereas in base two there is only the one
possibility.
The sole digit to the left of the radix point can't be zero (in either base 2
or base 10). If it's zero, you just slide around the radix point as needed,
adjusting the exponent value to compensate,
which is called
"normalizing" the number.
This is why the special rule for an all-zeroes exponent is essential; otherwise
we couldn't represent the number zero itself.
This rule also gives us a few smaller-magnitude numbers,
e.g. 0.00000000000001 x 2-126,
but its chief importance is that it gives us a way to represent zero.
(Why -126 instead of -127 (which would seem more obvious, to me anyway)?
Consider trying to represent the number
2-127 itself.
With the exponent being -126, we could represent this as (in effect)
0.1*2-126.
If the actual exponent for this special case of an all-zeroes exponent field
were -127, this number would be unrepresentable.
But this is
getting rather far afield and you don't have to know this
for this course.)
1210 is 11002
.4 is 2/5
In base ten, we'd
divide 5 into 2 in base 10 to get 0.4. You do this with long division.
For base two, we instead
divide 5 into 2 in base 2, or if you like divide the base two number 101 into 10 (these
being the base-two representations of 5 and 2 respectively), using long
division, in base two.
This turns out to be a repeating fraction, with "0110" repeating.
so 12.4 = 1100.0110011001100110...
Convert 130 to base two by repeated integer division, and take the remainders.
(We could have done this for 12 near the beginning of this example, except
that it was so easy to see that 1100 is 8+4 is 12, but we do need to have an
algorithm, so here it is.)
130 div 2 = 65 remainder 0
so IEEE 754 single-precision representation is:
Remember that we omit the initial '1' from the mantissa field, thus gaining an
additional bit of precision.
There are no spaces there in any sense of course, I've just shown the
grouping.
The computer knows the grouping by counting bits.
Now let's take that value,
01000001010001100110011001100110,
and interpret it as an IEEE single-precision floating-point number and see what it is.
We separate the bits into the three fields:
the first bit is the sign bit, the next 8 are the exponent field,
the remaining 23 are the mantissa field.
So the sign bit is 0, meaning that the number is not negative.
The exponent field
is 10000010, or 27+22, which is 128+2 or 130.
And the mantissa field is
10001100110011001100110.
Remember that there is an implied '1' at the beginning there, if you look at
how the mantissa field substitutes into the meaning of the number, towards the
beginning of this document.
Substituting into our formula, we get
1.10001100110011001100110 x 2130-127
So you see why we used 130 for the exponent field when the actual exponent is
supposed to be 3.
The mantissa, then, is
Add these up with your calculator and you'll get
1.5499999523162841796875.
This is times 23, which is 8;
multiply by 8 and you'll get
12.3999996185302734375.
So
01000001010001100110011001100110
isn't actually a representation of 12.4 exactly, but it's as close as we can
get,
just as how 0.3333333333 in base ten is not equal to a third but it is as
close as we can get with ten significant decimal digits.
Fixed-point representation
Suppose we want numbers other than integers, e.g. 2½?
= 1*21 + 0*20 + 1*2-1
= 1*2 + 0*1 + 1*½
= 2 + 0 + ½
= 2½
Floating-point representation
Floating-point numbers involve an exponent of the base,
like scientific notation.
Example: Avogadro's number can be written as 6.02x1023 in base 10.
6.02x1023 is approximately (1 and 63/64)x278
or 1.111111 (base two) x 21001110 (base two)
+/- 1.{mantissa} x 2 {exponent}-127
IEEE 754 standard sizes
IEEE 754 has two standard sizes:
Example floating-point conversions
Convert 12.4 to IEEE single-precision.
= 1.1000110011001100110... x 23
so use 130 for the exponent field (because 130-127 == 3 -- see the
conversion back the other way, below, if this isn't clear)
6 div 2 = 32 remainder 1
3 div 2 = 16 remainder 0
1 div 2 = 8 remainder 0
8 div 2 = 4 remainder 0
4 div 2 = 2 remainder 0
2 div 2 = 1 remainder 0
1 div 2 = 0 remainder 1
Read the remainders upwards to get the value.
so 13010 = 100000102
0 10000010 10001100110011001100110
20
+ 2-1
+ 2-5
+ 2-6
+ 2-9
+ 2-10
+ 2-13
+ 2-14
+ 2-17
+ 2-18
+ 2-21
+ 2-22
[list of topics covered, evening section]
[course notes available (evening section)]
[main course page]