Number Representations & States

"how numbers are stored and used in computers"

Strings

A string is a sequence of characters, where each character is represented by one or more numbers. In the early days of computing, each character was represented by a single byte, which could represent 128 different characters.

0null
1start of heading
2start of text
3end of text
4end of transmission
5enquiry
6acknowledge
7bell
8backspace
9horizontal tab
10line feed
11vertical tab
12form feed
13carriage return
14shift out
15shift in
16data link escape
17device control 1
18device control 2
19device control 3
20device control 4
21negative ack.
22synchronous idle
23end of trans. block
24cancel
25end of medium
26substitute
27escape
28file separator
29group separator
30record separator
31unit separator
32space
33!
34"
35#
36$
37%
38&
39'
40(
41)
42*
43+
44,
45-
46.
47/
480
491
502
513
524
535
546
557
568
579
58:
59;
60<
61=
62>
63?
64@
65A
66B
67C
68D
69E
70F
71G
72H
73I
74J
75K
76L
77M
78N
79O
80P
81Q
82R
83S
84T
85U
86V
87W
88X
89Y
90Z
91[
92\
93]
94^
95_
96`
97a
98b
99c
100d
101e
102f
103g
104h
105i
106j
107k
108l
109m
110n
111o
112p
113q
114r
115s
116t
117u
118v
119w
120x
121y
122z
123{
124|
125}
126~
127

If you wanted to transmit some information between computer systems, you would send a sequence of bytes, where each byte has a single character.

Why 7 bits instead of 8?

ASCII was developed in the early 1960s when computing hardware was significantly constrained. Many early computers communicated via teletype machines that used 7-bit characters, and ASCII was designed to be compatible with these existing systems.

In many early computer systems, which were far less reliable than the ones we have today, the 8th bit was reserved for error detection through parity checking. In telecommunications, this extra bit helped verify if data was transmitted correctly.

For English text and basic computing needs, 128 characters was considered adequate, as this was enough to include all uppercase and lowercase letters, numbers, punctuation, and control characters.

The limitations of ASCII's 7-bit encoding eventually led to the development of extended ASCII variants and ultimately Unicode, which addresses the need to represent characters from all of the world's writing systems.

Unicode

Unicode is a universal character encoding standard that aims to represent all characters from all writing systems in a single encoding. It is a superset of ASCII, and is the most common encoding for text in the world.

Encodings

link to all string encodings