Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
100
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
認知情報科学科_キャリアデザイン_大学院の紹介
yuyakurodou
0
150
ニュースメディアにおける生成 AI の活用と開発 / UTokyo Lecture Business Introduction
upura
0
160
ビジネススキル研修紹介(株式会社27th)
27th
PRO
1
230
情報処理工学問題集 /infoeng_practices
kfujita
0
170
Ch2_-_Partie_3.pdf
bernhardsvt
0
110
Medidas en informática
irocho
0
420
TP5_-_UV.pdf
bernhardsvt
0
120
1127
cbtlibrary
0
170
Semantic Web and Web 3.0 - Lecture 9 - Web Technologies (1019888BNR)
signer
PRO
2
2.6k
ThingLink
matleenalaakso
28
3.8k
Казармы и гарнизоны
pnuslide
0
150
人々はさくらになにを込めたか
jamashita
0
140
Featured
See All Featured
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
Practical Orchestrator
shlominoach
186
10k
Speed Design
sergeychernyshev
25
700
Thoughts on Productivity
jonyablonski
68
4.4k
GraphQLの誤解/rethinking-graphql
sonatard
67
10k
Raft: Consensus for Rubyists
vanstee
137
6.7k
Rails Girls Zürich Keynote
gr2m
94
13k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
Embracing the Ebb and Flow
colly
84
4.5k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
The World Runs on Bad Software
bkeepers
PRO
66
11k
How to train your dragon (web standard)
notwaldorf
88
5.8k
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You