Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
120
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
今も熱いもの!魂を揺さぶる戦士の儀式:マオリ族のハカ
shubox
0
210
Tutorial: Foundations of Blind Source Separation and Its Advances in Spatial Self-Supervised Learning
yoshipon
1
120
Pythonパッケージ管理 [uv] 完全入門
mickey_kubo
20
14k
実務プログラム
takenawa
0
6.2k
Virtual and Augmented Reality - Lecture 8 - Next Generation User Interfaces (4018166FNR)
signer
PRO
0
1.7k
SARA Annual Report 2024-25
sara2023
1
180
新卒研修に仕掛ける 学びのサイクル / Implementing Learning Cycles in New Graduate Training
takashi_toyosaki
1
160
データ分析
takenawa
0
6.3k
Course Review - Lecture 12 - Next Generation User Interfaces (4018166FNR)
signer
PRO
0
1.7k
モンテカルロ法(3) 発展的アルゴリズム / Simulation 04
kaityo256
PRO
7
1.3k
技術勉強会 〜 OAuth & OIDC 入門編 / 20250528 OAuth and OIDC
oidfj
5
1.3k
Pydantic(AI)とJSONの詳細解説
mickey_kubo
0
110
Featured
See All Featured
Music & Morning Musume
bryan
46
6.6k
Producing Creativity
orderedlist
PRO
346
40k
A better future with KSS
kneath
239
17k
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
Building a Modern Day E-commerce SEO Strategy
aleyda
42
7.4k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
52k
GitHub's CSS Performance
jonrohan
1031
460k
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You