Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
99
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
Flinga
matleenalaakso
2
13k
技術を楽しもう/enjoy_engineering
studio_graph
1
420
CSS3 and Responsive Web Design - Lecture 5 - Web Technologies (1019888BNR)
signer
PRO
1
2.5k
セキュリティ・キャンプ全国大会2024 S17 探査機自作ゼミ 事前学習・当日資料
sksat
3
850
脳卒中になってしまった さあ、どうする
japanstrokeassociation
0
660
Comment aborder et contribuer sereinement à un projet open source ? (Masterclass Université Toulouse III)
pylapp
0
3.2k
子どものためのプログラミング道場『CoderDojo』〜法人提携例〜 / Partnership with CoderDojo Japan
coderdojojapan
4
14k
Algo de fontes de alimentación
irocho
1
370
Medidas en informática
irocho
0
300
Epithelium Flashcards
ndevaul
0
1k
Web Application Frameworks - Lecture 4 - Web Technologies (1019888BNR)
signer
PRO
0
2.6k
Qualtricsで相互作用実験する「SMARTRIQS」入門編
kscscr
0
320
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1030
460k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
169
50k
KATA
mclloyd
29
14k
The Pragmatic Product Professional
lauravandoore
31
6.3k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
47
2.1k
For a Future-Friendly Web
brad_frost
175
9.4k
Building Adaptive Systems
keathley
38
2.3k
Optimising Largest Contentful Paint
csswizardry
33
2.9k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Agile that works and the tools we love
rasmusluckow
327
21k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
How to Think Like a Performance Engineer
csswizardry
20
1.1k
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You