Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
140
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
Gitの中身 / 03-a-git-internals
kaityo256
PRO
0
140
Data Presentation - Lecture 5 - Information Visualisation (4019538FNR)
signer
PRO
0
3k
MySmartSTEAM 2526
cbtlibrary
0
210
Leveraging LLMs for student feedback in introductory data science courses (Stats Up AI)
minecr
1
240
SJRC 2526
cbtlibrary
1
220
多様なメンター、多様な基準
yasulab
6
20k
悩める リーダー達に 届けたい書籍|レジリエントマネジメント 書籍イントロダクション-260126
mimoza60
1
400
東大1年生にJulia教えてみた
matsui_528
7
12k
Information Architectures - Lecture 2 - Next Generation User Interfaces (4018166FNR)
signer
PRO
1
1.9k
アジャイルなマインドセットを「取り戻す」新人研修づくり
chinmo
1
190
SSH公開鍵認証 / 02-b-ssh
kaityo256
PRO
0
110
滑空スポーツ講習会2025(実技講習)EMFT講習 実施要領/JSA EMFT 2025 procedure
jsaseminar
0
150
Featured
See All Featured
Heart Work Chapter 1 - Part 1
lfama
PRO
5
35k
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
1
1.3k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.4k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
220
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
480
Paper Plane
katiecoart
PRO
0
48k
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
230
Marketing to machines
jonoalderson
1
5k
エンジニアに許された特別な時間の終わり
watany
106
240k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
95
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
180
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
300
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You