Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
130
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
質のよいアウトプットをできるようになるために~「読む・聞く、まとめる、言葉にする」を読んで~
amarelo_n24
0
260
高校におけるプログラミング教育を考える
naokikato
PRO
0
160
CHARMS-HP-Banner
weltraumreisende
0
1k
今の私を形作る4つの要素と偶然の出会い(セレンディピティ)
mamohacy
2
100
Online Privacy
takahitosakamoto
1
120
情報科学類で学べる専門科目38選
momeemt
0
620
EVOLUCIÓN DE LAS NEUROCIENCIAS EN LOS CONTEXTOS ORGANIZACIONALES
jvpcubias
0
180
20250830_MIEE祭_会社員視点での学びのヒント
ponponmikankan
1
170
Презентация "Знаю Россию"
spilsart
0
250
吉岡研究室紹介(2025年度)
kentaroy47
0
290
AWSと共に英語を学ぼう
amarelo_n24
0
170
20250830_本社にみんなの公園を作ってみた
yoneyan
0
130
Featured
See All Featured
Gamification - CAS2011
davidbonilla
81
5.5k
Why Our Code Smells
bkeepers
PRO
339
57k
Six Lessons from altMBA
skipperchong
28
4k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.7k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
19
1.2k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
890
Product Roadmaps are Hard
iamctodd
PRO
54
11k
Code Reviewing Like a Champion
maltzj
525
40k
The Language of Interfaces
destraynor
162
25k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.6k
Statistics for Hackers
jakevdp
799
220k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You