Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bits, Bytes and Characters
Search
Shaikhul Islam
January 29, 2021
Education
0
96
Bits, Bytes and Characters
Computer Science 101, what is bit, bytes, character and unicode
Shaikhul Islam
January 29, 2021
Tweet
Share
Other Decks in Education
See All in Education
自己紹介 / who-am-i
yasulab
2
3.9k
Matz に頼られたので張り切って2時間ほどドイツと日本の互いの Ruby 学習事情についてディスカッションした話
yasulab
1
360
プロダクト・エンジニア・QAE 3軸でのナレッジシェアのススメ
hinac0
1
690
#英語力ランキング批判:EF-EPI,TOEFLスコア,英語教育実施状況調査
terasawat
0
490
HonkyTonk
savannamenzel
0
100
White Snake: Qing's Mission
movingcastal
0
240
HCL Domino 14.0 AutoUpdate を試してみた
harunakano
0
640
謙虚なアジャイルコーチ__アダプティブ_ムーブ_による伴走支援.pdf
antmiyabin
0
190
🎓 ChatGPT を組み込んだ24時間TA : 教育現場における LLM 活用の課題と改善
yasslab
PRO
0
770
困ったときのガイドライン / We Support You in Any Situation
yasulab
2
3.7k
Цифровые финансы - магистерская программа Финэка МГИМО 2024 г.
niellony
0
690
Flip-videochat
matleenalaakso
0
14k
Featured
See All Featured
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
45
4.8k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
158
15k
Why You Should Never Use an ORM
jnunemaker
PRO
53
9k
5 minutes of I Can Smell Your CMS
philhawksworth
202
19k
The Pragmatic Product Professional
lauravandoore
31
6.2k
Code Reviewing Like a Champion
maltzj
517
39k
Building Applications with DynamoDB
mza
90
6k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
23
1.7k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
26
3.9k
A Philosophy of Restraint
colly
202
16k
The Art of Programming - Codeland 2020
erikaheidi
48
13k
Testing 201, or: Great Expectations
jmmastey
36
7k
Transcript
Bits, Bytes and Characters Shaikhul Islam Chowdhury dev.to/shaikhul github.com/shaikhul
Bit • Smallest unit of storage • Bit is 0
or 1 • 8 bits - 1 Byte
Byte • Group of 8 bit • 1 bit pattern
- 0, 1 - 2 entry • 2 bit pattern - 00, 01, 10, 11 - 4 entry • n bit - 2^n entry possible • 1 Byte ◦ 8 bit - 2^8 - 255 entry ◦ Can hold 0 - 255 numbers
Bytes • How many bytes? • All storage are measured
in Bytes • Bigger units ◦ KB (1000 B), ◦ MB (1000 KB), ◦ GB (1000 MB), ◦ TB (1000 GB) etc
Character and Unicode • Characters are represented as code point
- range 0 - 0x10FFFF ( 1 million) Character Unicode Code Point Glyph Latin small letter a 0x61 a Black chess knight 0x265E ♞ Euro currency 0x20AC €
Character and Unicode (Code Point) Python In [22]: chr(0x0041) Out[22]:
'A' In [23]: chr(0x00df) Out[23]: 'ß' In [24]: chr(0x6771) Out[24]: '東' In [25]: chr(0x10400) Out[25]: '' Java jshell> new String(Character.toChars(0x0041)) $13 ==> "A" jshell> new String(Character.toChars(0x00df)) $14 ==> "ß" jshell> new String(Character.toChars(0x6771)) $15 ==> "東" jshell> new String(Character.toChars(0x10400)) $16 ==> ""
(Character) Encoding • Unicode string is a sequence of code
points (limit 0 - 0x10FFFF) • character encoding - translate sequence of code points into Bytes to store into memory ◦ ASCII: 7 bit (0 - 127), english letters ◦ UTF-8: most common, default in python ◦ UTF-16 etc
(Character) Encoding - String to Bytes Python In [40]: c
= chr(0x20ac) In [41]: c Out[41]: '€' In [42]: c.encode('utf-8') Out[42]: b'\xe2\x82\xac' Java jshell> String str = new String(Character.toChars(0x20ac)) str ==> "€" jshell> import java.nio.charset.* jshell> byte bytes[] = str.getBytes(StandardCharsets.UTF_8) bytes ==> byte[3] { -30, -126, -84 } jshell> for (byte b: bytes) { System.out.printf("%x ", b); } e2 82 ac
References • Stanford CS 101 on Bits and Bytes •
Unicode HOWTO — Python 3.9.1 documentation • Unicode (The Java™ Tutorials > Internationalization > Working with Text)
Thank You