Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Go Out with Blackhole

Avatar for Tenstorrent Japan Tenstorrent Japan
October 29, 2025
94

Go Out with Blackhole

Tenstorrent TechTalk #4 in Tokyo, LT2

Avatar for Tenstorrent Japan

Tenstorrent Japan

October 29, 2025
Tweet

More Decks by Tenstorrent Japan

Transcript

  1. Go Out with Blackhole Tenstorrent Tech Talk #4 - Lightning

    Talk Tetsuya Hayashi Note: This material is not official information from Tenstorrent. Please understand this is solely personal hobby information.
  2. Introduction I want to hack Blackhole anytime, anywhere! I really

    want to hack on the actual hardware in my hands I want to bring my own Blackhole and show it off Want to bring Blackholes together and connect them at 800Gbps A 2D torus with 4 people, a 3D torus with 8 people? That's incredible! (1) Might be doable with effort. TT is freedum! ※ The image on the right was generated by Gemini Nano Banana (1)
  3. Shopping 1. Thunderbolt 3 M.2 NVMe Adapter: Wavlink Portable M.2

    NVMe SSD 2. M.2 NVMe PCIe 3.0 x4 Adapter : ADT-Link R42UF 3. 1000W ATX 3.1 Power Supply: Thermalright TR-TPFX-1000-W Purchased the entire set during Aliexpress's July sale for ¥32,335 DIY was cheaper than buying an eGPU box Reference sites https://darekasan-net.hatenablog.com/entry/2024/09/04/152918
  4. Tried various hosts Result I/F Memory Old ThinkPad X1 Carbon

    ✗ TB3 16GB Abandoned due to lack of Adobe 4G Decoding in BIOS Unable to recognize PCIe memory Business card-sized x86 Radxa X4 ✗ M.2 OCI link 8GB Close! The small sample runs, but vLLM fails due to insufficient memory tt-smi OK, run_op_on_device.py OK, vLLM NG Recent ThinkPad P14s Gen5 ◯ TB3 64GB Worked with IOMMU (VT-d) off TTSMI ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ V e r s i o n   3 . 0 . 3 2                                                                   TT-SMI                                                 O c t   2 3   2 0 2 5   1 1 : 1 2 : 0 4   P M │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌─  H o s t   I n f o   ( C o n f i g   W a r n i n g ! )  ─ ─ ─ ─ ─ ─ ─ ─┐I n f o r m a t i o n   ( 1 ) T e l e m e t r y   ( 2 ) F W   V e r s i o n   ( 3 ) │ │╸━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │*  O S               :  L i n u x   ( x 8 6 _ 6 4 ) │┌─  D e v i c e   I n f o r m a t i o n  ───────────────────────────────────────────────────────────────────────────┐ │*  D i s t r o       :  U b u n t u   2 4 . 0 4 . 3   L T S ││ │ │*  K e r n e l       :  6 . 1 4 . 0 - 3 3 - g e n e r i c ││ #       B u s   I D       B o a r d   T y p e   B o a r d   I D     C o o r d s D R A M   T r a i n e d D R A M   S p e e d L i n k   S p e e d   L i │ │*  H o s t n a m e   :  h a u y n i t e ││ │ │*  P y t h o n       :  3 . 1 2 . 3 ││ 0  0 0 0 0 : 0 3 : 0 0 . 0        p 1 0 0 a          4 3 2 3 1 9 1 1 0 5 c     N ∕ A             N ∕ A                 N ∕ A         G e n 3  ∕   G e n 5     x │ │*  M e m o r y       :  7 . 5 4   G B ││ │ │                      *   3 2 G B + ││ │ │*  D r i v e r       :  T T - K M D   2 . 4 . 1 ││ │ │ ││ │ └─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─  *   R e c o m m e n d e d   C o n f i g  ─┘│ │ │ │ │ │ │ ▉ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────┘   q  Q u i t    h  H e l p    d  T o g g l e   d a r k   m o d e    c  T o g g l e   s i d e b a r    1  D e v i c e   i n f o   t a b    2  T e l e m e t r y   t a b    3  F i r m w a r e   t a b   ▏^ p  p a l e t t e ※2 ERROR 10-12 03:13:08 [engine.py:453] [enforce fail at alloc_cpu.cpp:117] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 17179869184 bytes. Error code 12 (Cannot allocate memory) 2
  5. It worked! Board: My Blackhole p100a PC: Thinkpad P14s Gen5

    Intel Core Ultra 7 155H 64MB Ubuntu 24.04.3 bare metal installed BIOS: Thunderbolt 3 -> Security Level: No Security Security -> Virtualization -> VT- d Feature: Disable ※ For some reason, in my case, vLLM wouldn't work and threw errors when IOMMU (VT-d) was enabled!? ※
  6. TT-SMI Display TTSMI ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ V e r s i

    o n   3 . 0 . 3 2                                                                         TT-SMI                                                       O c t   2 0   2 0 2 5   1 2 : 2 4 : 3 7   A M │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌─  H o s t   I n f o   ( F u l l y   C o m p a t i b l e )  ─ ─ ─ ─ ─ ─ ─┐I n f o r m a t i o n   ( 1 ) T e l e m e t r y   ( 2 ) F W   V e r s i o n   ( 3 ) │ │╸━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━ ━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │*  O S               :  L i n u x   ( x 8 6 _ 6 4 ) │┌─  D e v i c e   I n f o r m a t i o n  ───────────────────────────────────────────────────────────────────────────────────┐ │*  D i s t r o       :  U b u n t u   2 4 . 0 4 . 3   L T S ││ │ │*  K e r n e l       :  6 . 1 4 . 0 - 3 3 - g e n e r i c ││ #       B u s   I D       B o a r d   T y p e   B o a r d   I D     C o o r d s D R A M   T r a i n e d D R A M   S p e e d L i n k   S p e e d   L i n k   W i d t h │ │*  H o s t n a m e   :  m i d n i g h t ││ │ │*  P y t h o n       :  3 . 1 2 . 3 ││ 0  0 0 0 0 : 5 2 : 0 0 . 0        p 1 0 0 a          4 3 2 3 1 9 1 1 0 5 c     N ∕ A             N ∕ A                 N ∕ A         G e n 3  ∕   G e n 5     x 4  ∕   x 1 6 │ │*  M e m o r y       :  6 2 . 3 0   G B ││ │ │*  D r i v e r       :  T T - K M D   2 . 4 . 1 ││ │ │ ││ │ └─ ─── ──── ─── ─── ─── ──── ─── ─── ──── ─── ─── ─┘│ │ │ │ │ │ │ │ │ │ │ ▏ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────┘   q  Q u i t    h  H e l p    d  T o g g l e   d a r k   m o d e    c  T o g g l e   s i d e b a r    1  D e v i c e   i n f o   t a b    2  T e l e m e t r y   t a b    3  F i r m w a r e   t a b   ▏^ p  p a l e t t e Saving SVG from tt-smi adds this border, but it's not a Mac
  7. It worked! TT-Inference-Server Following the tutorial's “Deploying LLMs” section, vLLM

    worked smoothly. (request-venv) hayate@midnight:~/git/tt-inference-server$ curl -sS "http://localhost:8000/v1/completions" -H "Content-Type: application/json" -H "Authorization: Bearer $VLLM_API_KEY" -d "{ \"model\": \"meta-llama/$MODEL\", \"prompt\": \"Jim Keller is?\", \"max_tokens\": 60, \"temperature\": 0 }" | jq { "id": "cmpl-9c65c696ebaa4031a5900aaec091ab11", "object": "text_completion", "created": 1761145166, "model": "meta-llama/Llama-3.1-8B-Instruct", "choices": [ { "index": 0, "text": " (Part 2)\nJim Keller is a renowned American computer architect and engineer, best known for his work at AMD and Apple. He is credited with designing the x86-64 architecture, which is the foundation of modern personal computers.\nKeller's career spans over three decades, with significant contributions to", "logprobs": null, "finish_reason": "length", "stop_reason": null, "prompt_logprobs": null } ], "usage": { "prompt_tokens": 5, "total_tokens": 65, "completion_tokens": 60, "prompt_tokens_details": null } } https://docs.tenstorrent.com/getting-started/vLLM-servers.html#deploying-llms ※: For tt-inference-server branches, try bh-getting-started first, then proceed to dev if successful
  8. Summary Built a portable Blackhole environment using a Thunderbolt adapter

    + p100a Now able to go out with my Blackhole anytime, anywhere Future work Investigate performance degradation from Thunderbolt connection (Is 8.0 Gb/s sufficient?) Evaluate Blackhole Peer to Peer 800Gbps connection performance Requires two or more P150 units. Yes, I want them! (2) pci 0000:52:00.0: 8.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s PCIe x4 link at 0000:00:07.2 (capable of 504.112 Gb/s with 32.0 GT/s PCIe x16 link) 00:07.2 PCI bridge: Intel Corporation Meteor Lake-P Thunderbolt 4 PCI Express Root Port #2 (rev 02) (2)
  9. Tips. Linux Device Recognition and hugepage 1. Add udev rules

    to recognize Thunderbolt devices on connection /etc/udev/rules.d/99-removable.rules ACTION==“add”, SUBSYSTEM==“thunderbolt”, ATTR{authorized}==‘0’, ATTR{authorized}=“1” ※Reference URL: https://wiki.archlinux.org/title/Thunderbolt 2. Connect the p100a and verify with lspci Verify with lspci -vv -d 1e52:* 52:00.0 Processing accelerators: Tenstorrent Inc Blackhole The device must be displayed and three Memory regions (0, 2, 4) must be allocated 3. Re-apply hugepages (mandatory for plug-and-play connections) If the device shows up in lspci, manually run sudo /opt/tenstorrent/bin/hugepage-setup.sh If it displays Node 0 hugepages after: 4 , it's OK. You can also check the info with cat /proc/meminfo
  10. lspci and hugepage-setup.sh output $ lspci -vv -d 1e52:* 52:00.0

    Processing accelerators: Tenstorrent Inc Blackhole Subsystem: Tenstorrent Inc Blackhole Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 198 Region 0: Memory at 4800000000 (64-bit, prefetchable) [size=512M] Region 2: Memory at 4820000000 (64-bit, prefetchable) [size=1M] Region 4: Memory at 4000000000 (64-bit, prefetchable) [size=32G] Capabilities: <access denied> Kernel driver in use: tenstorrent Kernel modules: tenstorrent $ sudo /opt/tenstorrent/bin/hugepages-setup.sh Node 0 hugepages needed: 4 Node 0 hugepages after: 4 Completed hugepage setup